# Summarization

Large language models are adept at summarizing due to their sophisticated understanding of language structure, semantics, and context derived from the vast amounts of text they have been trained on. They analyze and comprehend the core elements and main ideas within a text, enabling them to distill complex and lengthy content into succinct summaries. Summaries produced by language models help users quickly grasp the essential points of voluminous texts, facilitating more efficient decision-making and knowledge acquisition without the necessity to delve into and navigate through exhaustive details.

To start, we first need to instantiate an Aleph Alpha `Client` so that we can make calls to the Aleph Alpha API, which we use to get access to the Aleph Alpha family of models.

We introduce the `LimitedConcurrencyClient`, which makes sure that we can run concurrent calls to our API without running into problems later!

In [None]:
from intelligence_layer.connectors import LimitedConcurrencyClient

from dotenv import load_dotenv

load_dotenv()

client = LimitedConcurrencyClient.from_env()

Let's start building a simple summarization workflow.
For starters, let's import a ready-to-go summarization method.

This particular summarization method is optimized for handling extensive text inputs and producing summaries with a medium level of compression.
Designed to maintain a balance between brevity and comprehensive detail, this method efficiently distills long-context information into concise summaries without omitting crucial content.

Note, how the task takes a client instance, as we will be making calls to the Aleph Alpha API.

In [None]:
from intelligence_layer.use_cases import SteerableLongContextSummarize

summarize_task = SteerableLongContextSummarize()

Ok, now let's find some longer text to summarize.

Lately, I've been into wooden scyscrapers, so let's hit up Wikipedia.

In [None]:
from intelligence_layer.use_cases import LongContextSummarizeInput

text = """Plyscraper

A plyscraper, or timber tower is a skyscraper made (at least partly) of wood. They may alternatively be known as mass timber buildings.

There are four main types of engineered wood used for mass timber including cross-laminated timber (CLT), glued laminated timber (glulam), laminated strand lumber (LSL), and laminated veneer lumber (LVL). Of these three wood systems, CLT is the most commonly used.[1]

When other materials, such as concrete or steel, are used in conjunction with engineered wood, these plyscrapers are called “hybrids”. For hybrid buildings, there are some approaches to how different materials can be used including the “Cree’s System” which was developed by Cree Buildings, and the “Finding the Forest Through the Trees" (FFTT) construction model” developed by Michael Green. Cree's System combines the use of concrete and wood mainly in its hybrid flooring systems. In some instances, concrete can also be used as a core or for the foundation of a building because wood is too light. The FFTT construction model incorporates a wooden core and wooden floor slabs mixed with steel beams to provide ductility to the building.[1][2]

When considering which engineered wood system to use for a plyscraper the individual benefits of each must be compared. CLT has a high fire resistance due to the fire-resistant adhesive used and the surface char layer that forms when it is exposed to fire. The surface char layer protects the interior of the wood from further damage. Glulam is typically used for columns and beams as an alternative to commonly used steel and concrete.[1][3] This is because it has a greater tensile strength-to-weight ratio than steel and can resist compression better than concrete.  LVL also has the same strength as concrete.[4]  As plyscrapers are made from wood, they sequester carbon during construction and are renewable if the forests that they are sourced from are sustainably managed.[1][3]

Despite these benefits, there are bound to be some drawbacks when using the various engineered woods.  Steel overall has a greater strength and durability for the same sized profile when compared to its wood counterpart.[5] Thus, a building made with steel beams would require smaller beams than the same building constructed with wooden beams.  Walls and columns in the interior spaces of these plyscrapers can get so thick that the size of said interior space gets heavily reduced. This issue however, does not occur within shorter buildings."""
summarize_input = LongContextSummarizeInput(text=text)

To help developers and users alike understand how LLM-based applications work, we offer so-called `Tracer`s.
They can be inserted into any task run and automatically log all underlying processes to enable easier debugging and auditability.

In [None]:
from intelligence_layer.core import InMemoryTracer

tracer = InMemoryTracer()

Finally, let's give our `summarize_input` and `tracer` to our `summarize_task` and run!

In [None]:
from IPython.display import Pretty

output = summarize_task.run(summarize_input, tracer)
Pretty(
    " ".join(partial_summary.summary for partial_summary in output.partial_summaries)
)

Ok, let's have a look at what our tracer picked up along the way...

In [None]:
tracer

Note, how the task is broken up into subtasks, such as `ChunkTask` and `SingleChunkFewShotSummarize`.
To work with longer texts, we break them up into smaller sections ("chunks") and each is summarized in turn.

When using the 'Intelligence Layer'-task-framework, you can easily run multiple tasks concurrently.
This notably speeds up the time it takes to obtain an answer.
Let's have a look at how to do this.

In [None]:
text1 = """The concrete industry is a crucial sector in the global construction market, playing a pivotal role in the development of infrastructure and buildings. Central to its operations is the production of concrete, a composite material consisting primarily of aggregates (like gravel and sand), cement, and water. This industry is characterized by its extensive supply chain, which includes the extraction of raw materials, manufacturing of cement, and the production and delivery of ready-mix concrete. The versatility, strength, and durability of concrete make it a preferred choice for a wide range of construction projects, from residential buildings to large-scale infrastructural developments like bridges, dams, and roads. However, the industry also faces significant challenges, including environmental concerns related to carbon emissions during cement production and the need for sustainable mining practices."""
text2 = """The timber industry revolves around the cultivation, management, harvesting, and processing of trees for the production of wood and wood products. This industry is deeply integrated with forestry management, aiming to balance the economic benefits of timber production with environmental and ecological considerations. Timber is a renewable resource and is used in various applications, including construction, paper manufacturing, and the creation of furniture and other wood products. The industry is known for its sustainable practices, such as replanting and managing forests to ensure a continuous supply of timber while preserving biodiversity. However, it also faces challenges like deforestation, illegal logging, and the need to manage forests in a way that mitigates climate change and preserves habitats. The timber industry's focus on sustainability and renewable resources makes it a key player in the global move towards more environmentally friendly construction and manufacturing practices."""
summarize_inputs = [
    LongContextSummarizeInput(text=text1),
    LongContextSummarizeInput(text=text2),
]

results = summarize_task.run_concurrently(summarize_inputs, tracer)

In [None]:
# for the first text about concrete...
Pretty(
    " ".join(
        partial_summary.summary for partial_summary in results[0].partial_summaries
    )
)

In [None]:
# for the second text about timber...
Pretty(
    " ".join(
        partial_summary.summary for partial_summary in results[1].partial_summaries
    )
)

Congrats, you now know how a `Task` works and what a `Tracer` is!
In the next notebooks, we'll dive into more detail on other tasks, as well as how to build your own and how to evaluate methodologies.