In [1]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Text Summarization of Large Documents


<table align="left">

  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/language/examples/reference-architectures/text_summarization_for_large_documents.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/generative-ai/blob/main/language/examples/reference-architectures/text_summarization_for_large_documents.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/blob/main/language/examples/reference-architectures/text_summarization_for_large_documents.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>

## Overview

Text summarization is the process of creating a shorter version of a text document while still preserving the most important information. This can be useful for a variety of purposes, such as quickly skimming a long document, getting the gist of an article, or sharing a summary with others.

Although summarizing a short paragraph is a non-trivial task, there are a few challenges to overcome if you want to summarize a large document, such as a PDF file with multiple pages. In this notebook, you will go through a few examples of how you can use generative models to summarize large documents.

### Objective

In this tutorial, you will learn how to use generative models to summarize information from text by working through the following examples:

- Stuffing method
- MapReduce method
- MapReduce with Overlapping Chunks method
- MapReduce with Rolling Summary method

### Costs

This tutorial uses billable components of Google Cloud:
- Vertex AI Generative AI Studio

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Generative AI pricing](https://cloud.google.com/vertex-ai/pricing#generative_ai_models), and use the [Pricing Calculator](https://cloud.google.com/products/calculator/) to generate a cost estimate based on your projected usage.

## Getting Started

### Install Vertex AI SDK, other packages and their dependencies

In [2]:
!pip install google-cloud-aiplatform google-cloud-translate PyPDF2 ratelimit backoff --upgrade --quiet --user

**Colab only**: Uncomment the following cell to restart the kernel. For Vertex AI Workbench you can restart the terminal using the button on top.

In [3]:
# # Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

### Authenticating your notebook environment
* If you are using **Colab** to run this notebook, uncomment the cell below and continue.
* If you are using **Vertex AI Workbench**, check out the setup instructions [here](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/setup-env).

In [4]:
# from google.colab import auth
# auth.authenticate_user()

### Import libraries

**Colab only:** Uncomment the following cell to initialize the Vertex AI SDK. For Vertex AI Workbench, you don't need to run this.

In [5]:
# import vertexai

# PROJECT_ID = "[your-project-id]"  # @param {type:"string"}
# vertexai.init(project=PROJECT_ID, location="us-central1")

In [6]:
import re
import urllib
import warnings
from pathlib import Path

import backoff
import pandas as pd
import PyPDF2
import ratelimit
from google.api_core import exceptions
from tqdm import tqdm
from vertexai.preview.language_models import TextGenerationModel

warnings.filterwarnings("ignore")

### Import models

Here you load the pre-trained text generation model called `text-bison@001`.


In [7]:
generation_model = TextGenerationModel.from_pretrained("text-bison@001")

### Create the translation wrapper function

In [8]:
from google.cloud import translate

project_id = !gcloud config list project
project_id = project_id[1].split('=')[1].strip()
parent = f'projects/' + project_id


def traduza(texto, idioma_destino):
    client = translate.TranslationServiceClient()

    response = client.translate_text(
        parent=parent,
        contents=[texto],
        target_language_code=idioma_destino,
        mime_type="text/plain"
    )

    return response.translations[0].translated_text

### Preparing data files

To begin, you will need to download a pdf file for the summarizing tasks below.

In [9]:
# Define a folder to store the files
data_folder = "data"
Path(data_folder).mkdir(parents=True, exist_ok=True)

# Define a pdf link to download and place to store the download file
pdf_url = "https://www.ic.unicamp.br/~reltech/PFG/2021/PFG-21-52.pdf"
pdf_file = Path(data_folder, pdf_url.split("/")[-1])

# Download the file using `urllib` library
urllib.request.urlretrieve(pdf_url, pdf_file)

(PosixPath('data/PFG-21-52.pdf'), <http.client.HTTPMessage at 0x7fe90421b210>)

Here you will take a peak at a few pages of the downloaded pdf file

In [10]:
# Read the PDF file and create a list of pages
reader = PyPDF2.PdfReader(pdf_file)
pages = reader.pages

# Print three pages from the pdf
for i in range(3):
    text = pages[i].extract_text().strip()
    print(f"Page {i}: {text} \n\n")

Page 0: UNIVERSIDADE ESTADUAL DE CAMPINAS
INSTITUTO DE COMPUTAÇÃOCategorização dos Desafios de
Segurança em Nuvem
relacionados à tecnologia de
Virtualização
M. G. Andrietta P. L. Geus
Relatório Técnico - IC-PFG-21-52
Projeto Final de Graduação
2021 - Dezembro
The contents of this report are the sole responsibility of the authors.
O conteúdo deste relatório é de única responsabilidade dos autores. 


Page 1: 1 Categoriza ção dos Desafios de Segurança em Nuvem relacionados à 
tecnologia de Virtualização  
Murilo Guidetti Andrietta1, Paulo Lício de Geus1 
1 Instituto de Computação Universidade Estadual de Campinas (UNICAMP), Caixa Postal 6176  
13083 -970 Campinas -SP, Brasil  
m147472@dac.unicamp.br  e pgeus@unicamp.br  
 
Resumo.  Ao longo dos últimos anos, organizações de variados ramos industriais têm aderido 
ao uso da Nuvem Computacional, pois os benéficos dessa arquitetura são cada vez mais 
evidentes. A tecnologia d e Virtualização funciona como base para o sucesso desse ambiente 

## Method 1: Stuffing

The simplest way to pass data to a language model is to "stuff" it all into the prompt as context. This means simply including all of the relevant information in the prompt, in the order that you want the model to process it.

Here you will extract the text from all the pages in the pdf file.

In [11]:
# Read the PDF file and create a list of pages
reader = PyPDF2.PdfReader(pdf_file)
pages = reader.pages

# Entry string to concatenate all the extacted texts
concatenated_text = ""

# Loop through the pages
for page in tqdm(pages):

    # Extract the text from the page and remove any leading or trailing whitespace
    text = page.extract_text().strip()

    # Concate the extracted text to the concatenated text
    concatenated_text += text

print(f"There are {len(concatenated_text)} characters in the pdf")

100%|██████████| 25/25 [00:01<00:00, 17.92it/s]

There are 39231 characters in the pdf





You will now create a prompt template that can be used later in the notebook.

In [12]:
prompt_template = traduza("""
    Escreva um sumário conciso do texto abaixo delimitado por três aspas invertidas.
    Retorne sua resposta em bullets que cubram os pontos principais do texto.

    ```{text}```

    SUMARIO EM BULLETS:
""", "en")

Here you will use LLM via the API to summarize the extracted texts. Please note that LLMs currently have input text limit and stuffing a large input text might not be accepted. You can read more about quotas and limits [here](https://cloud.google.com/vertex-ai/docs/quotas).

The following code will cause **an exception**!

In [13]:
# Define the prompt using the prompt template
prompt = prompt_template.format(text=traduza(concatenated_text, "en"))

# Use the model to summarize the text using the prompt
summary = traduza(generation_model.predict(prompt=prompt, max_output_tokens=1024).text, "pt")

print(summary)

InvalidArgument: 400 Text is too long. [field_violations {
  field: "contents"
  description: "The total codepoints in the request must be less than 30720, actual: 39231"
}
]

#### Retrying

The model responded with an error message: **400 Request contains an invalid argument** because the extracted text is too long for the generative model to process.

To avoid this issue, you will only input a chunk of the extracted text (e.g. the first 30,000 words).

In [14]:
# Define the prompt using the prompt template
prompt = prompt_template.format(text=traduza(concatenated_text[:30000], "en"))

# Use the model to summarize the text using the prompt
summary = traduza(generation_model.predict(prompt=prompt, max_output_tokens=1024).text, "pt")

print(summary)

- Cloud Computing é uma tecnologia que está crescendo em popularidade.
- A virtualização é uma tecnologia chave para Cloud Computing.
- A tecnologia de virtualização cria uma série de desafios de segurança.
- Não há uma maneira padrão de categorizar os desafios de segurança relacionados à tecnologia de virtualização.
- Este relatório propõe uma nova maneira de categorizar os desafios de segurança relacionados à tecnologia de virtualização.
- A categorização proposta baseia-se nos três pilares da segurança informática: disponibilidade, confidencialidade e integridade.
- A categorização proposta mapeia os desafios de segurança para elementos de uma arquitetura tradicional de Cloud Computing.
- A categorização proposta é simples, direta, sistemática e completa.


### Recap

Although full text is too large for the model, you have managed to create a concise, bulleted list of the most important information from a portion of the PDF using the model. Thus, here are the pros and cons of using the stuffing method:

**Pros:**
- Only required a single call to the model.
- When summarizing text, the model has access to all the data at once so that the result may be better.

**Cons:**
- Most models have a context length, and for large documents (or many documents) this will not work as it will result in a prompt larger than the context length.
- This method only works on smaller pieces of data and not suitable to large documents most of the time.

In the following session, you will explore approaches which designed to help deal with having longer text than context lengh limit of LLMs.

### Adding rate limit to model calls

When you use MapReduce or other similar methods, you will be making multiple API calls to the model in a short period of time. There is a limit on the number of API calls you can make per minute, so you will need to add a safety measure to your code to prevent exceeding the limit. This will help to ensure that your code runs smoothly and does not encounter any errors.

For this method, here are a few specific things that you will do:
1. You will make use of a Python library called [ratelimit](https://pypi.org/project/ratelimit/) to limit the number of API calls per minute
2. You will make use of a Python library called [backoff](https://pypi.org/project/backoff/) to retry until the maximum time limit has reached

The following function improves the API call process by limiting the number of calls to **20 per minute**. It also back offs and retries calling the API after encountering **Resource Exhausted** exception. The wait duration grows **exponentially until the 5-minute mark**, and then the function will give up on retrying.

In [15]:
CALL_LIMIT = 20  # Number of calls to allow within a period
ONE_MINUTE = 60  # One minute in seconds
FIVE_MINUTE = 5 * ONE_MINUTE

# A function to print a message when the function is retrying
def backoff_hdlr(details):
    print(
        "Backing off {} seconds after {} tries".format(
            details["wait"], details["tries"]
        )
    )


@backoff.on_exception(  # Retry with exponential backoff strategy when exceptions occur
    backoff.expo,
    (
        exceptions.ResourceExhausted,
        ratelimit.RateLimitException,
    ),  # Exceptions to retry on
    max_time=FIVE_MINUTE,
    on_backoff=backoff_hdlr,  # Function to call when retrying
)
@ratelimit.limits(  # Limit the number of calls to the model per minute
    calls=CALL_LIMIT, period=ONE_MINUTE
)

# This function will call the `generation_model.predict` function, but it will retry if defined exceptions occur.
def model_with_limit_and_backoff(**kwargs):
    return generation_model.predict(**kwargs)

## Method 2: MapReduce

This method works by first splitting the large data into chunks, then running a prompt on each chunk of text. For summarization tasks, the output from the initial prompt would be a summary of that chunk. Once all the initial outputs have been generated, a different prompt is run to combine them.

This method is a bit more complex than the first method, but it can be more effective for large datasets. Here you will prepare two prompt templates: one for the initial summary step and another for the final combine step. You will be using these two templates later in this notebook.

In [16]:
initial_prompt_template = traduza("""
    Escreva um sumário conciso sobre o texto abaixo delimitado por três aspas invertidas.

    ```{text}```

    SUMARIO CONCISO:
""", "en")

final_prompt_template = traduza("""
    Escreva um cenário conciso do texto abaixo delimitdo por aspas invertidas.
    Retorne sua resposta em bullets que cubram os pontos principais do texto.

    ```{text}```

    SUMARIO EM BULLETS:
""", "en")

#### Map step

In this section, you will read the PDF file again and use the model to summarize each page individually using the initial prompt template.

In [17]:
# Read the PDF file and create a list of pages
reader = PyPDF2.PdfReader(pdf_file)
pages = reader.pages

# Create an empty list to store the summaries
initial_summary = []

# Iterate over the pages and generate a summary for each page
for page in tqdm(pages):

    # Extract the text from the page and remove any leading or trailing whitespace
    text = traduza(page.extract_text().strip(), "en")

    # Create a prompt for the model using the extracted text and a prompt template
    prompt = initial_prompt_template.format(text=text)

    # Generate a summary using the model and the prompt
    summary = model_with_limit_and_backoff(prompt=prompt, max_output_tokens=1024).text

    # Append the summary to the list of summaries
    initial_summary.append(summary)

 80%|████████  | 20/25 [00:41<00:10,  2.12s/it]

Backing off 0.40881491390959934 seconds after 1 tries
Backing off 0.09398439018053617 seconds after 2 tries
Backing off 1.2653382979372685 seconds after 3 tries
Backing off 5.524524880784982 seconds after 4 tries
Backing off 6.075871890268356 seconds after 5 tries
Backing off 27.36901061349193 seconds after 6 tries


100%|██████████| 25/25 [01:33<00:00,  3.73s/it]


Take a look at the first few summaries of from the initial Map phrase.

In [18]:
print(traduza("\n\n".join(initial_summary[:10]), "pt"))

```
UNIVERSIDADE ESTADUAL DE CAMPINAS
INSTITUTO DE COMPUTAÇÃO
Categorização de Desafios
Segurança na Nuvem
relacionado a tecnologia
virtualização
MG Andrietta P. L. Geus
Relatório Técnico - IC-PFG-21-52
Projeto Final de Graduação
2021 - dezembro
O conteúdo deste relatório é de responsabilidade exclusiva dos autores.
O conteúdo deste relatório é de responsabilidade exclusiva dos autores.
```


A computação em nuvem é um campo em rápido crescimento, e a tecnologia de virtualização é essencial para seu sucesso. No entanto, a segurança ainda é uma grande preocupação para muitas organizações que estão pensando em migrar para a nuvem. Este artigo propõe uma nova maneira de categorizar os desafios de segurança na nuvem, especificamente aqueles relacionados à virtualização.

Nos últimos anos, a computação em nuvem tornou-se cada vez mais popular.
De fato, 77% das empresas do setor de tecnologia já possuem alguma parte de sua infraestrutura instalada com um provedor de nuvem.
Algumas das princi

Here you will count the number of characters in the initial summary to see if they are small enough to fit in a prompt.

In [19]:
len("\n".join(initial_summary))

10280

As you managed to input 30,000 characters in a prompt previously, you can input this whole summary which has fewer characters to a prompt directly too. You will do that in the next step.

#### Reduce step

Here you will create a reduce function that concatenate the summaries from the inital summarization step (Map step) and use the final prompt template to summarize the summaries again.

In [20]:
# Define a function to create a summary of the summaries
def reduce(summary, prompt_template):

    # Concatenate the summaries from the inital step
    concat_summary = "\n".join(initial_summary)

    # Create a prompt for the model using the concatenated text and a prompt template
    prompt = prompt_template.format(text=concat_summary)

    # Generate a summary using the model and the prompt
    summary = model_with_limit_and_backoff(prompt=prompt, max_output_tokens=1024).text

    return summary

You are ready to proceed on to the next step to combine all the summary into an even smaller summary using the final prompt template and the function that you created earlier.

In [21]:
# Use defined `reduce` function to summarize the summaries
summary = reduce(initial_summary, final_prompt_template)

print(traduza(summary, "pt"))

- A computação em nuvem é um campo em rápido crescimento e a tecnologia de virtualização é essencial para seu sucesso.
- A virtualização é uma tecnologia que permite que vários sistemas operacionais sejam executados no mesmo hardware físico.
- Os provedores de nuvem estão interessados ​​em adotar a virtualização porque ela permite que eles ofereçam serviços minimamente coesos, isolados e encapsulados em um ambiente multilocatário.
- A virtualização é uma tecnologia fundamental para a computação em nuvem, mas também há questões de segurança que precisam ser abordadas.
- Muitos pesquisadores que se concentram nos desafios de segurança na nuvem não classificaram as ameaças estudadas de forma padronizada e sistemática.
- Este artigo propõe uma categorização dos desafios de segurança na computação em nuvem, especificamente relacionados à tecnologia de virtualização.
- A classificação é simples, direta, sistemática e completa.
- A categorização proposta foi elaborada com base em um conjunto 

#### Recap

You just summarized the whole paper into a few bullet points using the MapReduce method. Here are the pros and cons of using such method:

**Pros:**
- Can summarize a large document
- Can work well with parallel processing as the processes to summarize pages are independent to each other

**Cons:**
- Multiple calls to the model is needed
- As the pages are summarized individually, the context between the pages could be loss


In the next section, you will try another method which makes use of more than one chunk (page) per prompt to summarize.

## Method 3: MapReduce with Overlapping Chunks

It is similar to MapReduce, but with one key difference: overlapping chunks. This means that a few pages will be summarized together, rather than each page being summarized separately. This helps to preserve more context or information between chunks, which can improve the accuracy of the results.

It is important to note that combining chunks may sometimes exceed the token limit imposed by the model. If this occurs, you can either implement the chunk splitting method showor creatively solve the issue (e.g. removing a few initial chunks).

#### Map step

In this section, you will read the PDF file again and use the model to summarize <b>a few pages</b> together using the initial prompt template that you defined earlier.

In [22]:
# Read the PDF file and create a list of pages
reader = PyPDF2.PdfReader(pdf_file)
pages = reader.pages

# Create an empty list to store the extracted text from the pages
text_from_pages = []

# Iterate over the pages and generate a summary for each page
for page in tqdm(pages):

    # Extract the text from the page and remove any leading or trailing whitespace
    text = traduza(page.extract_text().strip(), "en")

    # Append the extracted text to the list of extracted text
    text_from_pages.append(text)

100%|██████████| 25/25 [00:27<00:00,  1.10s/it]


Here you will define the chunk size (number of pages to combine in this example) and summarize the chunks.

In [23]:
CHUNK_SIZE = 2  # number of overlapping pages

# Read the PDF file and create a list of pages
reader = PyPDF2.PdfReader(pdf_file)
pages = reader.pages

# Create an empty list to store the summaries
initial_summary = []

# Iterate over the pages and generate a summary for a few pages as one chunk based on `CHUNK_SIZE`
for i in tqdm(range(len(pages))):

    # Select a list of pages to merge as one chunk
    pages_to_merge = [x for x in range(i, i + CHUNK_SIZE) if x < len(pages)]

    extracted_texts = [text_from_pages[x] for x in pages_to_merge]

    # Concatenate the
    text = "\n".join(extracted_texts)

    # Create a prompt for the model using the concatenated text and a prompt template
    prompt = initial_prompt_template.format(text=text)

    # Generate a summary using the model and the prompt
    summary = model_with_limit_and_backoff(prompt=prompt, max_output_tokens=1024).text

    # Append the summary to the list of summaries
    initial_summary.append(summary)

    # If the last page is reached, break the loop
    if pages_to_merge[-1] == len(reader.pages):
        break

 56%|█████▌    | 14/25 [00:12<00:09,  1.17it/s]

Backing off 0.6760637274195608 seconds after 1 tries
Backing off 1.096445949176078 seconds after 2 tries
Backing off 3.549740242112873 seconds after 3 tries


100%|██████████| 25/25 [00:29<00:00,  1.18s/it]


Take a look at the first few summaries of from the initial Map phrase.

In [24]:
print(traduza("\n\n".join(initial_summary[:10]), "pt"))

Este relatório propõe uma nova forma de categorizar os desafios de segurança na Nuvem, especificamente relacionados à Virtualização, de forma simples e sistemática.

A computação em nuvem é uma arquitetura popular que oferece muitos benefícios, mas a segurança ainda é uma preocupação. Este artigo propõe uma nova maneira de categorizar os desafios de segurança na nuvem relacionados à virtualização.

Nos últimos anos, a computação em nuvem tornou-se cada vez mais popular.
Empresas em vários setores estão lutando para justificar a propriedade de seu próprio hardware, enquanto a computação em nuvem continua a crescer como uma solução para atender às suas necessidades computacionais.
Algumas das principais plataformas que oferecem serviços Cloud são Microsoft Azure, Amazon AWS, Google Cloud e IBM Cloud.
Em termos de conceituação básica, Ali, Khan e Vasilakos utilizaram o National Institute of Standards and Technology (NIST) para trazer a Figura 2, abaixo, e resumir as principais característ

#### Reduce step

You are ready to proceed on to the next step to combine all the summary into an even smaller summary using the final prompt template and the function that you created earlier.

In [25]:
# Use defined `reduce` function to summarize the summaries
summary = reduce(initial_summary, final_prompt_template)

print(traduza(summary, "pt"))

- A computação em nuvem é uma tecnologia em franca ascensão.
 - Existem três modelos de serviço disponíveis no momento da compra: IaaS, PaaS e SaaS.
 - São quatro modelos de implantação: Público, Privado, Comunitário e Híbrido.
 - A virtualização é uma tecnologia que possibilita a existência de toda a arquitetura Cloud.
 - Possui inúmeras vantagens tanto para os usuários quanto para os prestadores de serviços.
 - Virtualização é o uso de recursos computacionais para imitar outros recursos computacionais ou um computador inteiro.
 - Possui inúmeras vantagens tanto para os usuários quanto para os provedores de serviços, como a possibilidade de rodar vários sistemas operacionais no mesmo hardware físico, melhor gerenciamento de energia, melhor aproveitamento dos recursos, possibilidade de balanceamento de carga, manutenção, aumento da disponibilidade dos sistemas e isolamento.
 - A virtualização pode estar presente em praticamente todos os recursos computacionais. O tipo de virtualização 

#### Recap

The model was able to summary the whole paper into a few bullet points using the MapReduce with Overlapping Chunks method. Here are the pros and cons of using such method:

**Pros:**
- Can summarize a large document
- As the sequential pages are summarized together, the context between the pages are preserved
- Can use parallel processing as the results are independent to each other

**Cons:**
- Multiple calls to the model is needed
- Slightly slower than pure MapReduce method
- Create larger input text


In the next section, you will try a different approach that make use of a summary from the previous page instead of the entire text.

## Method 4: MapReduce with Rolling Summary (Refine)

On some occasions, combining a few pages might be too large to summarize. To resolve that issue, you will now a different approach that uses an initial summary from the previous step along with the next page to summarize each prompt. This helps to ensure that the summary is complete and accurate, as it takes into account the context of the previous page.

In [26]:
initial_prompt_template = traduza("""
    Levando em consideração o contexto abaixo, delimitado por aspas invertidas triplas:

    ```{context}```

    Escreva um sumário conciso sobre o seguinte texto delimitado por aspas invertidas.

    ```{text}```

    SUMÁRIO CONCISO:
""", "en")

In [27]:
# Read the PDF file and create a list of pages.
reader = PyPDF2.PdfReader(pdf_file)
pages = reader.pages

# Create an empty list to store the summaries.
initial_summary = []

# Iterate over the pages and generate a summary
for idx, page in enumerate(tqdm(pages)):

    # Extract the text from the page and remove any leading or trailing whitespace.
    text = traduza(page.extract_text().strip(), "en")

    if idx == 0:  # if current page is the first page, no previous context
        prompt = initial_prompt_template.format(context="", text=text)

    else:  # if current page is not the first page, previous context is the summary of the previous page
        prompt = initial_prompt_template.format(
            context=initial_summary[idx - 1], text=text
        )

    # Generate a summary using the model and the prompt
    summary = model_with_limit_and_backoff(prompt=prompt, max_output_tokens=1024).text

    # Append the summary to the list of summaries
    initial_summary.append(summary)

 32%|███▏      | 8/25 [00:17<00:40,  2.40s/it]

Backing off 0.13016626116645247 seconds after 1 tries
Backing off 1.5822510633823128 seconds after 2 tries
Backing off 0.07123886737655605 seconds after 3 tries
Backing off 3.1590057554721716 seconds after 4 tries
Backing off 12.627177942862355 seconds after 5 tries
Backing off 23.72472707586244 seconds after 6 tries


100%|██████████| 25/25 [01:32<00:00,  3.70s/it]


Here you will list out a few entries from the initial summary list.

In [28]:
print(traduza("\n\n".join(initial_summary[:10]), "pt"))

Este é um relatório técnico sobre a categorização dos desafios relacionados à segurança e virtualização da nuvem.

Este relatório técnico propõe uma nova maneira de categorizar os desafios de segurança na nuvem, especificamente relacionados à virtualização.

A computação em nuvem é uma maneira popular de atender às necessidades computacionais. As empresas que usam serviços de nuvem obtêm benefícios como redução de custos, maior agilidade e maior escalabilidade. Algumas das principais plataformas que oferecem serviços em nuvem são Microsoft Azure, Amazon AWS, Google Cloud e IBM Cloud.

A computação em nuvem é uma maneira popular de atender às necessidades computacionais. As empresas que usam serviços de nuvem obtêm benefícios como redução de custos, maior agilidade e maior escalabilidade. Algumas das principais plataformas que oferecem serviços em nuvem são Microsoft Azure, Amazon AWS, Google Cloud e IBM Cloud.

A computação em nuvem possui cinco características essenciais: amplo acesso

It is expected that there will be a few duplicate entries in the list, as you are rolling in context from previous pages to the next. You can easily remove these duplicates by using the set function.

#### Reduce step
You are ready to proceed on to the next step to combine all the summary into an even smaller summary using the final prompt template and the function that you created earlier.

In [29]:
# Use defined `reduce` function to summarize the summaries
initial_summary = set(initial_summary)  # set() function removes duplicate items
summary = reduce(initial_summary, final_prompt_template)

print(traduza(summary, "pt"))

- A computação em nuvem é uma maneira popular de atender às necessidades computacionais.
- As empresas que usam serviços em nuvem obtêm benefícios como redução de custos, maior agilidade e maior escalabilidade.
- Algumas das principais plataformas que oferecem serviços em nuvem são Microsoft Azure, Amazon AWS, Google Cloud e IBM Cloud.
- A computação em nuvem possui cinco características essenciais: amplo acesso à rede, elasticidade, possibilidade de mensuração do uso de recursos, oferta sob demanda e arquitetura para oferta de recursos e serviços multilocatários.
- A virtualização é uma tecnologia que permite que vários sistemas operacionais sejam executados no mesmo hardware físico.
- Os provedores de nuvem usam a virtualização para oferecer soluções escaláveis, elásticas e sob demanda.
- Existem três tipos de virtualização: virtualização completa, para-virtualização e virtualização assistida por hardware.
- Existem muitas vantagens em usar a virtualização, mas também existem algumas

#### Recap

The model was able to summarize the whole paper into a few bullet points using the MapReduce with Rolling Summary method. Here are the pros and cons of using such method:

**Pros:**
- Can summarize a large document
- As the sequential pages are summarized using the context from previous pages, the context between the pages are preserved

**Cons:**
- Multiple calls to the model is needed
- Cannot work well with parallel processing as the processes to summarize pages are dependent to each other

## Conclusion

You have successfully summarized a long document, even though it was initially impossible due to an input prompt limit. You have also learned several methods for summarizing long documents, along with their advantages and disadvantages.

Summarizing a long document can be challenging. It requires you to identify the main points of the document, synthesize the information, and present it in a concise and coherent way. This can be especially difficult if the document is complex or technical. Additionally, summarizing a long document can be time-consuming, as you need to carefully read and analyze the text to ensure that the summary is accurate and complete.

While these methods allow you to interact with LLMs and summarize long documents in a flexible way, you may sometimes want to speed up the process by using bootstrapping or pre-built methods. This is where libraries like LangChain come in. You can read more about LangChain support on Vertex AI [here](https://python.langchain.com/en/latest/modules/models/llms/integrations/google_vertex_ai_palm.html).