# Demo: Summarizing PDFs with Map-Reduce Technique

**Objective:**

This notebook demoes a workflow leveraging the `Map-Reduce` technique to generate coherent summaries of PDF documents.

The workflow consists of three sequential steps:
1. **Load and split PDF**: Load the desired PDF and create text chunks for each page.
2. **Map-Step**: For each page, generate a concise summary (~150 words) using `GPT-3.5-turbo`.
3. **Reduce-Step**: Create final, consolidated summary using `GPT-4-turbo `from set of summaries generated in `Map-Step`.

**Key Advantages:**

This modular approach, which breaks down a complex task into manageable sub-tasks, offers several advantages over directly generating a summary from the whole PDF:
- **Handling of Large Data Volumes**: Enables efficient handling of larger documents by breaking them down into manageable pages, which allows for parallel processing.
- **Cost Savings with Cheaper LLMs**: Allows for the use of more cost-effective LLMs for generating intermediate summaries in the `Map-Step`, reducing overall operational costs while maintaining effectiveness.
- **Improved Summary Quality**: Each page is summarized individually, leading to more nuanced and accurate summaries, with the flexibility to customize and adjust the process for different use cases.

**Disclaimer:**

This notebook is strictly for educational purposes. Users should employ it responsibly, ensuring accuracy, respecting privacy and copyright, and avoiding the dissemination of misinformation. Always consider the ethical implications and context of the generated content.

## Setup: Load dependencies and set parameters

As an inital step, we load the required dependencies and set the parameters for the `Map-Reduce` workflow. In this demo, we summarize the main content of the seminal `"Attention Is All You Need"` paper by Google Research.

In [1]:
import logging
import os
from tqdm.notebook import tqdm

from genaipy.extractors.pdf import extract_pages_text
from genaipy.openai_apis.chat import get_chat_response
from genaipy.prompts.build_prompt import build_prompt
from genaipy.prompts.generate_summaries import SUMMARY_PROMPT_TPL, REDUCE_SUMMARY_PROMPT_TPL

In [2]:
os.chdir('d:/genai/GenAIPy/') # Change directory to root

# Logging settings
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# PDF settings
PDF_URL = "demos/data/transformers_attention_paper.pdf" # https://arxiv.org/abs/1706.03762
START = 1
END = 10 # exclude references and appendix

# LLM settings
MAP_LLM = "gpt-3.5-turbo"
REDUCE_LLM = "gpt-4-1106-preview"
SYS_MESSAGE = "You are a generative AI expert. You explain complex AI concepts in simple language so regular users can understand your answers."
MAP_MAX_WORDS = 150
REDUCE_MAX_WORDS = 300

## Step [1/3]: Load and split PDF

In the first step, we load the PDF article and split the text content of each page in separate chunks. This enables us to generate an individual summary per page in the following `Map-Step`.

In [3]:
pages = extract_pages_text(pdf_path=PDF_URL, start_page=START, end_page=END)
print(f"Successfully loaded text from {len(pages)} PDF pages.")

Successfully loaded text from 10 PDF pages.


## Step [2/3]: Map-Step

In the second step, we execute the mapping loop of the `Map-Reduce` technique. Specifically, we generate a summary of each page's text content with `GPT-3.5-turbo` and store the resulting intermediate summaries in a list.

In [4]:
map_summaries = []
for page in tqdm(pages, desc="Generating Map Summaries"):
    try:
        map_prompt = build_prompt(template=SUMMARY_PROMPT_TPL, text=pages[page]["content"], max_words=MAP_MAX_WORDS)
        summary = get_chat_response(map_prompt, sys_message= SYS_MESSAGE, model=MAP_LLM)
        map_summaries.append(summary)
        logging.info("Map Summary #%d: %s", page, summary)
    except Exception as e:
        logging.error("An error occured while generating summary #%d: %s", page, summary)


Generating Map Summaries:   0%|          | 0/10 [00:00<?, ?it/s]

2023-12-13 12:17:25,163 - INFO - Successfully completed Chat API request. Total token usage: 906
2023-12-13 12:17:25,163 - INFO - Map Summary #1: The text is discussing a new network architecture called the Transformer, which is based solely on attention mechanisms and does not use recurrent or convolutional neural networks. The authors conducted experiments on machine translation tasks and found that the Transformer models outperformed existing models in terms of quality, training time, and parallelizability. The model achieved a BLEU score of 28.4 on the English-to-German translation task and a state-of-the-art BLEU score of 41.8 on the English-to-French translation task. The authors also demonstrated that the Transformer performed well on other tasks, such as English constituency parsing, with both large and limited training data. The paper was published in the 31st Conference on Neural Information Processing Systems (NIPS 2017).
2023-12-13 12:17:41,194 - INFO - Successfully complet

## Step [3/3]: Reduce-Step

In the last step of the workflow, we combine the individual intermediate summaries, so we can distill them into a cohesive summary in the `Reduce-Step` with `GPT-4-turbo`. As we can see, the resulting final summary gives a comprehensive overview of the paper without omitting important information. 

In [5]:
# Process generated map summaries for reduce step

string = "\n".join(map_summaries)
text = string.replace("\n\n", "") # clean potential double newlines in joined map summaries
print(text)

The text is discussing a new network architecture called the Transformer, which is based solely on attention mechanisms and does not use recurrent or convolutional neural networks. The authors conducted experiments on machine translation tasks and found that the Transformer models outperformed existing models in terms of quality, training time, and parallelizability. The model achieved a BLEU score of 28.4 on the English-to-German translation task and a state-of-the-art BLEU score of 41.8 on the English-to-French translation task. The authors also demonstrated that the Transformer performed well on other tasks, such as English constituency parsing, with both large and limited training data. The paper was published in the 31st Conference on Neural Information Processing Systems (NIPS 2017).
The text discusses the concept of the Transformer model architecture, which uses attention mechanisms instead of recurrent networks or convolutions. The Transformer model allows for more parallelizat

In [6]:
# Execute reduce step to generate final summary

try:
    reduce_prompt = build_prompt(template=REDUCE_SUMMARY_PROMPT_TPL, text=text, max_words=REDUCE_MAX_WORDS)
    final_summary = get_chat_response(prompt=reduce_prompt, sys_message=SYS_MESSAGE, model=REDUCE_LLM, max_tokens=1024)
    logging.info("Final Reduce Summary:\n%s", final_summary)
except Exception as e:
    logging.error("An error occured while generating final summary: %s", e)
else:
    output = {"final_summary": final_summary, "map_summaries": map_summaries}

2023-12-13 12:19:25,903 - INFO - Successfully completed Chat API request. Total token usage: 1919
2023-12-13 12:19:25,904 - INFO - Final Reduce Summary:
**Overview of the Transformer Model**

- The Transformer is a novel AI model created for tasks like language translation.
- It is built around attention mechanisms instead of previous methods like recurrent networks or convolutions.
- Its architecture consists of an encoder and a decoder, each with multiple layers that facilitate parallel processing and reduce training time.
  
**Key Features and Mechanisms**

- **Self-Attention Layers**: These layers allow each position to attend to all positions in the previous layer and help the model to learn long-range dependencies efficiently.
- **Encoder and Decoder**: The encoder processes the input sequence and the decoder generates the translated output, both leveraging self-attention and feed-forward networks.
- **Attention Types**:
  - *Scaled Dot-Product Attention*: This calculates the rel