# Summarizing Research Papers with GPT-4.1
## ABB #5 - Session 2

Code authored by: Shaw Talebi

### imports

In [1]:
import fitz  # PyMuPDF
from IPython.display import Markdown, display

from openai import OpenAI
from dotenv import load_dotenv
import os

In [2]:
# import sk from .env file
load_dotenv()

# connect to openai API
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

### 1) Extract text

In [3]:
filepath = "content/attention-is-all-you-need.pdf"

In [4]:
pdf = fitz.open(filepath)
text = "".join([page.get_text() for page in pdf])

### 2) Write prompt

In [5]:
prompt = f"""# Role and Objective

You are a research assistant. Your task is to read technical articles and condense them into concise, clear, and accessible summaries. 

# Instructions

Review the article provided by the user and generate a summary in the following format.

- **Title**: The paper's title.
- **Executive Summary**: The research goal or question being addressed.
- **Key Terms**: Technical terms and concepts central to the article's contents
    - Term 1 = term 1 definition in context of article
    - Term 2 = term 2 definition in context of article
    - ...
- **Key Findings**: Most critical findings and insights of the paper relevant to AI researchers and engineers.
- **Further Reading**: References which are foundational to article's work or dive deeper into key concepts.

# Guidelines

- Write in a neutral and academic tone.
- Use simple, precise language to ensure clarity for a broad audience.
- Keep summaries concise (150-300 words) unless otherwise specified.
- Assume the audience has general technical knowledge but may not be familiar with the specific field of the paper.
- Please limit bullets under key terms, key findings, and further reading, sections to the 3-5 most essential.

"""

In [6]:
print(prompt)

# Role and Objective

You are a research assistant. Your task is to read technical articles and condense them into concise, clear, and accessible summaries. 

# Instructions

Review the article provided by the user and generate a summary in the following format.

- **Title**: The paper's title.
- **Executive Summary**: The research goal or question being addressed.
- **Key Terms**: Technical terms and concepts central to the article's contents
    - Term 1 = term 1 definition in context of article
    - Term 2 = term 2 definition in context of article
    - ...
- **Key Findings**: Most critical findings and insights of the paper relevant to AI researchers and engineers.
- **Further Reading**: References which are foundational to article's work or dive deeper into key concepts.

# Guidelines

- Write in a neutral and academic tone.
- Use simple, precise language to ensure clarity for a broad audience.
- Keep summaries concise (150-300 words) unless otherwise specified.
- Assume the audi

### 3) Summarize Paper with GPT-4o

Responses API: https://platform.openai.com/docs/api-reference/responses/create

In [7]:
%%time
# make api call
response = client.responses.create(
    model="gpt-4.1",
    instructions=prompt,
    input=text
)

# extract response
summary = response.output_text
print(summary)

- **Title**: Attention Is All You Need

- **Executive Summary**:  
  This paper introduces the Transformer, a novel neural network architecture for sequence transduction tasks such as machine translation. Unlike previous models, it dispenses completely with recurrence and convolution, instead relying entirely on attention mechanisms—specifically, self-attention and multi-head attention. The Transformer achieves state-of-the-art results in machine translation (notably on WMT 2014 English-to-German and English-to-French) with much greater computational efficiency, parallelizability, and significantly reduced training time compared to prior approaches. The model also generalizes well to other sequence tasks, demonstrated by its strong performance in English constituency parsing.

- **Key Terms**:
    - **Transformer**: A neural network architecture for sequence-to-sequence tasks that employs solely attention mechanisms (no RNNs or CNNs), featuring stacked self-attention and feed-forward l

### 4) Display Markdown

In [8]:
display(Markdown(summary))

- **Title**: Attention Is All You Need

- **Executive Summary**:  
  This paper introduces the Transformer, a novel neural network architecture for sequence transduction tasks such as machine translation. Unlike previous models, it dispenses completely with recurrence and convolution, instead relying entirely on attention mechanisms—specifically, self-attention and multi-head attention. The Transformer achieves state-of-the-art results in machine translation (notably on WMT 2014 English-to-German and English-to-French) with much greater computational efficiency, parallelizability, and significantly reduced training time compared to prior approaches. The model also generalizes well to other sequence tasks, demonstrated by its strong performance in English constituency parsing.

- **Key Terms**:
    - **Transformer**: A neural network architecture for sequence-to-sequence tasks that employs solely attention mechanisms (no RNNs or CNNs), featuring stacked self-attention and feed-forward layers in both encoder and decoder.
    - **Self-Attention**: An attention mechanism relating different positions of a single input sequence to compute contextual representations, enabling the model to capture dependencies irrespective of their distance in the sequence.
    - **Multi-Head Attention**: Multiple parallel attention mechanisms (heads) allowing the model to jointly focus on information from different subspaces at different positions.
    - **Positional Encoding**: Injects sequence order information into embeddings, typically using sinusoidal functions, since the Transformer has no recurrence or convolution to model order.
    - **Sequence Transduction**: The transformation of an input sequence into an output sequence (e.g., machine translation).

- **Key Findings**:
    - The Transformer outperforms state-of-the-art models (including ensembles) in English-German and English-French machine translation, achieving higher BLEU scores (28.4 and 41.8 respectively) with dramatically lower training costs.
    - The model can be efficiently trained due to its parallelizable architecture, reducing training time from days/weeks to hours.
    - Self-attention enables shorter paths for information flow, making it easier to capture long-range dependencies compared to RNN- or CNN-based models.
    - The architecture generalizes beyond translation, achieving strong results in English constituency parsing with minimal task-specific tuning.
    - Attention mechanisms within the model can be interpreted: different attention heads learn to focus on different linguistic phenomena (such as syntax, co-reference).

- **Further Reading**:
    - [2] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. (Attention mechanism in NMT.)
    - [9] Jonas Gehring et al. Convolutional sequence to sequence learning. (CNN-based sequence models.)
    - [35] Ilya Sutskever, Oriol Vinyals, and Quoc Le. Sequence to sequence learning with neural networks. (RNN encoder-decoder models.)
    - [20] Diederik Kingma and Jimmy Ba. Adam: A method for stochastic optimization. (Optimization algorithm.)
    - [38] Yonghui Wu et al. Google's neural machine translation system: Bridging the gap between human and machine translation. (Large-scale NMT.)