In [1]:
from dotenv import load_dotenv
load_dotenv()

True

In [2]:
from langchain_groq import ChatGroq
llm = ChatGroq(
    model = "openai/gpt-oss-120b"
    )



'**Transformers** can refer to two very different things depending on the context—**electrical devices** that transfer electrical energy, or **machine‑learning models** that have transformed (pun intended) the field of natural‑language processing (NLP) and many other AI tasks. Below is a quick overview of both meanings, followed by a deeper dive into the AI‑focused transformer model.\n\n---\n\n## 1️⃣ Electrical Transformers\n\n| **What it is** | A device that transfers electrical energy between two or more circuits through electromagnetic induction. |\n|---|---|\n| **Core principle** | **Faraday’s law of induction**: a changing magnetic field in a coil of wire induces a voltage in another coil. |\n| **Main parts** | 1. **Core** (usually laminated iron) – guides magnetic flux.<br>2. **Primary winding** – receives input voltage.<br>3. **Secondary winding** – delivers output voltage.<br>4. **Optional tap points, shielding, cooling, and protective devices.** |\n| **How it works** | 1. AC v

In [3]:
speech = """
"The Future of Artificial Intelligence in Healthcare"

Good morning, everyone. Today, I want to talk about one of the most transformative applications of artificial intelligence: its role in revolutionizing healthcare.

Artificial intelligence has already begun to reshape how we diagnose, treat, and prevent diseases. In radiology, AI systems can now detect certain types of cancer more accurately than human radiologists. For instance, Google's DeepMind developed an AI system that can identify over 50 eye diseases with 94% accuracy, potentially preventing blindness in millions of people worldwide.

Machine learning algorithms are also transforming drug discovery. Traditionally, developing a new drug takes 10-15 years and costs billions of dollars. However, AI can analyze vast databases of molecular structures and predict which compounds are most likely to become effective medicines, reducing both time and cost significantly.

In personalized medicine, AI analyzes patient data including genetic information, lifestyle factors, and medical history to create tailored treatment plans. This approach has shown remarkable success in cancer treatment, where AI helps oncologists select the most effective therapy for each individual patient.

However, we must address several challenges. Data privacy remains a critical concern, as healthcare AI systems require access to sensitive patient information. We also need to ensure these systems are unbiased and work equally well across different demographic groups. Additionally, healthcare professionals need proper training to work alongside AI systems effectively.

Looking ahead, I believe AI will enable predictive healthcare, where we can identify and prevent diseases before symptoms appear. Wearable devices combined with AI could continuously monitor our health and alert us to potential issues early.

The integration of AI in healthcare represents not just technological advancement, but a fundamental shift toward more precise, personalized, and accessible medical care for everyone.
"""

In [4]:
from langchain.schema import AIMessage, HumanMessage, SystemMessage


In [5]:
chat_message = [
    SystemMessage(content="You are a helpful assistant that summarizes text."),
    HumanMessage(content="Provide the short summary of the following speech: {speech}"),
]


In [6]:
llm.get_num_tokens(speech)

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

360

#### Creating the Prompt Template

In [None]:
from langchain.prompts import ChatPromptTemplate  

prompt = ChatPromptTemplate.from_messages([
    ('system', "You are a helpful assistant that summarizes text."),
    ('human', "Provide the short summary of the following speech: {speech} and translate the summary to {language}"),
])

prompt

ChatPromptTemplate(input_variables=['language', 'speech'], input_types={}, partial_variables={}, messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], input_types={}, partial_variables={}, template='You are a helpful assistant that summarizes text.'), additional_kwargs={}), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['language', 'speech'], input_types={}, partial_variables={}, template='Provide the short summary of the following speech: {speech} and translate the summary to {language}'), additional_kwargs={})])

In [12]:
llm_chain = prompt | llm
result = llm_chain.invoke({'speech': speech, 'language': 'Spanish'})
result.content

'**Summary (English):**  \nAI is transforming healthcare by improving disease diagnosis (e.g., AI detecting cancers and eye diseases), accelerating drug discovery, and enabling personalized treatment plans, especially in oncology. While the benefits are clear, challenges such as data privacy, algorithmic bias, and the need for professional training must be addressed. Future AI-driven healthcare will focus on predictive, preventive care using wearables and continuous monitoring, moving toward more precise, personalized, and accessible medicine.\n\n**Resumen (Español):**  \nLa IA está transformando la atención médica al mejorar el diagnóstico de enfermedades (por ejemplo, la detección de cáncer y de más de 50 enfermedades oculares), acelerar el descubrimiento de fármacos y permitir tratamientos personalizados, especialmente en oncología. Aunque los beneficios son claros, se deben enfrentar desafíos como la privacidad de datos, el sesgo algorítmico y la necesidad de capacitación profesion

#### StuffDocumentation Text Summarization

In [15]:
from langchain.document_loaders import PyPDFLoader
loader = PyPDFLoader("Research_Papers/Attention_Is_All_You_Need.pdf")
pages = loader.load_and_split()
pages

[Document(metadata={'producer': 'pdfTeX-1.40.25', 'creator': 'LaTeX with hyperref', 'creationdate': '2024-04-10T21:11:43+00:00', 'author': '', 'keywords': '', 'moddate': '2024-04-10T21:11:43+00:00', 'ptex.fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5', 'subject': '', 'title': '', 'trapped': '/False', 'source': 'Research_Papers/Attention_Is_All_You_Need.pdf', 'total_pages': 15, 'page': 0, 'page_label': '1'}, page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukas

In [19]:
from langchain.prompts import PromptTemplate

template = '''
Write the concise and short summary of the following document,
Document: {text}
'''

prompt_template = PromptTemplate(
    input_variables=["text"],
    template=template
)

In [21]:
## Mainly used when working with the text summary tasks
from langchain.chains import load_summarize_chain

chain = load_summarize_chain(
    llm=llm,
    chain_type="stuff",
    prompt=prompt_template,
    verbose = True
)
chain.run(pages[:2])



[1m> Entering new StuffDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Write the concise and short summary of the following document,
Document: Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.com
Noam Shazeer∗
Google Brain
noam@google.com
Niki Parmar∗
Google Research
nikip@google.com
Jakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.com
Aidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.edu
Łukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the e

'**Summary**\n\nThe paper introduces the **Transformer**, a novel sequence‑to‑sequence architecture that replaces the traditional recurrent or convolutional layers in encoder‑decoder models with **pure attention mechanisms** (self‑attention and multi‑head attention). By eliminating sequential recurrence, the model enables **high parallelism**, drastically reducing training time while achieving superior translation quality.  \n\nKey results:  \n- **English‑German** translation (WMT 2014) – 28.4 BLEU (≈\u202f2\u202fBLEU improvement over prior best).  \n- **English‑French** translation (WMT 2014) – 41.8 BLEU, the highest single‑model score, achieved after only 3.5\u202fdays on eight GPUs.  \n- Demonstrated **generalization** to English constituency parsing with both large and limited datasets.\n\nThe work shows that self‑attention alone can capture global dependencies with a constant number of operations, outperforming prior RNN‑ or CNN‑based models in both speed and accuracy.  \n\n*Autho

#### Map Reduce for the summarization

In [26]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_spliter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    length_function=len
)
docs_final = text_spliter.split_documents(pages)
docs_final

[Document(metadata={'producer': 'pdfTeX-1.40.25', 'creator': 'LaTeX with hyperref', 'creationdate': '2024-04-10T21:11:43+00:00', 'author': '', 'keywords': '', 'moddate': '2024-04-10T21:11:43+00:00', 'ptex.fullbanner': 'This is pdfTeX, Version 3.141592653-2.6-1.40.25 (TeX Live 2023) kpathsea version 6.3.5', 'subject': '', 'title': '', 'trapped': '/False', 'source': 'Research_Papers/Attention_Is_All_You_Need.pdf', 'total_pages': 15, 'page': 0, 'page_label': '1'}, page_content='Provided proper attribution is provided, Google hereby grants permission to\nreproduce the tables and figures in this paper solely for use in journalistic or\nscholarly works.\nAttention Is All You Need\nAshish Vaswani∗\nGoogle Brain\navaswani@google.com\nNoam Shazeer∗\nGoogle Brain\nnoam@google.com\nNiki Parmar∗\nGoogle Research\nnikip@google.com\nJakob Uszkoreit∗\nGoogle Research\nusz@google.com\nLlion Jones∗\nGoogle Research\nllion@google.com\nAidan N. Gomez∗ †\nUniversity of Toronto\naidan@cs.toronto.edu\nŁukas

In [28]:
chunk_prompt = '''
Provide the summary of the following document,
Document: {text}
'''

map_prompt_template = PromptTemplate(
    input_variables=["text"],
    template=chunk_prompt
)

In [29]:
final_prompt = ''''
Provide the important points from the following document in a number and structured format,
Document: {text}
'''
final_prompt_template = PromptTemplate(
    input_variables=["text"],
    template=final_prompt
)

final_chain = load_summarize_chain(
    llm=llm,
    chain_type="map_reduce",
    map_prompt=map_prompt_template,
    combine_prompt=final_prompt_template,
    verbose = True
)

final_chain.run(docs_final[:5])



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3m
Provide the summary of the following document,
Document: Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.com
Noam Shazeer∗
Google Brain
noam@google.com
Niki Parmar∗
Google Research
nikip@google.com
Jakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.com
Aidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.edu
Łukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and d

'**Important Points from the Document (Numbered & Structured)**  \n\n---\n\n### 1. Core Contribution – The Transformer  \n1.1. **Purely attention‑based architecture** that replaces recurrent (RNN) and convolutional (CNN) layers with *self‑attention* mechanisms.  \n1.2. **Encoder‑decoder design**: each side consists of stacked **multi‑head self‑attention** + **position‑wise feed‑forward** sub‑layers, wrapped with **residual connections** and **layer‑norm**.  \n1.3. **Positional Encoding**: sinusoidal, parameter‑free encodings added to input embeddings to supply absolute position information.  \n\n---\n\n### 2. Technical Innovations  \n2.1. **Scaled dot‑product attention** (the basic attention operation).  \n2.2. **Multi‑head attention** – allows the model to attend to information from different representation sub‑spaces simultaneously.  \n2.3. **Parameter‑free positional encoding** – avoids extra learned parameters while providing sequence order.  \n\n---\n\n### 3. Performance & Efficie

#### Refine Chain Summarization

In [31]:
chain = load_summarize_chain(
    llm=llm,
    chain_type="refine",
    verbose = True
)
chain.run(docs_final[:5])



[1m> Entering new RefineDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"Provided proper attribution is provided, Google hereby grants permission to
reproduce the tables and figures in this paper solely for use in journalistic or
scholarly works.
Attention Is All You Need
Ashish Vaswani∗
Google Brain
avaswani@google.com
Noam Shazeer∗
Google Brain
noam@google.com
Niki Parmar∗
Google Research
nikip@google.com
Jakob Uszkoreit∗
Google Research
usz@google.com
Llion Jones∗
Google Research
llion@google.com
Aidan N. Gomez∗ †
University of Toronto
aidan@cs.toronto.edu
Łukasz Kaiser∗
Google Brain
lukaszkaiser@google.com
Illia Polosukhin∗ ‡
illia.polosukhin@gmail.com
Abstract
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks that include an encoder and a decoder. The best
performing models also connect the encoder and decoder through a

'**Summary of “Attention Is All You Need” (Vaswani\u202fet\u202fal., 2017)**  \n\n**1. Motivation and Problem Setting**  \n* Recurrent neural networks (RNNs), especially LSTMs and gated‑recurrent units, have dominated sequence‑modeling tasks (language modeling, machine translation, etc.).  \n* RNN‑based encoder‑decoder models compute hidden states sequentially—each hidden state \\(h_t\\) depends on the previous state \\(h_{t-1}\\) and the current input. This **intrinsic sequentiality** prevents parallel computation across time steps, which becomes a severe bottleneck for long sequences and limits batch size due to memory constraints.  \n* The authors set out to replace the recurrent (and convolutional) components of such models with a **purely attention‑based architecture** that can process all positions simultaneously, thus enabling massive parallelism while preserving or improving performance.\n\n**2. Core Architectural Contributions**  \n1. **Self‑Attention (Scaled Dot‑Product Atten