# 🔍 LLM Text Summarization: Top 10 Colab Questions
This notebook covers the most frequently asked coding questions about using Large Language Models (LLMs) for text summarization.

Each section includes code, best practices, and comments for easy understanding.

In [4]:
!pip install -U google-generativeai
import google.generativeai as genai

from google.colab import userdata
GOOGLE_AI_STUDIO = userdata.get('GOOGLE_API_KEY')

genai.configure(api_key=GOOGLE_AI_STUDIO)



In [5]:
# Get a list of available models
models = genai.list_models()

# Iterate through the models and print their information
for model in models:
    print(f"Model Name: {model.name}")
    print(f"Supported Methods: {model.supported_generation_methods}")
    print(f"Description: {model.description}")
    print("-" * 20)  # Separator for clarity

Model Name: models/chat-bison-001
Supported Methods: ['generateMessage', 'countMessageTokens']
Description: A legacy text-only model optimized for chat conversations
--------------------
Model Name: models/text-bison-001
Supported Methods: ['generateText', 'countTextTokens', 'createTunedTextModel']
Description: A legacy model that understands text and generates text as an output
--------------------
Model Name: models/embedding-gecko-001
Supported Methods: ['embedText', 'countTextTokens']
Description: Obtain a distributed representation of a text.
--------------------
Model Name: models/gemini-1.0-pro-vision-latest
Supported Methods: ['generateContent', 'countTokens']
Description: The original Gemini 1.0 Pro Vision model version which was optimized for image understanding. Gemini 1.0 Pro Vision was deprecated on July 12, 2024. Move to a newer Gemini version.
--------------------
Model Name: models/gemini-pro-vision
Supported Methods: ['generateContent', 'countTokens']
Description: The 

In [14]:
# step 1: define Define Model
model = genai.GenerativeModel("models/gemini-2.0-flash-lite")



In [20]:
# Step 2: Define Model and Prompt


# Few-shot learning example with structured case note + care plan + action items
few_shot_example = """
Discharge Summary:
Patient: James L.
Admit: 3/15/25 | Discharge: 3/20/25 | Procedure: Left Hip Replacement
Disposition: Home with walker
Plan: Home PT 3x/wk, f/u ortho 2 wks
Meds: Acetaminophen, Oxycodone, Enoxaparin

Expected Output:

Case Note:
Patient admitted for left hip arthroplasty, discharged home on POD5 with walker. No complications. Home PT arranged, follow-up scheduled. Pain controlled. Will contact patient to verify DME delivery and medication adherence.

Care Plan:
Goals:
- Ambulate independently with walker by 4/5
- Attend orthopedic follow-up by 4/3
- Maintain surgical site healing

Gaps:
- Follow-up appointment not confirmed
- DME delivery not verified

Interventions:
- Review patient’s home PT status
- Monitor for signs of wound infection
- Reinforce medication adherence

Action Items for Case Manager:
- ☑️ Call patient to confirm walker delivery and use
- ☑️ Confirm orthopedic appointment date
- ☑️ Verify home PT has started and is appropriately scheduled
</Example>
"""



# 🔁 Your New Input Discharge Summary
discharge_summary = """
Patient Name: Mary Thompson
MRN: 987654321
Admitted: 4/3/25
Discharged: 4/9/25
Procedure: Left Total Knee Replacement
Surgeon: Dr. Alan Rivera
Course: No complications. Started PT POD1. Walker used. Pain managed with acetaminophen and celecoxib. CPAP used for OSA. Home PT arranged. F/u ortho in 2 weeks. Daughter assisting at home.
Meds: Acetaminophen, Celecoxib, Ferrous Sulfate, Oxycodone, Escitalopram
"""

# 🔧 Final Prompt Assembly
full_prompt = f"""
You are a healthcare case manager. Based on the following discharge summary, generate:

1. A clear and concise **case note** for internal documentation
2. A structured **care plan** including goals, identified gaps, and interventions

Use appropriate clinical case management language. Be concise but complete.
<Example>
{few_shot_example}
</Example>

Now generate based on this:

Discharge Summary:
{discharge_summary}
"""

In [21]:
print(few_shot_example)


Discharge Summary:
Patient: James L.
Admit: 3/15/25 | Discharge: 3/20/25 | Procedure: Left Hip Replacement
Disposition: Home with walker
Plan: Home PT 3x/wk, f/u ortho 2 wks
Meds: Acetaminophen, Oxycodone, Enoxaparin

Expected Output:

Case Note:
Patient admitted for left hip arthroplasty, discharged home on POD5 with walker. No complications. Home PT arranged, follow-up scheduled. Pain controlled. Will contact patient to verify DME delivery and medication adherence.

Care Plan:
Goals:
- Ambulate independently with walker by 4/5
- Attend orthopedic follow-up by 4/3
- Maintain surgical site healing

Gaps:
- Follow-up appointment not confirmed
- DME delivery not verified

Interventions:
- Review patient’s home PT status
- Monitor for signs of wound infection
- Reinforce medication adherence

Action Items for Case Manager:
- ☑️ Call patient to confirm walker delivery and use
- ☑️ Confirm orthopedic appointment date
- ☑️ Verify home PT has started and is appropriately scheduled
</Example>

In [22]:
print(full_prompt)


You are a healthcare case manager. Based on the following discharge summary, generate:

1. A clear and concise **case note** for internal documentation
2. A structured **care plan** including goals, identified gaps, and interventions

Use appropriate clinical case management language. Be concise but complete.
<Example>

Discharge Summary:
Patient: James L.
Admit: 3/15/25 | Discharge: 3/20/25 | Procedure: Left Hip Replacement
Disposition: Home with walker
Plan: Home PT 3x/wk, f/u ortho 2 wks
Meds: Acetaminophen, Oxycodone, Enoxaparin

Expected Output:

Case Note:
Patient admitted for left hip arthroplasty, discharged home on POD5 with walker. No complications. Home PT arranged, follow-up scheduled. Pain controlled. Will contact patient to verify DME delivery and medication adherence.

Care Plan:
Goals:
- Ambulate independently with walker by 4/5
- Attend orthopedic follow-up by 4/3
- Maintain surgical site healing

Gaps:
- Follow-up appointment not confirmed
- DME delivery not verified

In [23]:
response = model.generate_content(full_prompt)
print(response.text)

Here's the case note and care plan based on the provided discharge summary:

**Case Note:**

Patient admitted 4/3/25 for Left Total Knee Replacement. Discharged home 4/9/25. No complications. Home PT ordered. Pain managed. CPAP used for OSA. Daughter assisting. Will contact patient to verify home PT start date, DME (walker) use, medication adherence, and follow-up appointment details.

**Care Plan:**

**Goals:**

*   Ambulate safely with walker at discharge
*   Attend orthopedic follow-up appointment within 2 weeks
*   Adhere to medication regimen as prescribed
*   Maintain CPAP compliance for OSA
*   Demonstrate wound care management

**Identified Gaps:**

*   Home PT start date unconfirmed
*   Medication adherence unclear
*   Durable Medical Equipment (DME) use/supply not verified
*   Orthopedic follow-up appointment date not verified

**Interventions:**

*   Contact patient to confirm home PT schedule and address any barriers to participation.
*   Assess medication adherence (acetam

In [11]:
prompt = """
You are a clinical assistant generating care note and care plan for a patient based on discharge summary.


Example:
Discharge Summary:


Patient Name: Mary Thompson
Medical Record Number: 987654321
Date of Admission: 2025-04-03
Date of Discharge: 2025-04-09
Attending Physician: Dr. Alan Rivera, MD
Primary Diagnosis:

End-stage Osteoarthritis of the Left Knee (M17.12)

Secondary Diagnoses:

Obesity (BMI 35)

Obstructive Sleep Apnea (OSA)

Depression (well-controlled)

Procedure Performed
Date: 2025-04-04
Procedure: Left Total Knee Arthroplasty (Replacement)
Surgeon: Dr. Alan Rivera, MD
Anesthesia: General Anesthesia with Adductor Canal Block

Hospital Course
Patient was admitted for elective left total knee replacement due to chronic pain and significant functional limitations from advanced osteoarthritis. Preoperative clearance obtained. The procedure was completed without intraoperative complications.

Postoperatively, pain was initially difficult to manage due to patient sensitivity to opioids, but improved with adjustment to acetaminophen, celecoxib, and limited oxycodone. The patient used CPAP during hospitalization due to known OSA.

Physical therapy began on postoperative day 1. Patient progressed with ambulation using a front-wheeled walker but required additional assistance with stairs. No signs of surgical site infection. Mild anemia post-op managed conservatively.

Discharge Condition
Awake, alert, oriented

Ambulating with front-wheeled walker

Mild pain controlled with oral medication

Surgical site clean and dry

Hemoglobin stable and improving

Mood stable, no signs of post-op delirium

Medications on Discharge
Acetaminophen 650 mg, PO q6h PRN

Oxycodone 5 mg, PO q6h PRN (limit 5 days)

Celecoxib 200 mg, PO daily with food

Ferrous Sulfate 325 mg, PO BID for mild anemia

Escitalopram 10 mg, PO daily (home med)

CPAP: Continue nightly use (home device)

Discharge Plan
Discharged to home with support from adult daughter

Home PT scheduled to begin within 48 hours

Anticipated need for stair rail installation at home

Follow-up with orthopedic surgeon in 2 weeks

Instructed to monitor for signs of infection, DVT, worsening pain

Activity Restrictions
Weight-bearing as tolerated with walker

No driving until cleared

Avoid twisting movements or unsupported stairs

Continue use of CPAP nightly

Prepared by: Dr. Alan Rivera, MD
Date: 2025-04-09
Signature: __________________________




"""
response = model.generate_content(prompt)
print(response.text)

Okay, I understand. I will use the provided discharge summary to generate a care note and a care plan.

**Care Note**

**Patient Name:** Mary Thompson
**Medical Record Number:** 987654321
**Date of Admission:** 2025-04-03
**Date of Discharge:** 2025-04-09
**Date of Note:** 2025-04-09
**Provider:** (Clinical Assistant)

**Subjective:**

*   Patient discharged home today following Left Total Knee Arthroplasty (LTKA) performed on 2025-04-04.
*   Patient reports mild pain, controlled with oral medications.
*   Mood stable, no complaints of post-operative delirium or anxiety.
*   Patient verbalizes understanding of discharge instructions and medication regimen.
*   States she has her CPAP machine at home and ready to use.

**Objective:**

*   Awake, alert, and oriented to person, place, and time.
*   Ambulating with a front-wheeled walker.
*   Surgical site clean, dry, and without signs of infection.
*   Hemoglobin stable and improving (per discharge summary).
*   Vitals: (To be filled in u

In [13]:
prompt = """
Did you identify any care gapsfrom the discharge summary.


Example:
Discharge Summary:


Patient Name: Mary Thompson
Medical Record Number: 987654321
Date of Admission: 2025-04-03
Date of Discharge: 2025-04-09
Attending Physician: Dr. Alan Rivera, MD
Primary Diagnosis:

End-stage Osteoarthritis of the Left Knee (M17.12)

Secondary Diagnoses:

Obesity (BMI 35)

Obstructive Sleep Apnea (OSA)

Depression (well-controlled)

Procedure Performed
Date: 2025-04-04
Procedure: Left Total Knee Arthroplasty (Replacement)
Surgeon: Dr. Alan Rivera, MD
Anesthesia: General Anesthesia with Adductor Canal Block

Hospital Course
Patient was admitted for elective left total knee replacement due to chronic pain and significant functional limitations from advanced osteoarthritis. Preoperative clearance obtained. The procedure was completed without intraoperative complications.

Postoperatively, pain was initially difficult to manage due to patient sensitivity to opioids, but improved with adjustment to acetaminophen, celecoxib, and limited oxycodone. The patient used CPAP during hospitalization due to known OSA.

Physical therapy began on postoperative day 1. Patient progressed with ambulation using a front-wheeled walker but required additional assistance with stairs. No signs of surgical site infection. Mild anemia post-op managed conservatively.

Discharge Condition
Awake, alert, oriented

Ambulating with front-wheeled walker

Mild pain controlled with oral medication

Surgical site clean and dry

Hemoglobin stable and improving

Mood stable, no signs of post-op delirium

Medications on Discharge
Acetaminophen 650 mg, PO q6h PRN

Oxycodone 5 mg, PO q6h PRN (limit 5 days)

Celecoxib 200 mg, PO daily with food

Ferrous Sulfate 325 mg, PO BID for mild anemia

Escitalopram 10 mg, PO daily (home med)

CPAP: Continue nightly use (home device)

Discharge Plan
Discharged to home with support from adult daughter

Home PT scheduled to begin within 48 hours

Anticipated need for stair rail installation at home

Follow-up with orthopedic surgeon in 2 weeks

Instructed to monitor for signs of infection, DVT, worsening pain

Activity Restrictions
Weight-bearing as tolerated with walker

No driving until cleared

Avoid twisting movements or unsupported stairs

Continue use of CPAP nightly

Prepared by: Dr. Alan Rivera, MD
Date: 2025-04-09
Signature: __________________________
"""
response = model.generate_content(prompt)
print(response.text)


Here's an analysis of potential care gaps from the provided discharge summary:

**Potential Care Gaps:**

1.  **Obesity Management:** While obesity is listed as a secondary diagnosis, there's no explicit mention of counseling or referral for weight management. This is a significant comorbidity that could impact the long-term success of the knee replacement and overall health.

2.  **OSA Adherence:** The summary mentions CPAP use during hospitalization. However, there is no details on the patient's current CPAP adherence and if she is utilizing it at home, if she has been compliant, if there is any follow up with the sleep specialist or sleep clinic.

**Explanation of why these are care gaps:**

*   **Obesity:** Obesity is a chronic disease and a risk factor for many conditions, including osteoarthritis. Addressing obesity through lifestyle modifications (diet, exercise) and potentially other interventions could improve long-term outcomes and overall health.
*   **OSA:** Lack of adheren

In [None]:
# 📦 Install required packages (for Google Colab)
!pip install transformers datasets rouge-score fastapi uvicorn[standard] bitsandbytes accelerate --quiet


## 1. Summarize Text with Hugging Face BART

In [None]:
# ✅ Load a BART model pre-trained for summarization
from transformers import pipeline

# Create a summarization pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

# Input text
text = "Long article text goes here..."

# Generate the summary
summary = summarizer(text, max_length=130, min_length=30, do_sample=False)
print("📝 Summary:", summary[0]['summary_text'])


## 2. Fine-tune BART on CNN/DailyMail

In [None]:
# ✅ Fine-tune BART using Hugging Face `Trainer` API
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, TrainingArguments, Trainer

# Load a small portion of the dataset for demonstration
dataset = load_dataset("cnn_dailymail", "3.0.0", split="train[:1%]")
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-base")

# Preprocess function for summarization
def preprocess(examples):
    inputs = tokenizer(examples["article"], truncation=True, padding="max_length", max_length=512)
    targets = tokenizer(examples["highlights"], truncation=True, padding="max_length", max_length=128)
    inputs["labels"] = targets["input_ids"]
    return inputs

# Tokenize the dataset
tokenized_dataset = dataset.map(preprocess, batched=True)

# Load the model
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/bart-base")

# Training configuration
training_args = TrainingArguments(
    output_dir="./results",
    per_device_train_batch_size=4,
    num_train_epochs=1,
    logging_steps=10
)

# Fine-tune the model
trainer = Trainer(model=model, args=training_args, train_dataset=tokenized_dataset)
trainer.train()


## 3. Extractive vs. Abstractive Summarization

In [None]:
# ✅ Extractive summarization with spaCy (highlights original sentences)
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from heapq import nlargest

text = "Long article text here..."
nlp = spacy.load("en_core_web_sm")
doc = nlp(text)

# Calculate word frequencies
word_freq = {}
for word in doc:
    if word.text.lower() not in STOP_WORDS and word.is_alpha:
        word_freq[word.text.lower()] = word_freq.get(word.text.lower(), 0) + 1

# Score sentences based on word frequency
sentence_scores = {}
for sent in doc.sents:
    for word in sent:
        if word.text.lower() in word_freq:
            sentence_scores[sent] = sentence_scores.get(sent, 0) + word_freq[word.text.lower()]

# Extract top 3 sentences
summary_sentences = nlargest(3, sentence_scores, key=sentence_scores.get)
summary = " ".join([sent.text for sent in summary_sentences])
print("📝 Extractive Summary:", summary)


## 4. Summarize Long Documents

In [None]:
# ✅ Handle long documents using chunking
def split_text(text, chunk_size=400):
    words = text.split()
    for i in range(0, len(words), chunk_size):
        yield " ".join(words[i:i + chunk_size])

# Example input
long_text = "Very long document text..."

chunks = list(split_text(long_text))

from transformers import pipeline
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

# Summarize each chunk
summary_parts = [summarizer(chunk, max_length=130, min_length=30, do_sample=False)[0]['summary_text'] for chunk in chunks]
full_summary = " ".join(summary_parts)
print("📝 Full Summary:", full_summary)


## 5. ROUGE Evaluation

In [None]:
# ✅ Evaluate summarization quality using ROUGE metric
from datasets import load_metric

rouge = load_metric("rouge")

# Example prediction and reference
predictions = ["The company posted strong revenue growth and plans expansion."]
references = ["The company reported revenue increase and future expansion."]

# Compute ROUGE scores
results = rouge.compute(predictions=predictions, references=references)
print("📊 ROUGE Scores:", results)


## 6. Prompt-based Summarization (Chat Models)

In [None]:
# ✅ Summarize using chat/instruction-tuned LLMs
from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")

prompt = "Summarize this article:\n" + "Long article..." + "\nSummary:"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids

output_ids = model.generate(input_ids, max_new_tokens=150)
summary = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print("📝 Prompt-based Summary:", summary)


## 7. Batch Summarization from CSV

In [None]:
# ✅ Load and summarize articles from CSV
import pandas as pd
from transformers import pipeline

df = pd.read_csv("articles.csv")  # Assume column: 'content'
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

# Generate summaries for each row
df["summary"] = df["content"].apply(lambda x: summarizer(x, max_length=130, min_length=30, do_sample=False)[0]['summary_text'])
df.to_csv("summaries.csv", index=False)
print("✅ Summaries saved to summaries.csv")


## 8. REST API with FastAPI

In [None]:
# ✅ Build a summarization REST API with FastAPI
from fastapi import FastAPI
from pydantic import BaseModel
from transformers import pipeline

app = FastAPI()
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

class TextRequest(BaseModel):
    text: str

@app.post("/summarize")
def summarize(req: TextRequest):
    result = summarizer(req.text, max_length=130, min_length=30, do_sample=False)
    return {"summary": result[0]['summary_text']}

# ➤ To run: save as app.py and run `uvicorn app:app --reload`


## 9. Quantized Summarization (4-bit LLM)

In [None]:
# ✅ Use quantized LLMs for memory-efficient summarization
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_use_double_quant=True)
model = AutoModelForCausalLM.from_pretrained("TheBloke/LLaMA-2-7B-GGML", quantization_config=bnb_config, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("TheBloke/LLaMA-2-7B-GGML")

# Inference would proceed as usual using tokenizer and model


## 10. Multilingual Summarization (mBART)

In [None]:
# ✅ Summarize multilingual text using mBART
from transformers import MBartTokenizer, MBartForConditionalGeneration

model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-cc25")
tokenizer = MBartTokenizer.from_pretrained("facebook/mbart-large-cc25")

text = "Texte en français ici..."  # French input
tokenizer.src_lang = "fr_XX"

input_ids = tokenizer(text, return_tensors="pt", truncation=True, max_length=512).input_ids
summary_ids = model.generate(input_ids, max_length=100)
summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print("📝 French Summary:", summary)
