## **Clinical Notes Summarization BART Models - BART, T5 and PEGASUS Models**

### **Clinical Notes Dataset Overview**

**UPLOAD DATASET**

The dataset consists of timestamped clinical notes recorded by healthcare providers, primarily nurses, during patient encounters and during ED visits. Each note captures specific observations, interventions, or updates made by clinical staff, particularly nurses.

In [1]:
import pandas as pd

# Load the CSV file
file_path = '/content/sample_data/Patient-3-NurseNotes - Summary.csv'
df = pd.read_csv(file_path)

In [2]:
df.head()

Unnamed: 0,VisitID,NoteSeqID,Type,ServiceDateTime,FullNoteText
0,PAT3,1,Emergency Department Notes,3/3/2025 12:31,PT ON EMS GURNEY BY FRONT NURSE STATION.
1,PAT3,2,Emergency Department Notes,3/3/2025 13:12,Pt used the bathroom. Pt is x2 assist; unable ...
2,PAT3,3,Emergency Department Notes,3/3/2025 13:15,Hand off tool: Hand off tool confirmed with...
3,PAT3,4,Emergency Department Notes,3/3/2025 13:21,CT tech taking pt for scan; pending results.
4,PAT3,5,Emergency Department Notes,3/3/2025 13:42,Pt provided food tray. VSS.


**Notes Preprocessing**

In [3]:
import nltk
import re
import string
from nltk.tokenize import word_tokenize

# Download necessary NLTK data
nltk.download('punkt')
nltk.download('punkt_tab')

def preprocess_text(text, to_lowercase=True):
    if to_lowercase:
        text = text.lower()

    # Remove URLs
    text = re.sub(r'http\S+|www\S+|https\S+', '', text)

    # Remove mentions and hashtags
    text = re.sub(r'@\w+|#', '', text)

    # Remove all punctuation except comma and period
    punct_to_remove = string.punctuation.replace(',', '').replace('.', '')
    text = text.translate(str.maketrans('', '', punct_to_remove))

    # Remove extra spaces
    text = re.sub(r'\s+', ' ', text).strip()

    # Tokenize
    tokens = word_tokenize(text)

    return ' '.join(tokens)

# Apply to DataFrame column
df['NurseNotes_cleanText'] = df['FullNoteText'].apply(preprocess_text)

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


In [4]:
# Print the first few rows of the "NurseNotes_cleanText" column
df.head()

Unnamed: 0,VisitID,NoteSeqID,Type,ServiceDateTime,FullNoteText,NurseNotes_cleanText
0,PAT3,1,Emergency Department Notes,3/3/2025 12:31,PT ON EMS GURNEY BY FRONT NURSE STATION.,pt on ems gurney by front nurse station .
1,PAT3,2,Emergency Department Notes,3/3/2025 13:12,Pt used the bathroom. Pt is x2 assist; unable ...,pt used the bathroom . pt is x2 assist unable ...
2,PAT3,3,Emergency Department Notes,3/3/2025 13:15,Hand off tool: Hand off tool confirmed with...,hand off tool hand off tool confirmed with rec...
3,PAT3,4,Emergency Department Notes,3/3/2025 13:21,CT tech taking pt for scan; pending results.,ct tech taking pt for scan pending results .
4,PAT3,5,Emergency Department Notes,3/3/2025 13:42,Pt provided food tray. VSS.,pt provided food tray . vss .


**Remove repeated phrases to reduce redundancy in clinical notes**

In [5]:
# Define the function to remove repeated phrases
def remove_repeated_phrases(text):
    if pd.isna(text):
        return text
    pattern = r'\b(\w+(?:\s+\w+){0,3})\b(?:\s+\1\b)+'
    return re.sub(pattern, r'\1', text, flags=re.IGNORECASE)

# Apply to the column
df['NurseNotes_phraseText'] = df['NurseNotes_cleanText'].apply(remove_repeated_phrases)

# Print the first few rows of the "NurseNotes_phraseText" column - removed repeated phrases in the text.
df.head()


Unnamed: 0,VisitID,NoteSeqID,Type,ServiceDateTime,FullNoteText,NurseNotes_cleanText,NurseNotes_phraseText
0,PAT3,1,Emergency Department Notes,3/3/2025 12:31,PT ON EMS GURNEY BY FRONT NURSE STATION.,pt on ems gurney by front nurse station .,pt on ems gurney by front nurse station .
1,PAT3,2,Emergency Department Notes,3/3/2025 13:12,Pt used the bathroom. Pt is x2 assist; unable ...,pt used the bathroom . pt is x2 assist unable ...,pt used the bathroom . pt is x2 assist unable ...
2,PAT3,3,Emergency Department Notes,3/3/2025 13:15,Hand off tool: Hand off tool confirmed with...,hand off tool hand off tool confirmed with rec...,hand off tool confirmed with receiving unit on...
3,PAT3,4,Emergency Department Notes,3/3/2025 13:21,CT tech taking pt for scan; pending results.,ct tech taking pt for scan pending results .,ct tech taking pt for scan pending results .
4,PAT3,5,Emergency Department Notes,3/3/2025 13:42,Pt provided food tray. VSS.,pt provided food tray . vss .,pt provided food tray . vss .


In [6]:
row_ids = [2]  # Rows 2 (0-indexed)

for idx in row_ids:
    print(f"\nRow {idx + 1} — Original Nurse Note:")
    print(df.loc[idx, 'NurseNotes_cleanText'])

    print(f"\nRow {idx + 1} — Removed repeated phrases:")
    print(df.loc[idx, 'NurseNotes_phraseText'])
    print("-" * 80)


Row 3 — Original Nurse Note:
hand off tool hand off tool confirmed with receiving unit oncoming rn chiquita .

Row 3 — Removed repeated phrases:
hand off tool confirmed with receiving unit oncoming rn chiquita .
--------------------------------------------------------------------------------


**pyspellchecker to detect and correct spelling errors in clinical notes**

In [7]:
#!pip install pyspellchecker
#!pip install pyspellchecker==0.8.2

Collecting pyspellchecker
  Downloading pyspellchecker-0.8.3-py3-none-any.whl.metadata (9.5 kB)
Downloading pyspellchecker-0.8.3-py3-none-any.whl (7.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.2/7.2 MB[0m [31m45.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pyspellchecker
Successfully installed pyspellchecker-0.8.3
Collecting pyspellchecker==0.8.2
  Downloading pyspellchecker-0.8.2-py3-none-any.whl.metadata (9.4 kB)
Downloading pyspellchecker-0.8.2-py3-none-any.whl (7.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.1/7.1 MB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pyspellchecker
  Attempting uninstall: pyspellchecker
    Found existing installation: pyspellchecker 0.8.3
    Uninstalling pyspellchecker-0.8.3:
      Successfully uninstalled pyspellchecker-0.8.3
Successfully installed pyspellchecker-0.8.2


In [8]:
from spellchecker import SpellChecker

spell = SpellChecker()

def spell_check(text):
    words = text.split()
    corrected_words = []
    for word in words:
        # Only correct if it's alphabetic AND longer than 10 characters
        if word.isalpha() and len(word) >= 8:
            corrected = spell.correction(word)
            corrected_words.append(corrected if corrected else word)
        else:
            corrected_words.append(word)
    return ' '.join(corrected_words)

# Apply to DataFrame
df['NurseNotes_spellchecked'] = df['NurseNotes_phraseText'].apply(spell_check)

In [9]:
row_ids = [5, 19]  # Rows 6 and 20 (0-indexed)

for idx in row_ids:
    print(f"\nRow {idx + 1} — Original Nurse Note:")
    print(df.loc[idx, 'NurseNotes_phraseText'])

    print(f"\nRow {idx + 1} — Grammar Corrected Note:")
    print(df.loc[idx, 'NurseNotes_spellchecked'])
    print("-" * 80)


Row 6 — Original Nurse Note:
pt bp elevated rn notified md tummala orders initated .

Row 6 — Grammar Corrected Note:
pt bp elevated rn notified md tummala orders initiated .
--------------------------------------------------------------------------------

Row 20 — Original Nurse Note:
unable to give report . nurse currently unavailble will call back .

Row 20 — Grammar Corrected Note:
unable to give report . nurse currently unavailable will call back .
--------------------------------------------------------------------------------


### **Zero-Shot Model for Notes Classifications| Restraint v\s Non-Restraint Notes**

In [10]:
#!pip install --upgrade transformers --quiet

In [11]:
from transformers import pipeline
import pandas as pd
from collections import Counter
import matplotlib.pyplot as plt

# Load the zero-shot classification model
zero_shot_classifier = pipeline("zero-shot-classification", model="facebook/bart-large-mnli")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Device set to use cpu


**Zero-Shot Classifer**

In [12]:
#def is_restraint_note(note):
    #result = zero_shot_classifier(note,
        #candidate_labels=["restraint event", "non-restraint"],
        #hypothesis_template="This note is about {}.")
    #return result['labels'][0] == "restraint event" and result['scores'][0] > 0.7  # threshold

# Apply to DataFrame
#df['is_restraint'] = df['NurseNotes_spellchecked'].apply(lambda x: is_restraint_note(str(x)))
#df_restraint = df[df['is_restraint']]


**Zero-Shot Classifier + Custom Rules**

In [13]:
# 1. Define custom keyword check
def keyword_check(text):
    keywords = ["restraint","soft wrist restriants","placed","discontinued","orders","initated","initate","continued","Renewal","sitter","bed side"]
    return any(kw.lower() in text.lower() for kw in keywords)

# 2. Combine both in one function
def is_restraint_note(text):
    # Zero-shot part
    try:
        result = zero_shot_classifier(text, candidate_labels=["restraint event", "non-restraint"])
        label = result['labels'][0]
        score = result['scores'][0]
        zero_shot_decision = label == "restraint event" and score >= 0.75
    except:
        zero_shot_decision = False

    # Custom rule part
    keyword_match = keyword_check(text)

    # Final decision
    return zero_shot_decision or keyword_match

df['is_restraint'] = df['NurseNotes_spellchecked'].apply(lambda x: is_restraint_note(str(x)))
df_restraint = df[df['is_restraint']]

In [14]:
from datetime import datetime

# Generate today's date string
today_str = datetime.today().strftime('%m%d%Y')

# Define filename with today's date
filename = f'Restraint_Notes-{today_str}.csv'

# Save DataFrame to CSV
df_restraint.to_csv(filename, index=False)

# If using Google Colab, download the file
from google.colab import files
files.download(filename)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [15]:
df_restraint.head()

Unnamed: 0,VisitID,NoteSeqID,Type,ServiceDateTime,FullNoteText,NurseNotes_cleanText,NurseNotes_phraseText,NurseNotes_spellchecked,is_restraint
1,PAT3,2,Emergency Department Notes,3/3/2025 13:12,Pt used the bathroom. Pt is x2 assist; unable ...,pt used the bathroom . pt is x2 assist unable ...,pt used the bathroom . pt is x2 assist unable ...,pt used the bathroom . pt is x2 assist unable ...,True
3,PAT3,4,Emergency Department Notes,3/3/2025 13:21,CT tech taking pt for scan; pending results.,ct tech taking pt for scan pending results .,ct tech taking pt for scan pending results .,ct tech taking pt for scan pending results .,True
5,PAT3,6,Emergency Department Notes,3/3/2025 14:38,Pt BP elevated; RN notified MD Tummala; orders...,pt bp elevated rn notified md tummala orders i...,pt bp elevated rn notified md tummala orders i...,pt bp elevated rn notified md tummala orders i...,True
8,PAT3,9,Emergency Department Notes,3/3/2025 16:10,Pt still trying to get out of bed. Ativan; no ...,pt still trying to get out of bed . ativan no ...,pt still trying to get out of bed . ativan no ...,pt still trying to get out of bed . ativan no ...,True
10,PAT3,11,Emergency Department Notes,3/3/2025 17:20,Soft wrists and ankle restraints intiated; pt ...,soft wrists and ankle restraints intiated pt i...,soft wrists and ankle restraints intiated pt i...,soft wrists and ankle restraints initiated pt ...,True


**Copying final restraint notes back to df**

In [16]:
df = df_restraint.copy()

**--------------------------------------------------------------------------------------------------------------------------------------------------------------**

### **Group Notes Day-Wise**

In [17]:
# Combine the ServiceDateTime and NurseNotes_spellchecked into a new column called FullNotes
df['FullNotes'] = df['ServiceDateTime'].apply(lambda x: f"On {x.split()[0]} at {x.split()[1]}") + " " + df['NurseNotes_spellchecked']

# Print the first few rows of the "FullNotes" column
df.head()

Unnamed: 0,VisitID,NoteSeqID,Type,ServiceDateTime,FullNoteText,NurseNotes_cleanText,NurseNotes_phraseText,NurseNotes_spellchecked,is_restraint,FullNotes
1,PAT3,2,Emergency Department Notes,3/3/2025 13:12,Pt used the bathroom. Pt is x2 assist; unable ...,pt used the bathroom . pt is x2 assist unable ...,pt used the bathroom . pt is x2 assist unable ...,pt used the bathroom . pt is x2 assist unable ...,True,On 3/3/2025 at 13:12 pt used the bathroom . pt...
3,PAT3,4,Emergency Department Notes,3/3/2025 13:21,CT tech taking pt for scan; pending results.,ct tech taking pt for scan pending results .,ct tech taking pt for scan pending results .,ct tech taking pt for scan pending results .,True,On 3/3/2025 at 13:21 ct tech taking pt for sca...
5,PAT3,6,Emergency Department Notes,3/3/2025 14:38,Pt BP elevated; RN notified MD Tummala; orders...,pt bp elevated rn notified md tummala orders i...,pt bp elevated rn notified md tummala orders i...,pt bp elevated rn notified md tummala orders i...,True,On 3/3/2025 at 14:38 pt bp elevated rn notifie...
8,PAT3,9,Emergency Department Notes,3/3/2025 16:10,Pt still trying to get out of bed. Ativan; no ...,pt still trying to get out of bed . ativan no ...,pt still trying to get out of bed . ativan no ...,pt still trying to get out of bed . ativan no ...,True,On 3/3/2025 at 16:10 pt still trying to get ou...
10,PAT3,11,Emergency Department Notes,3/3/2025 17:20,Soft wrists and ankle restraints intiated; pt ...,soft wrists and ankle restraints intiated pt i...,soft wrists and ankle restraints intiated pt i...,soft wrists and ankle restraints initiated pt ...,True,On 3/3/2025 at 17:20 soft wrists and ankle res...


In [18]:
import pandas as pd

# Ensure ServiceDateTime is in datetime format
df['ServiceDateTime'] = pd.to_datetime(df['ServiceDateTime'])

# Extract only the date part
df['ServiceDate'] = df['ServiceDateTime'].dt.date

# Group by ServiceDate and merge notes
merged_notes_df = df.groupby('ServiceDate')['FullNotes'].apply(lambda x: ' '.join(x.dropna())).reset_index()

# Rename for clarity
merged_notes_df.columns = ['ServiceDate', 'MergedNote']

# Preview
merged_notes_df.head()

Unnamed: 0,ServiceDate,MergedNote
0,2025-03-03,On 3/3/2025 at 13:12 pt used the bathroom . pt...
1,2025-03-04,On 3/4/2025 at 1:40 patient is sleeping . no s...
2,2025-03-05,On 3/5/2025 at 7:30 patient sleeping eyes clos...
3,2025-03-06,On 3/6/2025 at 7:45 opening shift note assumed...
4,2025-03-07,On 3/7/2025 at 7:05 closing shift note endorse...


In [19]:
import re

def remove_redundant_dates(text):
    # Match all instances like "On 3/7/2025"
    matches = re.findall(r'On \d{1,2}/\d{1,2}/\d{4}', text)
    seen = set()
    result = []

    for match in re.finditer(r'On \d{1,2}/\d{1,2}/\d{4}', text):
        date_str = match.group()
        if date_str not in seen:
            seen.add(date_str)
            result.append((match.start(), match.end(), date_str))
        else:
            # Mark for removal (replace later)
            result.append((match.start(), match.end(), ''))

    # Remove duplicates from the text (starting from the end to preserve indices)
    clean_text = text
    for start, end, replacement in reversed(result):
        clean_text = clean_text[:start] + replacement + clean_text[end:]

    return clean_text

# Apply to your merged notes
merged_notes_df['DaywiseNotes'] = merged_notes_df['MergedNote'].apply(remove_redundant_dates)

# Preview result
merged_notes_df[['ServiceDate', 'DaywiseNotes']]

Unnamed: 0,ServiceDate,DaywiseNotes
0,2025-03-03,On 3/3/2025 at 13:12 pt used the bathroom . pt...
1,2025-03-04,On 3/4/2025 at 1:40 patient is sleeping . no s...
2,2025-03-05,On 3/5/2025 at 7:30 patient sleeping eyes clos...
3,2025-03-06,On 3/6/2025 at 7:45 opening shift note assumed...
4,2025-03-07,On 3/7/2025 at 7:05 closing shift note endorse...
5,2025-03-08,On 3/8/2025 at 18:19 patient transferred to un...
6,2025-03-10,On 3/10/2025 at 19:30 opening shift note assum...
7,2025-03-11,On 3/11/2025 at 10:36 resident md cruz moreno ...
8,2025-03-12,On 3/12/2025 at 8:00 opening shift note assume...
9,2025-03-13,On 3/13/2025 at 9:58 resident rounds md cruz a...


In [20]:
# Step 1: Define the function to remove repeated sentences
def remove_repeated_sentences(text):
    seen = set()
    unique_sentences = []

    # Split text into sentences using '. ' as the delimiter
    for sentence in text.split('. '):
        cleaned = sentence.strip().lower()  # Normalize (remove extra spaces and lowercase)
        if cleaned and cleaned not in seen:
            seen.add(cleaned)  # Track the sentence
            unique_sentences.append(sentence.strip())  # Keep original casing

    return '. '.join(unique_sentences)

# Step 2: Apply the function to the DaywiseNotes column
merged_notes_df['DaywiseNotes_Cleaned'] = merged_notes_df['DaywiseNotes'].apply(remove_repeated_sentences)

# Step 3: Preview the cleaned column
merged_notes_df[['ServiceDate', 'DaywiseNotes_Cleaned']].head()

Unnamed: 0,ServiceDate,DaywiseNotes_Cleaned
0,2025-03-03,On 3/3/2025 at 13:12 pt used the bathroom. pt ...
1,2025-03-04,On 3/4/2025 at 1:40 patient is sleeping. no si...
2,2025-03-05,On 3/5/2025 at 7:30 patient sleeping eyes clos...
3,2025-03-06,On 3/6/2025 at 7:45 opening shift note assumed...
4,2025-03-07,On 3/7/2025 at 7:05 closing shift note endorse...


In [21]:
from datetime import datetime

# Generate today's date string
today_str = datetime.today().strftime('%m%d%Y')

# Define filename with today's date
filename = f'Final-Notes-{today_str}.csv'

# Save DataFrame to CSV
merged_notes_df.to_csv(filename, index=False)

# If using Google Colab, download the file
from google.colab import files
files.download(filename)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

**--------------------------------------------------------------------------------------------------------------------------------------------------------------**

### **BART- facebook/bart-large-cnn**

In [22]:
# Install Required Libraries
#!pip install transformers
#!pip install torch
!pip install rouge_score
!pip install nltk

Collecting rouge_score
  Downloading rouge_score-0.1.2.tar.gz (17 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: rouge_score
  Building wheel for rouge_score (setup.py) ... [?25l[?25hdone
  Created wheel for rouge_score: filename=rouge_score-0.1.2-py3-none-any.whl size=24934 sha256=6b1eb748982bfe0098b0e0f3350b5cbc85b6c2aa4de7e7ae0c72aacecccb9499
  Stored in directory: /root/.cache/pip/wheels/1e/19/43/8a442dc83660ca25e163e1bd1f89919284ab0d0c1475475148
Successfully built rouge_score
Installing collected packages: rouge_score
Successfully installed rouge_score-0.1.2


In [23]:
# Import Libraries
import pandas as pd
from transformers import BartTokenizer, BartForConditionalGeneration
from rouge_score import rouge_scorer
from nltk.translate.bleu_score import sentence_bleu
from datetime import date
import nltk

# Ensure required packages are downloaded
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [24]:
# Initialize Tokenizer and Model
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large-cnn')
model = BartForConditionalGeneration.from_pretrained('facebook/bart-large-cnn')

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

**BART Abstractive Summarization Logic**

In [25]:
#BART Abstractive Summarization Logic
def abstractive_summarization_bart(text, max_length=300, min_length=100):
    inputs = tokenizer.encode(text, return_tensors='pt', max_length=1024, truncation=True)
    summary_ids = model.generate(
        inputs,
        max_length=max_length,
        min_length=min_length,
        length_penalty=2.0,
        num_beams=8,
        early_stopping=True
    )
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# ROUGE and BLEU Evaluation
def evaluate_summary(reference, hypothesis):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    reference_tokens = reference.split()
    hypothesis_tokens = hypothesis.split()
    bleu_score = sentence_bleu([reference_tokens], hypothesis_tokens)
    return {
        "ROUGE-1": round(scores['rouge1'].fmeasure, 4),
        "ROUGE-2": round(scores['rouge2'].fmeasure, 4),
        "ROUGE-L": round(scores['rougeL'].fmeasure, 4),
        "BLEU": round(bleu_score, 4)
    }

In [26]:
# Step 1: Initialize lists
summaries = []
metrics = []

# Step 2: Run BART summarization and evaluation on each note
for note in merged_notes_df['DaywiseNotes_Cleaned']:
    summary = abstractive_summarization_bart(note)
    summaries.append(summary)
    metric = evaluate_summary(note, summary)
    metrics.append(metric)

# Step 3: Create the new DataFrame df_bart
df_bart = pd.DataFrame({
    'ServiceDate': merged_notes_df['ServiceDate'].values,
    'DaywiseNotes_Cleaned': merged_notes_df['DaywiseNotes_Cleaned'].values,
    'Summary_BART': summaries,
    'Metrics_BART': metrics
})

# Step 4: Save to CSV
from datetime import date
today = date.today().strftime("%Y-%m-%d")
df_bart.to_csv(f'BART_Summary_{today}.csv', index=False)
print(f'Saved as BART_Summary_{today}.csv')

Saved as BART_Summary_2025-06-11.csv


**BART - Hyper-parameter test to select best combination and good evalution metrics**

In [None]:
# Abstractive Summarization Logic with Dynamic Parameters
def abstractive_summarization_bart(text, max_length, min_length, length_penalty, num_beams):
    prompt = "Summarize the following nurse note, concentrating on restraint orders and patient behavior:\n\n"
    final_input = prompt + text

    inputs = tokenizer.encode(final_input, return_tensors='pt', max_length=1024, truncation=True)
    summary_ids = model.generate(
        inputs,
        max_length=max_length,
        min_length=min_length,
        length_penalty=length_penalty,
        num_beams=num_beams,
        early_stopping=True,
        no_repeat_ngram_size=3
    )
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# ROUGE and BLEU Evaluation (No smoothing)
from nltk.translate.bleu_score import sentence_bleu

def evaluate_summary(reference, hypothesis):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    reference_tokens = reference.split()
    hypothesis_tokens = hypothesis.split()

    bleu_score = sentence_bleu([reference_tokens], hypothesis_tokens)  # No smoothing

    return {
        "ROUGE-1": round(scores['rouge1'].fmeasure, 4),
        "ROUGE-2": round(scores['rouge2'].fmeasure, 4),
        "ROUGE-L": round(scores['rougeL'].fmeasure, 4),
        "BLEU": round(bleu_score, 4)
    }

# Sample Data for Testing
texts = ['''
On 3/3/2025 at 12:31 pt on ems gurney by front nurse station . at 13:12 pt used the bathroom . pt is x2 assist unable to provide ua specimen will try later again . at 13:15 hand off tool confirmed with receiving unit oncoming rn chiquita .  at 13:21 ct tech taking pt for scan pending results .
at 13:42 pt provided food tray . vss .  at 14:38 pt bp elevated rn notified md tummala orders initiated . at 14:53 labatolol held bp 13999 mmhg . at 15:38 pt keeps getting out of bed multiple times . vitals bp improving . rn notified md tummala in regards pt being restless and agitated will initiate medication for comfort .
at 16:10 pt still trying to get out of bed . ativan no effect . will initiated restraints if pt does not cooperate .  at 17:05 np opkan at bedside assessing the pt . at 17:20 soft wrists and ankle restraints initiated pt is too combative towards staff and not following nurses orders . pt constantly pulling out lines and getting out of bed . neuro pt is not confuse aox4 . perrla no issues . cnl noted .
at 17:47 soft wrists and ankle restraints initiated pt constantly pulling out lines and getting out of bed . neuro aox2 only to self and place . perrla no issues . cnl noted .  at 17:48 md tummala at bedside to assess pt . pt keeps yelling orders initiated .
at 18:33 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse . at 19:30 initial contact pt presents to ed with fall injury , contusion to l side of the face , swollen , and skin tight . pt reports having no pain . pt presents with 1 edema to the l hand . pt denies any ss of cp , sob , nvd ,
dizziness or blurred vision . no ss of respiratory distress . 20g iv placed in the l fa . pt aox2 pmh tbi , htn , dm , schizophrenia , seziure pt currently placed in bed connected to cardiac , bp , and pulse ox monitoring . comfort and safety measures implemented .  at 19:30 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse .
at 19:30 offered pt urinal at this time . pt refused stating he wants to go to snow beach . attempted to reorient pt back to hospital setting . pt ao x 2 at 20:43 bed assignment bed 285b nurse ivy sitter room at 20:50 tele req and sent
at 21:00 unable to give report . nurse currently unavailable will call back . at 19:30 pt skin intact for restraints  at 21:14 unable to give report . nurse currently unavailable will call back . at 21:19 nurse unavailable , nurse ivy stated she will call back to our extension
at 20:30 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse . at 20:30 skin assessment skin intact for restraints . at 20:30 offered pt urinal . denied at this time
at 21:27 report given to nurse , ivy . all questions answered . at 21:27 pt being transferred to floor with restraints . skin intact with adequate room on cuffs at 21:40 pt . placed in soft restraints admitted from er . patient is calm and very sleepy . received patient with 4 point restraint applied . per report patient was very agitated and combative in er pulling at lines or tubes
and attempting to ambulate , risk of falling and is non compliant with safety instruction . dc 4 point restraint and applied soft restraint to both wrist as patient is calm and very sleepy at this time . medical non violent restraint order obtained from np okpan . will continue to monitor if patient needs continuous restraint . not able to contact patients family at this time . restraints placed , see restraint assessment for further charting .
at 21:00 patient bathelinen change patient is incontinent in urine . keep patient clean and dry .skin integrity assessed for any changes . linens changed . patient repositioned for comfort . fall precaution and seizure precaution initiated . will continue to monitor . addendum 030325
at 2341 by alas , rn 030325 at 2200 patient bathelinen change patient is incontinent in urine . keep patient clean and dry .skin integrity assessed for any changes . linens changed . patient repositioned for comfort . fall precaution and seizure precaution initiated . will continue to monitor .
at 23:40 patient is calm and sleeping . no sign of distresssob noted . sitter at bed side . bed alarm on for safety . will continue to monitor patient .
''']

# Hyperparameter Ranges
length_penalties = [0.8, 1.0, 1.2, 1.5, 1.8, 2.0]
num_beams_list = [2, 4, 6, 8]
max_lengths = [150, 200, 250]
min_lengths = [10, 20, 30]

# Loop through all combinations
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M")
output_file = open(f"Summary_BART_{timestamp}.txt", "w", encoding="utf-8")

for text in texts:
    for max_len in max_lengths:
        for min_len in min_lengths:
            for length_penalty in length_penalties:
                for num_beams in num_beams_list:
                    output_file.write(f"\n=== Parameters: max_length={max_len}, min_length={min_len}, length_penalty={length_penalty}, num_beams={num_beams} ===\n")
                    summary = abstractive_summarization_bart(
                        text,
                        max_length=max_len,
                        min_length=min_len,
                        length_penalty=length_penalty,
                        num_beams=num_beams
                    )
                    output_file.write(f"Abstractive Summary:\n{summary}\n")
                    scores = evaluate_summary(text, summary)
                    output_file.write(f"Evaluation Metrics:\n{scores}\n")
                    output_file.write("-" * 100 + "\n")

output_file.close()



### **T5 - google/flan-t5-large**

In [27]:
from transformers import T5Tokenizer, T5ForConditionalGeneration
from rouge_score import rouge_scorer
from nltk.translate.bleu_score import sentence_bleu
import torch

In [28]:
# Initialize Tokenizer and Model
tokenizer = T5Tokenizer.from_pretrained('google/flan-t5-large')
model = T5ForConditionalGeneration.from_pretrained('google/flan-t5-large')

tokenizer_config.json:   0%|          | 0.00/2.54k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/2.20k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/2.42M [00:00<?, ?B/s]

You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what it means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565


config.json:   0%|          | 0.00/662 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/3.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

**T5 Abstractive Summarization Logic**

In [31]:
# T5 Abstractive Summarization Logic
def abstractive_summarization_t5(text, max_length=300, min_length=100):
    # T5 expects a task-specific prompt
    input_text = "summarize: " + text

    # Tokenization and generation
    inputs = tokenizer.encode(input_text, return_tensors='pt', max_length=1024, truncation=True)
    summary_ids = model.generate(
        inputs,
        max_length=max_length,
        min_length=min_length,
        length_penalty=2.0,
        #length_penalty=1.0,
        num_beams=8,
        #num_beams=6,
        early_stopping=True,
        no_repeat_ngram_size=3
    )
    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# ROUGE and BLEU Evaluation
def evaluate_summary(reference, hypothesis):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    reference_tokens = reference.split()
    hypothesis_tokens = hypothesis.split()
    bleu_score = sentence_bleu([reference_tokens], hypothesis_tokens)

    return {
        "ROUGE-1": scores['rouge1'].fmeasure,
        "ROUGE-2": scores['rouge2'].fmeasure,
        "ROUGE-L": scores['rougeL'].fmeasure,
        "BLEU": bleu_score
    }

In [32]:
# Step 1: Initialize lists
summaries = []
metrics = []

# Step 2: Run T5 summarization and evaluation on each note
for note in merged_notes_df['DaywiseNotes_Cleaned']:
    summary = abstractive_summarization_t5(note)
    summaries.append(summary)
    metric = evaluate_summary(note, summary)
    metrics.append(metric)

# Step 3: Create the new DataFrame df_t5
df_t5 = pd.DataFrame({
    'ServiceDate': merged_notes_df['ServiceDate'].values,
    'DaywiseNotes_Cleaned': merged_notes_df['DaywiseNotes_Cleaned'].values,
    'Summary_T5': summaries,
    'Metrics_T5': metrics
})

# Step 4: Save to CSV
from datetime import date
today = date.today().strftime("%Y-%m-%d")
df_t5.to_csv(f'T5_Summary1_{today}.csv', index=False)
print(f'Saved as T5_Summary_{today}.csv')


Saved as T5_Summary_2025-06-11.csv


**T5 - Hyper-parameter test to select best combination and good evalution metrics**

In [None]:
# Abstractive Summarization Logic for T5 with Dynamic Parameters
def abstractive_summarization_t5(text, max_length, min_length, length_penalty, num_beams):
    input_text = "summarize: " + text
    inputs = tokenizer.encode(input_text, return_tensors='pt', max_length=1024, truncation=True)

    summary_ids = model.generate(
        inputs,
        max_length=max_length,
        min_length=min_length,
        length_penalty=length_penalty,
        num_beams=num_beams,
        early_stopping=True,
        no_repeat_ngram_size=3
    )

    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# ROUGE and BLEU Evaluation (No smoothing)
from nltk.translate.bleu_score import sentence_bleu
from rouge_score import rouge_scorer

def evaluate_summary(reference, hypothesis):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    reference_tokens = reference.split()
    hypothesis_tokens = hypothesis.split()

    bleu_score = sentence_bleu([reference_tokens], hypothesis_tokens)  # No smoothing

    return {
        "ROUGE-1": round(scores['rouge1'].fmeasure, 4),
        "ROUGE-2": round(scores['rouge2'].fmeasure, 4),
        "ROUGE-L": round(scores['rougeL'].fmeasure, 4),
        "BLEU": round(bleu_score, 4)
    }

# Sample Data for Testing
texts = ['''
On 3/3/2025 at 12:31 pt on ems gurney by front nurse station . at 13:12 pt used the bathroom . pt is x2 assist unable to provide ua specimen will try later again . at 13:15 hand off tool confirmed with receiving unit oncoming rn chiquita .  at 13:21 ct tech taking pt for scan pending results .
at 13:42 pt provided food tray . vss .  at 14:38 pt bp elevated rn notified md tummala orders initiated . at 14:53 labatolol held bp 13999 mmhg . at 15:38 pt keeps getting out of bed multiple times . vitals bp improving . rn notified md tummala in regards pt being restless and agitated will initiate medication for comfort .
at 16:10 pt still trying to get out of bed . ativan no effect . will initiated restraints if pt does not cooperate .  at 17:05 np opkan at bedside assessing the pt . at 17:20 soft wrists and ankle restraints initiated pt is too combative towards staff and not following nurses orders . pt constantly pulling out lines and getting out of bed . neuro pt is not confuse aox4 . perrla no issues . cnl noted .
at 17:47 soft wrists and ankle restraints initiated pt constantly pulling out lines and getting out of bed . neuro aox2 only to self and place . perrla no issues . cnl noted .  at 17:48 md tummala at bedside to assess pt . pt keeps yelling orders initiated .
at 18:33 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse . at 19:30 initial contact pt presents to ed with fall injury , contusion to l side of the face , swollen , and skin tight . pt reports having no pain . pt presents with 1 edema to the l hand . pt denies any ss of cp , sob , nvd ,
dizziness or blurred vision . no ss of respiratory distress . 20g iv placed in the l fa . pt aox2 pmh tbi , htn , dm , schizophrenia , seziure pt currently placed in bed connected to cardiac , bp , and pulse ox monitoring . comfort and safety measures implemented .  at 19:30 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse .
at 19:30 offered pt urinal at this time . pt refused stating he wants to go to snow beach . attempted to reorient pt back to hospital setting . pt ao x 2 at 20:43 bed assignment bed 285b nurse ivy sitter room at 20:50 tele req and sent
at 21:00 unable to give report . nurse currently unavailable will call back . at 19:30 pt skin intact for restraints  at 21:14 unable to give report . nurse currently unavailable will call back . at 21:19 nurse unavailable , nurse ivy stated she will call back to our extension
at 20:30 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse . at 20:30 skin assessment skin intact for restraints . at 20:30 offered pt urinal . denied at this time
at 21:27 report given to nurse , ivy . all questions answered . at 21:27 pt being transferred to floor with restraints . skin intact with adequate room on cuffs at 21:40 pt . placed in soft restraints admitted from er . patient is calm and very sleepy . received patient with 4 point restraint applied . per report patient was very agitated and combative in er pulling at lines or tubes
and attempting to ambulate , risk of falling and is non compliant with safety instruction . dc 4 point restraint and applied soft restraint to both wrist as patient is calm and very sleepy at this time . medical non violent restraint order obtained from np okpan . will continue to monitor if patient needs continuous restraint . not able to contact patients family at this time . restraints placed , see restraint assessment for further charting .
at 21:00 patient bathelinen change patient is incontinent in urine . keep patient clean and dry .skin integrity assessed for any changes . linens changed . patient repositioned for comfort . fall precaution and seizure precaution initiated . will continue to monitor . addendum 030325
at 2341 by alas , rn 030325 at 2200 patient bathelinen change patient is incontinent in urine . keep patient clean and dry .skin integrity assessed for any changes . linens changed . patient repositioned for comfort . fall precaution and seizure precaution initiated . will continue to monitor .
at 23:40 patient is calm and sleeping . no sign of distresssob noted . sitter at bed side . bed alarm on for safety . will continue to monitor patient .
''']

# Hyperparameter Ranges
length_penalties = [0.8, 1.0, 1.2, 1.5, 1.8, 2.0]
num_beams_list = [2, 4, 6, 8]
max_lengths = [150, 200, 250]
min_lengths = [10, 20, 30]

# Output File Setup
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M")
output_file = open(f"Summary_T5_{timestamp}.txt", "w", encoding="utf-8")

# Hyperparameter Search Loop
for text in texts:
    for max_len in max_lengths:
        for min_len in min_lengths:
            for length_penalty in length_penalties:
                for num_beams in num_beams_list:
                    output_file.write(f"\n=== Parameters: max_length={max_len}, min_length={min_len}, length_penalty={length_penalty}, num_beams={num_beams} ===\n")
                    summary = abstractive_summarization_t5(
                        text,
                        max_length=max_len,
                        min_length=min_len,
                        length_penalty=length_penalty,
                        num_beams=num_beams
                    )
                    output_file.write(f"Abstractive Summary:\n{summary}\n")
                    scores = evaluate_summary(text, summary)
                    output_file.write(f"Evaluation Metrics:\n{scores}\n")
                    output_file.write("-" * 100 + "\n")

output_file.close()


### **PEGASUS - google/pegasus-large**

In [33]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from datasets import load_metric
from rouge_score import rouge_scorer
from nltk.translate.bleu_score import sentence_bleu
import nltk

nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [34]:
# Initialize Model and Tokenizer
tokenizer = AutoTokenizer.from_pretrained('google/pegasus-large')
model = AutoModelForSeq2SeqLM.from_pretrained('google/pegasus-large')

tokenizer_config.json:   0%|          | 0.00/88.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/3.09k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/1.91M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/65.0 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/2.28G [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.28G [00:00<?, ?B/s]

Some weights of PegasusForConditionalGeneration were not initialized from the model checkpoint at google/pegasus-large and are newly initialized: ['model.decoder.embed_positions.weight', 'model.encoder.embed_positions.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


generation_config.json:   0%|          | 0.00/260 [00:00<?, ?B/s]

**PEGASUS Abstractive Summarization Logic**

In [35]:
# PEGASUS Abstractive Summarization Logic
def abstractive_summarization_pegasus(text, max_length=300, min_length=100):
    # No task prefix needed for PEGASUS
    inputs = tokenizer.encode(text, return_tensors='pt', max_length=1024, truncation=True)

    summary_ids = model.generate(
        inputs,
        max_length=max_length,
        min_length=min_length,
        length_penalty=2.0,
        num_beams=8,
        early_stopping=True,
        no_repeat_ngram_size=3
    )

    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary

# ROUGE and BLEU Evaluation (same as before, no smoothing)
from nltk.translate.bleu_score import sentence_bleu
from rouge_score import rouge_scorer

def evaluate_summary(reference, hypothesis):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    reference_tokens = reference.split()
    hypothesis_tokens = hypothesis.split()
    bleu_score = sentence_bleu([reference_tokens], hypothesis_tokens)  # No smoothing

    return {
        "ROUGE-1": round(scores['rouge1'].fmeasure, 4),
        "ROUGE-2": round(scores['rouge2'].fmeasure, 4),
        "ROUGE-L": round(scores['rougeL'].fmeasure, 4),
        "BLEU": round(bleu_score, 4)
    }


In [36]:
# Step 1: Initialize lists
summaries = []
metrics = []

# Step 2: Run PEGASUS summarization and evaluation on each note
for note in merged_notes_df['DaywiseNotes_Cleaned']:
    summary = abstractive_summarization_pegasus(note)  # Use PEGASUS summarizer
    summaries.append(summary)
    metric = evaluate_summary(note, summary)
    metrics.append(metric)

# Step 3: Create the new DataFrame df_pegasus
df_pegasus = pd.DataFrame({
    'ServiceDate': merged_notes_df['ServiceDate'].values,
    'DaywiseNotes_Cleaned': merged_notes_df['DaywiseNotes_Cleaned'].values,
    'Summary_Pegasus': summaries,
    'Metrics_Pegasus': metrics
})

# Step 4: Save to CSV
from datetime import date
today = date.today().strftime("%Y-%m-%d")
df_pegasus.to_csv(f'Pegasus_Summary_{today}.csv', index=False)
print(f'Saved as Pegasus_Summary_{today}.csv')


Passing a tuple of `past_key_values` is deprecated and will be removed in Transformers v4.58.0. You should pass an instance of `EncoderDecoderCache` instead, e.g. `past_key_values=EncoderDecoderCache.from_legacy_cache(past_key_values)`.


Saved as Pegasus_Summary_2025-06-11.csv


**PEGASUS - Hyper-parameter test to select best combination and good evalution metrics**

In [None]:
# Abstractive Summarization Logic for PEGASUS with Dynamic Parameters
def abstractive_summarization_pegasus(text, max_length, min_length, length_penalty, num_beams):
    # PEGASUS does NOT require a task prefix like T5
    inputs = tokenizer.encode(text, return_tensors='pt', max_length=1024, truncation=True)

    summary_ids = model.generate(
        inputs,
        max_length=max_length,
        min_length=min_length,
        length_penalty=length_penalty,
        num_beams=num_beams,
        early_stopping=True,
        no_repeat_ngram_size=3
    )

    summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    return summary


# ROUGE and BLEU Evaluation (No smoothing)
from nltk.translate.bleu_score import sentence_bleu
from rouge_score import rouge_scorer

def evaluate_summary(reference, hypothesis):
    scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
    scores = scorer.score(reference, hypothesis)
    reference_tokens = reference.split()
    hypothesis_tokens = hypothesis.split()

    bleu_score = sentence_bleu([reference_tokens], hypothesis_tokens)  # No smoothing

    return {
        "ROUGE-1": round(scores['rouge1'].fmeasure, 4),
        "ROUGE-2": round(scores['rouge2'].fmeasure, 4),
        "ROUGE-L": round(scores['rougeL'].fmeasure, 4),
        "BLEU": round(bleu_score, 4)
    }


# Sample Data for Testing
texts = ['''
On 3/3/2025 at 12:31 pt on ems gurney by front nurse station . at 13:12 pt used the bathroom . pt is x2 assist unable to provide ua specimen will try later again . at 13:15 hand off tool confirmed with receiving unit oncoming rn chiquita .  at 13:21 ct tech taking pt for scan pending results .
at 13:42 pt provided food tray . vss .  at 14:38 pt bp elevated rn notified md tummala orders initiated . at 14:53 labatolol held bp 13999 mmhg . at 15:38 pt keeps getting out of bed multiple times . vitals bp improving . rn notified md tummala in regards pt being restless and agitated will initiate medication for comfort .
at 16:10 pt still trying to get out of bed . ativan no effect . will initiated restraints if pt does not cooperate .  at 17:05 np opkan at bedside assessing the pt . at 17:20 soft wrists and ankle restraints initiated pt is too combative towards staff and not following nurses orders . pt constantly pulling out lines and getting out of bed . neuro pt is not confuse aox4 . perrla no issues . cnl noted .
at 17:47 soft wrists and ankle restraints initiated pt constantly pulling out lines and getting out of bed . neuro aox2 only to self and place . perrla no issues . cnl noted .  at 17:48 md tummala at bedside to assess pt . pt keeps yelling orders initiated .
at 18:33 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse . at 19:30 initial contact pt presents to ed with fall injury , contusion to l side of the face , swollen , and skin tight . pt reports having no pain . pt presents with 1 edema to the l hand . pt denies any ss of cp , sob , nvd ,
dizziness or blurred vision . no ss of respiratory distress . 20g iv placed in the l fa . pt aox2 pmh tbi , htn , dm , schizophrenia , seziure pt currently placed in bed connected to cardiac , bp , and pulse ox monitoring . comfort and safety measures implemented .  at 19:30 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse .
at 19:30 offered pt urinal at this time . pt refused stating he wants to go to snow beach . attempted to reorient pt back to hospital setting . pt ao x 2 at 20:43 bed assignment bed 285b nurse ivy sitter room at 20:50 tele req and sent
at 21:00 unable to give report . nurse currently unavailable will call back . at 19:30 pt skin intact for restraints  at 21:14 unable to give report . nurse currently unavailable will call back . at 21:19 nurse unavailable , nurse ivy stated she will call back to our extension
at 20:30 pt is restless on soft wrist restraints . pulses are all palpable no issues . vss . pt is confuse . at 20:30 skin assessment skin intact for restraints . at 20:30 offered pt urinal . denied at this time
at 21:27 report given to nurse , ivy . all questions answered . at 21:27 pt being transferred to floor with restraints . skin intact with adequate room on cuffs at 21:40 pt . placed in soft restraints admitted from er . patient is calm and very sleepy . received patient with 4 point restraint applied . per report patient was very agitated and combative in er pulling at lines or tubes
and attempting to ambulate , risk of falling and is non compliant with safety instruction . dc 4 point restraint and applied soft restraint to both wrist as patient is calm and very sleepy at this time . medical non violent restraint order obtained from np okpan . will continue to monitor if patient needs continuous restraint . not able to contact patients family at this time . restraints placed , see restraint assessment for further charting .
at 21:00 patient bathelinen change patient is incontinent in urine . keep patient clean and dry .skin integrity assessed for any changes . linens changed . patient repositioned for comfort . fall precaution and seizure precaution initiated . will continue to monitor . addendum 030325
at 2341 by alas , rn 030325 at 2200 patient bathelinen change patient is incontinent in urine . keep patient clean and dry .skin integrity assessed for any changes . linens changed . patient repositioned for comfort . fall precaution and seizure precaution initiated . will continue to monitor .
at 23:40 patient is calm and sleeping . no sign of distresssob noted . sitter at bed side . bed alarm on for safety . will continue to monitor patient .
''']

# Hyperparameter Ranges
length_penalties = [0.8, 1.0, 1.2, 1.5, 1.8, 2.0]
num_beams_list = [2, 4, 6, 8]
max_lengths = [150, 200, 250]
min_lengths = [10, 20, 30]

# Output File Setup
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M")
output_file = open(f"Summary_Pegasus_{timestamp}.txt", "w", encoding="utf-8")

# Hyperparameter Search Loop
for text in texts:
    for max_len in max_lengths:
        for min_len in min_lengths:
            for length_penalty in length_penalties:
                for num_beams in num_beams_list:
                    output_file.write(f"\n=== Parameters: max_length={max_len}, min_length={min_len}, length_penalty={length_penalty}, num_beams={num_beams} ===\n")
                    summary = abstractive_summarization_pegasus(
                        text,
                        max_length=max_len,
                        min_length=min_len,
                        length_penalty=length_penalty,
                        num_beams=num_beams
                    )
                    output_file.write(f"Abstractive Summary:\n{summary}\n")
                    scores = evaluate_summary(text, summary)
                    output_file.write(f"Evaluation Metrics:\n{scores}\n")
                    output_file.write("-" * 100 + "\n")

output_file.close()
