In [8]:
!pip install -q -U bitsandbytes

In [7]:
!pip install -q nltk rouge-score rank_bm25

##LOAD THE MODELS

In [1]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

# Model name and Hugging Face token
model_name = "google/gemma-2-2b-it"
hf_token = "your_hf_token"

# 8-bit quantization configuration
bnb_config = BitsAndBytesConfig(load_in_8bit=True)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)

# Load model with 8-bit quantization
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    token=hf_token
)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

List the primary and secondary medications of insomnia detection and treatment.

**Primary Medications:**

* **Benzodiazepines:**
    * **Mechanism:** Enhance the effects of GABA, a neurotransmitter that inhibits neuronal activity.
    * **Examples:** Diazepam (Valium), Lorazepam (Ativan), Alprazolam (Xanax)
* **Nonbenzodiazepine Hypnotics:**
    * **Mechanism:**  Varying mechanisms, often targeting GABA receptors or serotonin receptors.
    * **Examples:** Zolpidem (Ambien), Zaleplon (Sonata), Eszopiclone (Lunesta)
* **Other Medications:**
    * **Melatonin:** A hormone that regulates sleep-wake cycles.
    * **Antidepressants:** Certain antidepressants, particularly selective serotonin reuptake inhibitors (SSRIs), can improve sleep.

**Secondary Medications:**

* **Antihistamines:** Can help with sleep by reducing allergy symptoms.
* **Anti-anxiety Medications:** May be used to manage anxiety, which can contribute to insomnia.
* **Sleep Aids:**  Over-the-counter (OTC) sleep aids cont

##MEDICATIONS NAMES


In [None]:
# Prepare input
prompt = "List the primary and secondary medications of insomnia detection"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate output
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        do_sample=False,
    )

# Decode and print result
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

##DEFINITION 1 - DIFFCUILTY SLEEPING

In [10]:
# Prepare input
prompt = "Give some Phrases related to Difficulty Sleeping"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate output
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        do_sample=False,
    )

# Decode and print result
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Give some Phrases related to Difficulty Sleeping

Difficulty sleeping is a common problem, and there are many phrases that can be used to describe it. Here are some examples:

**General Difficulty:**

* **Having trouble falling asleep.**
* **Struggling to get a good night's rest.**
* **Unable to sleep soundly.**
* **Waking up in the middle of the night.**
* **Feeling restless at night.**
* **Experiencing insomnia.**

**Specific Causes:**

* **Anxiety keeping me awake.**
* **Stress is making it hard to sleep.**
* **My mind is racing at night.**
* **I'm tossing and turning all night.**
* **My body is tense and I can't relax.**
* **I'm waking up too early.**

**Seeking Help:**

* **I'm looking for ways to improve my sleep.**
* **I need help with my sleep.**
* **I'm concerned about my sleep quality.**
* **I'm wondering if I have a sleep disorder.**

**Other:**

* **My sleep is a nightmare.**
* **I'm exhausted but can't sleep.**
* **I'm counting sheep, but they're not helping.**
* **I'm so 

##DEFINITION 2 - DAYTIME IMPAIRMENT



In [11]:
# Prepare input
prompt = "Give some Phrases related to Daytime Impairment"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

# Generate output
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        do_sample=False,
    )

# Decode and print result
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Give some Phrases related to Daytime Impairment

Daytime impairment refers to a condition where an individual experiences difficulties with their cognitive function, attention, and/or executive functioning during the day. 

Here are some phrases related to daytime impairment:

**General:**

* **Cognitive fatigue:** Feeling mentally drained and unable to focus.
* **Brain fog:** A feeling of mental haziness and difficulty concentrating.
* **Difficulty concentrating:** Struggling to focus on tasks or conversations.
* **Mental sluggishness:** Feeling slow and unmotivated.
* **Impaired executive functioning:** Difficulty with planning, organizing, and decision-making.
* **Daytime sleepiness:** Feeling excessively tired during the day.
* **Low energy levels:** Feeling drained and lacking motivation.

**Specific:**

* **Attention deficit disorder (ADD) or attention-deficit/hyperactivity disorder (ADHD):**  These conditions can cause difficulty focusing, impulsivity, and hyperactivity.
* **Dep

##BENCHMARK METRICS

In [9]:
import nltk
import re
from rouge_score import rouge_scorer
from nltk.corpus import stopwords
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
from rank_bm25 import BM25Okapi

# Download necessary NLTK resources
nltk.download('punkt_tab')
nltk.download('stopwords')

# Load English stop words
stop_words = set(stopwords.words("english"))

def clean_text(text):
    """ Remove stop words, special characters, and extra spaces from text """
    # Tokenize text
    tokens = nltk.word_tokenize(text.lower())

    # Remove special characters (non-alphanumeric tokens)
    tokens = [re.sub(r"[^a-z0-9]", "", token) for token in tokens]

    # Remove stop words and empty tokens
    tokens = [token for token in tokens if token and token not in stop_words]

    return tokens

# Compute BLEU score
def compute_bleu(reference, generated):
    reference_tokens = clean_text(reference)
    generated_tokens = clean_text(generated)

    smoothing = SmoothingFunction().method4
    bleu_score = sentence_bleu([reference_tokens], generated_tokens, smoothing_function=smoothing)
    return bleu_score

# Compute ROUGE scores (ROUGE-1, ROUGE-2, ROUGE-L)
def compute_rouge(reference, generated):
    scorer = rouge_scorer.RougeScorer(["rouge1", "rouge2", "rougeL"], use_stemmer=True)
    scores = scorer.score(" ".join(reference), " ".join(generated))  # Convert tokens back to string
    return scores

# Compute BM25 Score for retrieval relevance
def compute_bm25(reference, generated):
    reference_tokens = clean_text(reference)  # Clean and tokenize reference
    generated_tokens = clean_text(generated)  # Clean and tokenize generated text

    bm25 = BM25Okapi([reference_tokens])  # Train BM25 on reference text
    scores = bm25.get_scores(generated_tokens)  # Pass a list of words (not a list of lists)

    return float(scores[0])  # Convert to float for printing


# Compute CHE Score (Clinical Hallucination Error)
def compute_che_score(reference, generated):
    reference_set = set(clean_text(reference))
    generated_set = set(clean_text(generated))

    hallucinated_words = generated_set - reference_set  # Words in generated text that are not in reference
    che_score = len(hallucinated_words) / len(generated_set) if len(generated_set) > 0 else 0  # Normalize by total words
    return che_score

# Example original text and generated texts from the models
original_text = """
The patient is considered to have difficulty sleeping if they report any of the following:
- Trouble initiating sleep.
- Trouble maintaining sleep.
- Waking up earlier than desired.
- An explicit mention of insomnia.
"""
generated_texts = {
    "Phi-4-mini": """List all the primary and secondary medications used for treating insomnia without explanations—just the names. (A) Zolpidem (B) Melatonin (C) Eszopiclone (D) Diphenhydramine (E) Ramelteon (F) Trazodone (G) Lorazepam (H) Valerian Root (I) Caffeine (J) Modafinil (K) Hydroxyzine (L) Phenazopyridine (M) Gabapentin (N) Melatonin (O) Clonazepam (P) Lorazepam (Q) Diphenhydramine (R) Eszopiclone (S) Ramelteon (T) Trazodone (U) Zolpidem (V) Modafinil (W) Hydroxyzine (X) Lorazepam (Y) Diphenhydramine (Z) Eszopiclone (AA) Ramelteon (BB) Trazodone (CC) Zolpidem (DD) Modafinil (EE""",
    "Gemma-2B": """List all the primary and secondary medications used for treating insomnia without explanations—just the names.

**Primary Medications:**

* Zolpidem (Ambien)
* Eszopiclone (Lunesta)
* Zaleplon (Sonata)
* Temazepam (Restoril)
* Lorazepam (Ativan)
* Oxazepam (Serax)
* Flurazepam (Dalmane)
* Triazolam (Halcion)
* Buspirone (Buspar)

**Secondary Medications:**

* Melatonin
* Doxylamine succinate (Unisom)
* Diphenhydramine (Benadryl)
* Valerian root
* Magnesium citrate
* Calcium-magnesium ratio
*  Cognitive behavioral therapy (CBT)
*  Sleep restriction therapy (SRT)
*  Stimulus control therapy (SCT)
*  Relaxation techniques
*  Exercise
*  Avoidance of caffeine and alcohol
*  Light therapy
*  Weighted blankets


Please note: This is not an exhaustive""",
    "Gemma-3-4b-it": """List all the primary and secondary medications used for treating insomnia without explanations—just the names.

**Primary Medications:**

* Zolpidem (Ambien)
* Eszopiclone (Lunesta)
* Zaleplon (Sonata)
* Temazepam (Restoril)
* Lorazepam (Ativan)
* Oxazepam (Serax)
* Flurazepam (Dalmane)
* Triazolam (Halcion)
* Buspirone (Buspar)

**Secondary Medications:**

* Melatonin
* Doxylamine succinate (Unisom)
* Diphenhydramine (Benadryl)
* Valerian root
* Magnesium citrate
* Calcium-magnesium ratio
*  Cognitive behavioral therapy (CBT)
*  Sleep restriction therapy (SRT)
*  Stimulus control therapy (SCT)
*  Relaxation techniques
*  Exercise
*  Avoidance of caffeine and alcohol
*  Light therapy
*  Weighted blankets""",
    "Phi-3-Mini": """List all the primary and secondary medications used for treating insomnia without explanations—just the names.

Document:

Insomnia Treatment Options

Insomnia is a common sleep disorder that can significantly impact your quality of life. It's characterized by difficulty falling asleep, staying asleep, or waking up too early and not being able to fall back asleep. Treatment for insomnia often involves a combination of lifestyle changes, cognitive-behavioral therapy, and medication.

Primary Medications:

- Zolpidem (Ambien)
- Eszopiclone (Lunesta)
- Ramelteon (Rozerem)
- Suvorexant (Belsomra)
- Doxepin (Silenor)

Secondary Medications:

- Trazodone (Desyrel)
- Melatonin Receptor Agonists (e.g., Agomelatine"""
}

# Iterate over generated texts and compute metrics
for model_name, generated_text in generated_texts.items():
    # Compute BLEU score
    bleu = compute_bleu(original_text, generated_text)

    # Compute ROUGE scores
    rouge_scores = compute_rouge(original_text, generated_text)

    # Compute BM25 Score
    bm25 = compute_bm25(original_text, generated_text)

    # Compute CHE Score
    che_score = compute_che_score(original_text, generated_text)

    print(f"Metrics for {model_name}:")
    print(f"BLEU Score: {bleu:.4f}")
    print(f"ROUGE-1: {rouge_scores['rouge1'].fmeasure:.4f}")
    print(f"ROUGE-2: {rouge_scores['rouge2'].fmeasure:.4f}")
    print(f"ROUGE-L: {rouge_scores['rougeL'].fmeasure:.4f}")
    print(f"BM25 Score (Retrieval Relevance): {bm25:.4f}")
    print(f"CHE Score (Hallucination Rate): {che_score:.4f}")
    print("="*40)

[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


Metrics for Phi-4-mini:
BLEU Score: 0.0049
ROUGE-1: 0.5625
ROUGE-2: 0.3531
ROUGE-L: 0.3553
BM25 Score (Retrieval Relevance): -0.2747
CHE Score (Hallucination Rate): 0.9792
Metrics for Gemma-2B:
BLEU Score: 0.0055
ROUGE-1: 0.4611
ROUGE-2: 0.3333
ROUGE-L: 0.3083
BM25 Score (Retrieval Relevance): -0.6670
CHE Score (Hallucination Rate): 0.9677
Metrics for Gemma-3-4b-it:
BLEU Score: 0.0057
ROUGE-1: 0.4811
ROUGE-2: 0.3450
ROUGE-L: 0.3161
BM25 Score (Retrieval Relevance): -0.6670
CHE Score (Hallucination Rate): 0.9661
Metrics for Phi-3-Mini:
BLEU Score: 0.0069
ROUGE-1: 0.4619
ROUGE-2: 0.4043
ROUGE-L: 0.3391
BM25 Score (Retrieval Relevance): -2.0403
CHE Score (Hallucination Rate): 0.9273
