# Zero-Shot Evaluation of Llama-3.2-3B-Instruct

In this notebook, we perform zero-shot evaluation on the base Llama-3.2-3B-Instruct model using carefully curated prompts. Depending on the quality of the output, we are then going to decide whether we should fine-tune the model or not.

Here is the link to the model on Hugging Face: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

## Step 0: Mounting Google Drive and Importing Libraries


In [None]:
from google.colab import drive
drive.mount('/content/drive')
%cd /content/drive/MyDrive/multimodal-xray-agent
!ls

In [2]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from huggingface_hub import login

In [None]:
login()

## Step 1: Verifying GPU and Environment

In [4]:
if torch.cuda.is_available():
    device_name = torch.cuda.get_device_name(0)
    device = torch.device("cuda")
    print(f"GPU detected: {device_name}")
else:
    device = torch.device("cpu")
    print("GPU not detected. Falling back to CPU.")

print(f"Running on device: {device}")

GPU detected: Tesla T4
Running on device: cuda


## Step 2: Loading the Model and Tokenizer

In [5]:
model_id = "meta-llama/Llama-3.2-3B-Instruct"

In [None]:
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="left")

In [23]:
tokenizer.pad_token = tokenizer.eos_token

In [None]:
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

In [None]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer
)

## Step 3: Testing Model Responses with Different Prompts

In [28]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Hyperexpanded lungs, suggesting chronic obstructive pulmonary disease. No acute pulmonary process.

### Instruction:
Use the context above to directly and professionally answer the user’s question as a radiology report summary. Do not go beyond what is provided in the context.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=150,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

The key thoracic finding on this patient's imaging is hyperexpansion of the lungs, indicative of chronic obstructive pulmonary disease (COPD). There is no evidence of an acute pulmonary process.


In [27]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Hyperexpanded lungs, suggesting chronic obstructive pulmonary disease. No acute pulmonary process.

### Instruction:
Use the context above to directly and professionally answer the user’s question as a radiology report summary. Do not go beyond what is provided in the context. Also, provide an explanation of the key diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=150,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

**Summary of Thoracic Findings:**

The chest radiograph reveals hyperexpanded lungs with no evidence of acute pulmonary processes. This suggests a chronic condition affecting lung expansion, consistent with chronic obstructive pulmonary disease (COPD).

**Key Diagnostic Terms:**

*   **Hyperexpanded lungs**: A condition where the lungs are over-inflated, resulting in increased lung volumes.
*   **Chronic obstructive pulmonary disease (COPD)**: A progressive lung disease characterized by airflow limitation, often caused by smoking or environmental factors.

These findings indicate that the patient's respiratory system has been affected by a long-standing condition, leading to persistent changes in lung structure and function.


In [26]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Hyperexpanded lungs, suggesting chronic obstructive pulmonary disease. No acute pulmonary process.

### Instruction:
Use the context above to directly and professionally answer the user’s question as a radiology report summary. Do not go beyond what is provided in the context. Also, provide a terse explanation of the key diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=150,
    temperature=0.7,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

The key thoracic findings are hyperexpanded lungs with no evidence of an acute pulmonary process. Hyperexpansion refers to increased lung volume that can be seen on imaging due to air trapping or other pathologies, such as COPD (Chronic Obstructive Pulmonary Disease). 

In this case, it suggests chronic obstruction leading to inadequate expiration and retention of air in the lungs, causing them to expand more than normal. This finding supports the diagnosis of COPD without any evidence of acute infection or inflammation.


In [25]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Hyperexpanded lungs, suggesting chronic obstructive pulmonary disease. No acute pulmonary process.

### Instruction:
Use the context above to directly and professionally answer the user’s question as a radiology report summary. Do not go beyond what is provided in the context. Do not add inferred pathophysiology. Only use diagnostic terms explicitly present in the context. Provide a terse definition of the terms if they appear in the context.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=150,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

Key thoracic findings include hyperexpanded lungs, indicating chronic obstructive pulmonary disease (COPD). No acute pulmonary processes are evident.


In [29]:
# Define your custom prompt
prompt = """### Question:
What is the radiology impression?

### Context:
There is a right lower lobe opacity with air bronchograms. Mild cardiomegaly is present. No pleural effusion or pneumothorax.

### Instruction:
Use the context above to directly and professionally answer the user’s question as a radiology report summary. Do not go beyond what is provided in the context. Do not add inferred pathophysiology. Only use diagnostic terms explicitly present in the context. Provide a terse definition of the terms if they appear in the context.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=150,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

The radiology impression is a right lower lobe opacity with air bronchograms, mild cardiomegaly, no pleural effusion, and no pneumothorax. 

**Definitions:**

- **Right lower lobe opacity**: A radiographic finding indicating abnormality within the lung tissue.
- **Air bronchograms**: A sign seen on chest X-rays where air-filled bronchi are visible through opacified alveoli.
- **Mild cardiomegaly**: Enlargement of the heart that is less than 16mm in diameter at its greatest dimension.
- **Pleural effusion**: The accumulation of fluid between the layers of the pleura outside the lungs.
- **Pneumothorax


Comment: A 16 mm heart would be smaller than a toddler’s fist. This looks like a hallucinated or misremembered threshold.

In [30]:
# Define your custom prompt
prompt = """### Question:
What are the thoracic findings and their significance?

### Context:
There is a faint left perihilar haziness that may represent early infiltrate or artifact. Cardiac silhouette and mediastinum are within normal limits. No pleural effusion or pneumothorax is seen.

### Instruction:
Using only the information provided, summarize the thoracic findings in a concise radiology report style. Do not speculate beyond the context. If any finding is uncertain or equivocal, clearly indicate that. Provide terse definitions of any diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=150,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

**Thoracic Findings:** Faint left perihilar haziness.
**Significance:** Uncertain, possibly early infiltrate or artifact; no other abnormalities present. **Definitions:** Perihilar refers to an area surrounding the hilum (the region where the bronchi, blood vessels, and nerves enter the lung). Infiltrate refers to abnormal tissue growth within the lung parenchyma. Artifact refers to an image distortion caused by external factors such as patient movement during imaging.


In [31]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Chest radiograph shows bibasilar reticulonodular opacities, more pronounced on the right. Mild blunting of the left costophrenic angle suggests a small effusion. The cardiac silhouette is within normal limits. No pneumothorax or acute consolidation is noted. Trachea is midline. Bony thorax appears intact with no obvious rib fractures or lytic lesions.

### Instruction:
Use the context above to directly and professionally answer the user’s question as a radiology report summary. Do not go beyond what is provided in the context. Do not infer pathophysiology. Provide terse, accurate definitions of any diagnostic terms that appear in the context. Output must include the structured headings: **Thoracic Findings** and **Definitions**.
"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=256,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

#### Thoracic Findings
- Bibasilar reticulonodular opacities
- Mild blunting of the left costophrenic angle

#### Definitions
- **Bibasilar**: Refers to the lower parts of both lungs (the bases).
- **Reticulonodular**: A pattern of lung tissue appearance characterized by a mixture of reticular (network-like) and nodular (small masses) densities.
- **Costophrenic angle**: An angle formed between the ribs and the diaphragm.
- **Effusion**: Accumulation of fluid in a body cavity, often seen as blunting of the costophrenic angle on a chest X-ray.


In [32]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Chest radiograph shows bibasilar reticulonodular opacities, more pronounced on the right. Mild blunting of the left costophrenic angle suggests a small effusion. The cardiac silhouette is within normal limits. No pneumothorax or acute consolidation is noted. Trachea is midline. Bony thorax appears intact with no obvious rib fractures or lytic lesions.

### Instruction:
Using only the information provided, summarize the thoracic findings in a concise radiology report style. Do not speculate beyond the context. If any finding is uncertain or equivocal, clearly indicate that. Provide terse definitions of any diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=256,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

**Thoracic Findings:**

* **Bibasilar Reticulonodular Opacities:** Bilateral interstitial infiltrates with a reticular pattern and nodules, typically indicating an inflammatory or infectious process.
* **Mild Blunting of Left Costophrenic Angle:** Slight elevation of the costophrenic angle, suggesting a small pleural effusion (less than 200 mL).
* **Normal Cardiac Silhouette:** The heart's size and shape appear unchanged, ruling out significant cardiomegaly or pericardial disease.
* **Midline Trachea:** The trachea's position is centered, which is normal and does not suggest any significant airway obstruction or mediastinal shift.
* **Intact Bony Thorax:** No evidence of rib fractures, lytic lesions, or other bony abnormalities affecting the thoracic cage.

Note: These findings are based solely on the provided chest radiograph description and do not account for potential clinical correlations or further diagnostic testing required to confirm these interpretations.


In [33]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Mild hyperinflation. Scattered chronic appearing irregular interstitial markings with no focal alveolar consolidation. No definite pleural effusion seen. Heart size near top normal limits, aortic ectasia/tortuosity similar to prior. Right hemidiaphragm eventration. No typical findings of pulmonary edema.

### Instruction:
Using only the information provided, summarize the thoracic findings in a concise radiology report style. Do not speculate beyond the context. If any finding is uncertain or equivocal, clearly indicate that. Provide terse definitions of any diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=256,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

Thoracic Findings:

* **Hyperinflation**: Overinflation of lung tissue resulting in increased lung volume.
* **Irregular interstitial markings**: Disorganized patterns of fibrosis and/or inflammation within the lung parenchyma.
* **Focal alveolar consolidation**: Localized areas of lung consolidation (e.g., pneumonia).
* **Pleural effusion**: Accumulation of fluid between the lungs and chest wall.
* **Heart size**: Normal limits; "near top" implies slightly enlarged.
* **Aortic ectasia/tortuosity**: Abnormal widening or twisting of the aorta.
* **Right hemidiaphragm eventration**: Asymmetrical elevation of the diaphragm on one side due to muscle weakness or paralysis.

Note: The absence of specific findings does not necessarily imply their nonexistence. For example, mild hyperinflation may be present without overt clinical significance. Similarly, the lack of pleural effusion does not rule out its presence entirely.


In [None]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Mild hyperinflation. Scattered chronic appearing irregular interstitial markings with no focal alveolar consolidation. No definite pleural effusion seen. Heart size near top normal limits, aortic ectasia/tortuosity similar to prior. Right hemidiaphragm eventration. No typical findings of pulmonary edema.

### Instruction:
Using only the information provided, summarize the thoracic findings in a concise radiology report style. Do not speculate beyond the context. If any finding is uncertain or equivocal, clearly indicate that. Provide terse definitions of any diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=256,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

In [34]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
Heart size moderately enlarged for technique, aortic ectasia/tortuosity. Right hemidiaphragm eventration. No focal alveolar consolidation, no definite pleural effusion seen. Vascular redistribution without typical findings of pulmonary edema. No pneumothorax

### Instruction:
Using only the information provided, summarize the thoracic findings in a concise radiology report style. Do not speculate beyond the context. If any finding is uncertain or equivocal, clearly indicate that. Provide terse definitions of any diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=256,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

Thoracic Findings:

* Heart size: Moderately enlarged
* Aorta: Ectasia/Tortuosity
* Diaphragm: Eventration (right hemidiaphragm)
* Lungs: No focal alveolar consolidation; No definite pleural effusion; Vascular redistribution without pulmonary edema; No pneumothorax

Note: Definitions:
- Eventration: An abnormal elevation of a portion of the diaphragm.
- Ectasia: Enlargement of a blood vessel due to thinning and stretching of its wall.
- Tortuosity: Winding or twisting of a blood vessel.


In [35]:
# Define your custom prompt
prompt = """### Question:
Summarize the key thoracic findings.

### Context:
The heart is top normal in size. The aorta is mildly tortuous. There is mild right basilar scarring versus atelectasis. Lungs are otherwise clear. No pneumothorax or effusion. Osseous structures are intact.

### Instruction:
Write a professional radiology report summary using only the findings in the context. Present them in fluent narrative prose. Do not speculate or add inferred pathophysiology. If uncertain or ambiguous findings are mentioned, preserve that uncertainty. After the summary, briefly define any diagnostic terms used.

### Answer:"""

# Generate output
outputs = pipe(
    prompt,
    max_new_tokens=256,
    temperature=0.3,
    top_p=0.9,
    repetition_penalty=1.1,
    do_sample=True
)

# Show result
raw_output = outputs[0]["generated_text"]
answer_only = raw_output.split("### Answer:")[-1].strip()
print(answer_only)

This chest radiograph shows a heart of normal size with minimal aortic tortuosity. Mild right basal scarring and atelectasis are present without evidence of pneumothorax or pleural effusion. Lung fields appear clear. Osseous structures remain intact.

**Definitions:**

- **Aortic Tortuosity:** An abnormal curvature of the aorta.
- **Basal Scarring:** Fibrosis (scar tissue) in the lower parts of the lungs.
- **Atelectasis:** Collapse or closure of a lung resulting in reduced or absent gas exchange.
- **Pneumothorax:** Air in the pleural space surrounding the lungs, which can collapse the lung.
- **Pleural Effusion:** Accumulation of fluid between the lungs and chest wall.
