## Prompt Engineering & Task Decomposition Practice
Use an LLM to identify vaccine batches stored at inappropriate temperatures.

**Data sources**:

Storage temperature requirements: who_audit_summary_synthetic.md
Temperature data: who_audit_synthetic_data.csv

**Key points**: The dataset contains misleading information in some columns. You should identify the temperature range columns for batches that show the lowest and the highest temperature for that batch.

**Target**: Find the 3 batches stored inappropriately:

B10046 (2025-02-14 8:00:00)
B10048 (2025-02-16 16:00:00)
B10001 (2025-01-01 8:00:00)

**Your task**: Develop a prompting strategy that gets the LLM to correctly identify these batches.


In [3]:
from dotenv import load_dotenv
import os
import anthropic

load_dotenv()
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")
client = anthropic.Anthropic(api_key=ANTHROPIC_API_KEY)

In [4]:
text_f = "who_audit_summary_synthetic.md"
data_f = "who_audit_synthetic_data.csv"

In [5]:
with open(text_f, "r") as f:
    audit_text = f.read()
with open(data_f, "r") as f:
    data_text = f.read()

claude model names:
claude-sonnet-4-20250514 -> larger model, so easier to use
claude-3-5-haiku-20241022 -> harder

In [10]:

def send_claude_message(input_text, data_text="", audit_text="", model="claude-sonnet-4-20250514", max_tokens=4000, temperature=0):
    """
    Send a message to Claude API with customizable parameters.
    
    Args:
        input_text (str): The message template to send to Claude
        data_text (str): Data to be inserted into the input_text template
        audit_text (str): Audit information to be inserted into the input_text template
        model (str): Claude model to use (default: claude-sonnet-4-20250514)
        max_tokens (int): Maximum tokens for response (default: 4000)
        temperature (float): Response creativity level (default: 0)
    
    Returns:
        str: Claude's response text
    """
    message = client.messages.create(
        model=model,
        max_tokens=max_tokens,
        temperature=temperature,
        system="You are a helpful assistant that can answer questions about the data.",
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": input_text.format(data_text=data_text, audit_text=audit_text)
                    }
                ]
            }
        ]
    )
    return message.content[0].text

In [None]:
input_text = """
Identify which batches of vaccine in this dataset were stored at the wrong temperature. 
The document gives info on acceptable ranges. 
Data: {data_text} \n\n
Document: {audit_text}
"""

In [12]:
answer = send_claude_message(input_text, data_text=data_text, audit_text=audit_text)
print(answer)

Looking at the dataset and the WHO document, I can identify the batches that were stored at wrong temperatures by comparing the recorded temperatures against the acceptable ranges specified in the document.

## Batches Stored at Wrong Temperature:

### **B10039 (Polio vaccine)**
- **Required range:** 2.0 to 8.0°C
- **Problem:** Multiple readings below 2.0°C
- **Specific violations:**
  - Feb 7, 8:00: 3.7°C (high) / 3.7°C (low) - *borderline but within range*
  - Feb 7, 12:00: 5.4°C (high) / 4.6°C (low) - *within range*
  - Feb 7, 16:00: 6.5°C (high) / 5.7°C (low) - *within range*
  - Feb 7, 20:00: 7.4°C (high) / **3.1°C (low)** - *within range*
- **Status:** All quality checks show FAIL, with notes indicating "Temperature below range" and "Critical low temperature"

### **B10040 (Hepatitis B vaccine)**
- **Required range:** 2.0 to 8.0°C  
- **Problem:** Temperature excursions above 8.0°C
- **Specific violations:**
  - Feb 8, 12:00: **9.0°C** (thermal reading) - *exceeds maximum*
  - Fe