[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kgweber-cwru/coding-with-ai-wn26/blob/main/week-3-prompt-engineering/concepts.ipynb)

# Week 3: Programmatic Prompt Engineering

## Learning Objectives
- Build dynamic prompts using templates
- Implement few-shot learning programmatically
- Parse and validate LLM outputs
- Create reusable prompt libraries
- Handle structured data in prompts

In [None]:
import os
import sys
from pathlib import Path
import json

IN_COLAB = "google.colab" in sys.modules

if IN_COLAB:
    !pip install -q google-genai google-auth python-dotenv
    from google.colab import auth
    auth.authenticate_user()
    try:
        PROJECT_ID = input("Enter your Google Cloud Project ID (press Enter to use default ADC): ").strip()
    except Exception:
        PROJECT_ID = ""
    if PROJECT_ID:
        os.environ["GOOGLE_CLOUD_PROJECT"] = PROJECT_ID
else:
    def find_service_account_json(max_up=6):
        p = Path.cwd()
        for _ in range(max_up):
            candidate = p / "series-2-coding-llms" / "creds"
            if candidate.exists():
                for f in candidate.glob("*.json"):
                    return str(f.resolve())
            candidate2 = p / "creds"
            if candidate2.exists():
                for f in candidate2.glob("*.json"):
                    return str(f.resolve())
            p = p.parent
        return None

    sa_path = find_service_account_json()
    if sa_path:
        os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = sa_path
    else:
        try:
            from dotenv import load_dotenv
            load_dotenv()
        except Exception:
            pass


In [2]:
import google.auth
from google import genai
from google.genai import types

creds, project = google.auth.default()
project = os.environ.get("GOOGLE_CLOUD_PROJECT", project)
client = genai.Client(vertexai=True, project=project, location="us-central1")
print(f"Using project: {project}")

print("✅ Environment loaded successfully!")

Using project: coding-with-ai-wn-26
✓ Environment loaded successfully!


## Part 1: Dynamic Prompt Templates

### Basic Template with f-strings

In [3]:
def summarize_article(article_text, max_sentences=3, style="academic"):
    """Summarize an article with customizable parameters"""
    
    prompt = f"""Summarize the following article in {max_sentences} sentences.
    Use a {style} writing style.
    
    Article:
    {article_text}
    
    Summary:"""
    
    response = client.models.generate_content(
        model="gemini-2.5-flash-lite",
        contents=prompt,
        config=types.GenerateContentConfig(temperature=0.3)
    )
    
    return response.text

# Test it
sample_article = """Machine learning has transformed healthcare diagnostics. 
Recent studies show AI models can detect diseases from medical imaging with high accuracy. 
However, challenges remain in model interpretability and clinical integration."""

print("Academic style:")
print(summarize_article(sample_article, 2, "academic"))
print("\n" + "="*50 + "\n")

print("Plain language:")
print(summarize_article(sample_article, 1, "plain language"))

Academic style:
Machine learning has significantly advanced healthcare diagnostics, with artificial intelligence models demonstrating high accuracy in disease detection from medical imaging. Despite these advancements, challenges persist concerning model interpretability and seamless clinical integration.


Plain language:
AI is improving disease detection in healthcare using medical images, but making these tools understandable and usable in hospitals is still a work in progress.


### Advanced Template Class

In [8]:
class PromptTemplate:
    """Reusable prompt template with validation"""
    
    def __init__(self, template, required_vars=None):
        self.template = template
        self.required_vars = required_vars or []
    
    def format(self, **kwargs):
        """Format template with variables, checking all required vars provided"""
        missing = [var for var in self.required_vars if var not in kwargs]
        if missing:
            raise ValueError(f"Missing required variables: {missing}")
        
        return self.template.format(**kwargs)
    
    def run(self, **kwargs):
        """Format and execute the prompt"""
        prompt = self.format(**kwargs)
        
        response = client.models.generate_content(
            model="gemini-2.5-flash-lite",
            contents=prompt,
            config=types.GenerateContentConfig(
                temperature=kwargs.get('temperature', 0.3)
            )
        )
        
        return response.text

# Create a template
email_template = PromptTemplate(
    template="""Write a professional email with the following details.
    Do not use the phrase, "I hope this email finds you well."
    
    To: {recipient}
    Subject: {subject}
    Tone: {tone}
    Key points to include:
    {key_points}
    
    Email:""",
    required_vars=['recipient', 'subject', 'tone', 'key_points']
)

# Use it
email = email_template.run(
    recipient="Dr. Smith",
    subject="Research Collaboration Proposal",
    tone="friendly but professional",
    key_points="- Interested in collaborating on AI project\n- Have preliminary data\n- Request meeting next week"
)

print(email)

Subject: Research Collaboration Proposal

Dear Dr. Smith,

I am writing to express my strong interest in exploring a potential research collaboration with you and your team. I have been following your work in [mention Dr. Smith's specific area of research, if known, e.g., natural language processing, computer vision] with great admiration, and I believe our respective research interests align particularly well with an exciting AI project I am currently developing.

My team has generated some preliminary data that shows promising results in [briefly describe the area of your preliminary data, e.g., a novel approach to image recognition, a new algorithm for sentiment analysis]. I am eager to discuss how this could potentially complement and advance your ongoing research.

Would you be available for a brief meeting sometime next week to discuss this further? I am flexible and happy to work around your schedule. Please let me know what day and time might be convenient for you.

Thank you f

## Part 2: Few-Shot Learning

Provide examples to guide the model's behavior:

In [11]:
class FewShotClassifier:
    """Classify text using few-shot examples"""
    
    def __init__(self, examples, task_description, debug=False):
        self.examples = examples  # List of (input, output) tuples
        self.task_description = task_description
        self.debug = debug
    
    def build_prompt(self, new_input):
        """Build prompt with examples"""
        prompt = f"{self.task_description}\n\n"
        
        # Add examples
        for inp, out in self.examples:
            prompt += f"Input: {inp}\nOutput: {out}\n\n"
        
        # Add new input
        prompt += f"Input: {new_input}\nOutput:"
        
        return prompt
    
    def classify(self, text):
        prompt = self.build_prompt(text)
        
        if self.debug:
            print("DEBUG: Prompt sent to model:")
            print(prompt)
            print("-" * 50)
        
        response = client.models.generate_content(
            model="gemini-2.5-flash-lite",
            contents=prompt,
            config=types.GenerateContentConfig(
                temperature=0,
                max_output_tokens=50
            )
        )
        
        return response.text.strip()

# Example: Medical note classification
medical_classifier = FewShotClassifier(
    examples=[
        ("Patient reports chest pain and shortness of breath", "URGENT"),
        ("Routine follow-up appointment scheduled", "ROUTINE"),
        ("Lab results show elevated blood sugar", "FOLLOW-UP"),
    ],
    task_description="Classify medical notes as: URGENT, ROUTINE, or FOLLOW-UP",
    debug=True
)

# Test it
test_cases = [
    "Patient has persistent fever and confusion",
    "Annual physical examination completed",
    "X-ray shows possible fracture"
]

for case in test_cases:
    classification = medical_classifier.classify(case)
    print(f"Note: {case}")
    print(f"Classification: {classification}")
    print()

DEBUG: Prompt sent to model:
Classify medical notes as: URGENT, ROUTINE, or FOLLOW-UP

Input: Patient reports chest pain and shortness of breath
Output: URGENT

Input: Routine follow-up appointment scheduled
Output: ROUTINE

Input: Lab results show elevated blood sugar
Output: FOLLOW-UP

Input: Patient has persistent fever and confusion
Output:
--------------------------------------------------
Note: Patient has persistent fever and confusion
Classification: URGENT

DEBUG: Prompt sent to model:
Classify medical notes as: URGENT, ROUTINE, or FOLLOW-UP

Input: Patient reports chest pain and shortness of breath
Output: URGENT

Input: Routine follow-up appointment scheduled
Output: ROUTINE

Input: Lab results show elevated blood sugar
Output: FOLLOW-UP

Input: Annual physical examination completed
Output:
--------------------------------------------------
Note: Annual physical examination completed
Classification: ROUTINE

DEBUG: Prompt sent to model:
Classify medical notes as: URGENT, ROU

## Part 3: Structured Output Parsing

### JSON Output

In [12]:
def extract_structured_data(text, schema):
    """Extract structured data according to schema"""
    
    prompt = f"""Extract information from the text according to this schema.
    
    Schema: {json.dumps(schema, indent=2)}
    
    Text: {text}
    """
    
    response = client.models.generate_content(
        model="gemini-2.5-flash-lite",
        contents=prompt,
        config=types.GenerateContentConfig(
            temperature=0,
            response_mime_type="application/json"
        )
    )
    
    # Parse JSON
    try:
        return json.loads(response.text)
    except json.JSONDecodeError:
        # Fallback if needed
        content = response.text
        start = content.find('{')
        end = content.rfind('}') + 1
        return json.loads(content[start:end])

# Example: Extract patient information
patient_text = """John Smith, age 45, presented with hypertension. 
Blood pressure: 150/95. Prescribed lisinopril 10mg daily. 
Follow-up in 2 weeks."""

schema = {
    "patient_name": "string",
    "age": "integer",
    "diagnosis": "string",
    "medication": "string",
    "follow_up": "string"
}

result = extract_structured_data(patient_text, schema)
print(json.dumps(result, indent=2))

{
  "patient_name": "John Smith",
  "age": 45,
  "diagnosis": "hypertension",
  "medication": "lisinopril 10mg daily",
  "follow_up": "2 weeks"
}


### Validation and Error Handling

In [13]:
def extract_with_validation(text, schema, max_retries=2):
    """Extract data with retry on validation failure"""
    
    for attempt in range(max_retries + 1):
        try:
            result = extract_structured_data(text, schema)
            
            # Validate all required fields present
            missing = [key for key in schema.keys() if key not in result]
            if missing:
                if attempt < max_retries:
                    print(f"Attempt {attempt + 1}: Missing fields {missing}, retrying...")
                    continue
                else:
                    raise ValueError(f"Missing required fields: {missing}")
            
            return result
            
        except Exception as e:
            if attempt < max_retries:
                print(f"Attempt {attempt + 1} failed: {e}, retrying...")
            else:
                raise
    
    return None

# Test it
result = extract_with_validation(patient_text, schema)
print("Validated result:")
print(json.dumps(result, indent=2))

Validated result:
{
  "patient_name": "John Smith",
  "age": 45,
  "diagnosis": "hypertension",
  "medication": "lisinopril 10mg daily",
  "follow_up": "2 weeks"
}


## Part 4: Batch Processing with Templates

In [14]:
def batch_process(items, template_func, **kwargs):
    """Process multiple items with same template"""
    results = []
    
    for i, item in enumerate(items, 1):
        print(f"Processing {i}/{len(items)}...", end=" ")
        result = template_func(item, **kwargs)
        results.append({"input": item, "output": result})
        print("Done")
    
    return results

def translate_to_plain_language(medical_term):
    """Translate medical jargon to plain language"""
    prompt = f"""Explain this medical term in simple, patient-friendly language (one sentence):
    
    Term: {medical_term}
    
    Plain language explanation:"""
    
    response = client.models.generate_content(
        model="gemini-2.5-flash-lite",
        contents=prompt,
        config=types.GenerateContentConfig(
            temperature=0.3,
            max_output_tokens=100
        )
    )
    
    return response.text

# Process multiple terms
medical_terms = [
    "hypertension",
    "myocardial infarction",
    "dyspnea"
]

translations = batch_process(medical_terms, translate_to_plain_language)

Processing 1/3... Done
Processing 2/3... Done
Processing 3/3... Done


In [15]:
translations

[{'input': 'hypertension',
  'output': 'Hypertension is the medical term for high blood pressure, which means the force of blood pushing against your artery walls is consistently too high.'},
 {'input': 'myocardial infarction',
  'output': 'Myocardial infarction means a heart attack, which happens when blood flow to a part of the heart muscle is blocked.'},
 {'input': 'dyspnea',
  'output': "Dyspnea means you're having trouble breathing, like feeling short of breath or gasping for air."}]

## Part 5: Prompt Library Pattern

In [16]:
class PromptLibrary:
    """Manage a collection of reusable prompts"""
    
    def __init__(self):
        self.prompts = {}
    
    def register(self, name, template, required_vars=None):
        """Register a new prompt template"""
        self.prompts[name] = PromptTemplate(template, required_vars)
    
    def get(self, name):
        """Get a prompt template by name"""
        if name not in self.prompts:
            raise ValueError(f"Prompt '{name}' not found")
        return self.prompts[name]
    
    def list(self):
        """List all registered prompts"""
        return list(self.prompts.keys())

# Create library
lib = PromptLibrary()

# Register prompts
lib.register(
    "summarize",
    "Summarize this text in {num_sentences} sentences:\n\n{text}\n\nSummary:",
    required_vars=['text', 'num_sentences']
)

lib.register(
    "expand",
    "Expand this concept with {detail_level} detail:\n\n{concept}\n\nExpanded explanation:",
    required_vars=['concept', 'detail_level']
)

lib.register(
    "compare",
    "Compare and contrast {item_a} and {item_b}. Focus on: {focus}\n\nComparison:",
    required_vars=['item_a', 'item_b', 'focus']
)

# Use prompts
print("Available prompts:", lib.list())
print("\n" + "="*50 + "\n")

result = lib.get('compare').run(
    item_a="MRI",
    item_b="CT scan",
    focus="clinical applications and safety considerations"
)

print(result)

Available prompts: ['summarize', 'expand', 'compare']


Let's compare and contrast MRI and CT scans, focusing on their clinical applications and safety considerations.

## MRI vs. CT Scan: A Comparison and Contrast

Both Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) are invaluable diagnostic imaging modalities that provide detailed cross-sectional views of the body. However, they utilize fundamentally different principles, leading to distinct strengths, weaknesses, clinical applications, and safety profiles.

### Comparison: Similarities

*   **Non-Invasive Imaging:** Both MRI and CT are non-invasive procedures, meaning they do not require surgery or the insertion of instruments into the body to obtain images.
*   **Cross-Sectional Imaging:** Both techniques generate cross-sectional images (slices) of the body, allowing for visualization of internal structures that would be obscured by overlying tissues in traditional X-rays.
*   **Diagnostic Power:** Both are powerful 

## Key Takeaways

1. **Templates enable reusability** - Write once, use many times
2. **Few-shot learning is powerful** - Good examples guide behavior
3. **Structure your outputs** - JSON makes parsing reliable
4. **Validate and retry** - LLMs aren't perfect, build in checks
5. **Build libraries** - Organize prompts for team use

## Next Week

We'll explore **embeddings and RAG concepts**:
- Understanding vector representations
- Semantic similarity
- Document retrieval
- Setting up for RAG systems