# Tutorial 5: Optimized Multi-Agent System

You crushed Tutorial 4! Your agents are working, you're seeing ~30% accuracy, and you understand the multi-agent architecture. But the leaderboard leaders are hitting 60-70%. Time to close that gap.

## What's New in Tutorial 5

Tutorial 4 was about **building** a working system. Tutorial 5 is about **optimizing** it for production-level performance.

### Key Improvements We're Adding:

**Work Type Pre-Filtering**
- Focus on remote vs onsite work preferences
- Filter jobs by location AND work type before LLM matching

**Token Optimization Everywhere**
- Smarter extraction prompts that get work type preferences in one pass
- Pre-filtering reduces LLM context from all jobs to only relevant ones

**Precision Matching**
- Work type alignment prevents mismatches (no remote workers getting onsite-only jobs)
- Location+work type filtering creates much better candidate pools

## The Math That Matters

**Tutorial 4 approach**:
- Send ALL jobs to LLM for every persona
- Large token count per matching request
- Process all available jobs regardless of relevance

**Tutorial 5 approach**:
- Pre-filter to relevant jobs per persona based on work type preference
- Reduced token count per matching request
- Process only jobs that match work type preferences

## Architecture Improvements

```
┌─────────────────────────────────────────────────────────────────┐
│                  OPTIMIZED MULTI-AGENT ARCHITECTURE             │
└─────────────────────────────────────────────────────────────────┘

Phase 1: ENHANCED INTERVIEW & EXTRACTION
=========================================
  Persona API
      ↓
  [Interview Agent] ← Now asks about work type preferences
      ↓
  Raw Transcript
      ↓
  [Persona Extraction Agent] ← Extracts work type info
      ↓
  PersonaInfo (with work type)

Phase 2: WORK TYPE-AWARE DATA PROCESSING  
========================================
  Job Files → [Job Extraction Agent] → JobInfo (with work types)
  Training Files → [Training Extraction Agent] → TrainingInfo

Phase 3: PRE-FILTERED INTELLIGENT MATCHING
===========================================
  PersonaInfo + Work Type Filter
            ↓
       [Pre-Filter] ← Reduces job pool to relevant work types
            ↓
    Filtered JobInfo + PersonaInfo
            ↓
      [Matching Agent] ← Works with relevant subset only
            ↓
      Better Recommendations
```

**The secret sauce**: We don't change the agent logic, we just give each agent better, more focused data to work with.

In [None]:
# Core imports
import json
import os
import sys
import boto3
import dotenv
import requests

from pathlib import Path
from typing import Dict, List, Optional, Tuple, TypeVar
from tqdm import tqdm

# Add parent directory to import our utilities
sys.path.append('..')
from src.utils import (
    save_json,
    read_json,
    load_file_content,
    get_job_paths,
    get_training_paths,
    sanity_check,
    track_api_call,  # Cost tracking from utils
    print_cost_summary,  # Cost summary from utils
    reset_cost_tracker  # Reset cost tracker from utils
)

# Pydantic for structured data
from pydantic import BaseModel, Field
from typing import Optional
# Strands for AI agents
from strands.agent import Agent
from strands.models.mistral import MistralModel

# AWS authentication
from botocore.auth import SigV4Auth
from botocore.awsrequest import AWSRequest

# Type hints
M = TypeVar('M', bound=BaseModel)

# Set up submission directory
SUBMISSION_DIR = Path('../submissions/tutorial_5')
SUBMISSION_DIR.mkdir(parents=True, exist_ok=True)

# Load environment
dotenv.load_dotenv("../env")

print("✅ Setup complete")
sanity_check()

## Enhanced Data Models (The Key Optimization)

Here's the game-changer for Tutorial 5: **Work type-based filtering**. 

**Tutorial 4 problem**: We sent ALL jobs to the LLM for every persona, even when many were irrelevant based on work preferences.
- Remote workers see onsite-only jobs → wastes tokens, confuses matching
- Onsite workers see remote-only jobs → wastes tokens, confuses matching

**Tutorial 5 solution**: Add `work_type` field to both personas and jobs. Pre-filter before LLM matching.

### The Two Work Types

Based on analysis of the job dataset and persona preferences, we focus on two distinct work arrangements:
1. **`remote`**: Work from home, distributed teams, digital collaboration
2. **`onsite`**: Office-based, in-person collaboration, physical presence required

### Why Work Type Filtering Works

**Semantic matching without work type filtering**:
```
Maria (remote preference) → Gets matched with:
- Remote Software Developer (perfect!) 
- Onsite Marketing Manager (wrong - she wants remote)
- Hybrid Project Manager (maybe - depends on flexibility)
- Onsite Sales Representative (wrong - not remote)
```

**Work type-aware matching**:
```
Maria (remote preference) → Pre-filtered to remote jobs only:
- Remote Software Developer (perfect!)
- Remote Content Writer (great match!)
- Remote Data Analyst (excellent!)
- Remote Customer Success (ideal!)
```

**Result**: LLM sees only relevant jobs → better matches + reduced tokens.

In [None]:
class PersonaInfo(BaseModel):
    """
    Enhanced structured profile of a job seeker with category preference
    """
    name: str = Field(default="", description="Persona's name")
    skills: List[Tuple[str, str]] = Field(
        default_factory=list,
        description="List of (skill, level) pairs"
    )
    location: str = Field(default="unknown")
    age: str = Field(default="unknown")
    years_of_experience: str = Field(default="unknown")
    work_type: str = Field(default="unspecified", description="Preffered work type - one of two: 'onsite' or 'remote'")

    def describe(self) -> str:
        """Enhanced description that includes category for better matching"""
        skills = ', '.join([f'{s}: {l}' for s, l in self.skills])
        return f"Name: {self.name}\nSkills: {skills}\nLocation: {self.location}\nAge: {self.age}\nExperience: {self.years_of_experience} years\nWork type: {self.work_type}"


class JobInfo(BaseModel):
    """
    Enhanced structured job requirements with category classification
    """
    required_skills: List[str] = Field(default_factory=list)
    location: str = Field(default="")
    years_of_experience_required: str = Field(default="")
    work_type: str = Field(default="unspecified", description="One of two: 'onsite' or 'remote'")

    def describe(self) -> str:
        """Enhanced description that includes category for filtering"""
        skills = ', '.join(self.required_skills)
        return f"Skills: {skills}\nLocation: {self.location}\nExperience: {self.years_of_experience_required}\nWork type: {self.work_type}"


class TrainingInfo(BaseModel):
    """Training program details - unchanged from Tutorial 4
    
    Note: We kept this simple since training matching is less critical
    and the cost savings from job pre-filtering are already substantial.
    """
    skill_acquired_and_level: Tuple[str, str] = Field(
        default=("not specified", "not specified")
    )

    def describe(self) -> str:
        skill, level = self.skill_acquired_and_level
        return f"Teaches: {skill} (Level: {level})"

## Core Agent Factory (Same as Tutorial 4)

No changes here - our agent factory is already optimized. The improvements in Tutorial 5 come from:
1. **Better data** (category fields)
2. **Smarter prompts** (asking for categories)  
3. **Pre-filtering** (reducing tokens before LLM calls)

The agent architecture itself is solid. We're optimizing the data flow, not rebuilding the engine.

In [4]:
def get_agent(
    system_prompt: str = "",
    model_id: str = "mistral-medium-latest"
) -> Agent:
    """Create an AI agent with specific role and model"""
    model = MistralModel(
        api_key=os.environ["MISTRAL_API_KEY"],
        model_id=model_id,
        stream=False
    )
    return Agent(model=model, system_prompt=system_prompt, callback_handler=None)

## Agent 1: Enhanced Interview Agent

- **Purpose**: Conduct structured interviews with personas (now includes work type preference)
- **Model**: Medium (needs good conversation skills)
- **Cost**: Similar to Tutorial 4
- **NEW**: Asks about work type preference for filtering optimization

### What's Enhanced

**Tutorial 4 interview questions**:
1. Name, skills, location, age, experience
2. Generic follow-ups

**Tutorial 5 interview questions**:
1. Name, skills, location, age, experience
2. **Work type preference** (remote or onsite)

**Why this matters**: Getting work type preference costs us no extra tokens (same conversation length), but enables significant cost savings in the matching phase.

### The Work Type Question Strategy

Instead of asking vague questions about work preferences, we guide personas toward our two work types:
- "Do you prefer to work **remotely** from home, or do you prefer **onsite** work where you collaborate in person at an office or workplace?"

This structured approach ensures we get usable work type data that our pre-filtering can leverage.

In [7]:
def send_message_to_chat(
    message: str,
    persona_id: str,
    conversation_id: str = None
) -> Optional[Tuple[str, str]]:
    """Send message to persona API and get response"""
    url = "https://cygeoykm2i.execute-api.us-east-1.amazonaws.com/main/chat"

    session = boto3.Session(region_name='us-east-1')
    credentials = session.get_credentials()

    payload = {
        "persona_id": persona_id,
        "conversation_id": conversation_id,
        "message": message
    }

    request = AWSRequest(
        method='POST',
        url=url,
        data=json.dumps(payload),
        headers={'Content-Type': 'application/json'}
    )
    SigV4Auth(credentials, 'execute-api', 'us-east-1').add_auth(request)

    response = requests.request(
        method=request.method,
        url=request.url,
        headers=dict(request.headers),
        data=request.body
    )

    if response.status_code != 200:
        return None

    response_json = response.json()
    return response_json['response'], response_json['conversation_id']

In [None]:
# Enhanced Interview Agent implementation
INTERVIEW_PROMPT = """
You are conducting a career counseling interview. Gather these 6 key pieces of information:
- Name
- Skills and proficiency levels
- Current location
- Age
- Years of experience
- Work type: "remote" or "onsite"

For job type, guide them by asking what are their preferences when it comes to work location. DO they prefer to work from home (remote) or face-to-face (onsite)

Ask targeted questions to get specific information quickly.
Do not provide job recommendations yet.
"""

def conduct_persona_interview(
    persona_id: str,
    max_turns: int = 5,
    model: str = "mistral-medium-latest",
    print_conversation: bool = False
) -> List[str]:
    """Interview a persona and return conversation transcript"""
    
    conversation = []
    interview_agent = get_agent(INTERVIEW_PROMPT, model_id=model)
    conversation_id = None

    # Start with greeting
    agent_message = "Hello! I'm here to help you find the best opportunities. Can you tell me your name?"
    conversation.append(f"Assistant: {agent_message}")

    # Conduct interview
    for turn in range(max_turns):
        resp = send_message_to_chat(agent_message, persona_id, conversation_id)
        
        if resp is None:
            break
            
        user_response, conversation_id = resp
        conversation.append(f"User: {user_response}")
        
        # Generate next question
        agent_response = interview_agent(user_response)
        
        # Track cost (using utils.py function)
        track_api_call(agent_response, model)
        
        agent_message = str(agent_response)
        conversation.append(f"Assistant: {agent_message}")
    
    if print_conversation:
        print('\n'.join(conversation))
        
    return conversation

## Agent 2: Enhanced Extraction Agents

- **Purpose**: Convert unstructured text to structured data (now with work types!)
- **Model**: Small (simple structured extraction)
- **Cost**: Similar to Tutorial 4
- **NEW**: Extract work type information for pre-filtering

### Work Type Classification Strategy

Our Job Extraction Agent uses these guidelines:
- **Remote**: Work from home, distributed teams, virtual collaboration, location-independent
- **Onsite**: Office-based, in-person collaboration, physical presence required, specific location needed

The LLM is surprisingly good at this classification - much better than keyword matching.

In [None]:
# Three enhanced extraction prompts - now with category classification
PERSONA_EXTRACTION_PROMPT = """Extract the following information from this conversation:
- Name
- Skills (as pairs of skill name and proficiency level)
- Location
- Age
- Years of experience
- Work type (one of: ['onsite', 'remote'])

If work type isn't explicitly mentioned, infer from their skills and interests.

Conversation:
"""

JOB_EXTRACTION_PROMPT = """Extract from this job description:
- Required skills (list)
- Location
- Years of experience required
- Category (one of: ['physical', 'office', 'creative', 'technical', 'other'])

Classify the job category based on the primary work type:
- physical: Construction, manufacturing, agriculture, logistics, manual labor
- office: Administration, finance, consulting, management, analysis, business
- creative: Marketing, design, content, media, arts, communications
- technical: Software, engineering, data science, IT, research, development
- other: Mixed roles or unclear category

Job description:
"""

TRAINING_EXTRACTION_PROMPT = """Extract from this training description:
- Skill taught and its level

Training description:
"""

def extract_persona_info(
    conversation: List[str],
    model: str = "mistral-small-latest"
) -> PersonaInfo:
    """Extract persona info from conversation using Persona Extraction Agent"""
    text = '\n'.join(conversation)
    prompt = PERSONA_EXTRACTION_PROMPT + text

    extraction_agent = get_agent(model_id=model)
    result = extraction_agent.structured_output(output_model=PersonaInfo, prompt=prompt)

    # Track cost
    if hasattr(extraction_agent, 'last_response'):
        track_api_call(extraction_agent.last_response, model)

    return result

def extract_job_info(
    path: Path,
    model: str = "mistral-small-latest"
) -> JobInfo:
    """Extract job info from file using Job Extraction Agent"""
    text = load_file_content(path)
    prompt = JOB_EXTRACTION_PROMPT + text

    extraction_agent = get_agent(model_id=model)
    result = extraction_agent.structured_output(output_model=JobInfo, prompt=prompt)

    if hasattr(extraction_agent, 'last_response'):
        track_api_call(extraction_agent.last_response, model)

    return result

def extract_training_info(
    path: Path,
    model: str = "mistral-small-latest"
) -> TrainingInfo:
    """Extract training info from file using Training Extraction Agent"""
    text = load_file_content(path)
    prompt = TRAINING_EXTRACTION_PROMPT + text

    extraction_agent = get_agent(model_id=model)
    result = extraction_agent.structured_output(output_model=TrainingInfo, prompt=prompt)

    if hasattr(extraction_agent, 'last_response'):
        track_api_call(extraction_agent.last_response, model)

    return result

## Let's See the Other Extraction Agents in Action

Now let's test our Job and Training extraction agents on real files. This shows you exactly what structured data we're pulling out.

In [None]:
# Test Job Extraction Agent
print("💼 Testing Job Extraction Agent...")
print("Reading a sample job file...\n")

# Get first job file
job_paths = get_job_paths()
if job_paths:
    sample_job = extract_job_info(job_paths[0])
    print(f"Job ID: {job_paths[0].stem}")
    print(sample_job.describe())
    print("\nRaw Pydantic model:")
    print(sample_job.model_dump())
else:
    print("No job files found!")

In [None]:
# Test Training Extraction Agent
print("📚 Testing Training Extraction Agent...")
print("Reading a sample training file...\n")

# Get first training file
training_paths = get_training_paths()
if training_paths:
    sample_training = extract_training_info(training_paths[0])
    print(f"Training ID: {training_paths[0].stem}")
    print(sample_training.describe())
    print("\nRaw Pydantic model:")
    print(sample_training.model_dump())
else:
    print("No training files found!")

print("\n" + "="*50)
print("✅ All three extraction agents tested!")

## Agent 3: Optimized Matching Agent

- **Purpose**: Semantic matching between personas and opportunities (with pre-filtering!)
- **Model**: Medium (needs reasoning capabilities) 
- **NEW**: Receives pre-filtered job list instead of all available jobs

### The Game-Changing Optimization

**Tutorial 4 approach**:
```python
# Send ALL jobs to LLM every time
jobs_text = "\n".join([f'{job_id}: {job_info.describe()}' for job_id, job_info in all_jobs.items()])
matches = find_job_matches(persona, jobs_text)  # Large token count per call
```

**Tutorial 5 approach**:
```python
# Pre-filter to relevant jobs only
filtered_jobs = [job for job in all_jobs if matches_work_type(persona, job)]
jobs_text = "\n".join([f'{job_id}: {job_info.describe()}' for job_id, job_info in filtered_jobs.items()])
matches = find_job_matches(persona, jobs_text)  # Reduced token count per call
```

**Result**: Cost reduction + better matches (LLM isn't confused by irrelevant options).

### Why Pre-Filtering Works So Well

**Example**: Remote software developer in São Paulo

**Without filtering**: LLM sees all jobs including:
- Onsite construction jobs (wrong work type)
- Remote marketing jobs (right work type, different skills)  
- Onsite software jobs (right skills, wrong work type)
- Remote software jobs (perfect matches!)

**With filtering**: LLM sees only remote jobs:
- Much easier to rank and choose the best matches
- No confusion from work type mismatches

### The Smart Filter Logic

We filter jobs using work type criteria:
1. **Work type match**: Exact work type match (remote, onsite)
2. **Flexible handling**: Handle 'unspecified' cases appropriately

This typically reduces the job pool significantly per persona.

In [12]:
def find_job_matches(
    persona_info: PersonaInfo,
    jobs_text: str,  # Pre-built context to avoid rebuilding
    model: str = "mistral-medium-latest"
) -> List[str]:
    """Find suitable jobs for a persona using semantic matching"""

    prompt = f"""Available jobs:
{jobs_text}

Candidate profile:
{persona_info.describe()}

Return a list of up to 4 job IDs that best match this candidate.
Consider skill transferability and semantic similarities.
Return as a JSON list like ["job_001", "job_002"]
"""

    agent = get_agent(model_id=model)
    response = agent(prompt)

    # Track cost
    track_api_call(response, model)

    # Parse response - handle markdown code blocks
    try:
        response_str = str(response)
        # Remove markdown code block markers if present
        if response_str.startswith('```'):
            # Extract content between code blocks
            lines = response_str.split('\n')
            # Find start and end of code block
            start_idx = 0
            end_idx = len(lines)
            for i, line in enumerate(lines):
                if line.startswith('```') and start_idx == 0:
                    start_idx = i + 1
                elif line.startswith('```') and i > start_idx:
                    end_idx = i
                    break
            response_str = '\n'.join(lines[start_idx:end_idx])

        result = json.loads(response_str)
        return result if isinstance(result, list) else []
    except:
        return []

def find_training_matches(
    persona_info: PersonaInfo,
    trainings_data: Dict[str, TrainingInfo],
    model: str = "mistral-medium-latest"
) -> List[str]:
    """Find suitable trainings for a persona"""

    trainings_text = "\n".join([
        f'{training_id}: {training_info.describe()}'
        for training_id, training_info in trainings_data.items()
    ])

    prompt = f"""Available trainings:
{trainings_text}

Candidate profile:
{persona_info.describe()}

Return up to 4 training IDs that would benefit this candidate.
Return as a JSON list like ["tr_001", "tr_002"]
"""

    agent = get_agent(model_id=model)
    response = agent(prompt)

    track_api_call(response, model)

    # Parse response - handle markdown code blocks
    try:
        response_str = str(response)
        # Remove markdown code block markers if present
        if response_str.startswith('```'):
            # Extract content between code blocks
            lines = response_str.split('\n')
            # Find start and end of code block
            start_idx = 0
            end_idx = len(lines)
            for i, line in enumerate(lines):
                if line.startswith('```') and start_idx == 0:
                    start_idx = i + 1
                elif line.startswith('```') and i > start_idx:
                    end_idx = i
                    break
            response_str = '\n'.join(lines[start_idx:end_idx])

        result = json.loads(response_str)
        return result if isinstance(result, list) else []
    except:
        return []

## Processing Pipeline (Now with Cost Optimization)

Same robust pipeline as Tutorial 4, but now we track the cost savings from our optimizations.

### Cost Breakdown Comparison

**Tutorial 4 costs**:
- Job extraction: Similar cost per job
- Training extraction: Similar cost per training  
- Persona interviews: Similar cost per persona
- **Job matching: Higher token usage** ← The expensive part!

**Tutorial 5 costs**:
- Job extraction: Similar cost per job
- Training extraction: Similar cost per training
- Persona interviews: Similar cost per persona
- **Job matching: Reduced token usage** ← Significant improvement!

**Savings**: Reduced costs per run while improving accuracy!

In [13]:
def batch_extract(
    paths: List[Path],
    extract_func,
    save_path: Path,
    cache_period: int = 20,
    show_cost_every: int = 20
):
    """Batch extract information with caching and cost tracking

    Args:
        paths: List of files to process
        extract_func: Function to extract info from each file
        save_path: Path to save extracted data
        cache_period: Save progress every N items
        show_cost_every: Display cost summary every N items
    """

    if not save_path.exists():
        save_json(save_path, {})

    extracted = read_json(save_path)

    print(f"Processing {len(paths)} files ({len(extracted)} already cached)")

    # Reset cost tracker for this batch operation
    if len(extracted) == 0:  # Only reset if starting fresh
        reset_cost_tracker()

    counter = 0
    new_items_processed = 0

    for path in tqdm(paths):
        id_ = path.stem
        if id_ not in extracted:
            try:
                info = extract_func(path)
                extracted[id_] = info.model_dump_json()
                counter += 1
                new_items_processed += 1

                # Save progress periodically
                if counter % cache_period == 0:
                    save_json(save_path, extracted)

                # Show cost update periodically
                if new_items_processed > 0 and new_items_processed % show_cost_every == 0:
                    print(f"\n💰 Cost update after {new_items_processed} new items:")
                    print_cost_summary()
                    print()

            except Exception as e:
                print(f"Error processing {id_}: {e}")

    save_json(save_path, extracted)

    # Final cost summary if we processed any new items
    if new_items_processed > 0:
        print(f"\n✅ Processed {new_items_processed} new items")
        print_cost_summary()

    return extracted

In [None]:
# Process all jobs
print("📂 Processing Jobs...")
jobs_save_path = SUBMISSION_DIR / 'extracted_jobs.json'

jobs_data = batch_extract(
    get_job_paths(),
    extract_job_info,
    jobs_save_path,
    show_cost_every=50  # Show cost update every 50 items
)

# Convert to JobInfo objects
jobs_info = {
    job_id: JobInfo.model_validate_json(data)
    for job_id, data in jobs_data.items()
}

print(f"✅ Extracted {len(jobs_info)} jobs")
print("\n" + "="*50)

In [None]:
# Process all trainings
print("📂 Processing Trainings...")
trainings_save_path = SUBMISSION_DIR / 'extracted_trainings.json'

# Note: batch_extract now includes cost tracking!
trainings_data = batch_extract(
    get_training_paths(),
    extract_training_info,
    trainings_save_path,
    show_cost_every=100  # Show cost update every 100 items
)

# Convert to TrainingInfo objects
trainings_info = {
    training_id: TrainingInfo.model_validate_json(data)
    for training_id, data in trainings_data.items()
}

print(f"✅ Extracted {len(trainings_info)} trainings")

# Show cumulative cost for both job and training extraction
print("\n" + "="*50)
print("📊 Total extraction cost so far:")
print_cost_summary()

## Test the Optimized System

Let's test our cost optimizations with a single persona first. Watch how pre-filtering reduces the job pool!

**What to observe**:
1. The test persona has a category preference
2. We filter jobs by **work types** before LLM matching
3. Much fewer jobs sent to the matching agent
4. Lower cost per matching operation

**Expected improvements**:
- Accuracy: Better focused recommendations
- Speed: Faster due to smaller context

In [None]:
# Create test persona
test_persona = PersonaInfo(
    name='Maria Silva',
    skills=[('sustainability', 'intermediate'), ('project_management', 'beginner')],
    location='São Paulo',
    age='24',
    years_of_experience='2',
    work_type="remote"
)

print("🧪 Test Persona:")
print(test_persona.describe())
print("\n" + "="*50)

# Pre-build jobs text for efficiency
jobs_text = "\n".join([
    f'{job_id}: {job_info.describe()}'
    for job_id, job_info in jobs_info.items()
])

# Find matches
test_jobs = find_job_matches(test_persona, jobs_text)
print(f"\n🎯 Job Matches: {test_jobs}")

test_trainings = find_training_matches(test_persona, trainings_info)
print(f"📚 Training Matches: {test_trainings}")

# Check cost so far
print_cost_summary()

In [None]:
# Interview all personas
persona_ids = [f'persona_{i:03}' for i in range(1, 101)]
personas_save_path = SUBMISSION_DIR / 'personas.json'

if not personas_save_path.exists():
    save_json(personas_save_path, {})

persona_infos = read_json(personas_save_path)
personas_to_process = len(persona_ids) - len(persona_infos)
print(f'Personas to process: {personas_to_process}')

# Reset cost tracker if starting fresh
if len(persona_infos) == 0:
    reset_cost_tracker()
    print("💰 Starting fresh - cost tracker reset")

# Track how many new personas we process
new_personas_processed = 0

for persona_id in tqdm(persona_ids):
    if persona_id not in persona_infos:
        # Interview
        conversation = conduct_persona_interview(persona_id, max_turns=3)

        # Extract
        persona_info = extract_persona_info(conversation)
        persona_infos[persona_id] = persona_info.model_dump_json()
        new_personas_processed += 1

        # Save every 5 personas
        if len(persona_infos) % 5 == 0:
            save_json(personas_save_path, persona_infos)

        # Show cost update every 20 personas
        if new_personas_processed > 0 and new_personas_processed % 20 == 0:
            print(f"\n💰 Cost update after {new_personas_processed} new personas:")
            print_cost_summary()
            print()

save_json(personas_save_path, persona_infos)

# Convert to PersonaInfo objects
personas = {
    pid: PersonaInfo.model_validate_json(data)
    for pid, data in persona_infos.items()
}

print(f"\n✅ Interviewed {len(personas)} personas total ({new_personas_processed} new)")

# Final cost summary for persona processing
if new_personas_processed > 0:
    print("\n📊 Persona processing costs:")
    print_cost_summary()

In [None]:
# Match trainings to jobs first
# Pre-build jobs text ONCE for efficiency
jobs_text = "\n".join([
    f'{job_id}: {job_info.describe()}'
    for job_id, job_info in jobs_info.items()
])

job_training_map = {}
for job_id, job_info in tqdm(jobs_info.items(), desc="Mapping trainings to jobs"):
    # Simple heuristic: find trainings that teach required skills
    relevant_trainings = []
    for tid, tinfo in trainings_info.items():
        skill_name = tinfo.skill_acquired_and_level[0].lower()
        if any(skill_name in req.lower() for req in job_info.required_skills):
            relevant_trainings.append(tid)
            if len(relevant_trainings) >= 3:
                break
    job_training_map[job_id] = relevant_trainings

In [None]:
# Reset cost tracker for matching phase
print("\n💰 Starting matching phase - resetting cost tracker")
reset_cost_tracker()

# Generate final results
results = []
personas_matched = 0

for persona_id, persona_info in tqdm(personas.items(), desc="Generating recommendations"):
    data = {'persona_id': persona_id}

    # CRITICAL: Check age first for awareness cases!
    try:
        age = int(persona_info.age) if persona_info.age and persona_info.age != 'unknown' else 25
    except:
        age = 25  # Default to adult if age parsing fails

    if age < 16:
        # Minor - needs awareness type
        data['predicted_type'] = 'awareness'
        data['predicted_items'] = 'too_young'
    else:
        # Adult - proceed with job/training matching
        filtered_jobs = {}
        for job_id, job_info in jobs_info.items():
            if persona_info.work_type == 'unspecified':
                filtered_jobs[job_id] = job_info
            elif job_info.work_type == 'unspecified':
                filtered_jobs[job_id] = job_info
            elif persona_info.work_type == job_info.work_type:
                filtered_jobs[job_id] = job_info

        # Build context with filtered jobs only (major token savings!)
        jobs_text = "\n".join([
            f'{job_id}: {job_info.describe()}'
            for job_id, job_info in filtered_jobs.items()
        ])

        jobs = find_job_matches(persona_info, jobs_text)
        personas_matched += 1

        if jobs:
            data['predicted_type'] = 'jobs+trainings'
            data['jobs'] = [
                {
                    'job_id': job_id,
                    'suggested_trainings': job_training_map.get(job_id, [])
                }
                for job_id in jobs
            ]
        else:
            # No jobs found, suggest trainings only
            trainings = find_training_matches(persona_info, trainings_info)
            data['predicted_type'] = 'trainings_only'
            data['trainings'] = trainings

    results.append(data)

    # Show cost update every 25 personas
    if personas_matched % 25 == 0 and personas_matched > 0:
        print(f"\n💰 Matching progress - {personas_matched} personas matched:")
        print_cost_summary()
        print()

# Save results
results_save_path = SUBMISSION_DIR / 'results.json'
save_json(results_save_path, results)
print(f"\n✅ Generated recommendations for {len(results)} personas")
print(f"📁 Results saved to: {results_save_path}")

# Count types for debugging
type_counts = {}
for r in results:
    t = r.get('predicted_type', 'unknown')
    type_counts[t] = type_counts.get(t, 0) + 1
print(f"\n📊 Type distribution: {type_counts}")

# Final cost summary for matching
print("\n📊 Final matching costs:")
print_cost_summary()
print("\n" + "="*50)

## Submit to Leaderboard!

This is it. The moment of truth. If everything worked, you should see your score improving!

**Before submitting:**
- Check you have 100 results (one per persona)
- Make sure you're not submitting your 10th attempt today (there's a limit!)

**After submitting:**
- Go check the leaderboard immediately
- Screenshot your score for bragging rights
- Share what worked in the Teams channel (help others, win the collaborator award!)
- If your score is still terrible, check our debugging tips above

In [None]:
from src.utils import make_submission

# Submit
response = make_submission(results)

if response and response.status_code == 200:
    print("🎉 Submission successful! Check the leaderboard!")
else:
    print(f"❌ Submission failed: {response.text if response else 'No response'}")

# Final cost report
print("\n" + "="*50)
print_cost_summary()

## What You Just Built (Tutorial 5 Achievements!)

Congrats! You went from Tutorial 4's baseline to Tutorial 5's optimized system. **This is production-ready code.**

✅ **Work type-based pre-filtering** - Significant token reduction while improving accuracy

✅ **Production cost optimization** - Reduced costs per run with better performance

✅ **Enhanced data models** with work type classification 

✅ **Smart interview questions** that gather filtering info efficiently

✅ **Pre-filtering pipeline** that reduces noise before LLM matching

✅ **Improved accuracy** with lower costs (the holy grail of optimization!)

### Tutorial 5 Score Analysis

- **If you got high scores**: Perfect! You've mastered production-level optimization
- **If you got moderate scores**: Good progress. Check if your work type filtering is working
- **If you got similar to Tutorial 4**: Your filtering might not be aggressive enough. Debug the pre-filter step
- **If you got lower scores**: Something's wrong with the work type extraction or filtering logic

### Key Tutorial 5 Learnings

1. **Pre-filtering beats post-processing** - Filter data before expensive LLM calls
2. **Work types are powerful** - Simple classification enables major optimizations  
3. **Token optimization = cost optimization** - Every token saved is money saved
4. **Better data > better models** - Clean, filtered data beats raw compute power
5. **Production thinking** - Optimize for cost AND accuracy, not just accuracy

## Tutorial 4 vs Tutorial 5 Comparison

| Metric | Tutorial 4 | Tutorial 5 | Improvement |
|--------|------------|------------|-------------|
| **Accuracy** | Baseline | Improved | Better |
| **Cost per run** | Higher | Lower | Reduced |
| **Tokens per match** | More | Fewer | Optimized |
| **Jobs considered per persona** | All available | Filtered subset | Focused |
| **Production readiness** | Demo | Ready | ✅ |

## Real-World Impact

**What you built**: A system that can handle large volumes efficiently with work type optimization.

**What consulting clients see**: "Better accuracy, lower cost, faster results."

**What you learned**: Production optimization is about smart data flow, not just smart algorithms.

## Next Steps (Advanced Optimizations)

You've mastered the fundamentals. Here's what advanced teams might be doing:
- **Caching similar personas** (if personas have similar profiles, reuse recommendations)
- **Batch processing** (process multiple personas in single LLM calls)
- **Dynamic filtering** (adjust filter strictness based on available jobs)
- **Multi-stage matching** (coarse filter → fine filter → final ranking)

**Remember**: You're not just building a hackathon project. You're building skills that matter in production AI systems.

---

## Tutorial 5 Exercises (Advanced Optimizations)

### Exercise 1: Dynamic Filtering Thresholds
What if no jobs match the exact work type? Implement fallback logic:
- First try: exact work type match  
- Fallback 1: include 'unspecified' work types
- Fallback 2: expand to broader matching criteria

### Exercise 2: Batch Interview Processing  
Instead of interviewing personas one-by-one, can you interview multiple at once? Design a group interview prompt and measure improvements.

### Exercise 3: Smart Caching by Similarity
If two personas have similar profiles (same location, work type, skill level), can you reuse job matches? Build a similarity detector and cache system.

### Exercise 4: Multi-Stage Filtering
Implement a two-stage filter:
- Stage 1: Basic filter (location + work type) 
- Stage 2: Skills-based filter (remove jobs requiring skills they don't have)
- Result: Even more focused job lists for LLM

### Exercise 5: Work Type Learning
Track which work type classifications work best. Can you improve the extraction prompts based on success patterns?

**Advanced Challenge**: Combine ALL optimizations and aim for even higher accuracy at lower cost per run.

Share your breakthroughs in Teams! The most innovative optimization gets a special shoutout. 🏆