# Tutorial 4: Multi-Agent System for Smart Matching

Alright, your 5-10% score from Tutorial 3 was... expected. You literally gave everyone the same 5 jobs. Time to build something that actually works.

---

# Matching Rules

To add extra clarity before you jump to work we wanted to share the basic matching rules that are used to define best matches. These rules are the baseline for decisions why certain persona is fit for a job. Although we all know that sometimes applying for a job you have no experience with could positively surprise you, **these hard filters exist for good reason**.

Think of them as the "minimum viable candidate" criteria. Ignore them at your own risk - your accuracy will tank.

## Hard Filters

- **Domain matches the persona's target domain**.
- **Location**:
  - If the persona has a defined city → the job must be in the same city or remote.
  - If the persona is "open to relocate" → there is no location constraint.
- **Languages include at least one in common**.
- **Education level** of the persona is equal to or greater than the job's requirement.
- **Experience**: persona's years of experience are equal to or greater than the job's requirement.

**Brazilian education levels** (in ascending order):

Ensino Fundamental < Ensino Médio < Técnico < Tecnólogo < Graduação < Bacharelado < Licenciatura < Pós-graduação < Especialização < Mestrado < MBA < Doutorado

**If any of these filters fail, the job must not be recommended.**

### Skills

If a persona lacks required skills for a job:
- The job must still be recommended (as long as it passes the hard filters).
- Trainings must be suggested to cover all missing skills.

Training recommendations must follow a strict level-by-level progression:
- If the persona has no knowledge of a skill and the job requires Intermediário, suggest both Básico and Intermediário (if available in the catalog).
- If the persona has Básico and the job requires Avançado, suggest Intermediário and Avançado (if available).
- More generally: if the persona is at level $p$ and the job requires $r$, you must propose all trainings available from $p+1$ to $r$.
- Missing levels in the catalog are acceptable. If no training exists for a given skill, the training list for that skill can remain empty (this is not a non-conformity).
Skills should never be a blocker: if someone wants a job that requires a certain skill, this should not prevent those jobs from being suggested (although for a better user experience, it makes sense to prioritize them first).

### Trainings

Trainings are available at three levels:

Básico → Intermediário → Avançado

Rules for training recommendations:
- A persona can only benefit from trainings above their current level.
- If the persona already knows a skill at Básico, recommend Intermediário or Avançado, but not Básico.
- In standalone training mode (when no jobs are being recommended):
  - Recommend only the next level above the current one.
  - If the persona has no prior level, recommend only Básico.
  - Do not suggest multiple levels at once in this mode.
- Personas may seek trainings either as a standalone goal or as preparation for a job.
- If no relevant training exists, the list may remain empty.

### Awareness

Some personas are not yet ready for jobs or trainings. In these cases, your assistant should provide awareness content to help them explore and learn.
Two possible cases:
- **Too young:** If the persona is under 16, return awareness content with the reason "too_young".
- **Seeking information:** If the persona is simply exploring (e.g., asking “What does an engineer do?”), return awareness content with the reason "info".

## Why These Rules Matter

**Example failure**: Recommending a Doutorado-level research position to someone with Ensino Médio education.
- **Human logic**: "Maybe they'll accept it anyway!"
- **Reality**: Automatic rejection, wastes everyone's time
- **Your score**: Takes a hit because it's an obviously bad match

**Example success**: Finding a remote sustainability job for someone in a small city with environmental interests.
- **Filters pass**: Location (remote), domain (environmental), experience level matches
- **Result**: Realistic recommendation that could actually work


---

## What we're building

A multi-agent system that:
- **Interviews personas** to understand their background (and costs money doing it)
- **Extracts structured data** from messy conversations into clean JSON
- **Matches intelligently** using semantic understanding, not just keywords
- **Handles all three types correctly** (jobs+trainings, trainings_only, awareness)
- **Achieves 40-50% accuracy** (vs that embarrassing 5-10% from Tutorial 3)

**Reality check**: This tutorial will cost you $5-10 in API credits. But you'll learn patterns worth way more than that in real consulting work.

## The Competitive Landscape

Look, everyone can do keyword matching. The teams hitting 60%+ are doing something smarter:
- They're having actual conversations with personas
- They're understanding context ("I want to help the environment" → green jobs)
- They're catching edge cases (minors need awareness, not jobs!)
- They're being efficient with their API calls (remember: 5 conversations/day limit per persona)

We're going to build that. And then share tips in Teams because the best collaborator award is pretty cool too.

## Key Improvements in This Tutorial

1. **Age-aware routing**: Minors (< 16) automatically get awareness type
2. **Structured data extraction**: No more regex nightmares with Pydantic
3. **Semantic matching**: Understanding "sustainability" = "environmental"
4. **Training suggestions**: Worth 25% of your score - we won't ignore them!

## Architecture Overview

```
┌─────────────────────────────────────────────────────────────────┐
│                     MULTI-AGENT ARCHITECTURE                    │
└─────────────────────────────────────────────────────────────────┘

Phase 1: INTERVIEW & EXTRACTION
================================
  Persona API
      ↓
  [Interview Agent] ← Conducts conversations (Medium model, $0.02/persona)
      ↓
  Raw Transcript
      ↓
  [Persona Extraction Agent] ← Structures conversation data (Small model, $0.001/extraction)
      ↓
  PersonaInfo (structured)

Phase 2: DATA PROCESSING
========================
  Job Files → [Job Extraction Agent] → JobInfo (200 jobs)
  Training Files → [Training Extraction Agent] → TrainingInfo (467 trainings)

Phase 3: INTELLIGENT MATCHING
==============================
  PersonaInfo + JobInfo + TrainingInfo
                ↓
          [Matching Agent] ← Semantic matching (Medium model, $0.05/persona)
                ↓
          Recommendations
                ↓
            Submission
```

Each agent has ONE job. Think of it like a consulting team where everyone has their specialty. Nobody does everything well.

In [None]:
# Core imports
import json
import os
import sys
import dotenv

from pathlib import Path
from typing import Dict, List, Tuple, TypeVar
from tqdm import tqdm

# Add parent directory to import our utilities
sys.path.append('..')
from src.utils import (
    save_json,
    read_json,
    load_file_content,
    get_job_paths,
    get_training_paths,
    sanity_check,
	chat_with_persona,
    track_api_call,  # Cost tracking from utils
    print_cost_summary,  # Cost summary from utils
    reset_cost_tracker  # Reset cost tracker from utils
)

# Pydantic for structured data
from pydantic import BaseModel, Field

# Strands for AI agents
from strands.agent import Agent
from strands.models.mistral import MistralModel

# Type hints
M = TypeVar('M', bound=BaseModel)

# Set up submission directory
SUBMISSION_DIR = Path('../submissions')
SUBMISSION_DIR.mkdir(parents=True, exist_ok=True)

# Load environment
dotenv.load_dotenv("../env")

print("✅ Setup complete")
sanity_check()

✅ Setup complete
✅ API connection successful!


True

## Data Models (Why Pydantic Will Save Your Life)

Remember Tutorial 3 where we just threw strings around? Yeah, that doesn't scale.

**The problem with free-form LLM output:**
```
"Maria seems good for environmental jobs in São Paulo, maybe needs data training"
```

**What we actually need:**
```python
PersonaInfo(
    name="Maria Santos",
    skills=[("sustainability", "intermediate"), ("project_management", "beginner")],
    location="São Paulo"
)
```

Pydantic models force the LLM to give us structured data. No more regex parsing nightmares. This is the difference between hobby projects and production code.

In [16]:
class PersonaInfo(BaseModel):
    """Structured profile of a job seeker"""
    name: str = Field(default="", description="Persona's name")
    skills: List[Tuple[str, str]] = Field(
        default_factory=list,
        description="List of (skill, level) pairs"
    )
    location: str = Field(default="unknown")
    age: str = Field(default="unknown")
    years_of_experience: str = Field(default="unknown")

    def describe(self) -> str:
        skills = ', '.join([f'{s}: {l}' for s, l in self.skills])
        return f"Name: {self.name}\nSkills: {skills}\nLocation: {self.location}\nAge: {self.age}\nExperience: {self.years_of_experience} years"


class JobInfo(BaseModel):
    """Structured job requirements"""
    required_skills: List[str] = Field(default_factory=list)
    location: str = Field(default="")
    years_of_experience_required: str = Field(default="")

    def describe(self) -> str:
        skills = ', '.join(self.required_skills)
        return f"Skills: {skills}\nLocation: {self.location}\nExperience: {self.years_of_experience_required}"


class TrainingInfo(BaseModel):
    """Training program details"""
    skill_acquired_and_level: Tuple[str, str] = Field(
        default=("not specified", "not specified")
    )

    def describe(self) -> str:
        skill, level = self.skill_acquired_and_level
        return f"Teaches: {skill} (Level: {level})"

## Core Agent Factory

Here's where we create agents with different personalities. Think of this as the HR department of our AI company - we're hiring specialists, not generalists.

**Why multiple agents instead of one mega-prompt?**
- Each agent can use different models (small for extraction = cheap, large for reasoning = smart)
- Easier to debug when something inevitably goes wrong
- Costs less (use the cheapest model that gets the job done)
- **Context overload prevention** - LLMs get confused with too much at once!

It's like asking someone to read 200 resumes while conducting an interview while making hiring decisions. Nobody does that well. Split the work, get better results.

In [17]:
def get_agent(
    system_prompt: str = "",
    model_id: str = "mistral-medium-latest"
) -> Agent:
    """Create an AI agent with specific role and model"""
    model = MistralModel(
        api_key=os.environ["MISTRAL_API_KEY"],
        model_id=model_id,
        stream=False
    )
    return Agent(model=model, system_prompt=system_prompt, callback_handler=None)

## Agent 1: Interview Agent

- **Purpose**: Conduct structured interviews with personas
- **Model**: Medium (needs good conversation skills)
- **Cost**: ~$0.02 per persona

Time to build agents that actually talk to personas. This is where your money starts disappearing.

### The Persona System Reality Check

**⚠️ CRITICAL LIMITS:**
- **5 conversations per day** per persona
- **20 messages max** per conversation (10 on each side)
- **30k tokens per persona** per conversation
  - when chatting with a persona, this persona calls Mistral and consumes token. The persona should not exceed 30k tokens during the conversation
- **Conversation IDs** must be tracked (or you lose context)
- **1 submission** per week

**Translation**: You can't just spam personas with questions. Be smart or run out of attempts.

### Conversation Strategy That Actually Works

**Bad approach** (what everyone tries first):
```
"Hi, tell me everything about yourself"
*persona gives vague answer*
"Uh... tell me more?"
*waste 20 messages getting nothing*
```

**Good approach** (what we're building):
```
"Hi! What's your name?"
"What are your top skills and how experienced are you with each?"
"Where are you located and are you willing to relocate?"
*Get everything in 5 messages*
```

In [None]:
# Interview Agent implementation
INTERVIEW_PROMPT = """
You are conducting a career counseling interview. Gather:
- Name
- Skills and proficiency levels
- Current location
- Age
- Years of experience

Ask targeted questions to get specific information quickly.
Do not provide job recommendations yet.
"""

def conduct_persona_interview(
    persona_id: str,
    max_turns: int = 5,
    model: str = "mistral-medium-latest",
    print_conversation: bool = False
) -> List[str]:
    """Interview a persona and return conversation transcript"""

    conversation = []
    interview_agent = get_agent(INTERVIEW_PROMPT, model_id=model)
    conversation_id = None

    # Start with greeting
    agent_message = "Hello! I'm here to help you find the best opportunities. Can you tell me your name?"
    conversation.append(f"Assistant: {agent_message}")

    # Conduct interview
    for _ in range(max_turns):
        resp = chat_with_persona(persona_id, agent_message, conversation_id)

        if resp is None:
            break

        user_response, conversation_id = resp
        conversation.append(f"User: {user_response}")

        # Generate next question
        agent_response = interview_agent(user_response)

        # Track cost (using utils.py function)
        track_api_call(agent_response, model)

        agent_message = str(agent_response)
        conversation.append(f"Assistant: {agent_message}")

    if print_conversation:
        print('\n'.join(conversation))

    return conversation

# Test it
print("🎤 Testing Interview Agent...")
test_interview = conduct_persona_interview(
    "persona_001",
    max_turns=2,
    print_conversation=True
)

🎤 Testing Interview Agent...
Assistant: Hello! I'm here to help you find the best opportunities. Can you tell me your name?
User: hi my name is rafael ribeiro. i am 21 and i like food things.
Assistant: Great, Rafael! Let’s keep this quick and focused:

1. **Skills & Proficiency**: What technical or professional skills do you have (e.g., cooking, food science, customer service, inventory management)? Rate your proficiency for each (Beginner/Intermediate/Advanced).
2. **Current Location**: Where are you based (city/country)? Are you open to relocating?
3. **Experience**: How many years of formal work experience do you have (paid/unpaid, including internships or part-time roles)?
4. **Food Focus**: Do you specialize in any area (e.g., culinary arts, food tech, nutrition, hospitality)?

*(Example: "I have 2 years of experience as a line cook (Intermediate) and 1 year in food delivery logistics (Beginner).")*

User: 1. i dont have skils yet. i just know food is important and i wanna lern h

## Agent 2: Extraction Agents (Yes, Plural!)

- **Purpose**: Convert unstructured text to structured data
- **Model**: Small (simple structured extraction)
- **Cost**: ~$0.001 per extraction

We actually have THREE extraction agents - think of them as specialist junior analysts:
1. **Persona Extraction Agent**: Pulls name, skills, location from conversations
2. **Job Extraction Agent**: Extracts requirements from job descriptions
3. **Training Extraction Agent**: Gets skill outcomes from training programs

Why three agents instead of one? Each has a specific prompt optimized for its data type. It's like having specialists who know exactly what to look for.

The beauty of extraction agents is we can use a cheap model - it's just pattern matching and structuring, not complex reasoning. Save the expensive models for when you actually need intelligence.

### The Magic of .structured_output()

Here's the game-changer. Instead of this nightmare:
```python
# Old way - pray the LLM formats correctly
response = agent("Extract the skills...")
# Now parse the string and hope it's valid JSON...
# Handle 17 different ways the LLM might format it...
```

We do this:
```python
# New way - guaranteed valid Pydantic model
result = agent.structured_output(output_model=PersonaInfo, prompt=prompt)
# result is ALWAYS a valid PersonaInfo object!
```

**Why .structured_output() is perfect for our problem:**
- Forces the LLM to return data that matches our Pydantic schema exactly
- No more regex parsing, no more "hope it's valid JSON"
- If a field is missing, Pydantic uses the default value
- Type validation built-in (can't put a string where we expect a list)
- **100% consistent output structure** - crucial when processing 100 personas

This is the difference between a hackathon project and production code.

In [20]:
# Three specialized extraction prompts - each optimized for its data type
PERSONA_EXTRACTION_PROMPT = """Extract the following information from this conversation:
- Name
- Skills (as pairs of skill name and proficiency level)
- Location
- Age
- Years of experience

Conversation:
"""

JOB_EXTRACTION_PROMPT = """Extract from this job description:
- Required skills (list)
- Location
- Years of experience required

Job description:
"""

TRAINING_EXTRACTION_PROMPT = """Extract from this training description:
- Skill taught and its level

Training description:
"""

def extract_persona_info(
    conversation: List[str],
    model: str = "mistral-small-latest"
) -> PersonaInfo:
    """Extract persona info from conversation using Persona Extraction Agent"""
    text = '\n'.join(conversation)
    prompt = PERSONA_EXTRACTION_PROMPT + text

    extraction_agent = get_agent(model_id=model)
    result = extraction_agent.structured_output(output_model=PersonaInfo, prompt=prompt)

    # Track cost
    if hasattr(extraction_agent, 'last_response'):
        track_api_call(extraction_agent.last_response, model)

    return result

def extract_job_info(
    path: Path,
    model: str = "mistral-small-latest"
) -> JobInfo:
    """Extract job info from file using Job Extraction Agent"""
    text = load_file_content(path)
    prompt = JOB_EXTRACTION_PROMPT + text

    extraction_agent = get_agent(model_id=model)
    result = extraction_agent.structured_output(output_model=JobInfo, prompt=prompt)

    if hasattr(extraction_agent, 'last_response'):
        track_api_call(extraction_agent.last_response, model)

    return result

def extract_training_info(
    path: Path,
    model: str = "mistral-small-latest"
) -> TrainingInfo:
    """Extract training info from file using Training Extraction Agent"""
    text = load_file_content(path)
    prompt = TRAINING_EXTRACTION_PROMPT + text

    extraction_agent = get_agent(model_id=model)
    result = extraction_agent.structured_output(output_model=TrainingInfo, prompt=prompt)

    if hasattr(extraction_agent, 'last_response'):
        track_api_call(extraction_agent.last_response, model)

    return result

# Test extraction on conversation
print("🔍 Testing Persona Extraction Agent...")
test_persona = extract_persona_info(test_interview)
print(test_persona.describe())

🔍 Testing Persona Extraction Agent...
Name: Rafael Ribeiro
Skills: Food safety protocols: Beginner, Factory/hygienic food production: Beginner, Hands-on machine operation: Beginner
Location: São Paulo, Brazil
Age: 21
Experience: 0 years


## Let's See the Other Extraction Agents in Action

Now let's test our Job and Training extraction agents on real files. This shows you exactly what structured data we're pulling out.

In [21]:
# Test Job Extraction Agent
print("💼 Testing Job Extraction Agent...")
print("Reading a sample job file...\n")

# Get first job file
job_paths = get_job_paths()
if job_paths:
    sample_job = extract_job_info(job_paths[0])
    print(f"Job ID: {job_paths[0].stem}")
    print(sample_job.describe())
    print("\nRaw Pydantic model:")
    print(sample_job.model_dump())
else:
    print("No job files found!")

💼 Testing Job Extraction Agent...
Reading a sample job file...

Job ID: job_acc_001
Skills: Tax regulations and compliance processes at an intermediate level, Attention to detail, Organizational skills, Fluent in Portuguese (Brazilian), Fluent in English
Location: Brasília
Experience: 2 years

Raw Pydantic model:
{'required_skills': ['Tax regulations and compliance processes at an intermediate level', 'Attention to detail', 'Organizational skills', 'Fluent in Portuguese (Brazilian)', 'Fluent in English'], 'location': 'Brasília', 'years_of_experience_required': '2 years'}


In [22]:
# Test Training Extraction Agent
print("📚 Testing Training Extraction Agent...")
print("Reading a sample training file...\n")

# Get first training file
training_paths = get_training_paths()
if training_paths:
    sample_training = extract_training_info(training_paths[0])
    print(f"Training ID: {training_paths[0].stem}")
    print(sample_training.describe())
    print("\nRaw Pydantic model:")
    print(sample_training.model_dump())
else:
    print("No training files found!")

print("\n" + "="*50)
print("✅ All three extraction agents tested!")

📚 Testing Training Extraction Agent...
Reading a sample training file...

Training ID: tr_adm_client_record_management_01
Teaches: Basic Customer Data Management (Level: foundational level)

Raw Pydantic model:
{'skill_acquired_and_level': ('Basic Customer Data Management', 'foundational level')}

✅ All three extraction agents tested!


## Agent 3: Matching Agent

- **Purpose**: Semantic matching between personas and opportunities
- **Model**: Medium (needs reasoning capabilities)
- **Cost**: ~$0.05 per persona

This is where the magic happens. The "Senior Consultant" of our team that actually understands context.

### Semantic Matching vs Keyword Matching

**Keyword matching**:
- Person has "sustainability" → Job needs "environmental" → ❌ No match
- Person in "Greater São Paulo" → Job in "São Paulo" → ❌ No match  
- "Data analysis" → "Analytics" → ❌ No match

**Semantic matching** (what we're doing now):
- Agent understands "sustainability" = "environmental" = "green energy" = "climate action"
- Knows "Greater São Paulo" includes "São Paulo" 
- Recognizes "data analysis" skills transfer to "analytics" roles

**Example**: Maria, sustainability consultant, 3 years experience, São Paulo

**Simple matcher**: Looks for jobs with "sustainability" and "consultant" in the title. Finds 2.

**Our semantic matcher**: 
- Understands she could do environmental consulting, ESG reporting, green project management
- Knows her consulting skills transfer to advisory, strategy, implementation roles
- Recognizes São Paulo includes opportunities in the greater metro area
- Finds 15 relevant opportunities

That's the difference between 10% and 50% accuracy on the leaderboard.

In [23]:
def find_job_matches(
    persona_info: PersonaInfo,
    jobs_text: str,  # Pre-built context to avoid rebuilding
    model: str = "mistral-medium-latest"
) -> List[str]:
    """Find suitable jobs for a persona using semantic matching"""

    prompt = f"""Available jobs:
{jobs_text}

Candidate profile:
{persona_info.describe()}

Return a list of up to 4 job IDs that best match this candidate.
Consider skill transferability and semantic similarities.
Return as a JSON list like ["job_001", "job_002"]
"""

    agent = get_agent(model_id=model)
    response = agent(prompt)

    # Track cost
    track_api_call(response, model)

    # Parse response - handle markdown code blocks
    try:
        response_str = str(response)
        # Remove markdown code block markers if present
        if response_str.startswith('```'):
            # Extract content between code blocks
            lines = response_str.split('\n')
            # Find start and end of code block
            start_idx = 0
            end_idx = len(lines)
            for i, line in enumerate(lines):
                if line.startswith('```') and start_idx == 0:
                    start_idx = i + 1
                elif line.startswith('```') and i > start_idx:
                    end_idx = i
                    break
            response_str = '\n'.join(lines[start_idx:end_idx])

        result = json.loads(response_str)
        return result if isinstance(result, list) else []
    except:
        return []

def find_training_matches(
    persona_info: PersonaInfo,
    trainings_data: Dict[str, TrainingInfo],
    model: str = "mistral-medium-latest"
) -> List[str]:
    """Find suitable trainings for a persona"""

    trainings_text = "\n".join([
        f'{training_id}: {training_info.describe()}'
        for training_id, training_info in trainings_data.items()
    ])

    prompt = f"""Available trainings:
{trainings_text}

Candidate profile:
{persona_info.describe()}

Return up to 4 training IDs that would benefit this candidate.
Return as a JSON list like ["tr_001", "tr_002"]
"""

    agent = get_agent(model_id=model)
    response = agent(prompt)

    track_api_call(response, model)

    # Parse response - handle markdown code blocks
    try:
        response_str = str(response)
        # Remove markdown code block markers if present
        if response_str.startswith('```'):
            # Extract content between code blocks
            lines = response_str.split('\n')
            # Find start and end of code block
            start_idx = 0
            end_idx = len(lines)
            for i, line in enumerate(lines):
                if line.startswith('```') and start_idx == 0:
                    start_idx = i + 1
                elif line.startswith('```') and i > start_idx:
                    end_idx = i
                    break
            response_str = '\n'.join(lines[start_idx:end_idx])

        result = json.loads(response_str)
        return result if isinstance(result, list) else []
    except:
        return []

## Processing Pipeline

Now let's process all jobs and trainings. First time running this? Grab coffee. We're processing 200 jobs and 467 trainings.

**Cost alert**: ~$2-3 to process everything with medium model. But here's the beautiful part - we cache everything. Run it once, pay once. If it crashes halfway? Just run again, it picks up where it left off. We're not savages.

### Production Patterns That Matter

**Caching** (never pay twice for the same thing):
- Process all jobs once, save to JSON
- Process all trainings once, save to JSON
- If your code crashes, you don't lose everything

**Batch Processing** (with progress bars because we're not animals):
- Shows you exactly where you are
- Saves progress every N items
- Can resume from interruptions
- **Now with cost updates!** See your spending as you go

**Error Handling** (because everything fails eventually):
- Retry with exponential backoff
- Log failures for debugging
- Graceful degradation (partial results > no results)

In [24]:
def batch_extract(
    paths: List[Path],
    extract_func,
    save_path: Path,
    cache_period: int = 20,
    show_cost_every: int = 20
):
    """Batch extract information with caching and cost tracking

    Args:
        paths: List of files to process
        extract_func: Function to extract info from each file
        save_path: Path to save extracted data
        cache_period: Save progress every N items
        show_cost_every: Display cost summary every N items
    """

    if not save_path.exists():
        save_json(save_path, {})

    extracted = read_json(save_path)

    print(f"Processing {len(paths)} files ({len(extracted)} already cached)")

    # Reset cost tracker for this batch operation
    if len(extracted) == 0:  # Only reset if starting fresh
        reset_cost_tracker()

    counter = 0
    new_items_processed = 0

    for path in tqdm(paths):
        id_ = path.stem
        if id_ not in extracted:
            try:
                info = extract_func(path)
                extracted[id_] = info.model_dump_json()
                counter += 1
                new_items_processed += 1

                # Save progress periodically
                if counter % cache_period == 0:
                    save_json(save_path, extracted)

                # Show cost update periodically
                if new_items_processed > 0 and new_items_processed % show_cost_every == 0:
                    print(f"\n💰 Cost update after {new_items_processed} new items:")
                    print_cost_summary()
                    print()

            except Exception as e:
                print(f"Error processing {id_}: {e}")

    save_json(save_path, extracted)

    # Final cost summary if we processed any new items
    if new_items_processed > 0:
        print(f"\n✅ Processed {new_items_processed} new items")
        print_cost_summary()

    return extracted

In [25]:
# Process all jobs
print("📂 Processing Jobs...")
jobs_save_path = SUBMISSION_DIR / 'extracted_jobs.json'

jobs_data = batch_extract(
    get_job_paths(),
    extract_job_info,
    jobs_save_path,
    show_cost_every=50  # Show cost update every 50 items
)

# Convert to JobInfo objects
jobs_info = {
    job_id: JobInfo.model_validate_json(data)
    for job_id, data in jobs_data.items()
}

print(f"✅ Extracted {len(jobs_info)} jobs")
print("\n" + "="*50)

📂 Processing Jobs...
Processing 200 files (200 already cached)


100%|██████████| 200/200 [00:00<00:00, 178405.10it/s]

✅ Extracted 200 jobs






In [26]:
# Process all trainings
print("📂 Processing Trainings...")
trainings_save_path = SUBMISSION_DIR / 'extracted_trainings.json'

# Note: batch_extract now includes cost tracking!
trainings_data = batch_extract(
    get_training_paths(),
    extract_training_info,
    trainings_save_path,
    show_cost_every=100  # Show cost update every 100 items
)

# Convert to TrainingInfo objects
trainings_info = {
    training_id: TrainingInfo.model_validate_json(data)
    for training_id, data in trainings_data.items()
}

print(f"✅ Extracted {len(trainings_info)} trainings")

# Show cumulative cost for both job and training extraction
print("\n" + "="*50)
print("📊 Total extraction cost so far:")
print_cost_summary()

📂 Processing Trainings...
Processing 467 files (467 already cached)


100%|██████████| 467/467 [00:00<00:00, 416487.34it/s]

✅ Extracted 467 trainings

📊 Total extraction cost so far:
💰 Cost Summary:
  Total API calls: 3
  Total tokens: 1,390
  Estimated cost: $0.0019

  By model:
    mistral-medium-latest: 3 calls, $0.0019





## Test the System (Before You Blow $10 on Broken Code)

Always test with fake data first. Seriously. I know you want to just run everything, but trust me on this one.

**Quick math for talking to 100 personas:**
- ~2000 tokens per conversation (input + output)
- 100 personas = 200,000 tokens
- Cost: ~$0.02 with small model, ~$0.40 with large model
- **But wait**: You'll retry failed conversations, test your code, mess up... 
- **Real cost**: Probably $5-10 for this tutorial if you do the exercises

Let's test with one persona before we burn through our budget.

In [27]:
# Create test persona
test_persona = PersonaInfo(
    name='Maria Silva',
    skills=[('sustainability', 'intermediate'), ('project_management', 'beginner')],
    location='São Paulo',
    age='24',
    years_of_experience='2'
)

print("🧪 Test Persona:")
print(test_persona.describe())
print("\n" + "="*50)

# Pre-build jobs text for efficiency
jobs_text = "\n".join([
    f'{job_id}: {job_info.describe()}'
    for job_id, job_info in jobs_info.items()
])

# Find matches
test_jobs = find_job_matches(test_persona, jobs_text)
print(f"\n🎯 Job Matches: {test_jobs}")

test_trainings = find_training_matches(test_persona, trainings_info)
print(f"📚 Training Matches: {test_trainings}")

# Check cost so far
print_cost_summary()

🧪 Test Persona:
Name: Maria Silva
Skills: sustainability: intermediate, project_management: beginner
Location: São Paulo
Age: 24
Experience: 2 years


🎯 Job Matches: ['job_soc_008', 'job_tou_003', 'job_fib_009', 'job_pro_007']
📚 Training Matches: ['tr_fib_eco_regulations_02', 'tr_fib_waste_management_02', 'tr_pro_process_optimization_01', 'tr_pur_risk_management_supply_01']
💰 Cost Summary:
  Total API calls: 5
  Total tokens: 25,768
  Estimated cost: $0.0118

  By model:
    mistral-medium-latest: 5 calls, $0.0118


## 🎯 Practice Exercise: Manual Matching

Before we run the automated matching, let's build some intuition. Look at the extracted persona info above and think:
1. What kind of jobs would suit Maria?
2. What trainings might help her qualify for better roles?

**Your turn**: Based on Maria's profile (sustainability skills, project management beginner, São Paulo, 2 years experience), which of these would you recommend?
- job_env_001: Environmental Analyst (São Paulo, requires: data analysis, sustainability)
- job_mkt_005: Marketing Manager (Rio, requires: marketing, leadership)
- job_sus_003: Sustainability Coordinator (São Paulo, requires: sustainability, project management)
- tr_pm_101: Project Management Fundamentals
- tr_data_202: Data Analysis for Environmental Science

Think about it, then run the cell below to see what our agents recommend!

## Process All 100 Personas

**⚠️ THIS WILL COST REAL MONEY ⚠️**

Estimated cost: $3-5 for all 100 personas (assuming some failures and retries)

**Before you run this:**
1. ✅ Have you tested with 1-2 personas? 
2. ✅ Did the extraction work correctly?
3. ✅ Are your matches reasonable?
4. ✅ Do you have $10 to spare?

If you answered no to any of these, go back and test more.

**What's happening here:**
- Our **Interview Agent** talks to each persona
- Our **Extraction Agent** structures the conversation data
- We save progress every 5 personas (in case something breaks)
- Total time: ~10-15 minutes for all 100

**Pro tip**: Comment out the loop and run with just `persona_ids[:20]` first to test with 20 personas. Your wallet will thank you.

In [None]:
# Interview all personas
persona_ids = [f'persona_{i:03}' for i in range(1, 101)]
personas_save_path = SUBMISSION_DIR / 'personas.json'

if not personas_save_path.exists():
    save_json(personas_save_path, {})

persona_infos = read_json(personas_save_path)
personas_to_process = len(persona_ids) - len(persona_infos)
print(f'Personas to process: {personas_to_process}')

# Reset cost tracker if starting fresh
if len(persona_infos) == 0:
    reset_cost_tracker()
    print("💰 Starting fresh - cost tracker reset")

# Track how many new personas we process
new_personas_processed = 0

for persona_id in tqdm(persona_ids):
    if persona_id not in persona_infos:
        # Interview
        conversation = conduct_persona_interview(persona_id, max_turns=3)

        # Extract
        persona_info = extract_persona_info(conversation)
        persona_infos[persona_id] = persona_info.model_dump_json()
        new_personas_processed += 1

        # Save every 5 personas
        if len(persona_infos) % 5 == 0:
            save_json(personas_save_path, persona_infos)

        # Show cost update every 20 personas
        if new_personas_processed > 0 and new_personas_processed % 20 == 0:
            print(f"\n💰 Cost update after {new_personas_processed} new personas:")
            print_cost_summary()
            print()

save_json(personas_save_path, persona_infos)

# Convert to PersonaInfo objects
personas = {
    pid: PersonaInfo.model_validate_json(data)
    for pid, data in persona_infos.items()
}

print(f"\n✅ Interviewed {len(personas)} personas total ({new_personas_processed} new)")

# Final cost summary for persona processing
if new_personas_processed > 0:
    print("\n📊 Persona processing costs:")
    print_cost_summary()

Personas to process: 60


 60%|██████    | 60/100 [11:32<17:22, 26.05s/it]


💰 Cost update after 20 new personas:
💰 Cost Summary:
  Total API calls: 184
  Total tokens: 317,843
  Estimated cost: $0.3690

  By model:
    mistral-medium-latest: 184 calls, $0.3690



 80%|████████  | 80/100 [24:52<18:51, 56.55s/it]


💰 Cost update after 40 new personas:
💰 Cost Summary:
  Total API calls: 244
  Total tokens: 444,189
  Estimated cost: $0.5198

  By model:
    mistral-medium-latest: 244 calls, $0.5198



100%|██████████| 100/100 [34:15<00:00, 20.56s/it]


💰 Cost update after 60 new personas:
💰 Cost Summary:
  Total API calls: 286
  Total tokens: 536,314
  Estimated cost: $0.6293

  By model:
    mistral-medium-latest: 286 calls, $0.6293


✅ Interviewed 100 personas total (60 new)

📊 Persona processing costs:
💰 Cost Summary:
  Total API calls: 286
  Total tokens: 536,314
  Estimated cost: $0.6293

  By model:
    mistral-medium-latest: 286 calls, $0.6293





In [32]:
# Match trainings to jobs first
# Pre-build jobs text ONCE for efficiency
jobs_text = "\n".join([
    f'{job_id}: {job_info.describe()}'
    for job_id, job_info in jobs_info.items()
])

job_training_map = {}
for job_id, job_info in tqdm(jobs_info.items(), desc="Mapping trainings to jobs"):
    # Simple heuristic: find trainings that teach required skills
    relevant_trainings = []
    for tid, tinfo in trainings_info.items():
        skill_name = tinfo.skill_acquired_and_level[0].lower()
        if any(skill_name in req.lower() for req in job_info.required_skills):
            relevant_trainings.append(tid)
            if len(relevant_trainings) >= 3:
                break
    job_training_map[job_id] = relevant_trainings

Mapping trainings to jobs: 100%|██████████| 200/200 [00:00<00:00, 1545.52it/s]


In [33]:
# Reset cost tracker for matching phase
print("\n💰 Starting matching phase - resetting cost tracker")
reset_cost_tracker()

# Generate final results
results = []
personas_matched = 0

for persona_id, persona_info in tqdm(personas.items(), desc="Generating recommendations"):
    data = {'persona_id': persona_id}

    # CRITICAL: Check age first for awareness cases!
    try:
        age = int(persona_info.age) if persona_info.age and persona_info.age != 'unknown' else 25
    except:
        age = 25  # Default to adult if age parsing fails

    if age < 16:
        # Minor - needs awareness type
        data['predicted_type'] = 'awareness'
        data['predicted_items'] = 'too_young'
    else:
        # Adult - proceed with job/training matching
        jobs = find_job_matches(persona_info, jobs_text)
        personas_matched += 1

        if jobs:
            data['predicted_type'] = 'jobs+trainings'
            data['jobs'] = [
                {
                    'job_id': job_id,
                    'suggested_trainings': job_training_map.get(job_id, [])
                }
                for job_id in jobs
            ]
        else:
            # No jobs found, suggest trainings only
            trainings = find_training_matches(persona_info, trainings_info)
            data['predicted_type'] = 'trainings_only'
            data['trainings'] = trainings

    results.append(data)

    # Show cost update every 25 personas
    if personas_matched % 25 == 0 and personas_matched > 0:
        print(f"\n💰 Matching progress - {personas_matched} personas matched:")
        print_cost_summary()
        print()

# Save results
results_save_path = SUBMISSION_DIR / 'results.json'
save_json(results_save_path, results)
print(f"\n✅ Generated recommendations for {len(results)} personas")
print(f"📁 Results saved to: {results_save_path}")

# Count types for debugging
type_counts = {}
for r in results:
    t = r.get('predicted_type', 'unknown')
    type_counts[t] = type_counts.get(t, 0) + 1
print(f"\n📊 Type distribution: {type_counts}")

# Final cost summary for matching
print("\n📊 Final matching costs:")
print_cost_summary()
print("\n" + "="*50)


💰 Starting matching phase - resetting cost tracker


Generating recommendations:  27%|██▋       | 27/100 [01:13<03:17,  2.70s/it]


💰 Matching progress - 25 personas matched:
💰 Cost Summary:
  Total API calls: 25
  Total tokens: 305,525
  Estimated cost: $0.1241

  By model:
    mistral-medium-latest: 25 calls, $0.1241



Generating recommendations:  54%|█████▍    | 54/100 [02:18<01:20,  1.76s/it]


💰 Matching progress - 50 personas matched:
💰 Cost Summary:
  Total API calls: 51
  Total tokens: 623,835
  Estimated cost: $0.2543

  By model:
    mistral-medium-latest: 51 calls, $0.2543



Generating recommendations:  81%|████████  | 81/100 [03:02<00:29,  1.53s/it]


💰 Matching progress - 75 personas matched:
💰 Cost Summary:
  Total API calls: 76
  Total tokens: 929,334
  Estimated cost: $0.3784

  By model:
    mistral-medium-latest: 76 calls, $0.3784



Generating recommendations: 100%|██████████| 100/100 [04:27<00:00,  2.67s/it]


✅ Generated recommendations for 100 personas
📁 Results saved to: ../submissions/3/results.json

📊 Type distribution: {'jobs+trainings': 87, 'awareness': 6, 'trainings_only': 7}

📊 Final matching costs:
💰 Cost Summary:
  Total API calls: 101
  Total tokens: 1,237,638
  Estimated cost: $0.5090

  By model:
    mistral-medium-latest: 101 calls, $0.5090






## Submit to Leaderboard!

This is it. The moment of truth. If everything worked, you should jump from ~10% to ~40-50%.

**Before submitting:**
- Check you have 100 results (one per persona)
- Make sure you're not submitting your 10th attempt today (there's a limit!)

**After submitting:**
- Go check the leaderboard immediately
- Screenshot your score for bragging rights
- Share what worked in the Teams channel (help others, win the collaborator award!)
- If your score is still terrible, check our debugging tips above

In [None]:
from src.utils import make_submission

# Submit
response = make_submission(results)

if response and response.status_code == 200:
    print("🎉 Submission successful! Check the leaderboard!")
else:
    print(f"❌ Submission failed: {response.text if response else 'No response'}")

# Final cost report
print("\n" + "="*50)
print_cost_summary()

## What You Just Built

Congrats! You went from random matching (5%) to intelligent AI agents (40-50%). That's real progress.

✅ **Multi-agent system** with specialized roles (like a real consulting team)

✅ **Semantic matching** that actually understands context

✅ **Cost optimization** using appropriate models (small for simple, large for complex)

✅ **40-50% accuracy** (vs that embarrassing 5-10% from Tutorial 3)

### Your Score Analysis

- **If you got 40-50%**: Great! The agents are working. Your semantic matching is solid.
- **If you got 20-30%**: Check your conversation quality. Are personas actually answering your questions?
- **If you got <20%**: Something's broken. Check extraction, check matching logic, check everything.
- **If you got >60%**: Share your secret sauce in Teams! Seriously, help others and maybe win that collaborator award.

## Final Tips

1. **Track your costs** - Set a budget and stick to it
2. **Share knowledge** - The Teams channel is there for a reason
3. **Experiment boldly** - Try different approaches, compare scores
4. **Help others** - Best collaborator award is as cool as winning
5. **Have fun** - You're building AI agents that help people find green jobs. That's pretty awesome.

See you in Tutorial 5, where we'll push for 70%+ with advanced techniques! 🚀

**Remember**: Bad code that ships > Perfect code that doesn't. You shipped. You improved. You're already ahead of most.

---

## Exercises

### Exercise 1: Model Optimization
Try using small models for everything. How much do you save? How much does accuracy drop? The answer might surprise you.

### Exercise 2: Better Interviews
Design interview questions that get all info in 2 turns instead of 5. Compound questions are your friend.

### Exercise 3: Semantic Caching
If two personas have similar profiles, can you reuse recommendations? This could cut costs by 60-70%!

### Exercise 4: Training Paths
Instead of individual trainings, recommend learning paths (sequences of trainings). Much more useful for career progression.

Share your improvements in Teams! The best optimization wins eternal glory (and maybe some swag). 🚀