# Resume Screening with LLMs

**Goal**: Demonstrate structured outputs with LLMs for resume analysis

## What You'll Learn
1. Load and view resume data
2. Use structured outputs to extract information from resumes
3. Build reusable functions with clear inputs/outputs

## Setup

In [None]:
# Configuration
OPENROUTER_API_KEY = ""  # Paste your key here

if not OPENROUTER_API_KEY or OPENROUTER_API_KEY.strip() == "":
    raise RuntimeError(
        "⚠️  Please set OPENROUTER_API_KEY above before running this notebook.\n"
        "Get your key from: https://openrouter.ai/keys"
    )

print("✓ API key configured")

In [None]:
import csv
import json
import httpx
from typing import Dict, List, Any, Optional

print("✓ Imports loaded")

## Part 1: Loading Resume Data

First, we need a function to load our resume data from the CSV file.

In [None]:
def load_resumes(csv_path: str) -> Dict[str, Dict[str, str]]:
    """
    Load all resumes from CSV into a dictionary.
    
    Args:
        csv_path: Path to the resumes CSV file
    
    Returns:
        Dict mapping resume ID to resume data (ID, Resume_str, Category)
    """
    resumes = {}
    with open(csv_path, 'r', encoding='utf-8') as f:
        reader = csv.DictReader(f)
        for row in reader:
            resumes[row['ID']] = {
                'ID': row['ID'],
                'Resume_str': row['Resume_str'],
                'Category': row['Category']
            }
    return resumes

In [None]:
# Load all resumes
resumes = load_resumes('../data/resumes_final.csv')

print(f"Loaded {len(resumes)} resumes")
print(f"\nSample resume IDs:")
for resume_id in list(resumes.keys())[:5]:
    print(f"  {resume_id} - {resumes[resume_id]['Category']}")

In [None]:
# View a single resume
sample_id = list(resumes.keys())[0]
sample_resume = resumes[sample_id]

print(f"Resume ID: {sample_resume['ID']}")
print(f"Category: {sample_resume['Category']}")
print(f"\nResume text (first 800 characters):")
print("="*70)
print(sample_resume['Resume_str'][:800])
print("...")

## Part 2: Structured Output with LLMs

Now we'll create a reusable function that takes:
1. A prompt
2. A resume
3. A structured output schema

And returns the structured output.

In [None]:
def analyze_resume_with_structure(
    api_key: str,
    prompt: str,
    resume_text: str,
    output_schema: str,
    model: str = "anthropic/claude-3.5-sonnet",
    temperature: float = 0.3
) -> Dict[str, Any]:
    """
    Analyze a resume using an LLM with structured output.
    
    Args:
        api_key: OpenRouter API key
        prompt: The instruction for what to analyze
        resume_text: The resume text to analyze
        output_schema: JSON schema description for the output format
        model: Model to use (default: Claude 3.5 Sonnet)
        temperature: Sampling temperature (default: 0.3 for consistency)
    
    Returns:
        Dict with 'result' (parsed JSON) and 'error' (if any)
    """
    # Build the full prompt
    full_prompt = f"""{prompt}

Resume:
{resume_text[:3000]}

Return a JSON object with this structure:
{output_schema}

Return ONLY valid JSON, no additional text."""
    
    # Make API call
    url = "https://openrouter.ai/api/v1/chat/completions"
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    }
    payload = {
        "model": model,
        "messages": [{"role": "user", "content": full_prompt}],
        "temperature": temperature,
        "max_tokens": 1500,
        "response_format": {"type": "json_object"}
    }
    
    try:
        with httpx.Client(timeout=60) as client:
            resp = client.post(url, headers=headers, json=payload)
            resp.raise_for_status()
            data = resp.json()
            
            content = data["choices"][0]["message"]["content"]
            result = json.loads(content)
            
            return {
                "result": result,
                "error": None,
                "usage": data.get("usage", {})
            }
    except Exception as e:
        return {
            "result": None,
            "error": str(e),
            "usage": {}
        }

## Example 1: Extract Technical Skills

Let's use our function to extract technical skills from a resume.

In [None]:
# Define what we want to extract
prompt = "Extract the technical skills, programming languages, frameworks, and technologies from this resume."

# Define the output structure
output_schema = """
{
  "programming_languages": ["list of languages"],
  "frameworks_libraries": ["list of frameworks"],
  "databases": ["list of databases"],
  "cloud_platforms": ["list of cloud platforms"],
  "tools": ["list of development tools"]
}
"""

# Analyze the resume
result = analyze_resume_with_structure(
    OPENROUTER_API_KEY,
    prompt,
    sample_resume['Resume_str'],
    output_schema
)

if result['error']:
    print(f"❌ Error: {result['error']}")
else:
    print("✓ Skills extracted successfully\n")
    print(json.dumps(result['result'], indent=2))
    print(f"\nTokens used: {result['usage'].get('total_tokens', 0)}")

## Example 2: Assess Years of Experience

Let's extract experience level information.

In [None]:
# Different prompt and structure
prompt = "Analyze this resume and estimate the candidate's experience level and key qualifications."

output_schema = """
{
  "estimated_years_experience": <number>,
  "experience_level": "<entry|junior|mid|senior|principal>",
  "job_titles": ["list of job titles held"],
  "education": "highest degree or education",
  "key_strengths": ["list of 3-5 key strengths"]
}
"""

result = analyze_resume_with_structure(
    OPENROUTER_API_KEY,
    prompt,
    sample_resume['Resume_str'],
    output_schema
)

if result['error']:
    print(f"❌ Error: {result['error']}")
else:
    print("✓ Experience assessed successfully\n")
    print(json.dumps(result['result'], indent=2))
    print(f"\nTokens used: {result['usage'].get('total_tokens', 0)}")

## Example 3: Match to Job Requirements

Now let's compare a resume against a job requisition.

In [None]:
# Load a job requisition
with open('../data/job_req_senior.md', 'r') as f:
    job_req = f.read()

print("Job Requisition (first 300 characters):")
print("="*70)
print(job_req[:300])
print("...")

In [None]:
# Match resume to job
prompt = f"""Compare this resume against the job requirements below and assess the candidate's fit.

Job Requirements:
{job_req[:2000]}"""

output_schema = """
{
  "fit_score": <number 0-100>,
  "recommendation": "<STRONG_FIT|GOOD_FIT|MODERATE_FIT|WEAK_FIT|POOR_FIT>",
  "matching_qualifications": ["list of requirements the candidate meets"],
  "missing_qualifications": ["list of requirements the candidate lacks"],
  "reasoning": "<2-3 sentence explanation of the fit score>"
}
"""

result = analyze_resume_with_structure(
    OPENROUTER_API_KEY,
    prompt,
    sample_resume['Resume_str'],
    output_schema
)

if result['error']:
    print(f"❌ Error: {result['error']}")
else:
    print("✓ Match assessment complete\n")
    match_data = result['result']
    print(f"Fit Score: {match_data['fit_score']}/100")
    print(f"Recommendation: {match_data['recommendation']}")
    print(f"\nReasoning: {match_data['reasoning']}")
    print(f"\nMatching Qualifications ({len(match_data['matching_qualifications'])}):")
    for qual in match_data['matching_qualifications'][:5]:
        print(f"  ✓ {qual}")
    print(f"\nMissing Qualifications ({len(match_data['missing_qualifications'])}):")
    for qual in match_data['missing_qualifications'][:5]:
        print(f"  ✗ {qual}")
    print(f"\nTokens used: {result['usage'].get('total_tokens', 0)}")

## Key Takeaways

1. **Simple data loading**: One function returns all resumes as a dictionary
2. **Reusable structured output function**: Same function works for different analysis tasks
3. **Flexibility**: Change the prompt and output schema to extract different information
4. **Clear inputs/outputs**: Each function has a single, well-defined purpose

## Next Steps

Try experimenting with:
- Different prompts and output schemas
- Analyzing multiple resumes in a loop
- Comparing resumes against different job requirements
- Extracting other information (soft skills, certifications, etc.)