# Resume Screening with LLMs: Building Modular AI Systems

**Goal**: Demonstrate building AI systems with well-defined inputs/outputs and vertical slices

## Key Concepts
1. **Modular Design**: Break complex tasks into functions with clear inputs/outputs
2. **Vertical Slices**: Build end-to-end functionality for one case first
3. **Horizontal Expansion**: Add features after vertical slice works

## System Overview
```
Resume PDF/CSV → Extract Skills → Match to Job → Rank Candidates
     (Input)         (Module 1)      (Module 2)      (Output)
```

## Setup
This notebook requires:
- `OPENROUTER_API_KEY`
- `resumes_final.csv` in data directory
- `job_req.md` in data directory

In [None]:
# Add current directory to Python path for imports
import sys
from pathlib import Path
sys.path.insert(0, str(Path.cwd()))

import pandas as pd
import json
from typing import Dict, List

from resume_utils import (
    load_resume_from_csv,
    load_all_resumes,
    load_job_req,
    extract_skills,
    match_to_job,
    screen_resume_vertical_slice
)

# Configuration
OPENROUTER_API_KEY = ""  # Paste your key here
MODEL = "anthropic/claude-3.5-sonnet"

# File paths
CSV_PATH = "../data/resumes_final.csv"
JOB_REQ_PATH = "../data/job_req.md"

if not OPENROUTER_API_KEY or OPENROUTER_API_KEY.strip() == "":
    raise RuntimeError(
        "⚠️  Please set OPENROUTER_API_KEY above before running this notebook.\n"
        "Get your key from: https://openrouter.ai/keys"
    )

print("✓ Imports loaded")
print("✓ API key configured")

## Part 1: Understanding the Data

Before building any AI system, we need to understand our inputs.

In [None]:
# Load and explore the resume data
resumes_df = pd.read_csv(CSV_PATH)

print(f"Total resumes: {len(resumes_df)}")
print(f"\nCategories:")
print(resumes_df['Category'].value_counts())

# Show sample resume IDs
print(f"\nSample IT Resume IDs:")
it_resumes = resumes_df[resumes_df['Category'] == 'INFORMATION-TECHNOLOGY']['ID'].head(5).tolist()
for rid in it_resumes:
    print(f"  {rid}")

In [None]:
# Load and display job requisition
job_req = load_job_req(JOB_REQ_PATH)

print("Job Requisition (first 500 characters):")
print("="*70)
print(job_req[:500])
print("...")
print(f"\nTotal length: {len(job_req)} characters")

## Part 2: Building a Vertical Slice (Crawl Phase)

**Goal**: Process ONE resume end-to-end to prove the concept works.

### Module 1: Extract Skills
**Input**: Resume text (string)  
**Output**: Structured skills data (dict)

This module has a single, clear responsibility: parse a resume and extract technical skills.

In [None]:
# Test skill extraction on ONE resume
test_resume_id = it_resumes[0]  # Use first IT resume

# Load the resume
test_resume = load_resume_from_csv(CSV_PATH, test_resume_id)
print(f"Testing with Resume ID: {test_resume_id}")
print(f"Category: {test_resume['Category']}")
print(f"\nResume text (first 500 chars):")
print(test_resume['Resume_str'][:500])
print("...")

In [None]:
# Extract skills using LLM
print("Extracting skills...")
skills_result = extract_skills(
    OPENROUTER_API_KEY,
    test_resume['Resume_str'],
    MODEL
)

if skills_result['error']:
    print(f"❌ Error: {skills_result['error']}")
else:
    print("✓ Skills extracted successfully")
    skills = skills_result['parsed_content']
    
    print(f"\nTokens used: {skills_result['usage'].get('total_tokens', 0)}")
    print(f"\nExtracted Skills:")
    print(json.dumps(skills, indent=2))

### Module 2: Match to Job Requirements
**Input**: Skills dict + Job req text  
**Output**: Match score and analysis (dict)

This module takes the structured output from Module 1 and compares it to job requirements.

In [None]:
# Match extracted skills to job requirements
print("Matching to job requirements...")
match_result = match_to_job(
    OPENROUTER_API_KEY,
    skills,
    job_req,
    MODEL
)

if match_result['error']:
    print(f"❌ Error: {match_result['error']}")
else:
    print("✓ Matching completed successfully")
    match_data = match_result['parsed_content']
    
    print(f"\nTokens used: {match_result['usage'].get('total_tokens', 0)}")
    print(f"\nMatch Results:")
    print(json.dumps(match_data, indent=2))

### Complete Vertical Slice: End-to-End Pipeline

Now that we've tested each module independently, let's combine them into a single pipeline.

This is our **vertical slice**: one resume through the entire system.

In [None]:
# Run complete pipeline on one resume
print(f"Running complete pipeline on resume {test_resume_id}...\n")

result = screen_resume_vertical_slice(
    OPENROUTER_API_KEY,
    test_resume_id,
    CSV_PATH,
    JOB_REQ_PATH,
    MODEL
)

if 'error' in result:
    print(f"❌ Error: {result['error']}")
else:
    print("✓ Pipeline completed successfully\n")
    print(f"Resume ID: {result['resume_id']}")
    print(f"Category: {result['category']}")
    print(f"Total tokens: {result['total_tokens']}")
    print(f"\nFit Score: {result['match']['fit_score']}/100")
    print(f"Recommendation: {result['match']['recommendation']}")
    print(f"\nReasoning: {result['match']['reasoning']}")

## Part 3: Horizontal Expansion (Walk Phase)

**Goal**: Now that our vertical slice works, expand to process multiple resumes.

### Batch Processing Multiple Resumes

We can now reuse our tested pipeline to process multiple resumes.

In [None]:
# Select a sample of resumes to process
# Mix of IT resumes and non-IT resumes
it_sample = resumes_df[resumes_df['Category'] == 'INFORMATION-TECHNOLOGY']['ID'].head(3).tolist()
non_it_sample = resumes_df[resumes_df['Category'] != 'INFORMATION-TECHNOLOGY']['ID'].head(2).tolist()

sample_ids = it_sample + non_it_sample

print(f"Processing {len(sample_ids)} resumes...")
print(f"IT resumes: {len(it_sample)}")
print(f"Non-IT resumes: {len(non_it_sample)}")

In [None]:
# Process all sample resumes
results = []
total_tokens = 0

for i, resume_id in enumerate(sample_ids, 1):
    print(f"\n[{i}/{len(sample_ids)}] Processing {resume_id}...")
    
    result = screen_resume_vertical_slice(
        OPENROUTER_API_KEY,
        resume_id,
        CSV_PATH,
        JOB_REQ_PATH,
        MODEL
    )
    
    if 'error' in result:
        print(f"  ❌ Error: {result['error']}")
        results.append({
            'resume_id': resume_id,
            'error': result['error']
        })
    else:
        print(f"  ✓ Score: {result['match']['fit_score']}/100 - {result['match']['recommendation']}")
        results.append(result)
        total_tokens += result['total_tokens']

print(f"\n" + "="*70)
print(f"Batch processing complete!")
print(f"Total tokens used: {total_tokens:,}")
print(f"Estimated cost: ${total_tokens * 3 / 1_000_000:.4f}")

### Rank and Compare Candidates

Now we can analyze and rank all candidates based on their fit scores.

In [None]:
# Create DataFrame from results
results_data = []
for r in results:
    if 'error' not in r:
        results_data.append({
            'Resume ID': r['resume_id'],
            'Category': r['category'],
            'Fit Score': r['match']['fit_score'],
            'Recommendation': r['match']['recommendation'],
            'Matching Skills Count': len(r['match']['matching_skills']),
            'Missing Skills Count': len(r['match']['missing_skills']),
            'Tokens': r['total_tokens']
        })

results_df = pd.DataFrame(results_data)
results_df = results_df.sort_values('Fit Score', ascending=False)

print("Candidate Rankings:")
print("="*70)
print(results_df.to_string(index=False))

In [None]:
# Display detailed results for top candidate
if len(results) > 0 and 'error' not in results[0]:
    top_result = results_df.iloc[0]
    top_resume_id = top_result['Resume ID']
    
    # Find full result
    top_full = next(r for r in results if r.get('resume_id') == top_resume_id)
    
    print(f"\nTop Candidate Details: {top_resume_id}")
    print("="*70)
    print(f"\nFit Score: {top_full['match']['fit_score']}/100")
    print(f"Recommendation: {top_full['match']['recommendation']}")
    print(f"\nMatching Skills ({len(top_full['match']['matching_skills'])}):")
    for skill in top_full['match']['matching_skills'][:10]:  # Show first 10
        print(f"  ✓ {skill}")
    if len(top_full['match']['matching_skills']) > 10:
        print(f"  ... and {len(top_full['match']['matching_skills']) - 10} more")
    
    print(f"\nMissing Skills ({len(top_full['match']['missing_skills'])}):")
    for skill in top_full['match']['missing_skills'][:5]:  # Show first 5
        print(f"  ✗ {skill}")
    
    print(f"\nReasoning:")
    print(f"  {top_full['match']['reasoning']}")

## Part 4: Analysis and Insights

### What Did We Build?

1. **Modular Design**: Each function has clear inputs/outputs
   - `extract_skills(resume_text) -> skills_dict`
   - `match_to_job(skills, job_req) -> match_dict`
   - `screen_resume_vertical_slice(resume_id) -> full_result`

2. **Vertical Slice First**: We tested one resume end-to-end before expanding
   - Verified each module works independently
   - Combined into complete pipeline
   - Only then expanded to batch processing

3. **Easy to Test and Debug**: Each piece can be tested independently
   - If skill extraction fails, we know exactly where the problem is
   - If matching is poor, we can improve just that module
   - Clear boundaries make debugging straightforward

### Next Steps (Run Phase)

To make this production-ready, we would add:
- Error handling and retry logic
- Caching of job requirements and common skills
- Cost tracking and budgeting
- Human-in-the-loop review for borderline cases
- Prompt optimization based on accuracy metrics
- Parallel processing for large batches

## Exercise: Identify Non-IT Resumes

Our dataset includes 10 non-IT resumes mixed in with 120 IT resumes.

**Challenge**: Did our system correctly identify the non-IT resumes with low scores?

In [None]:
# Show which resumes are actually non-IT
if len(results_df) > 0:
    print("Non-IT Resumes in Our Sample:")
    print("="*70)
    non_it_results = results_df[results_df['Category'] != 'INFORMATION-TECHNOLOGY']
    if len(non_it_results) > 0:
        print(non_it_results.to_string(index=False))
        print(f"\nAverage fit score for non-IT: {non_it_results['Fit Score'].mean():.1f}")
    else:
        print("No non-IT resumes in this sample")
    
    print(f"\nIT Resumes in Our Sample:")
    print("="*70)
    it_results = results_df[results_df['Category'] == 'INFORMATION-TECHNOLOGY']
    if len(it_results) > 0:
        print(it_results.to_string(index=False))
        print(f"\nAverage fit score for IT: {it_results['Fit Score'].mean():.1f}")

## Key Takeaways

1. **Break problems into pieces with clear interfaces**
   - Each function does one thing well
   - Easy to test, debug, and improve

2. **Build vertical slices first**
   - Prove end-to-end flow works with simplest case
   - Identify integration issues early

3. **Expand horizontally after vertical slice works**
   - Add batch processing
   - Add error handling and edge cases
   - Don't optimize prematurely

4. **AI systems especially benefit from this approach**
   - LLM behavior is less predictable
   - Modular design helps isolate issues
   - Easy to swap models or prompts for specific steps