# Resume Matching System - Interactive Demo

This notebook demonstrates the resume matching system with interactive examples.


## Setup

In [None]:
import sys
import warnings
warnings.filterwarnings('ignore')

from matcher import SemanticMatcher
from parsers import StructuredJobDescription, StructuredResume
from structured_matcher import EnhancedResumeMatchingSystem
import pandas as pd
import numpy as np
from sklearn.metrics import ndcg_score


print("✅ Setup complete!")

✅ Setup complete!


## Part 1: Quick Start Example

In [15]:
# Simple job description
job_description = """
    Senior Data Scientist
    
    About the Role:
    Join our data science team to drive insights from large-scale datasets
    and build predictive models for business decision-making.
    
    Required Skills:
    - 5+ years of Python programming
    - Strong SQL and database experience
    - Machine learning frameworks (scikit-learn, TensorFlow)
    - Statistical analysis and hypothesis testing
    - Data visualization (Matplotlib, Tableau)
    
    Responsibilities:
    - Analyze complex datasets to identify trends and patterns
    - Build and deploy predictive models
    - Collaborate with product teams on data-driven features
    - Present findings to stakeholders
    - Mentor junior data scientists
    
    Qualifications:
    - Master's or PhD in Statistics, Computer Science, or related field
    - 5+ years of professional data science experience
    - Experience with A/B testing and experimental design
    - Strong communication skills
    
"""

In [16]:
# =========================
# 3️⃣ Synthetic Resumes + Manual Labels
# =========================
# Labels:
# 1.0  = Good Match
# 0.5  = Partial Match
# 0.0  = Poor Match


# Sample resumes
resumes = [
    {
        "text": """
        John Doe
        Senior ML Engineer with 7 years of experience.
        Skills: Python, TensorFlow, PyTorch, AWS, SQL, Docker
        Deployed ML models to production at scale.
        """,
        "label": 1.0  # Good Match
    },
    {
        "text": """
        Jane Smith
        Data Scientist with 4 years experience.
        Skills: Python, scikit-learn, SQL, basic AWS.
        Limited production deployment exposure.
        """,
        "label": 0.5  # Partial Match
    },
    {
        "text": """
        Mike Johnson
        Backend Developer with 6 years experience.
        Skills: Java, Spring Boot, MySQL, Docker.
        No ML background.
        """,
        "label": 0.0  # Poor Match
    },
    {
        "text": """
        Chef Marcus Thorne
        Executive Head Chef | 12 years experience
        Expertise: Menu Development, Kitchen Management, Fine Dining, Food Safety.
        Technical Skills: Inventory Management Software, POS Systems.
        Experience: Managed 5-star restaurant kitchen with a team of 15 chefs.
        """,
        "label": 0.0,
        "test_case": "Culinary vs. Data Science (Management context overlap)"
    },
    {
        "text": """
        Dr. Arisaka Chandra
        Principal Research Scientist | 15+ years experience
        Expertise: Neural Networks, Bayesian Inference, NLP, PyTorch.
        Leadership: Managed a department of 20+ Data Scientists.
        Tech: Python, R, Spark, High-Performance Computing (HPC).
        Education: PhD in Physics, IIT Patna.
        """,
        "label": 0.5, # Partial: Overqualified/Executive level vs. Senior JD
        "test_case": "Overqualification & Academic-to-Industry Mapping"
    },
    
    {
        "text": """
        Candidate #99 (Unstructured)
        py.thon - ten.sor.flow - py.torch - aws - s.q.l
        Work: 5 years doing ML models in the cloud. 2018-2023.
        """,
        "label": 1.0, # Good: Tests if hybrid_parser.py can handle messy text
        "test_case": "Noisy Text / Hybrid Parser Robustness"
    },
    {
        "text": """
        Sarah Lee
        ML Engineer with 5 years experience.
        Skills: Python, PyTorch, GCP, SQL.
        Experience deploying ML pipelines.
        """,
        "label": 1.0
    },
    {
        "text": """
        Alex Brown
        Junior Data Analyst, 2 years experience.
        Skills: Excel, SQL, Tableau, basic Python.
        No cloud or ML production experience.
        """,
        "label": 0.0
    }
]


## Part 3: Full Evaluation on Synthetic Dataset

Run comprehensive evaluation on our labeled test set.

In [17]:
system = EnhancedResumeMatchingSystem()

results = system.score_resumes_detailed(
    job_description,
    [r["text"] for r in resumes]
)

Loading weights:   0%|          | 0/103 [00:00<?, ?it/s]

[1mBertModel LOAD REPORT[0m from: sentence-transformers/all-MiniLM-L6-v2
Key                     | Status     |  | 
------------------------+------------+--+-
embeddings.position_ids | UNEXPECTED |  | 

[3mNotes:
- UNEXPECTED[3m	:can be ignored when loading from different task/architecture; not ok if you expect identical arch.[0m


In [19]:
# Combine predictions with manual labels

records = []

for result in results:
    idx = result['resume_id']
    predicted_score = result.get('final_score', result.get('overall_score', 0))
    manual_label = resumes[idx]['label']

    records.append({
        "Candidate_ID": idx,
        "Predicted_Score": predicted_score,
        "Manual_Label": manual_label
    })


df = pd.DataFrame(records)
df = df.sort_values(by="Predicted_Score", ascending=False)

print("\n===== Evaluation Results =====\n")
display(df)


===== Evaluation Results =====



Unnamed: 0,Candidate_ID,Predicted_Score,Manual_Label
0,0,0.6049,1.0
1,5,0.4715,1.0
2,1,0.4139,0.5
3,4,0.3878,0.5
4,7,0.378,0.0
5,6,0.3701,1.0
6,3,0.319,0.0
7,2,0.2841,0.0


In [20]:
# =========================
# 5️⃣ Ranking Evaluation Metrics
# =========================

# Convert to arrays for ranking metrics
true_relevance = np.array([r["label"] for r in resumes]).reshape(1, -1)
predicted_scores = np.array([
    next(item["Predicted_Score"] for item in records if item["Candidate_ID"] == i)
    for i in range(len(resumes))
]).reshape(1, -1)

# nDCG (best for ranking tasks)
ndcg = ndcg_score(true_relevance, predicted_scores)

# Simple Top-K Precision
K = 3
sorted_indices = df.index[:K]
top_k_labels = df.iloc[:K]["Manual_Label"]
precision_at_k = np.sum(top_k_labels == 1.0) / K

print(f"nDCG Score: {ndcg:.3f}")
print(f"Precision@{K}: {precision_at_k:.3f}")


nDCG Score: 0.966
Precision@3: 0.667


## Part 4: Evaluation Strategy


If we had a larger labeled dataset, the most important metrics would be:

1. nDCG (Normalized Discounted Cumulative Gain)
   - Best metric for ranked retrieval systems
   - Supports graded relevance (1, 0.5, 0)
   - Rewards correct ordering at the top

2. Precision@K
   - Measures how many of the top-K resumes are truly strong matches
   - Very important for recruiter workflows (only top few are reviewed)

3. Recall
   - Ensures strong candidates are not missed

4. ROC-AUC (binary framing)
   - Measures separability between good and poor candidates

For hiring use cases, nDCG and Precision@K are the most critical,
because ranking quality at the top matters more than overall classification.
