# HR Workflow Simulation: CV Analysis & Role Matching System 🟢

## 🎯 Project Overview

 Welcome to the **CV Analyzer Project**! In this comprehensive tutorial, you'll build an intelligent CV analysis system that processes candidate resumes and matches them to predefined job roles.

## 🧠 Manual CV Analysis Challenge

Before we dive into automation, let's experience what HR professionals do every day.

**📝 Your First Task**: Explore the CVs in the `data/cvs` folder and manually extract key information from each CV. For each CV, try to identify:

1. **Personal Details**: Name, contact information, location
2. **Key Skills**: Technical skills, soft skills, tools
3. **Experience**: Years of experience, previous roles, responsibilities
4. **Education**: Degrees, institutions, graduation years
5. **Role Fit**: Which job role would be the best match? Developer, QA, or DevOps?

**🔍 Instructions**:
1. Open the text files in the `data/cvs` folder
2. For each CV, create a summary with the information above
3. Estimate how suitable the candidate is for each role (score 1-10)
4. Track how long it takes you to process each CV

**📝 Manual Analysis Template**:
```
Candidate Name: 
Contact Info: 
Location: 
Key Skills:
Years of Experience:
Previous Positions:
Education:
Best Role Match:
Fit Score (1-10):
Time spent analyzing: ___ minutes
```

**💭 Reflection**: After analyzing 2-3 CVs, consider:
- How long did this take you?
- How consistent were your evaluations?
- How confident are you in your role recommendations?
- Could you maintain this quality for dozens of CVs?

## 🤖 Finding it Difficult? Let's Automate!

If you found the manual CV analysis:
- Time-consuming (taking 5-10 minutes per CV)
- Prone to inconsistency and human error
- Difficult to scale when processing dozens of applications
- Challenging to standardize across different HR team members

... then you're experiencing the exact challenges that HR professionals face every day!

Our automated solution will:
- Process text-based CVs in seconds
- Extract key entities with high accuracy
- Match candidates to job roles systematically
- Generate consistent reports for hiring decisions

**Skills Focus**:
- Entity extraction using prompt engineering
- Text classification with OpenAI
- Basic candidate scoring
- Structured output generation

**Input**: Text files (.txt) containing candidate CVs  
**Output**: Candidate profiles with role recommendations

## 📝 What You'll Learn

1. **Prompt Engineering**: Design effective prompts for entity extraction
2. **Text Classification**: Categorize candidates by role fit
3. **Structured Data**: Convert unstructured CV text into organized profiles
4. **HR Decision Making**: Generate actionable hiring recommendations

## 🔍 Project Workflow

1. **Load Text CVs** from the data folder
2. **Extract Entities** (name, skills, experience, education)
3. **Calculate Role Fit** scores for each position
4. **Generate Reports** with hiring recommendations
5. **Batch Process** multiple candidates efficiently

## Let's Start Building Our CV Analysis System

Now that you understand the challenges of manual CV analysis, let's build an automated solution. Our system will:

1. Read CV text files from the `data/cvs` folder 
2. Use AI to extract structured information
3. Match candidates to appropriate job roles
4. Generate comprehensive HR reports

This approach will save hours of manual work and provide consistent results across all candidates.

In [None]:
# Step 1: Install Required Libraries

# Basic libraries for Project 1
!pip install langchain openai

In [None]:
# Note: You might need to restart your kernel after installation
import sys
print("Python version:", sys.version)
print("\n✅ Installation complete! Restart kernel if needed.")

## Setup Environment and Import Libraries 🔧

First, we'll set up our environment and import the necessary libraries for the CV analysis:

- **langchain**: For working with language models
- **openai**: The OpenAI API client
- **re**: For regular expressions (pattern matching)
- **os**: For file handling

In [None]:
# Step 2: Import Libraries and Setup Environment

import os
import re
import json
from typing import Dict, List, Optional
from datetime import datetime
from pathlib import Path
from enum import Enum

# For AI/LLM operations
from langchain.llms import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# Read API key from config.txt file
config_file = Path('config.txt')
openai_api_key = None

if config_file.exists():
    with open(config_file, 'r') as f:
        for line in f:
            if line.startswith('OPENAI_API_KEY='):
                openai_api_key = line.strip().split('=')[1]
                break

if openai_api_key:
    print("✅ OpenAI API key loaded successfully from config.txt!")
    print(f"🔑 API key starts with: {openai_api_key[:8]}...")
else:
    print("❌ OpenAI API key not found!")
    print("Please make sure config.txt file exists with your OPENAI_API_KEY")
    print("Example content: OPENAI_API_KEY=sk-your-key-here")

print("\n📚 All libraries imported successfully!")

### Define Job Roles and Requirements 📋

Before we can analyze CVs, we need to define the job roles and their requirements. This will be used later for candidate matching.

In [None]:
# Define job roles and evaluation criteria

class ExperienceLevel(Enum):
    JUNIOR = "Junior (0-2 years)"
    MID = "Mid-level (2-5 years)"
    SENIOR = "Senior (5+ years)"
    LEAD = "Lead/Principal (8+ years)"

class JobRole(Enum):
    DEVELOPER = "Software Developer"
    QA_ENGINEER = "QA Engineer"
    DEVOPS_ENGINEER = "DevOps Engineer"

class RoleRequirements:
    """Define requirements for each job role"""
    def __init__(self, role, required_skills, preferred_skills, 
                min_experience, education_requirements, soft_skills):
        self.role = role
        self.required_skills = required_skills
        self.preferred_skills = preferred_skills
        self.min_experience = min_experience
        self.education_requirements = education_requirements
        self.soft_skills = soft_skills
    
# Define role requirements
ROLE_DEFINITIONS = {
    JobRole.DEVELOPER: RoleRequirements(
        role=JobRole.DEVELOPER,
        required_skills=["Python", "JavaScript", "SQL", "Git", "REST APIs"],
        preferred_skills=["React", "Node.js", "Django", "AWS", "Docker"],
        min_experience=2,
        education_requirements=["Computer Science", "Software Engineering", "Information Technology"],
        soft_skills=["Problem-solving", "Teamwork", "Communication", "Adaptability"]
    ),
    
    JobRole.QA_ENGINEER: RoleRequirements(
        role=JobRole.QA_ENGINEER,
        required_skills=["Test Automation", "Selenium", "API Testing", "SQL", "Bug Tracking"],
        preferred_skills=["Cypress", "TestNG", "Postman", "JIRA", "Performance Testing"],
        min_experience=1,
        education_requirements=["Computer Science", "Engineering", "Information Technology"],
        soft_skills=["Attention to Detail", "Analytical Thinking", "Documentation", "Patience"]
    ),
    
    JobRole.DEVOPS_ENGINEER: RoleRequirements(
        role=JobRole.DEVOPS_ENGINEER,
        required_skills=["AWS", "Docker", "Kubernetes", "CI/CD", "Linux", "Monitoring"],
        preferred_skills=["Terraform", "Jenkins", "Ansible", "Prometheus", "ELK Stack"],
        min_experience=3,
        education_requirements=["Computer Science", "Systems Administration", "Engineering"],
        soft_skills=["System Thinking", "Troubleshooting", "Collaboration", "Continuous Learning"]
    )
}

print("🎯 Job roles and requirements defined!")
print("\n📋 Available Positions:")
for role, req in ROLE_DEFINITIONS.items():
    print(f"- {role.value}: {len(req.required_skills)} core skills, {req.min_experience}+ years experience")

print("\n✅ HR workflow configuration ready!")

## Create Project Directories 📁

Let's create the necessary directories for our project:

In [None]:
# Create project directory structure
def create_project_structure():
    """
    Create folder structure for HR workflow simulation.
    """
    folders = [
        'output/reports',      # Generated reports
        'output/matches',      # Role matching results
        'data/job_roles',      # Job role definitions
    ]
    
    print("📁 Creating project directory structure...")
    for folder in folders:
        try:
            os.makedirs(folder, exist_ok=True)
            print(f"✅ Created: {folder}")
        except Exception as e:
            print(f"❌ Error creating {folder}: {str(e)}")
    
    print("\n🏢 Project Structure:")
    print("├── data/")
    print("│   ├── cvs/         # CV files for analysis")
    print("│   └── job_roles/   # Role definitions")
    print("└── output/")
    print("    ├── reports/     # HR reports")
    print("    └── matches/     # Role matching")
    print("\n📝 Note: We'll use CVs from the data/cvs folder for analysis")

create_project_structure()

## Setup LLM and Prompt Templates 🤖

Now we'll initialize our language model and create prompt templates for entity extraction and role classification:

In [None]:
# Initialize OpenAI LLM
if openai_api_key:
    llm = OpenAI(
        temperature=0,  # Low temperature for consistent, factual responses
        openai_api_key=openai_api_key,
        model_name="gpt-3.5-turbo-instruct"  # Cost-effective model
    )
    print("🤖 OpenAI LLM initialized successfully!")
else:
    print("❌ Cannot initialize LLM without API key")
    llm = None

# Entity Extraction Prompt Template
entity_extraction_prompt = PromptTemplate(
    input_variables=["cv_text"],
    template="""
    You are an expert HR assistant analyzing a candidate's CV. Extract the following information:
    
    CV Text:
    {cv_text}
    
    Extract and return ONLY the following information in JSON format:
    {{
        "name": "Candidate's full name",
        "email": "Email address",
        "phone": "Phone number", 
        "location": "City, Country",
        "skills": ["List of all technical and soft skills"],
        "experience": [
            {{
                "company": "Company name",
                "position": "Job title",
                "duration": "Employment period",
                "responsibilities": "Key responsibilities"
            }}
        ],
        "education": [
            {{
                "degree": "Degree name",
                "institution": "University/School name", 
                "year": "Graduation year",
                "field": "Field of study"
            }}
        ],
        "total_experience_years": "Number of years of relevant work experience"
    }}
    
    Be thorough but concise. Extract only factual information present in the CV.
    """
)

# Role Classification Prompt Template
role_classification_prompt = PromptTemplate(
    input_variables=["candidate_profile", "role_requirements"],
    template="""
    You are an HR specialist evaluating a candidate for a specific role.
    
    Candidate Profile:
    {candidate_profile}
    
    Role Requirements:
    {role_requirements}
    
    Analyze the candidate's fit for this role and return a JSON response:
    {{
        "role_fit_score": "Score from 0-100 based on skills and experience match",
        "matching_skills": ["List of candidate skills that match role requirements"],
        "missing_skills": ["List of required skills the candidate lacks"],
        "experience_level": "Junior/Mid-level/Senior based on years and complexity",
        "strengths": ["Top 3 strengths for this role"],
        "concerns": ["Top 3 concerns or gaps"],
        "recommendation": "Strong Fit/Good Fit/Moderate Fit/Poor Fit",
        "justification": "Brief explanation of the recommendation"
    }}
    
    Be objective and consider both technical skills and experience relevance.
    """
)

print("✅ Prompt templates created!")
print("📋 Templates ready for:")
print("- Entity extraction from text CVs")
print("- Role-based candidate classification")

## Implement Basic CV Analyzer 🔍

Let's implement the core functionality of our CV Analyzer for Project 1:

In [None]:
# Project 1: Core Functions for Text CV Processing

class Project1_BasicExtractor:
    """
    Basic CV analyzer for text-based documents (Project 1)
    Focus: Entity extraction and role classification
    """
    
    def __init__(self):
        self.results = []
        self.processed_count = 0
        print("🟢 Project 1: Basic Entity Extractor initialized!")
    
    def load_text_cv(self, file_path: str) -> str:
        """
        Load text content from a .txt CV file.
        
        Args:
            file_path (str): Path to the text CV file
            
        Returns:
            str: CV content as text
        """
        try:
            with open(file_path, 'r', encoding='utf-8') as file:
                content = file.read().strip()
                print(f"📝 Loaded CV: {os.path.basename(file_path)} ({len(content)} characters)")
                return content
        except Exception as e:
            print(f"❌ Error loading {file_path}: {str(e)}")
            return ""
    
    def extract_entities(self, cv_text: str) -> dict:
        """
        Extract structured entities from CV text using AI.
        
        Args:
            cv_text (str): Raw CV content
            
        Returns:
            dict: Extracted entities
        """
        if not llm:
            return {"error": "LLM not initialized"}
        
        try:
            print("🔄 Extracting entities from CV...")
            chain = LLMChain(llm=llm, prompt=entity_extraction_prompt)
            result = chain.run(cv_text=cv_text)
            
            # Parse JSON response
            try:
                entities = json.loads(result.strip())
                print("✅ Entities extracted successfully")
                return entities
            except json.JSONDecodeError:
                print("⚠️ JSON parsing failed, using raw response")
                return {"raw_response": result.strip()}
                
        except Exception as e:
            print(f"❌ Error extracting entities: {str(e)}")
            return {"error": str(e)}
    
    def classify_for_role(self, candidate_profile: dict, target_role: JobRole) -> dict:
        """
        Classify candidate fit for a specific role.
        
        Args:
            candidate_profile (dict): Extracted candidate information
            target_role (JobRole): Role to evaluate against
            
        Returns:
            dict: Role fit analysis
        """
        if not llm:
            return {"error": "LLM not initialized"}
        
        role_req = ROLE_DEFINITIONS[target_role]
        
        try:
            print(f"🎯 Analyzing fit for {target_role.value}...")
            
            # Prepare role requirements summary
            requirements_summary = {
                "role": target_role.value,
                "required_skills": role_req.required_skills,
                "preferred_skills": role_req.preferred_skills,
                "min_experience": role_req.min_experience,
                "education": role_req.education_requirements,
                "soft_skills": role_req.soft_skills
            }
            
            chain = LLMChain(llm=llm, prompt=role_classification_prompt)
            result = chain.run(
                candidate_profile=json.dumps(candidate_profile, indent=2),
                role_requirements=json.dumps(requirements_summary, indent=2)
            )
            
            # Parse JSON response
            try:
                classification = json.loads(result.strip())
                classification["target_role"] = target_role.value
                print(f"✅ Role analysis completed - {classification.get('recommendation', 'Unknown')} fit")
                return classification
            except json.JSONDecodeError:
                print("⚠️ JSON parsing failed for classification")
                return {"target_role": target_role.value, "raw_response": result.strip()}
                
        except Exception as e:
            print(f"❌ Error in role classification: {str(e)}")
            return {"target_role": target_role.value, "error": str(e)}
    
    def process_single_cv(self, cv_file_path: str) -> dict:
        """
        Complete processing pipeline for a single CV.
        
        Args:
            cv_file_path (str): Path to CV text file
            
        Returns:
            dict: Complete candidate analysis
        """
        print(f"\n🚀 Processing CV: {os.path.basename(cv_file_path)}")
        print("=" * 50)
        
        # Step 1: Load CV text
        cv_text = self.load_text_cv(cv_file_path)
        if not cv_text:
            return {"error": "Failed to load CV"}
        
        # Step 2: Extract entities
        entities = self.extract_entities(cv_text)
        if "error" in entities:
            return entities
        
        # Step 3: Classify for all roles
        role_analyses = {}
        for role in JobRole:
            analysis = self.classify_for_role(entities, role)
            role_analyses[role.value] = analysis
        
        # Step 4: Determine best fit role
        best_role = self.find_best_role_match(role_analyses)
        
        # Compile complete profile
        complete_profile = {
            "candidate_info": entities,
            "role_analyses": role_analyses,
            "best_role_match": best_role,
            "processed_at": datetime.now().isoformat(),
            "source_file": os.path.basename(cv_file_path)
        }
        
        self.processed_count += 1
        self.results.append(complete_profile)
        
        print(f"✅ CV processing completed! Best fit: {best_role['role']} ({best_role['score']}% match)")
        return complete_profile
    
    def find_best_role_match(self, role_analyses: dict) -> dict:
        """
        Find the role with the highest fit score.
        
        Args:
            role_analyses (dict): All role analysis results
            
        Returns:
            dict: Best matching role information
        """
        best_role = {"role": "Unknown", "score": 0, "recommendation": "No match"}
        
        for role_name, analysis in role_analyses.items():
            if "role_fit_score" in analysis:
                try:
                    score = float(analysis["role_fit_score"])
                    if score > best_role["score"]:
                        best_role = {
                            "role": role_name,
                            "score": score,
                            "recommendation": analysis.get("recommendation", "Unknown")
                        }
                except (ValueError, TypeError):
                    continue
        
        return best_role
    
    def save_results(self, output_dir: str = "output/reports") -> None:
        """
        Save analysis results to JSON files.
        
        Args:
            output_dir (str): Directory to save results
        """
        os.makedirs(output_dir, exist_ok=True)
        
        for result in self.results:
            candidate_name = result["candidate_info"].get("name", "unknown").replace(" ", "_").lower()
            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
            filename = f"{candidate_name}_{timestamp}.json"
            filepath = os.path.join(output_dir, filename)
            
            try:
                with open(filepath, 'w', encoding='utf-8') as f:
                    json.dump(result, f, indent=2)
                print(f"💾 Saved report: {filename}")
            except Exception as e:
                print(f"❌ Error saving report: {str(e)}")

# Initialize Project 1 extractor
project1_extractor = Project1_BasicExtractor()
print("\n🎯 Project 1 ready for text CV processing!")

## Analyzing CV Files in the Data Folder 📄

We'll analyze the CV files you manually reviewed in the `data/cvs` folder. Let's see what files are available:

In [None]:
# List the CV files for analysis
def list_cv_files():
    """
    List the CV files in the data/cvs folder that were manually analyzed.
    """
    cv_dir = os.path.join('data', 'cvs')
    
    # Ensure the directory exists
    if not os.path.exists(cv_dir):
        print(f"⚠️ Data directory {cv_dir} does not exist!")
        print("Creating the directory now...")
        os.makedirs(cv_dir, exist_ok=True)
    
    # List CV files with various extensions
    extensions = ['.txt', '.docx', '.pdf', '.doc']
    cv_files = []
    
    for ext in extensions:
        files = [f for f in os.listdir(cv_dir) if f.endswith(ext)]
        cv_files.extend(files)
    
    # Display results
    if cv_files:
        print(f"📊 Found {len(cv_files)} CV files that you analyzed manually")
        print("\nCV files available for automated analysis:")
        for i, file in enumerate(cv_files, 1):
            print(f"{i}. {file}")
        
        # Show file type statistics
        txt_files = len([f for f in cv_files if f.endswith('.txt')])
        other_files = len(cv_files) - txt_files
        print(f"\n📝 File type breakdown:")
        print(f"   - Text files (.txt): {txt_files} (will be processed in Project 1)")
        if other_files > 0:
            print(f"   - Other formats: {other_files} (will need Project 2 for processing)")
    else:
        print(f"⚠️ No CV files found in {cv_dir}/")
        print("Please add some sample CV files (in .txt format) to this folder to continue.")
    
    return cv_files

# List the CV files to be analyzed
cv_files = list_cv_files()

## Process and Analyze CVs 🚀

Now, let's run our CV analyzer on the sample CVs:

In [None]:
# Process CVs from data folder

def process_cv_batch():
    """
    Process CVs from the data/cvs folder.
    """
    print("🚀 Starting CV Analysis with Project 1")
    print("=" * 60)
    
    # Check if API key is available
    if not openai_api_key:
        print("❌ OpenAI API key not configured")
        print("Please set up your config.txt file with OPENAI_API_KEY")
        return
    
    # Get CV files from data folder
    cv_dir = os.path.join('data', 'cvs')
    cv_files = []
    
    if os.path.exists(cv_dir):
        for file in os.listdir(cv_dir):
            # For Project 1, we'll focus on text files, but note which files we're skipping
            if file.endswith('.txt'):
                cv_files.append(file)
            elif file.endswith(('.docx', '.doc', '.pdf')):
                print(f"⚠️ Skipping {file} - Project 1 supports only .txt files")
                print("   (Use Project 2 for multi-format document processing)")
    
    if not cv_files:
        print("❌ No text CV files found in data/cvs/")
        print("The system expects text (.txt) CV files in this folder.")
        print("Please add text CV files to continue.")
        return
    
    # Display processing information
    print(f"📄 Processing {len(cv_files)} CV files from the data folder")
    
    print(f"\n📄 Ready to process {len(cv_files)} CV files")
    
    # Process each CV
    for cv_file in cv_files:
        cv_path = os.path.join(cv_dir, cv_file)
        result = project1_extractor.process_single_cv(cv_path)
        
        if "error" not in result:
            print(f"\n📊 Results for {cv_file}:")
            print(f"- Name: {result['candidate_info'].get('name', 'Not found')}")
            print(f"- Best Role: {result['best_role_match']['role']} ({result['best_role_match']['score']}% fit)")
            print(f"- Skills Count: {len(result['candidate_info'].get('skills', []))}")
            print(f"- Experience: {result['candidate_info'].get('total_experience_years', 'Unknown')} years")
    
    # Save results to files
    project1_extractor.save_results()
    
    print(f"\n✅ CV Analysis completed! Processed {project1_extractor.processed_count} CVs")
    print("   HR summary report will be generated next")

# Run the analysis on CVs in the data folder
if cv_files:  # Only run if CV files were found
    process_cv_batch()

## Generate HR Summary Report 📊

Now that we've processed all the CVs from the data folder, let's create an HR summary report that shows the best candidates for each role:

In [None]:
# Generate HR Summary Report

def generate_hr_summary():
    """
    Generate an HR summary report showing best candidates for each role.
    """
    if not project1_extractor.results:
        print("❌ No results to summarize. Process some CVs first.")
        return
    
    print("\n📊 HR SUMMARY REPORT")
    print("=" * 60)
    print(f"Total Candidates Processed: {len(project1_extractor.results)}")
    print(f"Processing Date: {datetime.now().strftime('%Y-%m-%d %H:%M')}")
    print("=" * 60)
    
    # Organize candidates by best-fit role
    candidates_by_role = {}
    for result in project1_extractor.results:
        best_role = result["best_role_match"]["role"]
        if best_role not in candidates_by_role:
            candidates_by_role[best_role] = []
        
        candidates_by_role[best_role].append({
            "name": result["candidate_info"].get("name", "Unknown"),
            "score": result["best_role_match"]["score"],
            "recommendation": result["best_role_match"]["recommendation"],
            "experience": result["candidate_info"].get("total_experience_years", "Unknown"),
            "file": result["source_file"]
        })
    
    # Sort candidates by score for each role
    for role, candidates in candidates_by_role.items():
        candidates.sort(key=lambda x: x["score"], reverse=True)
        
        print(f"\n🎯 Role: {role}")
        print("-" * 40)
        
        for i, candidate in enumerate(candidates, 1):
            print(f"{i}. {candidate['name']}")
            print(f"   Score: {candidate['score']}% | Recommendation: {candidate['recommendation']}")
            print(f"   Experience: {candidate['experience']} years | Source: {candidate['file']}")
        
        if not candidates:
            print("No candidates matched this role.")
    
    # Generate recommendations
    print("\n🔍 HR RECOMMENDATIONS")
    print("-" * 40)
    
    for role, candidates in candidates_by_role.items():
        strong_candidates = [c for c in candidates if c["score"] >= 75]
        
        if strong_candidates:
            top_candidate = strong_candidates[0]
            print(f"✅ For {role}: Recommend interviewing {top_candidate['name']} ({top_candidate['score']}% match)")
        else:
            print(f"⚠️ For {role}: No strong candidates found. Consider expanding search.")
    
    # Save summary report
    try:
        os.makedirs('output/reports', exist_ok=True)
        report_file = f"output/reports/hr_summary_{datetime.now().strftime('%Y%m%d_%H%M%S')}.txt"
        
        with open(report_file, 'w', encoding='utf-8') as f:
            f.write("HR SUMMARY REPORT\n")
            f.write("=" * 60 + "\n")
            f.write(f"Total Candidates Processed: {len(project1_extractor.results)}\n")
            f.write(f"Processing Date: {datetime.now().strftime('%Y-%m-%d %H:%M')}\n")
            f.write("=" * 60 + "\n\n")
            
            for role, candidates in candidates_by_role.items():
                f.write(f"Role: {role}\n")
                f.write("-" * 40 + "\n")
                
                for i, candidate in enumerate(candidates, 1):
                    f.write(f"{i}. {candidate['name']}\n")
                    f.write(f"   Score: {candidate['score']}% | Recommendation: {candidate['recommendation']}\n")
                    f.write(f"   Experience: {candidate['experience']} years | Source: {candidate['file']}\n\n")
            
            f.write("\nHR RECOMMENDATIONS\n")
            f.write("-" * 40 + "\n")
            
            for role, candidates in candidates_by_role.items():
                strong_candidates = [c for c in candidates if c["score"] >= 75]
                
                if strong_candidates:
                    top_candidate = strong_candidates[0]
                    f.write(f"For {role}: Recommend interviewing {top_candidate['name']} ({top_candidate['score']}% match)\n")
                else:
                    f.write(f"For {role}: No strong candidates found. Consider expanding search.\n")
        
        print(f"\n✅ Summary report saved to {report_file}")
    except Exception as e:
        print(f"❌ Error saving summary report: {str(e)}")

# Generate the HR summary
generate_hr_summary()

## 🎉 Conclusion

In Project 1, you've learned how to:

✅ **Extract structured information** from plain text CVs using AI  
✅ **Classify candidates** against predefined job roles  
✅ **Calculate role fit scores** based on skills and experience  
✅ **Generate HR recommendations** for candidate selection  
✅ **Process multiple CVs** from a real data folder  

This project demonstrates how to automate the tedious task of CV analysis that HR professionals typically do manually. By analyzing CVs from the data folder, we've shown how this solution can be applied to real-world HR workflows.

In Project 2, we'll expand these capabilities to handle different document formats (PDF, Word, Excel) and generate more sophisticated outputs, making the system even more versatile for HR departments.

### Next Steps

To use this system with your own CVs:
1. Place your text CV files in the `data/cvs` folder
2. Run this notebook to process them automatically
3. Check the `output/reports` folder for detailed results