<a href="https://colab.research.google.com/github/mantuonweb/Google_Collab/blob/master/Agent_Resume_Matcher.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
%pip install langchain-openai langchain-community langchain-text-splitters langchain-core faiss-cpu python-dotenv pypdf

Collecting langchain-openai
  Downloading langchain_openai-1.1.6-py3-none-any.whl.metadata (2.6 kB)
Collecting langchain-community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting langchain-text-splitters
  Downloading langchain_text_splitters-1.1.0-py3-none-any.whl.metadata (2.7 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.13.2-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (7.6 kB)
Collecting pypdf
  Downloading pypdf-6.5.0-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core
  Downloading langchain_core-1.2.5-py3-none-any.whl.metadata (3.7 kB)
Collecting langchain-classic<2.0.0,>=1.0.0 (from langchain-community)
  Downloading langchain_classic-1.0.1-py3-none-any.whl.metadata (4.2 kB)
Collecting requests<3.0.0,>=2.32.5 (from langchain-community)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0.7.0,>=0.6.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7

In [None]:
import os
import shutil
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader, DirectoryLoader, PyPDFLoader
from dotenv import load_dotenv
from google.colab import userdata

# Load API key
os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')
load_dotenv()
print("‚úì Setup complete")
print("API Key:", os.environ.get("OPENAI_API_KEY", "Not set")[:20] + "...")

# Create folders if they don't exist
os.makedirs("./resumes", exist_ok=True)
os.makedirs("./resume_db", exist_ok=True)
print("‚úì Folders created: ./resumes and ./resume_db")




‚úì Setup complete
API Key: sk-proj-2D_k1B8OV3MW...
‚úì Folders created: ./resumes and ./resume_db


In [None]:
import shutil
import os

# Remove corrupted database
if os.path.exists("./resume_db"):
    shutil.rmtree("./resume_db")
    print("‚úì Cleaned up old database")

# Recreate folder
os.makedirs("./resume_db", exist_ok=True)
print("‚úì Ready for fresh start")


‚úì Cleaned up old database
‚úì Ready for fresh start


In [None]:
import shutil
import os
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import TextLoader, DirectoryLoader, PyPDFLoader
from dotenv import load_dotenv
from google.colab import userdata

# Initialize embeddings globally
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")

def ingest_resumes():
    """Load resumes from ./resumes folder and add to vector database"""
    print("üì• Ingesting resumes...")

    # Load text files
    txt_loader = DirectoryLoader("./resumes", glob="**/*.txt", loader_cls=TextLoader)
    txt_docs = txt_loader.load()

    # Load PDF files
    pdf_loader = DirectoryLoader("./resumes", glob="**/*.pdf", loader_cls=PyPDFLoader)
    pdf_docs = pdf_loader.load()

    all_docs = txt_docs + pdf_docs

    if not all_docs:
        print("‚ùå No resumes found in ./resumes folder")
        return

    # Split documents into chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
    chunks = text_splitter.split_documents(all_docs)

    # Check if FAISS index file exists (not just folder)
    db_file_exists = os.path.exists("./resume_db/index.faiss")

    if db_file_exists:
        # Load existing and add new documents
        vectorstore = FAISS.load_local("./resume_db", embeddings, allow_dangerous_deserialization=True)
        vectorstore.add_documents(chunks)
        print(f"‚úì Added {len(chunks)} chunks from {len(all_docs)} resumes")
    else:
        # Create new vector store
        vectorstore = FAISS.from_documents(chunks, embeddings)
        print(f"‚úì Created new database with {len(chunks)} chunks from {len(all_docs)} resumes")

    vectorstore.save_local("./resume_db")
    print("‚úì Database saved successfully")


def list_resumes():
    """List all resumes stored in vector database"""
    print("üìã Listing resumes...")

    if not os.path.exists("./resume_db/index.faiss"):
        print("‚ùå No database found. Please ingest resumes first.")
        return

    vectorstore = FAISS.load_local("./resume_db", embeddings, allow_dangerous_deserialization=True)

    # Get all documents
    all_docs = vectorstore.docstore._dict

    # Extract unique sources
    sources = set()
    for doc in all_docs.values():
        if hasattr(doc, 'metadata') and 'source' in doc.metadata:
            sources.add(os.path.basename(doc.metadata['source']))

    print(f"\n‚úì Found {len(sources)} resumes in database:")
    for i, source in enumerate(sorted(sources), 1):
        print(f"  {i}. {source}")


def search_resumes(skills):
    """Search resumes by skills and return best matches"""
    print(f"üîç Searching for candidates with skills: {skills}")

    if not os.path.exists("./resume_db/index.faiss"):
        print("‚ùå No database found. Please ingest resumes first.")
        return

    vectorstore = FAISS.load_local("./resume_db", embeddings, allow_dangerous_deserialization=True)

    # Search for relevant resume chunks
    retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
    docs = retriever.invoke(skills)

    # Create context from retrieved documents
    context = "\n\n".join([f"Resume {i+1}:\n{doc.page_content}" for i, doc in enumerate(docs)])

    # Create prompt for LLM
    prompt = f"""You are a recruiter assistant. Based on the following resume excerpts, identify and rank the best candidates for the required skills.\n\nRequired Skills: {skills}\n\nResume Excerpts:\n{context}\n\nPlease provide a quick summary for the top 3 best matching candidates. For each candidate, include their relevant skills, why they are a good fit, and a matching percentage. The response should be concise.\n\nAnswer:"""

    # Get LLM response
    llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
    response = llm.invoke(prompt)

    print("\n" + "="*60)
    print("üéØ SEARCH RESULTS")
    print("="*60)
    print(response.content)
    print("="*60)

    return response.content


def clear_resumes():
    """Clear all resumes from vector database"""
    print("üóëÔ∏è  Clearing resume database...")

    if os.path.exists("./resume_db"):
        shutil.rmtree("./resume_db")
        print("‚úì Database cleared successfully")
    else:
        print("‚ùå No database found")

print("‚úì Agent functions loaded successfully")

‚úì Agent functions loaded successfully


In [None]:
def generate_resume(data):
    """Generate a text resume from data dictionary"""
    resume = []

    # Header
    resume.append(data['name'].upper())
    resume.append(f"{data['email']} | {data['phone']} | {data['location']}")
    resume.append("")

    # Skills
    resume.append("SKILLS")
    resume.append(", ".join(data['skills']))
    resume.append("")

    # Experience
    resume.append("EXPERIENCE")
    for exp in data['experiences']:
        resume.append(f"{exp['title']} | {exp['company']} | {exp['duration']}")
        for resp in exp['responsibilities']:
            resume.append(f"- {resp}")
        resume.append("")

    # Education
    resume.append("EDUCATION")
    edu = data['education']
    resume.append(f"{edu['degree']} | {edu['institution']} | {edu['year']}")

    return "\n".join(resume)


def save_resume(data, filepath):
    """Save resume to file"""
    with open(filepath, 'w') as f:
        f.write(generate_resume(data))


# Example usage
if __name__ == "__main__":
    resume_data = {
        "name": "Mantu Nigam",
        "email": "mantu.nigam@email.com",
        "phone": "+91-9876543210",
        "location": "Bangalore",
        "skills": ["Python", "React", "Angular", "Nest", "Html", "CSS", "Google Cloud", "Docker"],
        "experiences": [
            {
                "title": "Senior AI Engineer",
                "company": "TechCorp",
                "duration": "2021-Present",
                "responsibilities": [
                    "Built Full stack applications with Angular and Nest",
                    "Developed Mobile App Using Material UI"
                ]
            },
            {
                "title": "Software Engineer",
                "company": "TCS",
                "duration": "2022-2025",
                "responsibilities": [
                    "Created ML models and REST APIs with Python"
                ]
            }
        ],
        "education": {
            "degree": "MCA",
            "institution": "IPU Delhi",
            "year": "2011"
        }
    }

    # Generate and print
    print(generate_resume(resume_data))

    # Save to file
    save_resume(resume_data, "resumes/mantu_nigam_resume.txt")


MANTU NIGAM
mantu.nigam@email.com | +91-9876543210 | Bangalore

SKILLS
Python, React, Angular, Nest, Html, CSS, Google Cloud, Docker

EXPERIENCE
Senior AI Engineer | TechCorp | 2021-Present
- Built Full stack applications with Angular and Nest
- Developed Mobile App Using Material UI

Software Engineer | TCS | 2022-2025
- Created ML models and REST APIs with Python

EDUCATION
MCA | IPU Delhi | 2011


In [None]:
def generate_resume(data):
    """Generate a text resume from data dictionary"""
    resume = []

    # Header
    resume.append(data['name'].upper())
    resume.append(f"{data['email']} | {data['phone']} | {data['location']}")
    resume.append("")

    # Skills
    resume.append("SKILLS")
    resume.append(", ".join(data['skills']))
    resume.append("")

    # Experience
    resume.append("EXPERIENCE")
    for exp in data['experiences']:
        resume.append(f"{exp['title']} | {exp['company']} | {exp['duration']}")
        for resp in exp['responsibilities']:
            resume.append(f"- {resp}")
        resume.append("")

    # Education
    resume.append("EDUCATION")
    edu = data['education']
    resume.append(f"{edu['degree']} | {edu['institution']} | {edu['year']}")

    return "\n".join(resume)


def save_resume(data, filepath):
    """Save resume to file"""
    with open(filepath, 'w') as f:
        f.write(generate_resume(data))


# Example usage
if __name__ == "__main__":
    resume_data = {
        "name": "Vinod Malik",
        "email": "vinod.malik@email.com",
        "phone": "+91-9876543210",
        "location": "Bangalore",
        "skills": ["Python", "LangChain", "VectorDB", "Google Cloud", "Docker"],
        "experiences": [
            {
                "title": "Senior AI Engineer",
                "company": "TechCorp",
                "duration": "2021-Present",
                "responsibilities": [
                    "Built Full stack Gen AI App",
                    "Developed Mobile App Using Material UI"
                ]
            },
            {
                "title": "Software Engineer",
                "company": "TCS",
                "duration": "2022-2025",
                "responsibilities": [
                    "Created ML models and REST APIs with Python"
                ]
            }
        ],
        "education": {
            "degree": "MCA",
            "institution": "IPU Delhi",
            "year": "2011"
        }
    }

    # Generate and print
    print(generate_resume(resume_data))

    # Save to file
    save_resume(resume_data, "resumes/vinod_malik_resume.txt")


VINOD MALIK
vinod.malik@email.com | +91-9876543210 | Bangalore

SKILLS
Python, LangChain, VectorDB, Google Cloud, Docker

EXPERIENCE
Senior AI Engineer | TechCorp | 2021-Present
- Built Full stack Gen AI App
- Developed Mobile App Using Material UI

Software Engineer | TCS | 2022-2025
- Created ML models and REST APIs with Python

EDUCATION
MCA | IPU Delhi | 2011


In [None]:
# Add resumes to database
ingest_resumes()


üì• Ingesting resumes...
‚úì Created new database with 2 chunks from 2 resumes
‚úì Database saved successfully


In [None]:
# Search for candidates with specific skills
skills = "front end, angular, react, microservice using nestjs"  # Change this to your required skills
search_resumes(skills)

üîç Searching for candidates with skills: front end, angular, react, microservice using nestjs

üéØ SEARCH RESULTS
### Top Candidates Summary

**1. Mantu Nigam**  
- **Relevant Skills:** React, Angular, Nest, Full stack development  
- **Why They Are a Good Fit:** Mantu has direct experience building full stack applications using Angular and Nest, which aligns perfectly with the required skills. His background in both front-end and back-end technologies makes him a strong candidate for roles involving microservices.  
- **Matching Percentage:** 90%

**2. Vinod Malik**  
- **Relevant Skills:** (Limited relevant skills)  
- **Why They Are a Good Fit:** While Vinod has experience as a Senior AI Engineer, his resume does not mention any front-end technologies like Angular or React, nor does it indicate experience with microservices using Nest. His skills are more focused on AI and Python, making him less suitable for the required role.  
- **Matching Percentage:** 40%

### Summary
Mantu 

'### Top Candidates Summary\n\n**1. Mantu Nigam**  \n- **Relevant Skills:** React, Angular, Nest, Full stack development  \n- **Why They Are a Good Fit:** Mantu has direct experience building full stack applications using Angular and Nest, which aligns perfectly with the required skills. His background in both front-end and back-end technologies makes him a strong candidate for roles involving microservices.  \n- **Matching Percentage:** 90%\n\n**2. Vinod Malik**  \n- **Relevant Skills:** (Limited relevant skills)  \n- **Why They Are a Good Fit:** While Vinod has experience as a Senior AI Engineer, his resume does not mention any front-end technologies like Angular or React, nor does it indicate experience with microservices using Nest. His skills are more focused on AI and Python, making him less suitable for the required role.  \n- **Matching Percentage:** 40%\n\n### Summary\nMantu Nigam is the clear top candidate due to his relevant experience with both Angular and Nest, while Vinod

In [None]:
# Search for candidates with specific skills
skills = "python, ML, Gen AI"  # Change this to your required skills
search_resumes(skills)

üîç Searching for candidates with skills: python, ML, Gen AI

üéØ SEARCH RESULTS
### Top Candidates Summary

**1. Vinod Malik**  
- **Relevant Skills:** Python, ML, Gen AI  
- **Why a Good Fit:** Vinod has direct experience in building a full-stack Gen AI application and has created ML models using Python. His role as a Senior AI Engineer indicates a strong background in AI technologies.  
- **Matching Percentage:** 95%

**2. Mantu Nigam**  
- **Relevant Skills:** Python, ML  
- **Why a Good Fit:** Mantu has experience in creating ML models and REST APIs with Python. However, he lacks specific experience in Gen AI, which slightly lowers his fit compared to Vinod.  
- **Matching Percentage:** 85%

**3. (No third candidate)**  
- **Why No Third Candidate:** Only two candidates provided relevant experience and skills related to the required skills of Python, ML, and Gen AI. 

### Summary
Vinod Malik is the strongest candidate due to his direct experience with Gen AI, followed by Mantu N

'### Top Candidates Summary\n\n**1. Vinod Malik**  \n- **Relevant Skills:** Python, ML, Gen AI  \n- **Why a Good Fit:** Vinod has direct experience in building a full-stack Gen AI application and has created ML models using Python. His role as a Senior AI Engineer indicates a strong background in AI technologies.  \n- **Matching Percentage:** 95%\n\n**2. Mantu Nigam**  \n- **Relevant Skills:** Python, ML  \n- **Why a Good Fit:** Mantu has experience in creating ML models and REST APIs with Python. However, he lacks specific experience in Gen AI, which slightly lowers his fit compared to Vinod.  \n- **Matching Percentage:** 85%\n\n**3. (No third candidate)**  \n- **Why No Third Candidate:** Only two candidates provided relevant experience and skills related to the required skills of Python, ML, and Gen AI. \n\n### Summary\nVinod Malik is the strongest candidate due to his direct experience with Gen AI, followed by Mantu Nigam, who has solid Python and ML experience but lacks Gen AI expo

In [None]:
from google.colab import drive
drive.mount('/content/drive')