# AI-Powered Resume Screener 

An AI-Powered Resume Screener using Google Flash 2.0 combines the capabilities of machine learning
and natural language processing to automate and enhance the recruitment process. 

 ### Capabilities Demonstrated:
- Document Understanding
- Few-shot Prompting
- Structured Output (JSON mode)


## Some of the problems:
1. Recruiters have to manually review hundreds or thousands of resumes.
2. Decisions may be influenced by unconscious bias (e.g., names, gender, background)
3. Good candidates might be overlooked due to resume formatting or keyword mismatch.


## Use Case:
1. Automatically reads and analyzes resumes to shortlist suitable candidates.
2. Compares candidate skills and experience with the job requirements.
3. Ranks applicants based on how well they match the job description.


### ***1. Install and Import Libraries***

Installs libraries:

* google-generativeai: to use Google's AI model (Gemini).

* PyMuPDF: to read and extract text from PDF files.

Imports modules:

*  fitz: from PyMuPDF, used to open and read PDFs.

* os, json: for file handling and data formatting.

* google.generativeai: to access and use the generative AI model.

* pandas and display: to work with and display data in tables.

In [8]:
!pip install google-generativeai --quiet
!pip install PyMuPDF --quiet  # For PDF parsing

import fitz  # PyMuPDF
import os
import json
import google.generativeai as genai
import pandas as pd
from IPython.display import display


### ***2. Configure Google Gemini Flash 2.0***

1. Gets your API key securely

    * It uses UserSecretsClient to safely get your Google API key stored in Kaggle Secrets (so it’s not exposed in the code).

2. Connects to Google’s Generative AI (Gemini)

   *  It sets up the API key using genai.configure().

    * Then it creates a model object using Gemini 1.5 Flash, which is a fast and lightweight version of Google's AI model.

3. Sets response type

    * It tells the AI to respond in JSON format, which is easy to read and work with in code.

In [9]:
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY") 

genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel(
    model_name="models/gemini-1.5-flash-latest",
    generation_config={
        "response_mime_type": "application/json"
    }
)


### ***3. Load and Read PDF Resumes***

1. Defines the directory:

    * RESUME_DIR specifies the location of the resume files on Kaggle.

2. Extracts text from PDF:

    * The function extract_text_from_pdf(pdf_path) opens each PDF, reads it, and returns all the text in the PDF.

3. Iterates over the resume files:

    * It loops through all files in the RESUME_DIR directory.

    * For each PDF file (.pdf), it calls the extraction function and stores the text in a dictionary (resumes) where the key is the filename and the value is the extracted text.

In [10]:
RESUME_DIR = "/kaggle/input/resume-it/INFORMATION-TECHNOLOGY" 
# Dataset folder name from Kaggle

def extract_text_from_pdf(pdf_path):
    doc = fitz.open(pdf_path)
    text = ""
    for page in doc:
        text += page.get_text()
    return text

resumes = {}
for filename in os.listdir(RESUME_DIR):
    if filename.endswith(".pdf"):
        pdf_path = os.path.join(RESUME_DIR, filename)
        resumes[filename] = extract_text_from_pdf(pdf_path)

### ***4. Define Job Description***

In [11]:
job_description = """
We are looking for a Data Analyst with:
- Strong skills in Python, SQL, and data visualization
- Experience with Pandas, NumPy, Matplotlib
- Knowledge of machine learning is a plus
- Good communication and teamwork
"""

### ***5. Few-shot Prompt Template***

This code creates a prompt for the AI that:

* Tells the AI it is acting as a resume screener.

* Provides a job description.

* Asks the AI to read a resume and extract useful information like:

    Name

    Skills

    Education

    Years of experience

    A match score (how well the resume fits the job)

In [12]:
few_shot_prompt = f"""
You are a resume screener AI.
Given a resume text and a job description, extract relevant details in structured JSON format:
{{
  "name": "Candidate's full name",
  "skills": ["List of skills"],
  "education": "Brief education summary",
  "experience": "Years of relevant experience",
  "match_score": "Score from 0 to 100 based on how well the resume matches the job description"
}}

Job Description:
{job_description}
"""

### ***6. Process Resumes with Flash 2.0***

The *model.generate_content()* part is the main brain of this code. Here's what it does:

1.  Reads and Understands the Resume

    * The gen AI reads the full text of each resume (education, skills, work experience, etc.).

2. Follows Instructions from the Prompt

    * The few_shot_prompt gives examples or instructions (like “extract skills and job titles”).

    * Gen AI uses this to understand what kind of output is expected.

3. Generates Structured Output

    * Instead of just repeating the resume, the AI turns it into clean, organized data — like a list of skills, years of experience, education, etc.

    * It writes this information in JSON format, which is easy for computers to work with.

In [13]:
results = {}
for name, text in list(resumes.items())[:20]: 

    full_prompt = few_shot_prompt + f"\n\nResume:\n{text}"
    response = model.generate_content(full_prompt)
    try:
        results[name] = json.loads(response.text)
    except:
        results[name] = {"error": "Failed to parse JSON", "raw": response.text}

### ***7. Show Output***

In [14]:

display(pd.DataFrame.from_dict(results, orient="index"))


Unnamed: 0,name,skills,education,experience,match_score
68460556.pdf,,"[American Sign Language, Excellent communicati...","December 2016, Information and Technology Mana...",More than 2 years of experience in IT related ...,20
57002858.pdf,INFORMATION TECHNOLOGY MANAGER,"[Accounting, backup, Billing system, budget, C...",Associate of Science : Business Administration...,26 years,20
13836471.pdf,INFORMATION TECHNOLOGY MANAGER,"[Python, SQL, data visualization, Pandas, NumP...","BS: Computer System Engineer, January 29, 2000...",22 years,20
41344156.pdf,VP OF INFORMATION TECHNOLOGY,"[IT Governance, Team Leadership, Systems Integ...",Information Systems 2014 Park University GPA: ...,18,30
10553553.pdf,INFORMATION TECHNOLOGY MANAGER,"[Operations management, Project tracking, Perf...",Master of Science: Business Information Techno...,10,40
38753827.pdf,"VICE PRESIDENT, INFORMATION TECHNOLOGY","[IT Strategy, IT Management, Project managemen...","M.B.A, University of Massachusetts; B.S, Real ...",11+ years,20
23864648.pdf,VICE PRESIDENT INFORMATION TECHNOLOGY INFRASTR...,"[Infrastructure Management, Data Center Operat...",Bachelor of Science in Computer Science from T...,20,20
12045067.pdf,INFORMATION TECHNOLOGY (IT) SPECIALIST,"[Python, SQL, data visualization, Pandas, NumP...",Bachelor of Science (BS) : Information Technol...,12,30
21780877.pdf,Kevin L. Trostle,"[Information Technology, Project Management, T...","HS Diploma: General Studies, BS Degree: Electr...",22 years,20
11584809.pdf,Candidate's full name not provided,"[System administration, Windows Server 2003, W...",Master of Science in Computer & Information Sc...,10+ years,30
