 # **RESUME EVALUATOR**


**Introduction**


This project evaluates resumes by extracting skills, education, and experience using NLP techniques. It also calculates an ATS (Applicant Tracking System) score and suggests improvements based on missing skills.


**Step 1: Install Dependencies**

We need to install required libraries before running the project.


In [1]:
!pip install nltk spacy pdfminer.six
!python -m spacy download en_core_web_sm

Collecting en-core-web-sm==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.8.0/en_core_web_sm-3.8.0-py3-none-any.whl (12.8 MB)
     ---------------------------------------- 0.0/12.8 MB ? eta -:--:--
     ---------------------------------------- 0.0/12.8 MB ? eta -:--:--
     - -------------------------------------- 0.5/12.8 MB 2.4 MB/s eta 0:00:06
     -- ------------------------------------- 0.8/12.8 MB 2.4 MB/s eta 0:00:06
     --- ------------------------------------ 1.0/12.8 MB 2.2 MB/s eta 0:00:06
     ---- ----------------------------------- 1.3/12.8 MB 1.8 MB/s eta 0:00:07
     ----- ---------------------------------- 1.8/12.8 MB 1.9 MB/s eta 0:00:06
     ------ --------------------------------- 2.1/12.8 MB 1.7 MB/s eta 0:00:07
     ------- -------------------------------- 2.4/12.8 MB 1.8 MB/s eta 0:00:06
     --------- ------------------------------ 2.9/12.8 MB 1.7 MB/s eta 0:00:06
     --------- ------------------------------ 3.

**Step 2: Import Required Libraries**

We import essential libraries for text extraction, tokenization, stopword removal, and Named Entity Recognition (NER).


In [2]:
import spacy
import nltk
from nltk.tokenize import word_tokenize
from nltk.corpus import stopwords
from pdfminer.high_level import extract_text


Download necessary NLTK datasets


In [3]:
nltk.download('punkt_tab')
nltk.download('stopwords')

[nltk_data] Downloading package punkt_tab to
[nltk_data]     C:\Users\UDIT\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\UDIT\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

Load Spacy NLP model


In [4]:
nlp = spacy.load("en_core_web_sm")

**Step 3: Extract Text from PDF Resume**

We extract text from a given PDF resume file.


In [5]:
def extract_resume_text(pdf_path):
    text = extract_text(pdf_path)
    return text

**Step 4: NLP Preprocessing**

We perform tokenization and stopword removal to clean the resume text.


In [6]:
def preprocess_resume(text):
    tokens = word_tokenize(text)
    stop_words = set(stopwords.words("english"))
    filtered_tokens = [word.lower() for word in tokens if word.isalpha() and word.lower() not in stop_words]
    return filtered_tokens

**Step 5: Named Entity Recognition (NER)**

We use NER to extract important resume sections such as skills, education, and experience.


In [7]:
def extract_entities(text):
    doc = nlp(text)
    entities = {"skills": [], "education": [], "experience": []}
    
    for ent in doc.ents:
        if ent.label_ in ["ORG", "WORK_OF_ART"]:
            entities["skills"].append(ent.text)
        elif ent.label_ == "DATE":
            entities["experience"].append(ent.text)
        elif ent.label_ == "EDUCATION":
            entities["education"].append(ent.text)
    
    return entities

**Step 6: Calculate ATS Score**

We compare extracted skills with an ideal skillset to compute an ATS score.


In [8]:
ideal_skills = ["python", "machine learning", "data analysis", "nlp", "sql", "deep learning", "cloud computing"]

def calculate_score(extracted_skills):
    match_count = sum(1 for skill in extracted_skills if skill.lower() in ideal_skills)
    score = (match_count / len(ideal_skills)) * 100
    return round(score, 2)

**Step 7: Suggest Resume Improvements**

We identify missing skills and provide improvement suggestions.


In [9]:
def suggest_improvements(extracted_skills):
    missing_skills = [skill for skill in ideal_skills if skill.lower() not in extracted_skills]
    return missing_skills if missing_skills else ["Your resume is excellent!"]

**Step 8: Test the System with a Sample Resume PDF**

We run the entire process on a sample resume and display the results.


In [11]:
def main():
    pdf_path = r"C:\Users\UDIT\Downloads\resumeudit (4).pdf" 
    resume_text = extract_resume_text(pdf_path)
    tokens = preprocess_resume(resume_text)
    entities = extract_entities(resume_text)
    
    score = calculate_score(entities["skills"])
    improvements = suggest_improvements(entities["skills"])
    
    print("\nResume Analysis Result:")
    print(f"Extracted Skills: {entities['skills']}")
    print(f"Extracted Education: {entities['education']}")
    print(f"Extracted Experience: {entities['experience']}")
    print(f"ATS Score: {score}%")
    print(f"Suggested Improvements: {improvements}")

main()


Resume Analysis Result:
Extracted Skills: ['MBM UNIVERSITY\nBachelor of Engineering-\n\nComputer Science\n', 'XII', 'JavaScript', 'MySQL\n\nHTML5', 'Canva, Figma', 'Arduino', 'Computer Networks\nLogic Design', 'MBM University', 'Harmonium', 'Operations', 'MATLAB', 'Arduino Uno', 'IIT', 'NLP', 'Arduino', 'IR']
Extracted Education: []
Extracted Experience: ['7976434666', '2023-2027', '2020', 'June 2023 - July 2023']
ATS Score: 14.29%
Suggested Improvements: ['python', 'machine learning', 'data analysis', 'nlp', 'sql', 'deep learning', 'cloud computing']


**Conclusion**

- This project extracts text from resumes, cleans it, and applies NLP techniques to extract key details.
- It calculates an ATS score based on relevant skills.
- Finally, it suggests improvements to enhance the resume quality.
