<a href="https://colab.research.google.com/github/Satish-970/ATS_Checker_Job-Description/blob/main/ATS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ATS Resume Matcher: Your Guide to a Better Resume

This tool evaluates how well your resume matches a job description, providing an ATS score and suggestions for improvement. It's designed to help you pass through Applicant Tracking Systems (ATS) and get your resume seen by recruiters.

To use it, run all the code cells sequentially. You'll be prompted to:
1. Upload your resume (PDF or DOCX) via a file upload dialog.
2. Paste the job description or provide a URL link to a job posting.
3. View the ATS score, keyword analysis, and suggestions directly in Colab.

**No sign-in required** - uploads are done directly from your local machine.

### How to Run
1. Run the first cell to install dependencies.
2. Run the second cell to install Java (required for LanguageTool).
3. Run the third cell to define the ATS functions.
4. Run the fourth cell to start the ATS checker, which will prompt for file upload and job description input (text or URL).

**Note**: Ensure your resume is a text-based PDF or DOCX file (not a scanned image) for accurate text extraction. If providing a URL, ensure it points to a publicly accessible job posting page.

In [1]:
# Step 1: Install and Import Libraries

# Install necessary libraries
!pip install spacy PyPDF2 python-docx scikit-learn matplotlib language-tool-python requests beautifulsoup4 --quiet
!python -m spacy download en_core_web_sm --quiet

# Import libraries
import spacy
import PyPDF2
from docx import Document
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import matplotlib.pyplot as plt
from google.colab import files
import os
import language_tool_python
import requests
from bs4 import BeautifulSoup

print("Setup complete!")

In [2]:
# Step 2: Install Java (required for LanguageTool)
!sudo apt-get purge openjdk-* -y
!sudo apt-get install openjdk-17-jdk -y

In [3]:
# Step 3: Define Core Functions

# Load the spaCy model
try:
    nlp = spacy.load('en_core_web_sm')
except OSError:
    print("Downloading spaCy model...")
    !python -m spacy download en_core_web_sm --quiet
    nlp = spacy.load('en_core_web_sm')

# Initialize LanguageTool for grammar and spell checking
tool = language_tool_python.LanguageTool('en-US')

# Function to extract text from PDF
def extract_pdf_text(pdf_path):
    text = ''
    try:
        with open(pdf_path, 'rb') as file:
            reader = PyPDF2.PdfReader(file)
            for page in reader.pages:
                extracted_page_text = page.extract_text()
                if extracted_page_text:
                    text += extracted_page_text + ' '
    except Exception as e:
        print(f"Error extracting text from PDF: {e}")
        return ""
    return text.strip()

# Function to extract text from DOCX
def extract_docx_text(docx_path):
    text = ''
    try:
        doc = Document(docx_path)
        for paragraph in doc.paragraphs:
            text += paragraph.text + ' '
    except Exception as e:
        print(f"Error extracting text from DOCX: {e}")
        return ""
    return text.strip()

# Function to extract text from uploaded resume
def extract_resume_text(file_path):
    file_extension = file_path.lower().split('.')[-1]
    if file_extension == 'pdf':
        return extract_pdf_text(file_path)
    elif file_extension in ['docx', 'doc']:
        return extract_docx_text(file_path)
    else:
        print("Unsupported file format. Please upload PDF or DOCX files only.")
        return ""

# Function to extract text from a webpage
def extract_text_from_url(url):
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()  # Raise an error for bad status codes
        soup = BeautifulSoup(response.text, 'html.parser')
        # Remove scripts, styles, and other non-content tags
        for script in soup(["script", "style", "header", "footer", "nav"]):
            script.decompose()
        # Extract text from relevant tags (e.g., p, div, li)
        text = ' '.join([element.get_text(strip=True) for element in soup.find_all(['p', 'div', 'li', 'h1', 'h2', 'h3'])])
        return text.strip()
    except requests.exceptions.RequestException as e:
        print(f"Error fetching job description from URL: {e}")
        return ""

# Function to extract keywords
def extract_keywords(text):
    doc = nlp(text.lower())
    keywords = set()

    # Comprehensive list of technical and professional skills
    tech_skills = {
        'python', 'java', 'javascript', 'sql', 'html', 'css', 'react', 'angular',
        'vue', 'node.js', 'django', 'flask', 'spring', 'hibernate', 'mongodb',
        'postgresql', 'mysql', 'oracle', 'aws', 'azure', 'gcp', 'docker',
        'kubernetes', 'jenkins', 'git', 'github', 'gitlab', 'agile', 'scrum',
        'kanban', 'jira', 'confluence', 'tensorflow', 'pytorch', 'keras',
        'pandas', 'numpy', 'scipy', 'matplotlib', 'seaborn', 'tableau',
        'powerbi', 'excel', 'r', 'scala', 'spark', 'hadoop', 'kafka',
        'elasticsearch', 'redis', 'graphql', 'rest', 'api', 'microservices',
        'devops', 'ci/cd', 'linux', 'windows', 'macos', 'bash', 'powershell',
        'communication', 'leadership', 'teamwork', 'management', 'analytics',
        'problem-solving', 'debugging', 'testing', 'automation'
    }

    # Extract nouns, proper nouns, and technical skills
    for token in doc:
        if token.pos_ in ['NOUN', 'PROPN'] and len(token.text) > 2 and token.text.isalpha():
            keywords.add(token.text)
        elif token.text in tech_skills:
            keywords.add(token.text)

    # Handle multi-word skills and phrases
    text_lower = text.lower()
    multi_word_skills = [
        'machine learning', 'data analysis', 'data science', 'project management',
        'artificial intelligence', 'deep learning', 'natural language processing',
        'computer vision', 'big data', 'business intelligence', 'data mining',
        'software development', 'web development', 'mobile development',
        'full stack', 'front end', 'back end', 'database management',
        'cloud computing', 'cyber security', 'network security',
        'quality assurance', 'user experience', 'user interface'
    ]

    for skill in multi_word_skills:
        if skill in text_lower:
            keywords.add(skill)

    return keywords

# Function to calculate ATS score
def calculate_ats_score(resume_text, job_description):
    documents = [resume_text, job_description]
    vectorizer = TfidfVectorizer(
        stop_words='english',
        ngram_range=(1, 2),
        max_features=1000,
        lowercase=True
    )
    tfidf_matrix = vectorizer.fit_transform(documents)
    similarity = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:2])[0][0]
    return similarity * 100

# Function to provide improvement suggestions
def get_suggestions(resume_keywords, job_keywords):
    missing = job_keywords - resume_keywords
    if missing:
        missing_list = sorted(list(missing))[:10]  # Top 10 suggestions
        return f"Consider adding these keywords to your resume: {', '.join(missing_list)}"
    return "Excellent! Your resume contains all the key skills from the job description."

# Function to provide score interpretation
def get_score_feedback(score):
    if score >= 80:
        return "Excellent match! Your resume aligns very well with this job."
    elif score >= 65:
        return "Good match! Minor improvements could boost your score."
    elif score >= 50:
        return "Moderate match. Review suggestions to improve alignment."
    elif score >= 35:
        return "Below average match. Consider significant improvements."
    else:
        return "Low match. Major revisions recommended to align with job requirements."

# Function to visualize the score
def visualize_score(score):
    # Color coding based on score ranges
    if score >= 80:
        color = 'green'
        performance = 'Excellent'
    elif score >= 65:
        color = 'lightgreen'
        performance = 'Good'
    elif score >= 50:
        color = 'orange'
        performance = 'Moderate'
    elif score >= 35:
        color = 'red'
        performance = 'Below Average'
    else:
        color = 'darkred'
        performance = 'Low'

    plt.figure(figsize=(10, 6))
    bars = plt.bar(['ATS Match Score'], [score], color=color, alpha=0.8, width=0.6)

    plt.ylim(0, 100)
    plt.ylabel('Match Percentage (%)', fontsize=12)
    plt.title('Resume ATS Match Analysis', fontsize=14, fontweight='bold')

    # Add score text on the bar
    plt.text(0, score + 2, f'{score:.1f}%\n({performance})',
             ha='center', va='bottom', fontsize=14, fontweight='bold')

    # Add reference lines
    plt.axhline(y=80, color='green', linestyle='--', alpha=0.3, label='Excellent (80%+)')
    plt.axhline(y=65, color='lightgreen', linestyle='--', alpha=0.3, label='Good (65%+)')
    plt.axhline(y=50, color='orange', linestyle='--', alpha=0.3, label='Moderate (50%+)')
    plt.axhline(y=35, color='red', linestyle='--', alpha=0.3, label='Below Average (35%+)')

    plt.legend(loc='upper right')
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()

# Function to perform general resume check (grammar, spelling, formatting)
def perform_general_check(resume_text, file_name):
    print("\n" + "=" * 60)
    print("GENERAL RESUME HEALTH CHECK RESULTS")
    print("=" * 60)

    # Grammar and Spelling Analysis
    print("\nGrammar and Spelling Analysis:")
    matches = tool.check(resume_text)
    if matches:
        print(f"- Found {len(matches)} potential issue(s). Review these suggestions:\n")
        for i, match in enumerate(matches[:5]):  # Show up to 5 examples
            print(f"  {i+1}. Issue: '{match.ruleId}', Suggestion: '{match.replacements}', Context: '{match.context}'")
        if len(matches) > 5:
            print("  ... and more. Please proofread your resume carefully.")
    else:
        print("- No major grammar or spelling issues detected. Great job! 👍")

    # Formatting and Structure Analysis
    print("\nFormatting and Structure Tips:")
    # Check for common section headings
    resume_lower = resume_text.lower()
    sections = ['experience', 'skills', 'education', 'projects', 'summary']
    found_sections = [sec for sec in sections if sec in resume_lower]

    if len(found_sections) >= 3:
        print(f"- Your resume uses standard section headings ({', '.join(found_sections)}), which is great for ATS parsing. ✅")
    else:
        print(f"- Consider using standard headings like 'Experience', 'Skills', and 'Education' to help ATS easily categorize your information. Your resume appears to be missing: {', '.join(list(set(sections) - set(found_sections)))}.")

    # Check for complex formatting cues (basic check)
    if any(keyword in resume_lower for keyword in ['table', 'column', 'image']):
        print("- Warning: Your resume may contain complex formatting (tables, columns, images). These can be difficult for some ATS to parse correctly and may lead to information being lost.")
    else:
        print("- Your resume appears to have a clean, ATS-friendly format. 👍")

    # Keyword and Content Analysis
    print("\nKeyword & Content Recommendations:")
    # Check for action verbs
    action_verbs = ['managed', 'developed', 'created', 'implemented', 'analyzed', 'led', 'designed', 'optimized', 'improved']
    found_verbs = [verb for verb in action_verbs if verb in resume_lower]

    if len(found_verbs) > 5:
        print("- You've used a good variety of strong action verbs to describe your experience. This is highly recommended!")
    else:
        print(f"- To make your achievements more impactful, try to use stronger action verbs like: {', '.join(action_verbs[:5])}.")

    # Check for quantifiable results
    if any(char.isdigit() and '%' in resume_text for char in resume_text) or any(char.isdigit() and '$' in resume_text for char in resume_text):
        print("- Your resume includes quantifiable results, which effectively showcases your impact. Keep this up! 💪")
    else:
        print("- Consider adding quantifiable results (e.g., 'Increased sales by 15%', 'Reduced costs by $10K', 'Managed a team of 5') to demonstrate your achievements with concrete evidence.")

    # General skill recommendations
    resume_keywords = extract_keywords(resume_text)
    print(f"- Your resume contains {len(resume_keywords)} keywords. A strong resume typically has a good mix of both technical ('hard') and professional ('soft') skills.")

    # Cleanup
    if os.path.exists(file_name):
        os.remove(file_name)
        print(f"\nTemporary file '{file_name}' removed.")

    print("\n" + "=" * 60)
    print("Analysis Complete! Good luck! 🎉")
    print("=" * 60)

print("All functions defined successfully.")

In [4]:
# Step 4: Run the ATS Checker

def run_ats_checker():
    print("ATS Resume Checker")
    print("=" * 50)
    print("This tool analyzes how well your resume matches a job description.")
    print("Supported formats: PDF, DOCX, DOC")
    print("\nStep 1: Upload your resume file")
    print("Click 'Choose Files' in the dialog below to select your resume from your local machine...")

    # Upload resume file from local machine
    uploaded_files = files.upload()

    if not uploaded_files:
        print("No file was uploaded. Please try again.")
        return

    file_name = list(uploaded_files.keys())[0]
    print(f"Successfully uploaded: {file_name}")

    # Extract text from resume
    try:
        resume_text = extract_resume_text(file_name)
        if not resume_text or len(resume_text.strip()) < 100:
            print("Warning: Very short resume text extracted. Please ensure your PDF/DOCX is readable.")
            if os.path.exists(file_name):
                os.remove(file_name)
            return
    except Exception as e:
        print(f"Error reading file: {e}")
        print("Please ensure you uploaded a valid PDF or DOCX file.")
        if os.path.exists(file_name):
            os.remove(file_name)
        return

    print("\n" + "=" * 50)
    print("Step 2: Provide the job description.")
    print("You can either paste the job description text or provide a URL to a job posting.")
    print("- To paste the job description, enter it in the text box below and press Enter.")
    print("- To use a URL, enter the full URL (e.g., https://example.com/job) and press Enter.")
    print("- Leave both blank and press Enter to perform a general resume health check (grammar and spelling).")

    job_desc_url = input("\nJob Description URL (or leave blank to paste text): ")
    job_desc = ""

    if job_desc_url.strip():
        print(f"\nFetching job description from URL: {job_desc_url}")
        job_desc = extract_text_from_url(job_desc_url)
        if not job_desc:
            print("Failed to extract job description from URL. Please try pasting the job description text instead.")
            job_desc = input("\nJob Description (paste here): ")
    else:
        job_desc = input("\nJob Description (paste here): ")

    if not job_desc.strip():
        perform_general_check(resume_text, file_name)
    else:
        # Check if job description is too short
        if len(job_desc.strip()) < 50:
            print("Warning: Job description seems very short. Consider providing a complete job posting for better analysis.")

        print("\nAnalyzing your resume against the job description...")

        try:
            # Extract keywords and calculate score
            resume_keywords = extract_keywords(resume_text)
            job_keywords = extract_keywords(job_desc)
            ats_score = calculate_ats_score(resume_text, job_desc)

            # Display comprehensive results
            print("\n" + "=" * 60)
            print("ATS MATCH ANALYSIS RESULTS")
            print("=" * 60)

            print(f"\nATS Match Score: {ats_score:.2f}%")
            print(f"Score Category: {get_score_feedback(ats_score)}")

            print(f"\nKeyword Analysis:")
            print(f"- Resume keywords found: {len(resume_keywords)}")
            print(f"- Job description keywords: {len(job_keywords)}")
            print(f"- Matching keywords: {len(resume_keywords & job_keywords)}")

            matching_keywords = resume_keywords & job_keywords
            if matching_keywords:
                print(f"\nMatching skills found: {', '.join(sorted(list(matching_keywords))[:15])}")

            print(f"\nImprovement Suggestions:")
            print(get_suggestions(resume_keywords, job_keywords))

            # Show visualization
            visualize_score(ats_score)

            # Provide additional recommendations
            print("\nAdditional Tips to Boost Your Score:")
            if ats_score < 65:
                print("- Use exact keywords from the job description in your resume.")
                print("- Include relevant skills in a dedicated skills section.")
                print("- Tailor your experience descriptions to match job requirements.")
                print("- Use industry-standard terminology.")

            print("- Ensure your resume is in a simple, ATS-friendly format.")
            print("- Use standard section headings (Experience, Skills, Education).")
            print("- Avoid images, tables, and complex formatting.")

        except Exception as e:
            print(f"Error during analysis: {e}")

        finally:
            # Clean up uploaded file
            if os.path.exists(file_name):
                os.remove(file_name)
                print(f"\nTemporary file '{file_name}' removed.")

            print("\n" + "=" * 60)
            print("Analysis Complete! Good luck! 🎉")
            print("=" * 60)

# Run the ATS checker
run_ats_checker()

## Saving to GitHub

To store this notebook on GitHub:
1. Save this JSON content as `ats_matcher.ipynb`.
2. Create a GitHub repository (e.g., `ATS_Checker_Job-Description`).
3. Upload `ats_matcher.ipynb` to the repository.
4. The notebook can be opened in Colab directly from GitHub using the provided link in the first cell.

**Note**: This version runs entirely in Colab and does not require Streamlit. For a web-based version, consider deploying the Streamlit version (see previous versions of this code). Ensure resume files are text-based for accurate extraction. If using a job description URL, ensure it is publicly accessible and contains the job posting details.