<a href="https://colab.research.google.com/github/Munalisa24/Machine-Learning-Projects/blob/main/resumeandjobmatcher.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [8]:
# Install necessary libraries
!pip install spacy
!pip install scikit-learn
!pip install pytextrank
!pip install PyPDF2
!python -m spacy download en_core_web_sm

import spacy
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import PyPDF2
import re
from google.colab import files, drive
from io import BytesIO
import os
os.listdir('/content/drive/MyDrive/')


# Load spaCy model
nlp = spacy.load("en_core_web_sm")

# Function to clean and preprocess text
def preprocess_text(text):
    """
    Preprocesses the input text: removes punctuation, converts to lowercase,
    removes stopwords, and lemmatizes the text using spaCy.
    """
    text = re.sub(r'[^a-zA-Z\s]', '', text.lower())
    doc = nlp(text)
    processed_text = ' '.join([token.lemma_ for token in doc if not token.is_stop and not token.is_punct])
    return processed_text

# Function to extract text from a PDF file
def extract_text_from_pdf(pdf_file_path):
    """
    Extracts text from a PDF file.
    """
    with open(pdf_file_path, 'rb') as file:
        pdf_reader = PyPDF2.PdfReader(file)
        text = ''
        for page in pdf_reader.pages:
            text += page.extract_text()
    return text

# Function to compute the match score between resume (PDF) and job description
def compute_match_score(resume_text, job_description_text):
    """
    Computes the cosine similarity match score between resume and job description.
    """
    resume_text = preprocess_text(resume_text)
    job_description_text = preprocess_text(job_description_text)

    vectorizer = TfidfVectorizer()
    tfidf_matrix = vectorizer.fit_transform([resume_text, job_description_text])

    similarity = cosine_similarity(tfidf_matrix[0:1], tfidf_matrix[1:2])[0][0]
    return similarity * 100  # Return as percentage

# Function to identify the gaps between candidate's resume and job description
def identify_gaps(resume_text, job_description_text):
    """
    Identify the gaps between the candidate's skills in resume and the job requirements.
    """
    resume_text = preprocess_text(resume_text)
    job_description_text = preprocess_text(job_description_text)

    resume_keywords = set(resume_text.split())
    job_keywords = set(job_description_text.split())

    missing_keywords = job_keywords - resume_keywords
    return missing_keywords

# Example usage with job description and resume (PDF)
job_description_text = """
Job Description: We are seeking an experienced AI Engineer to join our innovative team. This role involves designing, developing, and deploying advanced AI models to solve complex real-world problems. The ideal candidate will have a strong background in machine learning and deep learning, with a passion for staying updated on the latest advancements in AI technologies. You will work with cutting-edge frameworks like TensorFlow, PyTorch, and Vision Transformers to create AI solutions that drive business success.

Responsibilities:

Design, develop, and optimize machine learning and deep learning models using Python, TensorFlow, and PyTorch.
Collaborate with cross-functional teams to build and deploy AI solutions for real-world applications.
Enhance model accuracy through feature engineering, data preprocessing, and optimization techniques.
Stay current with the latest AI research, frameworks, and tools to continuously improve model performance.
Conduct thorough testing and validation to ensure AI models meet business requirements and deliver high-quality results.
Develop end-to-end AI solutions, including deployment, integration, and maintenance of models in production environments.
Requirements:

MSc in Artificial Intelligence or a related field (or equivalent work experience).
3+ years of experience in machine learning, deep learning, and AI model development.
Strong proficiency in Python and machine learning frameworks such as TensorFlow, PyTorch, and Keras.
Experience with advanced deep learning techniques (CNNs, RNNs, Transformers) and vision-based AI systems.
Familiarity with cloud platforms (AWS, GCP, Azure) for deploying AI models.
Strong analytical and problem-solving skills, with the ability to optimize models for efficiency and performance.
Excellent collaboration skills and the ability to work in a fast-paced, team-oriented environment.
Passion for AI, machine learning, and staying up-to-date with emerging technologies.
Preferred Skills:

Experience in NLP, computer vision, or reinforcement learning.
Familiarity with version control tools (Git) and agile development methodologies.
Knowledge of database management and data visualization tools.
"""

# *** Option 1: Mount Google Drive and provide path ***
drive.mount('/content/drive')
pdf_file_path = '/CV_MUNALISA_AUH_6.pdf'  # Adjust path to your resume in Google Drive
resume_text = extract_text_from_pdf(pdf_file_path)

# *** Option 2: Using Session Storage ***
# Uncomment this block if you are uploading the file
# uploaded = files.upload()
# pdf_file_name = next(iter(uploaded))
# pdf_file_content = uploaded[pdf_file_name]
# pdf_file = BytesIO(pdf_file_content)
# pdf_reader = PyPDF2.PdfReader(pdf_file)
# resume_text = ''
# for page in pdf_reader.pages:
#     resume_text += page.extract_text()

# Calculate match score
match_score = compute_match_score(resume_text, job_description_text)
print(f"Match Score: {match_score:.2f}%")

# Identify gaps (missing skills)
missing_skills = identify_gaps(resume_text, job_description_text)
if missing_skills:
    print("Missing Skills/Experience:", missing_skills)
else:
    print("No major gaps identified.")



Collecting en-core-web-sm==3.7.1
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.8/12.8 MB[0m [31m37.6 MB/s[0m eta [36m0:00:00[0m
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_sm')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Match Score: 35.17%
Missing Skills/Experience: {'agile', 'preprocessing', 'continuously', 'technology', 'meet', 'experienced', 'highquality', 'innovative', 'responsibility', 'aws', 'prefer', 'ideal', 'integration

In [None]:
from google.colab import drive
drive.mount('/content/drive')