## <u>Name</u> : ADVAIT GURUNATH CHAVAN
## <u>Contact No.</u> : +91 70214 55852
## <u>Mail ID </u> : advaitchavan135@gmail.com
## <u>Bharat Intern Task No. </u> : 1 -> Resume Parser
### Create an AI to find the correct candidate for the job by using NLTK and some words

## Importing the necessary modules

In [1]:
import nltk
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.corpus import stopwords
from nltk.tag import pos_tag
from nltk import FreqDist
nltk.download('punkt')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Advait\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Advait\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

## Step - 1: Creating a function to Tokenize the text and remove stopwords

In [9]:
def preprocess_text(text):
    # Tokenize the text
    tokens = word_tokenize(text.lower())

    # Remove stopwords
    stop_words = set(stopwords.words("english")) #selected text language is English
    filtered_tokens = [token for token in tokens if token not in stop_words]

    return filtered_tokens

## Step - 2: CALCULATING SIMILARITY

## Preprocessing: The calculate_similarity function starts by preprocessing the job keywords and the candidate profile. It calls the preprocess_text function to tokenize the text, convert it to lowercase, and remove stopwords. This preprocessing step ensures that we are comparing meaningful and relevant words without noise.

## Term frequencies: After preprocessing, the function creates frequency distributions (FreqDist) for both the job keywords and the candidate profile. A frequency distribution is a collection of word frequencies in a given text. It counts the occurrences of each word in the text.

## Similarity calculation: Next, the function identifies the common tokens (words) between the job keywords and the candidate profile by finding the intersection of their respective sets of keys (words). It retrieves the tokens that appear in both the job keywords and the candidate profile.

## Score calculation: Finally, the function calculates the similarity score by summing the products of the term frequencies for the common tokens. It multiplies the frequency of each common token in the job keywords by its frequency in the candidate profile and accumulates the scores.

In [10]:
def calculate_similarity(job_keywords, candidate_profile):
    # Preprocess the job keywords and candidate profile
    job_tokens = preprocess_text(job_keywords)
    candidate_tokens = preprocess_text(candidate_profile)

    # Calculate term frequencies
    job_frequencies = FreqDist(job_tokens)
    candidate_frequencies = FreqDist(candidate_tokens)

    # Calculate the similarity score
    common_tokens = set(job_frequencies.keys()) & set(candidate_frequencies.keys())
    score = sum([job_frequencies[token] * candidate_frequencies[token] for token in common_tokens])

    return score


## Step -3 : Finding the best candidate

## A - Initialization: The find_best_candidate function initializes the variables best_score and best_candidate. These variables will be used to keep track of the highest similarity score and the corresponding best candidate.

## B- Iteration: The function iterates through each candidate in the candidate_details list.

## C- Similarity calculation: For each candidate, it retrieves the candidate's profile and calculates the similarity score by calling the calculate_similarity function, passing the job_keywords and the candidate_profile as parameters. The calculate_similarity function returns the similarity score between the job keywords and the candidate's profile.

## D - Update best candidate: If the calculated score for the current candidate is higher than the current best_score, the best_score variable is updated with the new score, and the best_candidate variable is updated with the name of the current candidate.

## E - Return best candidate: After iterating through all the candidates, the function returns the name of the best candidate based on the highest similarity score.

In [11]:
def find_best_candidate(job_keywords, candidate_details):
    best_score = 0
    best_candidate = None

    for candidate in candidate_details:
        candidate_profile = candidate['profile']
        score = calculate_similarity(job_keywords, candidate_profile)

        if score > best_score:
            best_score = score
            best_candidate = candidate['name']

    return best_candidate


In [12]:
def take_job_inputs():
    job_description = input("Enter the job description: ")
    num_candidates = int(input("\n\nEnter the number of candidates: "))

    candidate_details = []
    for i in range(num_candidates):
        name = input(f"\n\nEnter the name of candidate {i+1}: ")
        profile = input(f"\n\nEnter the profile details of candidate {i+1}: ")
        candidate_details.append({'name': name, 'profile': profile})

    return job_description, candidate_details

In [13]:
def find_best_candidate_for_job(job_keywords, candidate_details):
    best_score = 0
    best_candidate = None

    for candidate in candidate_details:
        candidate_profile = candidate['profile']
        score = calculate_similarity(job_keywords, candidate_profile)

        if score > best_score:
            best_score = score
            best_candidate = candidate['name']

    return best_candidate

In [14]:
job_description, candidate_details = take_job_inputs()

Enter the job description: Job Title: AI Software Development Intern Job Description: We are seeking a motivated and skilled AI Software Development Intern to join our software  engineering team. As an AI Software Development Intern, you will have the opportunity to work on  developing and implementing AI-powered software solutions. You will collaborate with experienced  engineers to design and build robust AI systems, leverage machine learning algorithms, and contribute  to the development of cutting-edge AI applications. Responsibilities: 1. Collaborate with the software engineering team to understand project requirements and contribute  to the design and architecture of AI-driven software solutions. 2. Assist in developing and implementing AI models, algorithms, and systems using programming  languages such as Python, Java, or C++. 3. Preprocess and analyze large-scale data sets to extract meaningful insights and prepare them for AI  model training. 4. Train and fine-tune AI models 



Enter the name of candidate 3: Advait


Enter the profile details of candidate 3: ADVAIT CHAVAN +91 70214 55852 Andheri East, Mumbai, Maharashtra, India advaitchavan135@gmail.com CONTACT EDUCATION Anjuman-I-Islam's M.H. Saboo Siddik College of Engineering BACHELOR OF ENGINEERING ELECTRONICS - BE ELECTRONICS Nirmala Memorial College of Commerce and Science Aug 2017 - Mar 2019 HIGHER SECONDARY - SCIENCE HSC SCIENCE MAHARASHRA BOARD 63.54% CGPA till Sem 6- 9.36 out of 10 Aug 2019 - May 2023 SSC MAHARASHTA BOARD Children's Academy May 2016 - Feb 2017 87.20% SKILLS Excellent Problem Analysis Solid Numerical Solving Excel proficiency and knowledge of querying languages Expertise in Data Visualization Great Communication WORK EXPERIENCE Trainity Sept 2022 - Dec 2022 DATA ANALYST INTERN Analyzed data and generated reports using MySQL Workbench 8.0 to identify platform-oriented problems and provide insights for the marketing team and investors, resulting in effective problemsolving. Created 

In [15]:
best_candidate = find_best_candidate_for_job(job_description, candidate_details)
print(f"The best candidate for the job is: {best_candidate}")

The best candidate for the job is: David
