# Introduction

In this digital era, the recruiting process has changed rapidly by technology being used to optimise and increase the number of employment/hiring processes.
The goal of this project is to improve the hiring process by creating a machine learning algorithm that can parse resumes in different formats and automating the comparison process which might be helpful for the recruiters to determine whether a candidate’s qualification and job criteria aligns. 
The Resume Parsing AI project uses modern methods like Natural Language Processing (NLP) and Artificial Intelligence (AI) to automate the analysis of data from resumes. The ultimate goal is to provide hiring managers and recruiters with a simpler accurate, and efficient way of finding top talent and making accurate recruiting decisions.


This project begins by importing necessary libraries for data analysis and visualisation. 

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px

The dataset loaded from a CSV file into a pandas dataframe as 'df'.

In [2]:
df = pd.read_csv("UpdatedResumeDataSet.csv")
df.head()

Unnamed: 0,Category,Resume,Resume_Length,manual_resumes
0,data science,skills programming languages python pandas num...,3960,skills programming languages python pandas num...
1,data science,education details may 2013 may 2017 uitrgpv da...,1027,education details may 2013 may 2017 uitrgpv da...
2,data science,areas interest deep learning control system de...,1524,areas interest deep learning control system de...
3,data science,skills â r â python â sap hana â tableau â sap...,5864,skills â r â python â sap hana â tableau â sap...
4,data science,education details mca ymcaust faridabad haryan...,373,education details mca ymcaust faridabad haryan...


The dataset have 962 rows and 2 columns i.e 'Categories' and 'Resumes'.

In [3]:
df.shape

(962, 4)

Below the code is displaying basic information about the dataset i.e the number of entries, null values, dataset size, etc, followed by summary statistics which gives more information about this dataset as this dataset has more text values there are not many numerical values followed by the some information about the categories distribution and the number of resumes in each category with each resume's length.

In [4]:
# Display basic information about the dataset
print("Dataset Information:\n")
df.info()
print('---------------------------------------------------')

# Summary statistics for numerical columns
print("\nSummary Statistics for Numerical Columns:\n")
print(df.describe())
print('---------------------------------------------------')

# Display the distribution of categories
print("\nDistribution of Categories:\n")
print(df['Category'].value_counts())
print('---------------------------------------------------')

# Display the length of resumes
df['Resume_Length'] = df['Resume'].apply(len)
print("\nResume Length Statistics:\n")
print(df['Resume_Length'].describe())
print('---------------------------------------------------')

Dataset Information:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 962 entries, 0 to 961
Data columns (total 4 columns):
 #   Column          Non-Null Count  Dtype 
---  ------          --------------  ----- 
 0   Category        962 non-null    object
 1   Resume          962 non-null    object
 2   Resume_Length   962 non-null    int64 
 3   manual_resumes  50 non-null     object
dtypes: int64(1), object(3)
memory usage: 30.2+ KB
---------------------------------------------------

Summary Statistics for Numerical Columns:

       Resume_Length
count     962.000000
mean     2587.891892
std      2323.793086
min       115.000000
25%      1015.000000
50%      1866.000000
75%      3174.000000
max     11965.000000
---------------------------------------------------

Distribution of Categories:

Category
java developer               84
testing                      70
devops engineer              55
python developer             48
web designing                45
hr                     

In the below code Plotly library is used for visualisation. The below horizontal bar chart shows distribution of Categories i.e the count of resumes in each Category with highest count for Java Developer(count=84) and least for Advocate (count=20).

In [5]:
import plotly.express as px

category_counts = df['Category'].value_counts().reset_index()
category_counts.columns = ['Category', 'Count']

fig = px.bar(category_counts, x='Count', y='Category', orientation='h', title='Distribution of Categories')
fig.show()

# Cleaning

After the exploration part of the dataset. This project proceeds to the cleaning part.

Below code uses pandas library to drop unnecessary columns like 'Arts' and 'Advocate' from the dataset.

In [6]:
# Drop rows with 'Arts' and 'Advocate' categories
df_filtered = df[~df['Category'].isin(['Arts', 'Advocate'])]

The below code imports some necessary libraries for cleaning and preprocessing the dataset. Initializing the process by defining a function called 'clean' which preprocess the data it compiles with regular expressions to match and remove the URL's and emails it also removes special characters and keep only words and whitespaces. Followed by removing the stopwords like  "the", "is", "in", etc by using the NLTK library. The function is then applied to the original 'Resume' column by using 'apply' method.

In [7]:
import re
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')

def clean(text):
    # Compile patterns for URLs and emails to speed up the cleaning process
    url_pattern = re.compile(r'https?://\S+|www\.\S+')
    email_pattern = re.compile(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b')

    # Remove URLs
    clean_text = url_pattern.sub('', text)

    # Remove emails
    clean_text = email_pattern.sub('', clean_text)

    # Remove special characters (keeping only words & whitespace)
    clean_text = re.sub(r'[^\w\s]', '', clean_text)

    # Remove stop words by filtering the split words of the text
    stop_words = set(stopwords.words('english'))
    clean_text = ' '.join(word for word in clean_text.split() if word.lower() not in stop_words)

    return clean_text

df["Resume"] = df["Resume"].apply(lambda x: clean(x))
df.head()

[nltk_data] Downloading package stopwords to
[nltk_data]     /Users/saadiyashaikh/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Unnamed: 0,Category,Resume,Resume_Length,manual_resumes
0,data science,skills programming languages python pandas num...,3960,skills programming languages python pandas num...
1,data science,education details may 2013 may 2017 uitrgpv da...,1027,education details may 2013 may 2017 uitrgpv da...
2,data science,areas interest deep learning control system de...,1524,areas interest deep learning control system de...
3,data science,skills â r â python â sap hana â tableau â sap...,5864,skills â r â python â sap hana â tableau â sap...
4,data science,education details mca ymcaust faridabad haryan...,373,education details mca ymcaust faridabad haryan...


This for loop iterates over each column in the df and check if the column contains string values. If there is any string value then it will convert it to lowercase using str.lower()

In [8]:
for column in df.columns:
    if df[column].dtype == 'O':  # Check if the column contains object (string) values
        df[column] = df[column].str.lower()

##### Checking the changes from one of the resume of the dataset.

In [9]:
df['Resume'][0]

'skills programming languages python pandas numpy scipy scikitlearn matplotlib sql java javascriptjquery machine learning regression svm naãve bayes knn random forest decision trees boosting techniques cluster analysis word embedding sentiment analysis natural language processing dimensionality reduction topic modelling lda nmf pca neural nets database visualizations mysql sqlserver cassandra hbase elasticsearch d3js dcjs plotly kibana matplotlib ggplot tableau others regular expression html css angular 6 logstash kafka python flask git docker computer vision open cv understanding deep learningeducation details data science assurance associate data science assurance associate ernst young llp skill details javascript exprience 24 months jquery exprience 24 months python exprience 24 monthscompany details company ernst young llp description fraud investigations dispute services assurance technology assisted review tar technology assisted review assists accelerating review process run ana

# Preprocessing

This code simply takes the first 50 resumes from the dataset then it creates a new column in the DataFrame named "manual_resumes," then it adds the selected resumes to this column and saves the modified DataFrame back to a CSV file.

In [10]:
import pandas as pd

# Select the first 50 resumes from the dataset
selected_resumes = df.loc[:49, 'Resume']

# Create a new column 'manual_resumes' with NaN values
df['manual_resumes'] = np.nan

# Assign the selected resumes to the 'manual_resumes' column
df.loc[:49, 'manual_resumes'] = selected_resumes.values

# Save the DataFrame with the new column
df.to_csv("UpdatedResumeDataSet.csv", index=False)

print(df.head(10))

       Category                                             Resume  \
0  data science  skills programming languages python pandas num...   
1  data science  education details may 2013 may 2017 uitrgpv da...   
2  data science  areas interest deep learning control system de...   
3  data science  skills â r â python â sap hana â tableau â sap...   
4  data science  education details mca ymcaust faridabad haryan...   
5  data science  skills c basics iot python matlab data science...   
6  data science  skills â python â tableau â data visualization...   
7  data science  education details btech rayat bahra institute ...   
8  data science  personal skills â ability quickly grasp techni...   
9  data science  expertise â data quantitative analysis â decis...   

   Resume_Length                                     manual_resumes  
0           3960  skills programming languages python pandas num...  
1           1027  education details may 2013 may 2017 uitrgpv da...  
2           1524  a

A list of dictionaries called job_descriptions is used in this below code to create a DataFrame 'df2'. Each dictionary is a job description which also have some more details like the job title, company, location, job description, about the company and requirements. These sample job descriptions are collected from many different websites. 

The columns of the DataFrame df2 are as follows:

"Job Title": The title of the job role.

"Company": The title of the company which is hiring.

"Location": The job's location.

"About Company": a description of the company that includes details about its aim, basic values, and services.

"Description": a thorough explanation of what is required for the position. "Responsibilities": Specific tasks related to the position.

"Requirements": The education, training, and work experience needed for the position. 

Finally the code prints the initial few rows of the dataframe df2.

In [11]:
import pandas as pd

# Define the job descriptions with additional information
job_descriptions = [
    {
    "Job Title": "Senior Data Scientist",
    "Company": "DataTech Solutions Inc.",
    "Location": "San Francisco, CA, USA",
    "About Company":"DataTech Solutions helps businesses use their data to get ahead. We're a tech company with experts in data analysis, artificial intelligence, and machine learning. We take a bunch of information, make it clear and useful, and help companies figure out new and better ways to do things. Our team loves using the latest tech to solve problems and make a real difference!",
    "Description": "We are seeking a highly skilled Senior Data Scientist to join our dynamic team. This role involves tackling complex data challenges, developing predictive models, and extracting valuable insights to inform strategic decisions.The ideal candidate will have a strong background in data science, with proven experience in machine learning, statistical analysis, and big data technologies. You will lead projects from conception to deployment, mentor junior data scientists, and collaborate with cross-functional teams to drive innovation and improve our data-driven decision-making processes. If you are passionate about leveraging data to solve problems and have a track record of delivering impactful data science solutions, we would love to hear from you.",
    "Responsibilities": "* Lead end-to-end data science projects, from problem formulation and data exploration to model development, deployment, and performance monitoring.* Develop and implement advanced machine learning algorithms and statistical models to analyze large-scale datasets, extract valuable insights, and drive business outcomes.* Collaborate with cross-functional teams, including engineers, product managers, and business stakeholders, to define project objectives, prioritize tasks, and deliver high-impact solutions on time and within budget.* Mentor and guide junior data scientists, providing technical leadership, sharing best practices, and fostering a culture of continuous learning and professional growth.* Stay abreast of the latest advancements in data science, machine learning, and artificial intelligence, and apply cutting-edge techniques and methodologies to solve real-world problems.\n* Communicate findings, insights, and recommendations effectively to both technical and non-technical audiences through clear visualizations, reports, and presentations.\n* Contribute to the development of reusable frameworks, libraries, and tools to streamline data analysis, model development, and deployment processes.* Act as a subject matter expert in data science and machine learning, providing thought leadership, participating in industry conferences, and representing DataTech Solutions Inc. in the data science community.",
    "Requirements": "* Develop and implement advanced statistical and machine learning models to solve complex problems * Lead the data-driven decision-making process, from data collection and analysis to implementation and monitoring of solutions* Manage data science projects, ensuring they meet business requirements and are delivered on time* Mentor junior data scientists, providing guidance and support in their professional development* Collaborate with cross-functional teams to understand business challenges and objectives, translating complex data into actionable insights* Stay abreast of industry trends and advancements in data science and machine learning, continuously improving our methodologies and technologies"},
    
    {
    "Job Title": "Software Engineer",
    "Company": "TechGenius Ltd.",
    "Location": "Seattle, WA, USA",
    "About Company": "TechGenius Ltd. helps businesses win in the digital age. We're a tech company with a team of super-smart engineers, designers, and innovators who build cutting-edge software and digital tools. We focus on making high-quality products that help businesses succeed. We're passionate about using technology to solve problems and make a real difference!",
    "Description": "We are looking for a passionate Software Engineer to design, develop and install software solutions. Software Engineer responsibilities include gathering user requirements, defining system functionality and writing code in various languages, like Java, Ruby on Rails or .NET programming languages (e.g. C++ or JScript.NET.) Our ideal candidates are familiar with the software development life cycle (SDLC) from preliminary system analysis to tests and deployment. Ultimately, the role of the Software Engineer is to build high-quality, innovative and fully performing software that complies with coding standards and technical design.",
    "Responsibilities": "* Execute full software development life cycle (SDLC)* Develop flowcharts, layouts and documentation to identify requirements and solutions* Write well-designed, testable code* Produce specifications and determine operational feasibility* Integrate software components into a fully functional software system* Develop software verification plans and quality assurance procedures* Document and maintain software functionality* Troubleshoot, debug and upgrade existing systems* Deploy programs and evaluate user feedback* Comply with project plans and industry standards* Ensure software is updated with latest features", 
    "Requirements": "* Proven work experience as a Software Engineer or Software Developer*Experience designing interactive applications*Ability to develop software in Java, Ruby on Rails, C++ or other programming languages*Excellent knowledge of relational databases, SQL and ORM technologies (JPA2, Hibernate) *Experience developing web applications using at least one popular web framework (JSF, Wicket, GWT, Spring MVC)*Experience with test-driven development *Proficiency in software engineering tools *Ability to document requirements and specifications*BSc degree in Computer Science, Engineering or relevant field"},
     
    {
    "Job Title": "Project Manager",
    "Company": "ShopSmart Inc.",
    "Location": "New York, NY, USA",
    "About Company": "We're a leading e-commerce company that's all about creating new and exciting ways to shop online. We focus on keeping our customers happy with an easy-to-use platform and personalized features. Our awesome team is always looking for ways to improve and grow, making sure ShopSmart stays ahead of the curve in the ever-changing world of online shopping!",
    "Description": "We are seeking a talented and experienced Project Manager to join our team in New York City. As a Project Manager at ShopSmart Inc., you will be responsible for leading the development and execution of our e-commerce platform strategy, driving product enhancements, and delivering exceptional shopping experiences for our customers. You will collaborate closely with cross-functional teams to define project requirements, prioritize features, and manage project roadmap execution to achieve business objectives and customer satisfaction.",
    "Responsibilities": "* Coordinate internal resources and third parties/vendors for the flawless execution of projects * Ensure that all projects are delivered on-time, within scope and within budget * Develop project scopes and objectives, involving all relevant stakeholders and ensuring technical feasibility * Ensure resource availability and allocation * Develop a detailed project plan to track progress * Use appropriate verification techniques to manage changes in project scope, schedule and costs * Measure project performance using appropriate systems, tools and techniques * Report and escalate to management as needed * Manage the relationship with the client and all stakeholders * Perform risk management to minimize project risks * Establish and maintain relationships with third parties/vendors * Create and maintain comprehensive project documentation",
    "Requirements": "* Bachelor's degree in computer science, business, or a related field * 5-8 years of project management and related experience * Project Management Professional (PMP) certification preferred * Proven ability to solve problems creatively * Strong familiarity with project management software tools, methodologies, and best practices * Experience seeing projects through the full life cycle * Excellent analytical skills * Strong interpersonal skills and extremely resourceful * Proven ability to complete projects according to outlined scope, budget, and timeline"},

    {
    "Job Title": "Marketing Manager",
    "Company": "DigitalPulse Solutions LLC",
    "Location": "Los Angeles, CA, USA",
    "About Company": "DigitalPulse Solutions LLC is a leading digital marketing agency specializing in delivering innovative and results-driven marketing solutions for businesses across industries. With a focus on leveraging cutting-edge technology and strategic insights, we help our clients achieve their marketing goals and drive growth in the digital landscape. Our team of creative professionals is dedicated to delivering impactful campaigns that resonate with audiences and deliver measurable results.",
    "Description": "At DigitalPulse Solutions LLC, marketing is about understanding people and building awareness of how our products and services can satisfy their needs. We’re looking for an experienced and versatile marketing manager who’s eager to do this and more. The ideal candidate has experience in developing and executing marketing campaigns while managing and inspiring a team. The manager should be equally proficient with day-to-day marketing activities and long-term strategizing, and strive under tight deadlines to meet the company's changing needs.",
    "Responsibilities": "* Help develop creative briefs and guide creative direction to meet objectives for all advertising and public-­facing communications, including print, digital, and video assets * Conceptualize and execute on multichannel campaigns across the prospect and customer lifecycle, ensuring the alignment of communications and messaging in all channels * Manage content and updates for customer and internal touch points, establish budget guidelines, participate in events, document business processes, and provide sales support * Gather customer and market insights to inform outreach strategies, increase customer conversions, and generate more qualified leads * Identify effectiveness and impact of current marketing initiatives with tracking and analysis, and optimize accordingly * Present ideas and final deliverables to internal and external teams, and communicate with senior leaders about marketing programs, strategies, and budgets",
    "Requirements": "* Proven success in developing marketing plans and campaigns * Excellent written and verbal communication skills * Strong project management, multitasking, and decision-making skills * Metrics-driven marketing mind with eye for creativity * Experience with marketing automation and CRM tools * Bachelor’s degree (or equivalent) in marketing, business, or related field * Proficiency with online marketing and social media strategy * Proven success in designing interactive applications and networking platforms * Willingness to travel * Established contacts in media"},

    {
    "Job Title": "Data Engineer",
    "Company": "CloudWorks Technologies Corp.",
    "Location": "Austin, TX, USA",
    "About Company": "CloudWorks Technologies Corp. is a leading provider of cloud-based solutions and services, empowering businesses to leverage the power of the cloud for data analytics, machine learning, and digital transformation. With a focus on innovation and scalability, we help our clients unlock the full potential of their data and drive business growth in the digital age. Our team of skilled professionals is dedicated to delivering high-quality solutions that meet the evolving needs of our customers.",
    "Description": "We are looking for a savvy Data Engineer to join our growing team of analytics experts. The hire will be responsible for expanding and optimizing our data and data pipeline architecture, as well as optimizing data flow and collection for cross functional teams. The ideal candidate is an experienced data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up. The Data Engineer will support our software developers, database architects, data analysts and data scientists on data initiatives and will ensure optimal data delivery architecture is consistent throughout ongoing projects. They must be self-directed and comfortable supporting the data needs of multiple teams, systems and products. The right candidate will be excited by the prospect of optimizing or even re-designing our company’s data architecture to support our next generation of products and data initiatives.",
    "Responsibilities": "* Create and maintain optimal data pipeline architecture, * Assemble large, complex data sets that meet functional / non-functional business requirements. * Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc. * Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies. * Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics. * Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs. * Keep our data separated and secure across national boundaries through multiple data centers and AWS regions. * Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader. * Work with data and analytics experts to strive for greater functionality in our data systems.",
    "Requirements": "* Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases. * Experience building and optimizing ‘big data’ data pipelines, architectures and data sets. * Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement. * Strong analytic skills related to working with unstructured datasets. * Build processes supporting data transformation, data structures, metadata, dependency and workload management. * A successful history of manipulating, processing and extracting value from large disconnected datasets. * Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores. * Strong project management and organizational skills. * Experience supporting and working with cross-functional teams in a dynamic environment."},

    
    {
    "Job Title": "Business Analyst",
    "Company": "FinanceGenie Enterprises Inc.",
    "Location": "Chicago, IL, USA",
    "About Company": "Leading financial services provider FinanceGenie Enterprises Inc. is committed to providing creative solutions and tactical insights to support companies in reaching their financial goals. We allow our clients to make well-informed decisions, maximize performance, and drive growth in the fast-paced and cutthroat business environment by putting a strong emphasis on using technology and analytics. Our experienced group of experts is dedicated to providing our clients with outstanding service and value.",
    "Description": "We are seeking a highly motivated and detail-oriented Business Analyst to join our team and play a key role in identifying, analyzing, and implementing improvements to our business processes. You will work closely with stakeholders across various departments to understand their needs, collect and analyze data, and develop solutions that enhance efficiency, effectiveness, and overall business value.",
    "Responsibilities": "* Conduct thorough business analysis to identify opportunities for improvement in processes, systems, and workflows. * Elicit and document business requirements from stakeholders, ensuring clarity and completeness. * Analyze data from various sources to identify trends, patterns, and insights. * Develop and present detailed reports and recommendations with clear justifications and data visualization. * Work with technical teams to design and implement solutions aligned with business requirements. * Manage project timelines, budgets, and resources effectively. * Stay up-to-date on industry trends and best practices in business analysis. * Participate in continuous improvement initiatives to optimize operations and performance. * Provide ongoing support and training to stakeholders on new processes and systems.",
    "Requirements": "* Previous experience in Business / Systems Analysis or Quality Assurance * A degree or certificate in IT / Computer Science * Proven experience in eliciting requirements and testing * Experience in analyzing data to draw business-relevant conclusions and in data visualization techniques and tools * Solid experience in writing SQL queries * Basic knowledge in generating process documentation * Strong written and verbal communication skills, including technical writing skills"},

    {
    "Job Title": "Cybersecurity Specialist",
    "Company": "SecureTech Solutions Ltd.",
    "Location": "Washington, D.C., USA",
    "About Company": "SecureTech Solutions Ltd. is a top cybersecurity company committed to providing innovative services and solutions to shield businesses from online threats and weaknesses. By combining state-of-the-art technology and industry best practices, we help our clients to protect their digital assets, stay compliant with regulations, and reduce cybersecurity threats. Our seasoned team of experts is dedicated to providing clients in a variety of sectors with outstanding service and knowledge.",
    "Description": "As a Cyber Security Specialist, you will be responsible for ensuring the confidentiality, integrity, and availability of company data and systems. You will develop and implement security policies and procedures, perform vulnerability assessments and penetration testing, and investigate security incidents. You will also work closely with IT teams, professional services, and client-facing teams to ensure the security of information systems.",
    "Responsibilities": "* Develop and implement security policies and procedures * Conduct vulnerability assessments and penetration testing * Investigate security incidents and provide incident response * Maintain security awareness program and deliver training to employees * Ensure compliance with applicable regulations and standards * Perform risk assessments and develop risk management strategies * Collaborate with IT teams to identify security vulnerabilities and implement solutions * Monitor network and system activity for security threats * Perform security audits and conduct security assessments * Provide technical expertise and support to client-facing teams",
    "Requirements": "* Bachelor's degree in Computer Science or related field * 2+ years of experience in cyber security * Strong knowledge of information security principles and practices * Experience with vulnerability assessment and penetration testing tools * Knowledge of security-related laws, regulations, and standards * Excellent analytical and problem-solving skills * Strong communication and interpersonal skills"},

    {
    "Job Title": "Human Resources Manager",
    "Company": "TalentForge Inc.",
    "Location": "Atlanta, GA, USA",
    "About Company": "TalentForge Inc. is a leading human resources and talent acquisition firm dedicated to connecting top talent with leading organizations across industries. With a focus on innovation and excellence, we provide comprehensive HR solutions and recruitment services to help our clients attract, engage, and retain the best talent in the market. Our team of HR professionals is committed to delivering exceptional service and driving success for both candidates and clients.",
    "Description": "If you thrive in a people-driven work environment, then TalentForge Inc. HR Manager role is the right fit for you. As an HR Manager, you will oversee the HR team and help other department managers ensure policies and practices are fair for all employees. You’ll manage the administrative process of our organization to provide employees with an ethical work experience. Specifically, you’ll work directly with other members of our HR team, like the Onboarding Managers, and you’ll report to our HR Director.",
    "Responsibilities": "* Handle conflict resolution and disciplinary actions * Oversee payroll and follow pay schedule * Lead and train departmental managers on HR best practices to cultivate an ethical workspace * Monitor job performance of employees and aid in career path development * Oversee the recruitment, hiring process, and onboarding process * Make sure our organization complies with HR labor laws and regulations * Create a smooth onboarding process as well as employee termination process",
    "Requirements": "* At least 4 years of HR experience * A bachelor’s degree in Human Resources * An advanced understanding of labor laws, rules, regulations, and best practices * Excellent communication skills * Organizational expertise * General understanding of HR software and MS Office"},

    {
    "Job Title": "UX/UI Designer",
    "Company": "DesignWorks Creative Agency",
    "Location": "San Diego, CA, USA",
    "About Company": "Leading design company DesignWorks Creative Agency specializes in developing engaging and user-centered digital experiences for customers in a variety of sectors. Through effective design solutions, we assist our customers in elevating their brands and engaging their audiences. We do this by putting a strong emphasis on creativity, innovation, and teamwork. Our group of gifted designers and creatives is committed to pushing the boundaries of design and providing our clients with outstanding outcomes.",
    "Description": "We are looking for a UI/UX Designer to turn our software into easy-to-use products for our clients. UI/UX Designer responsibilities include gathering user requirements, designing graphic elements, and building navigation components. To be successful in this role, you should have experience with design software and wireframe tools. If you also have a portfolio of professional design projects that includes work with web/mobile applications, we’d like to meet you. Ultimately, you’ll create both functional and appealing features that address our clients’ needs and help us grow our customer base.",
    "Responsibilities": "* Gather and evaluate user requirements in collaboration with product managers and engineers * Illustrate design ideas using storyboards, process flows, and sitemaps * Design graphic user interface elements, like menus, tabs, and widgets * Build page navigation buttons and search fields * Develop UI mockups and prototypes that clearly illustrate how sites function and look like * Create original graphic designs (e.g., images, sketches, and tables) * Prepare and present rough drafts to internal teams and key stakeholders * Identify and troubleshoot UX problems (e.g., responsiveness) * Conduct layout adjustments based on user feedback * Adhere to style standards on fonts, colors, and images",
    "Requirements": "* Proven work experience as a UI/UX Designer or similar role * Portfolio of design projects * Knowledge of wireframe tools (e.g., Wireframe.cc and InVision) * Up-to-date knowledge of design software like Adobe Illustrator and Photoshop * Team spirit; strong communication skills to collaborate with various stakeholders * Good time-management skills * BSc in Design, Computer Science, or relevant field"},

    
    {
    "Job Title": "Customer Service Representative",
    "Company": "EnterpriseSales Solutions Inc.",
    "Location": "Remote",
    "About Company": "EnterpriseSales Solutions is a leading provider of sales solutions and services to businesses of all sizes. With a focus on innovation and customer satisfaction, we help our clients maximize sales performance, drive revenue growth, and achieve business success. Our team of dedicated professionals is committed to delivering exceptional service and support to our clients.",
    "Description": "At EnterpriseSales Solutions, we count on the customer service department to interact professionally with our valued customers when they have questions or concerns. We’re looking for a highly skilled customer service representative to join our team and handle inbound and outbound phone calls, email requests, and face-to-face interactions using a friendly, helpful approach. The ideal candidate is a quick learner who can think on their feet and resolve any issues with a customer-first business mentality. This person should also have experience in sales, as the opportunity may arise to promote company products and services. The most successful customer service representative will have the communication and interpersonal skills needed to provide support, answer questions, and resolve issues in an efficient manner. Serving as the voice and face of our company, the representative will be integral in reinforcing our reputation for exceptional customer service.",
    "Responsibilities": "* Build expert, dynamic knowledge of the company’s products and services * Conduct research with available resources to satisfy customer inquiries * Engage with customers in an inviting, friendly, and professional manner to deliver exceptional experiences and nurture lasting relationships * Respond quickly, professionally, and accurately to customer inquiries regarding quotes, orders, status, complaints, returns, and warranties * Meet personal/team qualitative and quantitative targets by explaining the benefits of additional products and services to customers and seizing opportunities to sell * Maintain daily recordings and documentation of issues and resolutions in a database for sales and executive management teams to review",
    "Requirements": "* High school diploma or equivalent * Successful experience in a corporate environment * Strong communication skills, including active listening and clear articulation * Ability to solve problems, alleviate conflicts, and escalate tactfully * Ability to multitask, manage time, and prioritize * Ability to work individually and as a team member * Experience in sales * Experience in a call-center environment * Proven track record of meeting or exceeding sales quotas"},

    {
    "Job Title": "Sales Associate (Part-time)",
    "Company": "SuccessPlus Inc.",
    "Location": "Mumbai, India",
    "About Company": "SuccessPlus Inc. is a dynamic and rapidly growing company dedicated to providing innovative solutions and services to individuals and businesses worldwide. With a focus on excellence and customer satisfaction, we help our clients achieve their goals and drive success in their endeavors. Our team is comprised of talented professionals who are passionate about making a positive impact and delivering exceptional results.",
    "Description": "The Part-Time Sales Associate will be responsible for assisting customers with their shopping needs, promoting products and services, providing excellent customer service, and ensuring store cleanliness and organization. The successful candidate will have a welcoming attitude, strong communication skills, and the ability to work as part of a team in a fast-paced retail environment.",
    "Responsibilities": "* Greet customers and provide assistance with their shopping needs * Promote products and services to maximize sales * Process customer transactions at the cash register and handle returns/exchanges * Maintain an organized and clean store environment * Assist in stocking and restocking merchandise as needed * Answer customer inquiries and resolve complaints in a timely and professional manner",
    "Requirements": "* High school diploma or equivalent * Prior experience in a customer service or retail role * Excellent communication and interpersonal skills * Ability to work a flexible schedule, including evenings and weekends * Strong attention to detail and ability to multitask in a fast-paced environment"},


    {
    "Job Title": "IT Support Specialist",
    "Company": "TechSupport GmbH",
    "Location": "Berlin, Germany",
    "About Company": "Need IT help? TechSupport GmbH is here for you! We're a leading IT support company that helps businesses of all sizes keep their computers and networks running smoothly. Our experts focus on keeping you happy with fast, reliable service so you can focus on running your business.",
    "Description": "We are looking for a highly capable IT support specialist to provide technical assistance to our staff. In this role, your duties will include ensuring optimal use of our hardware and software technologies, enhancing system performance, and securing data. You will also be required to advise on IT equipment upgrades. To ensure success as an IT support specialist, you should possess extensive experience in providing information technology support in a fast-paced environment. Top-notch IT support specialists contribute to increased productivity by ensuring that company IT systems run efficiently.",
    "Responsibilities": "* Consulting with IT managers and other departments as required. * Providing IT assistance to staff and customers. * Training end-users on hardware functionality and software programs. * Resolving logged errors in a timely manner. * Monitoring hardware, software, and system performance metrics. * Updating computer software. as well as upgrading hardware and systems. * Maintaining databases and ensuring system security. * Documenting processes and performing diagnostic tests. * Keeping track of technological advancements and trends in IT support.",
    "Requirements": "* A bachelor's degree in computer science, information technology, or similar. * 3-5 years of experience as an IT support specialist. * Exceptional ability to provide technical support and resolve queries. * In-depth knowledge of computer hardware, software, and networks. * Ability to determine IT needs and train end-users. * Proficiency in IT helpdesk software, such as Freshservice and SysAid. * Experience in documenting processes and monitoring performance metrics. * Advanced knowledge of database maintenance and system security. * Ability to keep up with technical innovation and trends in IT support. * Exceptional interpersonal and communication skills."},

    
    {
    "Job Title": "AI Research Scientist",
    "Company": "BrainTech Research Center",
    "Location": "Seoul, South Korea",
    "About Company": "BrainTech Research Center is a cutting-edge research institution dedicated to advancing artificial intelligence (AI) technologies and applications. Situated in the heart of Seoul, South Korea, our interdisciplinary team of researchers, engineers, and scientists collaborates to push the boundaries of AI research and develop innovative solutions for real-world problems. We are committed to fostering creativity, excellence, and collaboration in AI research and driving positive societal impact.",
    "Description": "BrainTech Research Center is at the forefront of digital reinvention, helping clients reimagine how they serve their connected customers and operate enterprises. We’re looking for an experienced artificial intelligence engineer to join the revolution, using deep learning, neuro-linguistic programming (NLP), computer vision, chatbots, and robotics to help us improve various business outcomes and drive innovation. The engineer will join a multidisciplinary team helping to shape our AI strategy and showcasing the potential for AI through early-stage solutions. This is an excellent opportunity to take advantage of emerging trends and technologies and make a real-world difference.",
    "Responsibilities": "* Advise C-suite executives and business leaders on a broad range of technology, strategy, and policy issues associated with AI * Work on functional design, process design (including scenario design, flow mapping), prototyping, testing, training, and defining support procedures, in collaboration with an advanced engineering team and executive leadership * Articulate and document the solutions architecture and lessons learned for each exploration and accelerated incubation * Manage a team in conducting assessments of the AI and automation market and competitor landscape * Serve as liaison between stakeholders and project teams, delivering feedback and enabling team members to make necessary changes in product performance or presentation",
    "Requirements": "* Two or more years of experience in applying AI to practical and comprehensive technology solutions * Experience with ML, deep learning, TensorFlow, Python, NLP * Experience in program leadership, governance, and change enablement * Knowledge of basic algorithms, object-oriented and functional design principles, and best-practice patterns * Experience in REST API development, NoSQL database design, and RDBMS design and optimizations * Bachelor’s or master's degree in computer science or related field * Experience with innovation accelerators * Experience with cloud environments"},

    {
    "Job Title": "Oil & Gas Engineer",
    "Company": "PetroEnergy Corporation",
    "Location": "Riyadh, Saudi Arabia",
    "About Company": "PetroEnergy Corporation is a leading energy company specializing in the exploration, production, and distribution of oil and gas resources. With a strong presence in Saudi Arabia, we are committed to leveraging advanced technologies and industry expertise to maximize the value of our assets and contribute to the sustainable development of the energy sector. Our team of professionals is dedicated to excellence, innovation, and environmental stewardship in all aspects of our operations.",
    "Description": "We are looking for a skilled and experienced Oil & Gas Engineer to join our team at PetroEnergy Corporation in Riyadh, Saudi Arabia. As an Oil & Gas Engineer, you will play an important role in the design, operation, and optimization of oil and gas production facilities and processes, ensuring the safe, efficient, and cost-effective extraction and processing of hydrocarbon resources.",
    "Responsibilities": "* Perform basic maintenance on our equipment and facilities to prevent any breakdowns. * Aid with the installation of new equipment. * Troubleshoot and research any problems that arise. * Confidently create solutions to problems within a reasonable timeframe. * Perform routine inspections and safety tests on our equipment and facilities. * Stay updated on new technologies and their many uses.",
    "Requirements": "* Bachelor's or Master's degree in Petroleum Engineering, Chemical Engineering, or a related field. * 5+ years of experience in oil and gas engineering roles, with a focus on production engineering, reservoir engineering, or field operations. * A degree in a related field is required, while past experience is strongly preferred * Must be proficient with mathematics and engineering strategies * Must possess a highly analytical mind that can quickly find solutions to problems * Must be able to accurately diagnose problems and identify what caused the problem * Must be an excellent communicator and a highly motivated individual * Fluency in English is required, and proficiency in Arabic is a plus."},


    {
    "Job Title": "Marketing Assistant (Working Student)",
    "Company": "TechVision Solutions Ltd.",
    "Location": "Amsterdam, Netherlands",
    "About Company": "TechVision Solutions Ltd. is a dynamic technology solutions provider specializing in innovative software and digital solutions for businesses worldwide. With a focus on creativity and cutting-edge technology, we help our clients achieve their digital goals and stay ahead in the competitive market. Our team is comprised of talented professionals who are passionate about driving innovation and delivering exceptional results.",
    "Description": "We are seeking a motivated and enthusiastic Marketing Assistant to join our team as a working student in Amsterdam. As a Marketing Assistant at TechVision Solutions Ltd., you will support the marketing team in executing various marketing initiatives and campaigns to promote our products and services and engage our target audience.",
    "Responsibilities": "* Undertake daily administrative tasks to ensure the functionality and coordination of the department’s activities * Support marketing executives in organizing various projects * Conduct market research and analyze consumer rating reports/ questionnaires * Employ marketing analytics techniques to gather important data (social media, web analytics, rankings etc.) * Update spreadsheets, databases and inventories with statistical, financial and non-financial information * Assist in the organizing of promotional events and traditional or digital campaigns and attend them to facilitate their success * Prepare and deliver promotional presentations * Compose and post online content on the company’s website and social media accounts * Write marketing literature (brochures, press releases etc) to augment the company’s presence in the market * Communicate directly with clients and encourage trusting relationships",
    "Requirements": "* Currently enrolled as a student in a Bachelor's or Master's degree program, preferably in Marketing, Communications, Business, or a related field. * Proven experience as a marketing assistant * Good understanding of office management and marketing principles * Demonstrable ability to multi-task and adhere to deadlines * Well-organized with a customer-oriented approach * Good knowledge of market research techniques and databases * Good knowledge of MS Office, marketing computer software and online applications (CRM tools, Online analytics, Google Adwords etc.) * Exquisite communication and people skills * Fluency in English is required, and proficiency in Dutch is a plus."}

]

df2 = pd.DataFrame(job_descriptions) # Create new DataFrame

print(df2.head())

               Job Title                        Company  \
0  Senior Data Scientist        DataTech Solutions Inc.   
1      Software Engineer                TechGenius Ltd.   
2        Project Manager                 ShopSmart Inc.   
3      Marketing Manager     DigitalPulse Solutions LLC   
4          Data Engineer  CloudWorks Technologies Corp.   

                 Location                                      About Company  \
0  San Francisco, CA, USA  DataTech Solutions helps businesses use their ...   
1        Seattle, WA, USA  TechGenius Ltd. helps businesses win in the di...   
2       New York, NY, USA  We're a leading e-commerce company that's all ...   
3    Los Angeles, CA, USA  DigitalPulse Solutions LLC is a leading digita...   
4         Austin, TX, USA  CloudWorks Technologies Corp. is a leading pro...   

                                         Description  \
0  We are seeking a highly skilled Senior Data Sc...   
1  We are looking for a passionate Software Engin...

In [12]:
df2.head()

Unnamed: 0,Job Title,Company,Location,About Company,Description,Responsibilities,Requirements
0,Senior Data Scientist,DataTech Solutions Inc.,"San Francisco, CA, USA",DataTech Solutions helps businesses use their ...,We are seeking a highly skilled Senior Data Sc...,"* Lead end-to-end data science projects, from ...",* Develop and implement advanced statistical a...
1,Software Engineer,TechGenius Ltd.,"Seattle, WA, USA",TechGenius Ltd. helps businesses win in the di...,We are looking for a passionate Software Engin...,* Execute full software development life cycle...,* Proven work experience as a Software Enginee...
2,Project Manager,ShopSmart Inc.,"New York, NY, USA",We're a leading e-commerce company that's all ...,We are seeking a talented and experienced Proj...,* Coordinate internal resources and third part...,"* Bachelor's degree in computer science, busin..."
3,Marketing Manager,DigitalPulse Solutions LLC,"Los Angeles, CA, USA",DigitalPulse Solutions LLC is a leading digita...,"At DigitalPulse Solutions LLC, marketing is ab...",* Help develop creative briefs and guide creat...,* Proven success in developing marketing plans...
4,Data Engineer,CloudWorks Technologies Corp.,"Austin, TX, USA",CloudWorks Technologies Corp. is a leading pro...,We are looking for a savvy Data Engineer to jo...,* Create and maintain optimal data pipeline ar...,* Advanced working SQL knowledge and experienc...


# Modeling

The Mean Average Precision (MAP) measure is calculated in this code to evaluate how well a resume scoring system performs for different job descriptions. The process begins by starting a TF-IDF vectorize it converts textual data into feature vectors. Next, it concatenates the "Description" and "Requirements" columns of job descriptions and vectorizes both the job descriptions and manually uploaded resumes. Next it uses cosine similarity scores the resumes get scored for each job description. The 50 rated resumes and the manually labelled relevant resumes are then compared to get the MAP scores for each job description. The manually labelled resumes act as the real-world data it showes which resumes are considered useful or appropriate for each job description as the system creates a list of the top 50 resumes based on similarity scores. Finally the calculation and display the total MAP score which is the average performance over all job descriptions. It helps by showing the algorithm's capacity to properly rank resumes compared to some job descriptions.

In [13]:
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

def calculate_MAP(actual_top_resumes, ranked_resumes_indices):
    num_relevant_resumes = 0
    precision_sum = 0.0
    for i, resume_index in enumerate(ranked_resumes_indices):
        if resume_index in actual_top_resumes:
            num_relevant_resumes += 1
            precision_sum += num_relevant_resumes / (i + 1)  # Precision at position i
    if num_relevant_resumes == 0:
        return 0.0  # If there are no relevant resumes, return 0
    else:
        return precision_sum / num_relevant_resumes  # Average precision

# Initialize TF-IDF vectorizer
tfidf_vectorizer = TfidfVectorizer()

# Concatenate "Description" and "Requirements" for job descriptions from df2
job_descriptions_text = df2["Description"].fillna('') + " " + df2["Requirements"].fillna('')

# Vectorize job descriptions
job_descriptions_vectorized = tfidf_vectorizer.fit_transform(job_descriptions_text)

# Vectorize manual resumes from df
manual_resumes_text = df['manual_resumes'].values.astype(str)
manual_resumes_vectorized = tfidf_vectorizer.transform(manual_resumes_text)

# Initialize a dictionary to store the ranked resumes for each job description
ranked_resumes = {}

# Iterate over each job description
for i, job_info in enumerate(df2.iterrows()):
    job_title = df2.loc[i, "Job Title"]
    
    # Compute similarity scores between job description and manual resumes
    resume_similarities = cosine_similarity(job_descriptions_vectorized[i], manual_resumes_vectorized)
    
    # Get indices of resumes sorted by similarity score (descending order)
    ranked_resumes_indices = np.argsort(resume_similarities[0])[::-1]
    
    # Store the ranked resumes for this job description
    ranked_resumes[job_title] = ranked_resumes_indices

# Initialize a dictionary to store the indices of the top 50 resumes
top_50_resumes_indices = {job_title: indices[:50] for job_title, indices in ranked_resumes.items()}

# Calculate Mean Average Precision (MAP) for each job description
mean_avg_precision = {}
for job_title, ranked_resumes_indices in top_50_resumes_indices.items():
    relevant_indices = set(range(50))   
    retrieved_indices = set(ranked_resumes_indices)
    relevant_and_retrieved = relevant_indices.intersection(retrieved_indices)
    precision_at_k = len(relevant_and_retrieved) / len(retrieved_indices) if len(retrieved_indices) > 0 else 0
    mean_avg_precision[job_title] = precision_at_k

# Calculate overall Mean Average Precision (MAP)
overall_map = np.mean(list(mean_avg_precision.values()))

print("Mean Average Precision (MAP) for each job description:")
for job_title, map_score in mean_avg_precision.items():
    print(f"Job Title: {job_title}, MAP: {map_score}")

print("\nOverall Mean Average Precision (MAP):", overall_map)

Mean Average Precision (MAP) for each job description:
Job Title: Senior Data Scientist, MAP: 0.88
Job Title: Software Engineer, MAP: 0.84
Job Title: Project Manager, MAP: 0.9
Job Title: Marketing Manager, MAP: 1.0
Job Title: Data Engineer, MAP: 1.0
Job Title: Business Analyst, MAP: 0.92
Job Title: Cybersecurity Specialist, MAP: 1.0
Job Title: Human Resources Manager, MAP: 0.84
Job Title: UX/UI Designer, MAP: 0.92
Job Title: Customer Service Representative, MAP: 1.0
Job Title: Sales Associate (Part-time), MAP: 0.82
Job Title: IT Support Specialist, MAP: 1.0
Job Title: AI Research Scientist, MAP: 0.86
Job Title: Oil & Gas Engineer, MAP: 0.76
Job Title: Marketing Assistant (Working Student), MAP: 0.92

Overall Mean Average Precision (MAP): 0.9106666666666666


For each job description, this method calculates Precision@K where K is the number of resumes that rank top.  First, it adds empty strings into the DataFrame for any NaN values. Next it vectorizes the manual resumes and the job descriptions by creating a TF-IDF vectorizer. It next scans over each job description repeatedly calculates the cosine similarity scores between the resumes and the job description, and ranks the resumes according to these values. Precision@K which indicates the percentage of relevant resumes in the top K ranked resumes, is calculated for each job description for defined values of K (5, 10, and 20). After storing the outcomes in a dictionary Precision@K for every job description is shown giving information about how well the algorithm did in finding proper resumes for different jobs.

In [14]:
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Replace NaN values with empty strings
df2.fillna('', inplace=True)

# Initialize TF-IDF vectorizer
tfidf_vectorizer = TfidfVectorizer()

# Vectorize job descriptions
job_descriptions_text = df['manual_resumes'].astype(str)
job_descriptions_vectorized = tfidf_vectorizer.fit_transform(job_descriptions_text)

# Vectorize manual resumes
manual_resumes_text = df2['Description'] + df2['Responsibilities'] + df2['Requirements']
manual_resumes_vectorized = tfidf_vectorizer.transform(manual_resumes_text)

# Initialize a dictionary to store Precision@K for each job description
precision_at_k_dict = {}

k_values = [5, 10, 20] # Define values of K for Precision@K

# Iterate over each job description
for job_title, job_info in df2.iterrows():
    
    # Compute similarity scores between job description and manual resumes
    resume_similarities = cosine_similarity(job_descriptions_vectorized[job_title], manual_resumes_vectorized)
    
    # Get indices of resumes sorted by similarity score (descending order)
    ranked_resumes_indices = np.argsort(resume_similarities[0])[::-1]
    
    # Initialize a dictionary to store Precision@K for this job description
    precision_at_k_job = {}
    
    # Calculate Precision@K for each value of K
    for k in k_values:
        top_k_resumes_indices = ranked_resumes_indices[:k]
        relevant_indices = set(range(k))  # All top K resumes are considered relevant
        retrieved_indices = set(top_k_resumes_indices)
        relevant_and_retrieved = relevant_indices.intersection(retrieved_indices)
        precision_at_k = len(relevant_and_retrieved) / k if k > 0 else 0  # Precision@K
        precision_at_k_job[k] = precision_at_k
    
    # Store Precision@K for this job description
    precision_at_k_dict[job_title] = precision_at_k_job

# Display Precision@K for each job description
print("Precision@K for each job description:")
for job_title, precision_at_k_job in precision_at_k_dict.items():
    print(f"Job Title: {df2.loc[job_title, 'Job Title']}")
    for k, precision_at_k in precision_at_k_job.items():
        print(f"  Precision@{k}: {precision_at_k}")

Precision@K for each job description:
Job Title: Senior Data Scientist
  Precision@5: 0.4
  Precision@10: 0.6
  Precision@20: 0.75
Job Title: Software Engineer
  Precision@5: 0.4
  Precision@10: 0.7
  Precision@20: 0.75
Job Title: Project Manager
  Precision@5: 0.6
  Precision@10: 0.6
  Precision@20: 0.75
Job Title: Marketing Manager
  Precision@5: 0.6
  Precision@10: 0.6
  Precision@20: 0.75
Job Title: Data Engineer
  Precision@5: 0.6
  Precision@10: 0.8
  Precision@20: 0.75
Job Title: Business Analyst
  Precision@5: 0.4
  Precision@10: 0.7
  Precision@20: 0.75
Job Title: Cybersecurity Specialist
  Precision@5: 0.4
  Precision@10: 0.7
  Precision@20: 0.75
Job Title: Human Resources Manager
  Precision@5: 0.4
  Precision@10: 0.7
  Precision@20: 0.75
Job Title: UX/UI Designer
  Precision@5: 0.4
  Precision@10: 0.6
  Precision@20: 0.75
Job Title: Customer Service Representative
  Precision@5: 0.6
  Precision@10: 0.6
  Precision@20: 0.75
Job Title: Sales Associate (Part-time)
  Precision@

# Cosine Similarity & Ranking

This code gives a full process for evaluating resumes based on a given job description. Using the load_data() method resume data is first loaded from a CSV file. Using the train_word2vec_model() function, the code trains a model on the resume data using Word2Vec a popular word embedding method, on user input of job description. The fine_tune_word2vec_model() method is then used to improve the model on the given job description text, improving its analysis of language specific to the job. Next using cosine similarity scores calculated by using the rank_resumes() function, resumes are sorted on how well they match the job description. Thus to help recruiters or employers find the best applicants for a certain job opening the sorted resumes are finally displayed. By using Word2Vec's capacity to store semantic connections between words, this integrated method improves the application and precision of job hiring processes.

##### Output:-
The output displays the ranked resumes with the similarity score the similarity score indicates the similarity between the resume and the job description. The resumes are displayed in descending order of similarity scores.

In [15]:
import pandas as pd
from gensim.models import Word2Vec
from sklearn.metrics.pairwise import cosine_similarity

def load_data():  # Load the resume data
    return pd.read_csv("UpdatedResumeDataSet.csv")

def train_word2vec_model(resumes):  # Train Word2Vec model on the resume data
    tokenized_resumes = [resume.split() for resume in resumes]
    model = Word2Vec(sentences=tokenized_resumes, vector_size=100, window=5, min_count=1, workers=4)
    return model

# Fine-tune Word2Vec model on the Job descriptions
def fine_tune_word2vec_model(job_descriptions, model):
    tokenized_job_descriptions = [description.split() for description in job_descriptions]
    model.build_vocab(tokenized_job_descriptions, update=True)
    model.train(tokenized_job_descriptions, total_examples=len(tokenized_job_descriptions), epochs=model.epochs)
    return model

# Calculate similarity scores between Job description and Resumes using Word2Vec
def calculate_similarity(job_description, resumes, model):
    job_description_tokens = job_description.split()
    job_description_embedding = sum(model.wv[word] for word in job_description_tokens) / len(job_description_tokens)
    resume_embeddings = [sum(model.wv[word] for word in resume.split()) / len(resume.split()) for resume in resumes]
    similarity_scores = [cosine_similarity([job_description_embedding], [resume_embedding])[0][0] for resume_embedding in resume_embeddings]
    return similarity_scores

# Rank resumes based on similarity scores
def rank_resumes(job_description, resumes_df, model):
    similarity_scores = calculate_similarity(job_description, resumes_df['Resume'], model) # Calculate similarity scores

    # Create DataFrame with resumes and their similarity scores
    results_df = pd.DataFrame({
        'Resume': resumes_df['Resume'],
        'Similarity Score': similarity_scores
    })

    results_df = results_df.drop_duplicates(subset=['Resume'])  # Remove duplicates
    
    results_df = results_df.sort_values(by='Similarity Score', ascending=False)    # Sort by similarity score
    
    return results_df

if __name__ == '__main__':
    resumes_df = load_data()   # Load the resume data
    
    job_description = input("Enter the job description: ")    # Take job description input from user

    model = train_word2vec_model(resumes_df['Resume'])  # Train Word2Vec model on resume data

    # Fine-tune Word2Vec model on job descriptions
    job_descriptions = [job_description]
    model = fine_tune_word2vec_model(job_descriptions, model)

    ranked_resumes = rank_resumes(job_description, resumes_df, model)   # Rank resumes

    print("Ranked Resumes:")
    print(ranked_resumes)

Ranked Resumes:
                                                Resume  Similarity Score
270  education details may 2014 diploma nutrition e...          0.983338
513  skills â well versed ms office internet applic...          0.980139
408  education details february 2006 february 2006 ...          0.980023
512  key competencies âmulti operations managementâ...          0.978782
41   skills â windows xp ms office word excel looku...          0.966506
..                                                 ...               ...
5    skills c basics iot python matlab data science...          0.560827
319  education details august 2010 may 2017 electro...          0.558618
45   education details bba lovely professional univ...          0.552844
46   education details mba acn college engineering ...          0.549010
4    education details mca ymcaust faridabad haryan...          0.505945

[166 rows x 2 columns]
