In [39]:
job_desc = '''Job Title: Software Engineer

Location: San Francisco, CA

Company: Tech Innovators Inc.

Job Description:
Tech Innovators Inc. is seeking a talented and passionate Software Engineer to join our dynamic team. The ideal candidate will have a strong background in software development, excellent problem-solving skills, and a keen interest in cutting-edge technology.

Responsibilities:

Develop, test, and maintain high-quality software applications.
Collaborate with cross-functional teams to define, design, and ship new features.
Identify and correct bottlenecks and fix bugs.
Continuously discover, evaluate, and implement new technologies to maximize development efficiency.
Participate in code reviews and contribute to the development of best practices.
Requirements:

Bachelor’s degree in Computer Science, Engineering, or a related field.
3+ years of experience in software development.
Proficiency in at least one programming language such as Java, Python, or C++.
Experience with web technologies (HTML, CSS, JavaScript) and frameworks (React, Angular).
Knowledge of database systems (SQL, NoSQL).
Strong understanding of software development principles and methodologies.
Excellent communication skills and a team player attitude.
Preferred Qualifications:

Master’s degree in Computer Science or related field.
Experience with cloud platforms (AWS, Azure, GCP).
Familiarity with DevOps practices and tools (Docker, Kubernetes).
Knowledge of machine learning algorithms and data analysis.
Benefits:

Competitive salary and performance bonuses.
Health, dental, and vision insurance.
401(k) with company match.
Flexible work hours and remote work options.
Professional development opportunities.
'''

In [40]:
resume_desc='''
Name: Harshita Singh

Location: San Francisco, CA

Contact Information:

Email: alex.johnson@email.com
Phone: +91 3050359888
LinkedIn: linkedin.com/in/alexjohnson
Summary:
Enthusiastic Software Engineer with 4 years of experience in developing robust software applications. Adept at problem-solving and collaborating with cross-functional teams to deliver high-quality solutions. Proficient in Java, Python, and web development technologies. Passionate about continuous learning and implementing new technologies.

Professional Experience:

Software Engineer
InnovateTech Solutions
San Francisco, CA
June 2020 – Present

Developed and maintained scalable web applications using Java and React, improving user engagement by 25%.
Collaborated with designers and product managers to define and implement new features, resulting in a 15% increase in customer satisfaction.
Identified and resolved performance bottlenecks, reducing load times by 40%.
Conducted code reviews and mentored junior developers on best practices.
Junior Software Engineer
CodeCrafters Inc.
San Francisco, CA
July 2018 – May 2020

Assisted in the development of web applications using Python and Angular, contributing to a 30% growth in user base.
Implemented RESTful APIs and integrated third-party services to enhance application functionality.
Participated in Agile development processes and collaborated with cross-functional teams to deliver projects on time.
Performed unit testing and debugging to ensure high-quality software releases.
Education:

Bachelor of Science in Computer Science
University of California, Berkeley
Graduated: May 2018
Skills:

Programming Languages: Java, Python, JavaScript
Web Technologies: HTML, CSS, React, Angular
Databases: SQL, NoSQL (MongoDB)
Tools: Git, Docker, Jenkins
Cloud Platforms: AWS, GCP
Methodologies: Agile, Scrum
Soft Skills: Problem-solving, teamwork, communication
Certifications:

AWS Certified Solutions Architect
Certified ScrumMaster (CSM)
Projects:

E-commerce Platform: Developed a full-stack e-commerce platform using Java and React, featuring user authentication, product management, and payment integration.
Data Analysis Tool: Created a Python-based tool for analyzing and visualizing large datasets, utilized by the marketing team to gain insights into customer behavior.
Volunteer Experience:

Tech Mentor at Code for Good, mentoring high school students in coding and software development principles.
'''

In [41]:
# %pip install numpy
# %pip install pandas 
# %pip install nltk
# %pip install gensim
# %pip install spacy
# %pip install streamlit


In [42]:
import numpy as np
import pandas as pd 
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string
import scipy
import gensim
from gensim.models import Word2Vec
import gensim.downloader as api
import spacy
from spacy import displacy
from numpy.linalg import norm
import re 



In [43]:
# %pip install gensim
# %pip install scipy
# nltk.download('punkt') 

In [44]:
# Function 1 

def preprocess(text):
    # Lowercase 
    text = text.lower()
    # Tokenization
    tokens = word_tokenize(text)
    # Remove punctutaion 
    tokens = [word for word in tokens if word.isalnum()]
    # Remove stop words
    stop_words = set(stopwords.words('english'))
    tokens = [word for word in tokens if word not in stop_words]
    return tokens 

In [45]:
# %pip install gensim

In [46]:
job_desc_token = preprocess(job_desc)
resume_desc_token = preprocess(resume_desc)

In [47]:
# !gzip -d GoogleNews-vectors-negative300.bin.gz

In [48]:
# !pip install wget

In [49]:

# # Load Google News pretrained Word2Vec model
# model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz', binary=True)


In [50]:
# Load the pretrained Word2Vec model
model = api.load('word2vec-google-news-300')

2024-07-25 15:33:52.763 loading projection weights from C:\Users\CT_USER/gensim-data\word2vec-google-news-300\word2vec-google-news-300.gz
2024-07-25 15:34:46.135 KeyedVectors lifecycle event {'msg': 'loaded (3000000, 300) matrix of type float32 from C:\\Users\\CT_USER/gensim-data\\word2vec-google-news-300\\word2vec-google-news-300.gz', 'binary': True, 'encoding': 'utf8', 'datetime': '2024-07-25T15:34:46.135549', 'gensim': '4.2.0', 'python': '3.6.3 |Anaconda, Inc.| (default, Nov  8 2017, 15:10:56) [MSC v.1900 64 bit (AMD64)]', 'platform': 'Windows-10-10.0.22621-SP0', 'event': 'load_word2vec_format'}


In [51]:
def get_paragraph_vector(tokens,model): 
    word_vectors = [model[word] for word in tokens if word in model]
    if not word_vectors :
        return np.zero(model.vector_size)
    paragraph_vector = np.mean(word_vectors,axis = 0)
    return paragraph_vector

In [52]:
job_vector = get_paragraph_vector(job_desc_token,model)
resume_vector = get_paragraph_vector(resume_desc_token,model)

In [53]:
# job_vector

In [54]:
# Cosine Similarity 
def cosine_sim(resume_vector,job_vector):
    cosine = np.dot(resume_vector,job_vector)/(norm(resume_vector)*norm(job_vector))
    return cosine

In [55]:
# model.most_similar('software')

In [56]:
# model.similarity('java','python')

In [57]:
# !python -m spacy download en_core_web_lg

In [58]:
# Name - Entity Recognition 


NER = spacy.load("en_core_web_lg")
text1 = NER(job_desc)
text2 = NER(resume_desc)

In [59]:
# for word in text1.ents:
#     print(word.text,word.label_)

In [60]:
# for word in text2.ents:
#     print(word.text,word.label_)

In [61]:

desired_labels = ["PERSON", "EMAIL", "PHONE"]

# Extract entities based on the desired labels
entities = [(ent.text, ent.label_) for ent in text2.ents if ent.label_ in desired_labels]


In [62]:
entities

[('Harshita Singh', 'PERSON'), ('Docker', 'PERSON')]

In [63]:
# Email id and phone number using regular expressions 

email_regex = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
phone_regex = r'\+91 \d{10}'

In [64]:

email = re.findall(email_regex, resume_desc)
phone = re.findall(phone_regex, resume_desc)

In [65]:
email

['alex.johnson@email.com']

In [66]:
phone

['+91 3050359888']

In [67]:
def details(text2,email_regex,phone_regex):
    name = [ent.text for ent in text2.ents if ent.label_ == 'PERSON'][0]
    email = re.findall(email_regex, resume_desc)
    phone = re.findall(phone_regex, resume_desc)
    result = [name] + email + phone
    return result
    


In [68]:
details(text2,email_regex,phone_regex)

['Harshita Singh', 'alex.johnson@email.com', '+91 3050359888']

In [69]:
# import streamlit as st

In [70]:
# st.title('Resume Score')

In [71]:
import pickle

In [72]:
# Save the model
with open('word2vec_model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Save the vectors
with open('vectors.pkl', 'wb') as f:
    pickle.dump({'job_vector': job_vector, 'resume_vector': resume_vector}, f)
