# Resume Chain

Steps:

1. Read in the resume and the job spec.
2. Scores the resume in terms of fit based on the job spec.
3. Makes suggestions to improve the scoring of the match.

## Prerequisites: Installing and Importing Modules

In [26]:
# Install libraries.

%%capture
!pip install langchain
!pip install openai 
!pip install "unstructured[local-inference]"
!pip install chromadb
!pip install tiktoken 
!pip install python-dotenv 

In [2]:
import os
from langchain.chains import RetrievalQA
import openai
from dotenv import load_dotenv
from langchain.llms import OpenAI
from langchain.vectorstores import Chroma
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.llms import OpenAI
from langchain.chains.question_answering import load_qa_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
import json

In [3]:
openai.api_key = os.getenv("OPENAI_API_KEY")

## Extracting Information From the Resume 

In [4]:
class ResumeExtractor(object):
    def __init__(self, path : str) -> None:

        # Precondition checks.
        if path == "" or path == None:
            raise ValueError(f"The path provided: {path} is not a valid path.")
        if not path.endswith(".pdf"):
            raise ValueError(f"The path provided: {path} is not a valid pdf path.") 

        self.path = path
        loader = UnstructuredPDFLoader(path)
        self.pages  = loader.load_and_split()
        if (len(self.pages) > 1):
            raise ValueError(f"The resume provided has more than 1 page. Please send a resume with just one page.")
        embeddings = OpenAIEmbeddings()

        # Just one page resumes accepted.
        self.docsearch = Chroma.from_documents([self.pages[0]], embeddings).as_retriever(search_kwargs={ "k": 1 })
        self.chain = load_qa_chain(OpenAI(temperature=0, max_tokens=2500), chain_type="stuff")

    def ask(self, question : str) -> str:
        docs = self.docsearch.get_relevant_documents(question)
        output = self.chain.run(input_documents=docs, question=question)
        return output

    def extract_details(self) -> str:
        query_to_extract_info = """Using the document, answer the following questions and output valid json with property names enclosed with double quotes with keys: "is_resume", "skills", "years_of_experience", "experience_summary", "achievements", "highest_education":

        1. Is this document of a resume? Answer in "True" or "False". The answer should correspond to the "is_resume" key.
        2. What are the candidates skills? The answer should be a json list associated with the "skills" key.
        3. How many years of experience does the candidate have? The answer should correspond to the "years_of_experience" key.
        4. Based on the candidate's experience, extract achievements that are backed by numbers that the candidate has made in the form of a json list associated with the "achievements" key.
        5. What is the candidate's highest education and field of study?"""
        return self.ask(query_to_extract_info)

### Software Engineering Resume Details

In [5]:
software_engineer_doc = ResumeExtractor("./ResumeExamples/SoftwareEngineer.pdf")
software_engineer_doc.pages[0].page_content

'FIRST LAST Bay Area, California • +1-234-456-789 • professionalemail@resumeworded.com • linkedin.com/in/username\n\nSoftware Engineer with over six years of experience in full-stack development and leading product cycle from conception to completion. Guided a team of 5-15 members through 5+ product launches at a recent experience in a high growth technology startup.\n\nPROFESSIONAL EXPERIENCE\n\nResume Worded, New York, NY Software Engineer\n\n2020 – Present\n\nCreated an invoicing system for subscription services that managed monthly invoices and\n\nprinted invoices to be sent to customers; increased conversion rate by 15%.\n\nCollaborated with internal teams, including graphic design and QA testers to develop and\n\nlaunch a new application in just 6 months, ahead of schedule by 6 months.\n\nWrote reusable unit test documents to ensure quality control and detect bugs by increasing\n\nover 35% efﬁciency rate.\n\nGrowthsi, San Diego, CA Software Engineer\n\n2016 – 2020\n\nAnalyzed inf

In [6]:
software_engineering_details = software_engineer_doc.extract_details()

In [10]:
software_engineering_resume_details = json.loads(software_engineering_details)
software_engineering_resume_details

{'is_resume': 'True',
 'skills': ['CSS',
  'Javascript',
  'Python',
  'Advanced SAP',
  'HTML and XML',
  'Scrum Methodology',
  'Database management software',
  'Software Development Life Cycle'],
 'years_of_experience': 6,
 'experience_summary': 'Software Engineer with over six years of experience in full-stack development and leading product cycle from conception to completion. Guided a team of 5-15 members through 5+ product launches at a recent experience in a high growth technology startup.',
 'achievements': ['Increased conversion rate by 15%',
  'Launched a new application in just 6 months, ahead of schedule by 6 months',
  'Increased efﬁciency rate by 35%',
  'Presented outputs to CTO',
  'Ensured project completion 3 weeks prior to targeted due date',
  'Released and updated 15+ custom .net applications for company clients in health niches',
  'Increased mobile trafﬁc by 22%',
  'Created dynamic web pages using Javascript, resulting in website leads increase by 15%',
  'Inc

### Investment Banking Resume Details

In [13]:
investment_banking_associate_doc = ResumeExtractor("./ResumeExamples/InvestmentBankingAssociate.pdf")

In [14]:
investment_banking_associate_details = json.loads(investment_banking_associate_doc.extract_details())

In [15]:
investment_banking_associate_details

{'is_resume': 'True',
 'skills': ['Restructuring',
  'Valuation',
  'Financial Analysis',
  'Financial Modeling Investment Banking',
  'Mergers and Acquisitions',
  'Leveraged Buyouts',
  'Bond Issuance',
  'Bloomberg Terminal',
  'S&P Capital IQ',
  'Microsoft Excel',
  'LexisNexis',
  "Moody's RiskCalc",
  'Certified Financial Planner (CFP)',
  'Series 7 License',
  'Chartered Financial Analyst (CFA)'],
 'years_of_experience': 10,
 'experience_summary': 'Investment banking associate with 10 years of experience executing transactions and providing world-class client service. Possesses exceptional knowledge of M&A, leveraged finance, and capital markets.',
 'achievements': ['Identified ways to streamline cost savings with 17 Asian counterparts, resulting in a 90% increase in monthly revenue',
  'Delivered marketing pitches to 17 C-suite members of 1.2K large corporate clients in the first six months of employment',
  'Evaluated administrative costs and created a firmwide cost-saving me

### Business Development Manager

In [17]:
business_development_manager_doc = ResumeExtractor("./ResumeExamples/BusinessDevelopmentManager.pdf")

In [18]:
business_dev_manager_details = json.loads(business_development_manager_doc.extract_details())
business_dev_manager_details

{'is_resume': 'True',
 'skills': ['Microsoft Access',
  'Excel',
  'CRM',
  'D2C ECommerce',
  'SEO/SEM',
  'Project Management Professional (PMP)',
  'ABC Certification',
  'Certified Business Analysis Professional (2013)'],
 'years_of_experience': 8,
 'experience_summary': 'Generated 300+ leads per month, landed 25 global accounts in 8 months, cut costs by $1 million annually, collaborated with leadership, engineering, and marketing teams to develop new offerings that raised sales $300k, increased customer retention by 25%, persuaded 5 partners to invest over $200,000 in other entities, coordinated and hosted 15+ conferences, events, and trips for the inside sales team of 10, generated new business and long-term account opportunities through prospecting and cold-calling, resulting in over $60k in closed new and recurring business, developed and implemented sales strategies that identified and produced new business in 3 states, established CRM to improve account tracking, which improv

## Extracting Information From The Job Description

In [5]:
class JobDescriptionExtractor(object):
    def __init__(self, path : str):
        # Precondition checks.
        if path == "" or path == None:
            raise ValueError(f"The path provided: {path} is not a valid path.")

        self.path = path
        self.loader = TextLoader(path)
        documents = self.loader.load()
        #text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
        #texts = text_splitter.split_documents(documents)
        from langchain.indexes import VectorstoreIndexCreator
        self.index = VectorstoreIndexCreator().from_loaders([self.loader])
        #embeddings = OpenAIEmbeddings()
        #self.docsearch = Chroma.from_documents(texts, embeddings).as_retriever(search_kwargs = { "k" : 1 })
        #llm = OpenAI(temperature= 0, max_tokens=3000)
        #self.chain = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=self.docsearch)

    def ask(self, question : str) -> str:
        output = self.index.query(question)
        return output

    def extract_details(self) -> str:
        query_to_extract_info = """Using the document, answer the following questions and output valid json with property names enclosed with double quotes with keys: "is_job_description", "skills_required", "responsibilities", "qualifications", "preferences":

        1. Is this document of a job description? Answer in "True" or "False". The answer should correspond to the "is_job_description" key.
        2. What are the skills required? The answer should be a json list associated with the "skills_required" key.
        3. What are the responsibilities? The answer should be a json list associated with the "responsibilities" key.
        4. What are the qualifications required? The answer should be in the form of a json list associated with the "qualifications" key.
        5. What are the preferences or preferred qualifications or skills? The answer should be in the form of a json list associated with the preferences key.
        """ 
        return self.ask(query_to_extract_info)

### Software Engineering Job Description

In [6]:
software_engineer_job_description = JobDescriptionExtractor("./JobDescriptions/SoftwareEngineering.txt")

In [7]:
software_engineer_job_details  = json.loads(software_engineer_job_description.extract_details())

Number of requested results 4 is greater than number of elements in index 2, updating n_results = 2


In [8]:
software_engineer_job_details

{'is_job_description': 'True',
 'skills_required': ['C#', '.NET Core', 'Azure DevOps', 'Angular', 'Agile'],
 'responsibilities': ['Bring fresh ideas to the table and collaborate with their eclectic team of .NET Software Engineer',
  'Approach tasks and projects the way you want, allowing you to diversify your skillset and fast-track your career'],
 'qualifications': ['Good experience with C#, .NET Core, Azure DevOps, Angular, and Agile'],
 'preferences': ['Work onsite 5-days per week',
  'Generous salary',
  'Holiday pay',
  'PTO',
  'Medical & dental insurance',
  'Clear career pathway for progression and promotion',
  'Training']}

### Investment Banking Job Description

In [29]:
investment_banking_job_description = JobDescriptionExtractor("./JobDescriptions/InvestmentBankingAssociate.txt")

In [30]:
investment_banking_job_details = json.loads(investment_banking_job_description.extract_details())

In [31]:
investment_banking_job_details

{'is_job_description': 'True',
 'skills_required': ['Strong leadership skills',
  'Ability to assess risks inherent in complex credit transactions and mitigate, structure and negotiate accordingly',
  'Excellent problem solving, oral, and written communication skills',
  'Extensive knowledge of products and services'],
 'responsibilities': ['Champion a culture of innovation and a customer centric mindset',
  'Stay up-to-date with industry trends to identify opportunities for innovation or strategic partnerships',
  'Find ways to increase efficiency with existing technical infrastructure through automation, while embracing the innovative opportunities offered by new technologies'],
 'qualifications': ['3+ years with above average performance results in a similar banking role or credit/lending related experience'],
 'preferences': ['Bachelors degree preferred',
  'Superior knowledge of the market dynamics and landscape preferred',
  'Outstanding professional reputation and integrity']}

### Business Development Manager Job Description

In [33]:
business_dev_manager_job_description = JobDescriptionExtractor("./JobDescriptions/InvestmentBankingAssociate.txt")

In [34]:
business_dev_manager_job_details = json.loads(business_dev_manager_job_description.extract_details())

In [35]:
business_dev_manager_job_details

{'is_job_description': 'True',
 'skills_required': ['Strong leadership skills',
  'Ability to assess risks inherent in complex credit transactions and mitigate, structure and negotiate accordingly',
  'Excellent problem solving, oral, and written communication skills',
  'Extensive knowledge of products and services'],
 'responsibilities': ['Champion a culture of innovation and a customer centric mindset',
  'Stay up-to-date with industry trends to identify opportunities for innovation or strategic partnerships',
  'Find ways to increase efficiency with existing technical infrastructure through automation, while embracing the innovative opportunities offered by new technologies'],
 'qualifications': ['3+ years with above average performance results in a similar banking role or credit/lending related experience'],
 'preferences': ['Bachelors degree preferred',
  'Superior knowledge of the market dynamics and landscape preferred',
  'Outstanding professional reputation and integrity']}

## Semantic Comparison 

In [57]:
import openai

class ResumeComparer(object):
    def __init__(self, resume_details : json, job_description_details : json): 
        self.resume_details = resume_details
        self.job_description_details = job_description_details
        self.messages = [{"role": "system", "content" : "You are a sophisticated career advisor who is trying to discern if the resume data matches that of the job description."}]

    '''TODO: Improve this.
    def __chat_completion(self, query : str) -> str:
        model="gpt-3.5-turbo",
        response = openai.ChatCompletion.create(
            model=model
            messages=self.messages,
            temperature=0, # this is the degree of randomness of the model's output
        )
        response_text = (response.choices[0].message["content"])
        return response_text
    '''

    def extract_details(self) -> dict:

        # Comparison 1.
        query_to_extract_info = f"""Based on just the two following json objects of resume details and job description details {self.resume_details} and {self.job_description_details},
        check or suggest the following in a list like fashion:
        1. Check if the resume details match the job description requirements and qualifications.
        2. The main differences.
        3. The similarities.
        4. Suggest improvements to the resume.
        """ 
        self.messages.append( {"role": "user", "content": query_to_extract_info} )
        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=self.messages,
            temperature=0, # this is the degree of randomness of the model's output
        )
        response_text = (response.choices[0].message["content"])

        # Chain in comparison.
        query_specifics = "Provide detailed suggestions that'll make the candidate's chance of getting the said job most probable quoting specific lines in the resume that need to be changed."
        self.messages.append({"role" : "assistant", "content" : response_text})
        self.messages.append({"role" : "user", "content" : query_specifics}) 

        response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=self.messages,
            temperature=0, # this is the degree of randomness of the model's output
        )
        return {"summary": response_text, "specifics": response.choices[0].message["content"] }

### Software Engineer Resume vs. Job Description Comparison

In [44]:
software_engineering_comparison = ResumeComparer(software_engineering_resume_details, software_engineer_job_description)
software_engineering_comparison_extracted = software_engineering_comparison.extract_details()

In [49]:
software_engineering_comparison_extracted

{'summary': "1. Based on the provided resume details and job description, there are some similarities and differences. The resume details indicate that the candidate has experience in full-stack development, leading product cycles, and has knowledge of CSS, Javascript, Python, Advanced SAP, HTML and XML, Scrum Methodology, Database management software, and Software Development Life Cycle. The job description requires experience in full-stack development, knowledge of Javascript, Python, and HTML/CSS, and experience with Agile/Scrum methodologies. The candidate's experience and skills seem to match the job requirements to some extent, but further analysis is needed to determine if they are a good fit for the role.\n\n2. The main differences between the resume details and job description are that the job description requires experience with Agile/Scrum methodologies, while the resume only mentions Scrum Methodology. Additionally, the job description requires knowledge of HTML/CSS, while 

In [48]:
with open('./Comparisons/SoftwareEngineeringComparison.json', 'w') as f:
    f.write(json.dumps(software_engineering_comparison_extracted, indent= 3))

### Investment Banking Associate Resume vs. Job Description Comparison

In [50]:
investment_banking_comparison = ResumeComparer(investment_banking_associate_details, investment_banking_job_details)
investment_banking_comparison_extracted = investment_banking_comparison.extract_details()

In [51]:
investment_banking_comparison_extracted

{'summary': "1. Check if the resume details match the job description requirements and qualifications:\n- The resume details show that the candidate has 10 years of experience in investment banking, which meets the job description's requirement of 3+ years of experience in a similar banking role or credit/lending related experience.\n- The candidate's skills in restructuring, valuation, financial analysis, financial modeling, investment banking, mergers and acquisitions, leveraged buyouts, bond issuance, Bloomberg Terminal, S&P Capital IQ, Microsoft Excel, LexisNexis, Moody's RiskCalc, Certified Financial Planner (CFP), Series 7 License, and Chartered Financial Analyst (CFA) match the job description's requirement of extensive knowledge of products and services.\n- The candidate's achievements in cost savings, marketing, cost-saving measures, financial analysis, liquidity ratios, credit rating, investment portfolio, bond investments, and payroll system overhaul demonstrate strong probl

In [53]:
with open('./Comparisons/InvestmentBankingComparison.json', 'w') as f:
    f.write(json.dumps(investment_banking_comparison_extracted, indent= 3))

### Business Development Manager Resume vs. Job Description Comparison

In [54]:
business_development_comparison = ResumeComparer(business_dev_manager_job_details, business_dev_manager_job_details)
business_development_comparison_extracted = business_development_comparison.extract_details()

In [55]:
business_development_comparison_extracted

{'summary': '1. The resume details match the job description requirements and qualifications.\n2. There are no main differences between the resume details and the job description requirements and qualifications.\n3. The similarities between the resume details and the job description requirements and qualifications are:\n- Strong leadership skills\n- Ability to assess risks inherent in complex credit transactions and mitigate, structure and negotiate accordingly\n- Excellent problem solving, oral, and written communication skills\n- Extensive knowledge of products and services\n- Champion a culture of innovation and a customer centric mindset\n- Stay up-to-date with industry trends to identify opportunities for innovation or strategic partnerships\n- Find ways to increase efficiency with existing technical infrastructure through automation, while embracing the innovative opportunities offered by new technologies\n- 3+ years with above average performance results in a similar banking rol

In [56]:
with open('./Comparisons/BusinessDevelopmentManager.json', 'w') as f:
    f.write(json.dumps(business_development_comparison_extracted, indent= 3))

## PDF Annotations

In [20]:
from pdfminer.layout import LAParams, LTTextBox
from pdfminer.pdfpage import PDFPage
from pdfminer.pdfinterp import PDFResourceManager
from pdfminer.pdfinterp import PDFPageInterpreter
from pdfminer.converter import PDFPageAggregator

fp = open('./ResumeExamples/SoftwareEngineer.pdf', 'rb')
rsrcmgr = PDFResourceManager()
laparams = LAParams()
device = PDFPageAggregator(rsrcmgr, laparams=laparams)
interpreter = PDFPageInterpreter(rsrcmgr, device)
pages = PDFPage.get_pages(fp)

x, y = 0, 0
text_to_coordinates = {}

for page in pages:
    print('Processing next page...')
    interpreter.process_page(page)
    layout = device.get_result()
    for lobj in layout:
        if isinstance(lobj, LTTextBox):
            x, y, text = lobj.bbox[0], lobj.bbox[3], lobj.get_text()
            text_to_coordinates[text] = x, y
            print('At %r is text: %s' % ((x, y), text))

Processing next page...
At (69.09142125, 767.024002862) is text: FIRST LAST
Bay Area, California • +1-234-456-789 • professionalemail@resumeworded.com • linkedin.com/in/username

At (57.148429875, 720.2899936365) is text: Software Engineer with over six years of experience in full-stack development and leading product
cycle from conception to completion. Guided a team of 5-15 members through 5+ product launches
at a recent experience in a high growth technology startup.

At (222.37683, 666.0789942712499) is text: PROFESSIONAL EXPERIENCE

At (54.0, 642.8289942712499) is text: Resume Worded, New York, NY
Software Engineer

At (476.600415, 642.8289942712499) is text: 2020 – Present

At (72.0, 614.9308490604785) is text: ● Created an invoicing system for subscription services that managed monthly invoices and

At (90.0, 601.0400096365001) is text: printed invoices to be sent to customers; increased conversion rate by 15%.

At (72.0, 587.9308690604786) is text: ● Collaborated with internal 

In [None]:
from pdf_annotate import PdfAnnotator, Location, Appearance
a = PdfAnnotator('./ResumeExamples/SoftwareEngineer.pdf')

x, y = text_to_coordinates['FIRST LAST\n']

a.add_annotation(
    'square',
    Location(x1=x, y1=y, x2=x+30, y2=y+30, page=0),
    Appearance(stroke_color=(1, 0, 0), stroke_width=5),
)

    
a.write('./b.pdf')  # or use overwrite=True if you feel lucky