# AI-Powered Resume Analyzer for Job Postings

This tool is designed to analyze resumes against specific job postings, offering valuable insights such as:

- Identification of skill gaps
- Keyword matching between the CV and the job description
- Tailored recommendations for CV improvement
- An alignment score reflecting how well the CV fits the job
- Personalized feedback 
- Job market trend insights

An example of the tool's output can be found [here](https://tvarol.github.io/sideProjects/AILLMAgents/output.html).

In [1]:
# Imports
import os
import io
import time
import requests
import PyPDF2
from dotenv import load_dotenv
from IPython.display import Markdown, display
from openai import OpenAI
from ipywidgets import Textarea, FileUpload, Button, VBox, HTML

In [2]:
# Load environment variables
load_dotenv(override=True)
api_key = os.getenv('OPENAI_API_KEY')

# Check the key
if not api_key:
    print("No API key was found!!!")
else:
    print("API key found and looks good so far!")

API key found and looks good so far!


In [3]:
openai = OpenAI()

## Using a Frontier Model GPT-4o Mini for This Project

### Types of Prompts

Models like GPT4o have been trained to receive instructions in a particular way.

They expect to receive:

**A system prompt** that tells them what task they are performing and what tone they should use

**A user prompt** -- the conversation starter that they should reply to

In [4]:
# Define our system prompt
system_prompt = """You are a powerful AI model designed to assist with resume analysis.
Your task is to analyze a resume against a given job posting and provide feedback on how
well the resume aligns with the job requirements. Your response should include the following:
1) Skill gap identification: Compare the skills listed in the resume with those required in the job posting,
highlighting areas where the resume may be lacking or overemphasized.
2) Keyword matching between a CV and a job posting: Match keywords from the job description with the resume,
determining how well they align. Provide specific suggestions for missing keywords to add to the CV.
3) Recommendations for CV improvement: Provide actionable suggestions on how to enhance the resume, such as
adding missing skills or rephrasing experience to match job requirements.
4) Alignment score: Display a score that represents the degree of alignment between the resume and the job posting.
5) Personalized feedback: Offer tailored advice based on the job posting, guiding the user on how to optimize their
CV for the best chances of success.
6) Job market trend insights, provide broader market trends and insights, such as in-demand skills and salary ranges.
Provide responses that are concise, clear, and to the point. Respond in markdown."""

In [5]:
# The job posting and the CV are required to define the user prompt
# The user will input the job posting as text in a box here
# The user will upload the CV in PDF format, from which the text will be extracted

# You might need to install PyPDF2 via pip if it's not already installed
# !pip install PyPDF2

# Create widgets - to create a box for the job posting text
job_posting_area = Textarea(
    placeholder='Paste the job posting text here...',
    description='Job Posting:',
    disabled=False,
    layout={'width': '800px', 'height': '300px'}
)

# Define file upload for CV
cv_upload = FileUpload(
    accept='.pdf',  # Only accept PDF files
    multiple=False,  # Only allow single file selection
    description='Upload CV (PDF)'
)

status = HTML(value="<b>Status:</b> Waiting for inputs...")

# Create Submit Buttons
submit_cv_button = Button(description='Submit CV', button_style='success')
submit_job_posting_button = Button(description='Submit Job Posting', button_style='success')

# Initialize variables to store the data
# This dictionary will hold the text for both the job posting and the CV
# It will be used to define the user_prompt
for_user_prompt = {
    'job_posting': '',
    'cv_text': ''
}

# Functions
def submit_cv_action(change):

    if not for_user_prompt['cv_text']:
        status.value = "<b>Status:</b> Please upload a CV before submitting."

    if cv_upload.value:
        # Get the uploaded file
        uploaded_file = cv_upload.value[0]
        content = io.BytesIO(uploaded_file['content'])

        try:
            pdf_reader = PyPDF2.PdfReader(content)
            cv_text = ""
            for page in pdf_reader.pages:
                cv_text += page.extract_text()

            # Store CV text in for_user_prompt
            for_user_prompt['cv_text'] = cv_text
            status.value = "<b>Status:</b> CV uploaded and processed successfully!"
        except Exception as e:
            status.value = f"<b>Status:</b> Error processing PDF: {str(e)}"

        time.sleep(0.5) # Short pause between upload and submit messages to display both

        if for_user_prompt['cv_text']:
            #print("CV Submitted:")
            #print(for_user_prompt['cv_text'])
            status.value = "<b>Status:</b> CV submitted successfully!"

def submit_job_posting_action(b):
    for_user_prompt['job_posting'] = job_posting_area.value
    if for_user_prompt['job_posting']:
        #print("Job Posting Submitted:")
        #print(for_user_prompt['job_posting'])
        status.value = "<b>Status:</b> Job posting submitted successfully!"
    else:
        status.value = "<b>Status:</b> Please enter a job posting before submitting."

# Attach actions to buttons
submit_cv_button.on_click(submit_cv_action)
submit_job_posting_button.on_click(submit_job_posting_action)

# Layout
job_posting_box = VBox([job_posting_area, submit_job_posting_button])
cv_buttons = VBox([submit_cv_button])

# Display all widgets
display(VBox([
    HTML(value="<h3>Input Job Posting and CV</h3>"),
    job_posting_box,
    cv_upload,
    cv_buttons,
    status
]))

VBox(children=(HTML(value='<h3>Input Job Posting and CV</h3>'), VBox(children=(Textarea(value='', description=…

In [21]:
# Now define user_prompt using for_user_prompt dictionary
# Clearly label each input to differentiate the job posting and CV
# The model can parse and analyze each section based on these labels
user_prompt = f"""
Job Posting:
{for_user_prompt['job_posting']}

CV:
{for_user_prompt['cv_text']}
"""

## Messages

The API from OpenAI expects to receive messages in a particular structure.
Many of the other APIs share this structure:

```
[
    {"role": "system", "content": "system message goes here"},
    {"role": "user", "content": "user message goes here"}
]

In [22]:
# Define messages with system_prompt and user_prompt
def messages_for(system_prompt_input, user_prompt_input):
    return [
        {"role": "system", "content": system_prompt_input},
        {"role": "user", "content": user_prompt_input}
    ]

In [23]:
# And now: call the OpenAI API.
response = openai.chat.completions.create(
    model = "gpt-4o-mini",
    messages = messages_for(system_prompt, user_prompt)
)

# Response is provided in Markdown and displayed accordingly
display(Markdown(response.choices[0].message.content))

# Resume Analysis for Cloud Data Engineer Position

## 1) Skill Gap Identification
### Lacking Skills:
- **Cloud Technologies**: The resume emphasizes Azure over AWS and lacks specific experience with AWS Glue, Step Functions, or other AWS services mentioned in the job posting.
- **Infrastructure as Code (IaC)**: Terraform is specifically mentioned as a requirement, but there's no evidence of experience or skills portrayed in that area.
- **Streaming Technologies**: Experience with Confluent Kafka or Spark Streaming is not mentioned.
- **Data Modeling Techniques**: While data modeling is mentioned, specific techniques like dimensional and time series modeling (Star Schemas, Data Vault) are not highlighted.

### Overemphasized Skills:
- **Actuarial Science and Financial Modeling**: While valuable, these skills may not be as relevant to the data engineering position being applied for, given the technical requirements focused on data pipelines and cloud infrastructure.

## 2) Keyword Matching
### Keywords Matched:
- Data Engineering (mentioned)
- Python, SQL (mentioned)
- ETL/ELT processes (mentioned)
- Data Warehousing (related concepts)
- Collaboration with technical teams (implied through various roles)

### Suggestions for Missing Keywords:
- "Cloud Infrastructure", "Big Data tools", "AWS Glue", "Kafka", "Terraform", "Infrastructure as Code", "Agile Software Development", "No-SQL databases", "Docker", "Kubernetes", "Data Modeling techniques", "Streaming technologies".

## 3) Recommendations for CV Improvement
1. **Highlight Experience with AWS**: Specifically mention any projects where AWS was used, particularly tools like Glue or Step Functions.
2. **Add IaC Experience**: Include any Terraform knowledge or projects, perhaps under Development Skills.
3. **Mention Streaming Technologies**: If applicable, describe any experience with tools like Kafka or Spark Streaming.
4. **Expand on Data Modeling**: Specific techniques such as Data Vault or Star Schemas should be included if they are part of your skill set.
5. **Tailor Job Titles**: Adjust job titles or add relevant keywords within existing job descriptions (e.g., explicitly mention data pipeline development).

## 4) Alignment Score
**Alignment Score: 65/100**  
This score reflects a moderate alignment due to strong quantitative skills and cloud experience, but substantial skill gaps in specifically required tools and techniques.

## 5) Personalized Feedback
To optimize your CV for the Cloud Data Engineer role at Regions, emphasize your direct experience with cloud technologies, particularly AWS. Consider presenting data engineering projects or components of your work that relate directly to data pipeline construction and machine learning model deployment. Utilize industry vernacular from the job description, mirroring their phrasing to boost keyword optimization.

## 6) Job Market Trend Insights
- **In-Demand Skills**: Cloud computing (AWS, Azure), Data Engineering, Machine Learning, Infrastructure as Code (Terraform), Data Modeling, and Agile methodologies.
- **Salary Range**: For Cloud Data Engineers in major cities across the U.S., the salary range is typically between **$100,000 to $150,000** per year, depending on experience and specific skills.
- **Trends**: There is a growing need for professionals who can bridge the gap between data engineering and analytical teams, especially in sectors like financial services that require robust data solutions.

By taking these recommendations into account and tailoring your resume specifically for the Cloud Data Engineer position, you'll enhance your chances of standing out to recruiters and hiring managers.

In [None]:
## If you would like to save the response content as a Markdown file, uncomment the following lines
#with open('yourfile.md', 'w') as file:
#    file.write(response.choices[0].message.content)

## You can then run the line below to create output.html which you can open on your browser
#!pandoc yourfile.md -o output.html