# ZAF059-Resume Screening Using Large Language Modelling
**Developer Name:** SIVADHANDAPANI S


**E-mail:** sivadhandapanis25@gmail.com

# 1.Checking Python Version
It is very helpfull for creating Container,venv or any config files while pushing to production

In [None]:
!python --version

Python 3.10.12


# 2.Install neccessary libraries

In [None]:
!pip install pdf2image google-generativeai



# 3.Import  neccessary libraries

In [None]:
import base64
import os
import io
from PIL import Image
import pdf2image
import google.generativeai as genai

# 4.Define a Function to get a response from LLM

Note-I use the Gemini Pro Version as LLM

In [None]:
genai.configure(api_key="AIzaSyCZuIdNXgKtK-7t-9AmhcLVIa7QUQE__jM") #Replace your Gemini API

def get_gemini_response(input,pdf_cotent,prompt):
    model=genai.GenerativeModel('gemini-pro-vision')
    response=model.generate_content([input,pdf_content[0],prompt])
    return response.text

#  5.Define a Function to convert a PDF to input for Gemini Model

**Function Explaination:**

1. **Argument**:
   - `uploaded_file_path`: This is the path to the uploaded PDF file.

2. **Check if file path is provided**:
   - The function checks if the `uploaded_file_path` is not `None`. If it is, it proceeds with processing the PDF file. Otherwise, it raises a `FileNotFoundError` indicating that no file was uploaded.

3. **PDF to Image Conversion**:
   - The function uses `pdf2image.convert_from_path()` function to convert the PDF file into a list of PIL (Python Imaging Library) images.

4. **Extract First Page**:
   - It selects the first page from the list of images generated from the PDF.

5. **Convert Image to Bytes**:
   - It converts the first page image into a byte array using `io.BytesIO()` and `save()` methods. This byte array represents the image in memory.

6. **Encode Image as Base64**:
   - The byte array representing the image is then encoded into a base64 string using `base64.b64encode()` method. This is done to facilitate easy transmission of image data.

7. **Construct Metadata**:
   - It constructs a dictionary containing metadata about the image, including its MIME type (`image/jpeg`) and the base64 encoded image data.

8. **Return Output**:
   - The function returns a list containing this dictionary, representing the image data in a format that can be easily transmitted or processed further.


In [None]:
def input_pdf_setup(uploaded_file_path):
    if uploaded_file_path is not None:
        ## Convert the PDF to image
        images=pdf2image.convert_from_path(uploaded_file_path)
        # Replace convert_from_file if you pass file as input (for web based application)
        # Eg:
        #images=pdf2image.convert_from_file(file_obj.read())
        first_page=images[0]

        # Convert to bytes
        img_byte_arr = io.BytesIO()
        first_page.save(img_byte_arr, format='JPEG')
        img_byte_arr = img_byte_arr.getvalue()

        pdf_parts = [
            {
                "mime_type": "image/jpeg",
                "data": base64.b64encode(img_byte_arr).decode()  # encode to base64
            }
        ]
        return pdf_parts
    else:
        raise FileNotFoundError("No file uploaded")

# 6.Installing Poppler and add it to Path Variable for pdf2image[Error Handling]

In [None]:
!apt-get install poppler-utils libpoppler-cpp-dev
!pip install -v -v python-poppler
!pip install pdf2image

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libpoppler-cpp0v5
The following NEW packages will be installed:
  libpoppler-cpp-dev libpoppler-cpp0v5 poppler-utils
0 upgraded, 3 newly installed, 0 to remove and 45 not upgraded.
Need to get 236 kB of archives.
After this operation, 928 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libpoppler-cpp0v5 amd64 22.02.0-2ubuntu0.3 [38.7 kB]
Get:2 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 libpoppler-cpp-dev amd64 22.02.0-2ubuntu0.3 [11.7 kB]
Get:3 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 poppler-utils amd64 22.02.0-2ubuntu0.3 [186 kB]
Fetched 236 kB in 2s (99.8 kB/s)
Selecting previously unselected package libpoppler-cpp0v5:amd64.
(Reading database ... 121920 files and directories currently installed.)
Preparing to unpack .../libpoppler-cpp0v5_22.02.0

# **Sample Job Description:**
**Job Title: Junior Data Scientist**

**Job Description:**

Are you passionate about data and its potential to drive insights and innovation? We are seeking a motivated and analytical Junior Data Scientist to join our dynamic team. As a Junior Data Scientist, you will work alongside experienced professionals to contribute to various data science projects aimed at solving real-world problems and enhancing business performance.

**Key Responsibilities:**

1. **Data Collection and Preparation:** Assist in gathering and cleaning data from various sources, ensuring its accuracy and reliability for analysis.

2. **Exploratory Data Analysis (EDA):** Conduct preliminary analysis to understand the structure and patterns within the data, identifying potential trends and outliers.

3. **Statistical Analysis:** Apply basic statistical techniques to interpret data and derive meaningful insights.

4. **Machine Learning Modeling:** Collaborate with senior team members to develop and implement machine learning models for predictive analysis, classification, and clustering tasks.

5. **Model Evaluation and Validation:** Evaluate model performance using appropriate metrics and techniques, and validate model results to ensure reliability and generalization.

6. **Data Visualization:** Create visualizations to effectively communicate findings and insights to stakeholders using tools like matplotlib, seaborn, or Tableau.

7. **Documentation:** Maintain thorough documentation of data science processes, methodologies, and results to facilitate knowledge sharing and reproducibility.

**Requirements:**

- Bachelor’s degree in Computer Science, Statistics, Mathematics, or related field.
- Proficiency in programming languages such as Python or R.
- Familiarity with data manipulation libraries like pandas and numpy.
- Basic understanding of machine learning algorithms and techniques.
- Strong analytical and problem-solving skills.
- Excellent communication and teamwork abilities.

**Preferred Qualifications:**

- Experience with data visualization tools such as matplotlib, seaborn, or Tableau.
- Knowledge of SQL for data manipulation and extraction.
- Previous internship or project experience in data science or related field.

**Benefits:**

- Competitive salary and benefits package.
- Opportunities for career growth and professional development.
- Collaborative and inclusive work environment.
- Exposure to cutting-edge technologies and methodologies in data science.
- Chance to make a real impact by contributing to meaningful projects that drive business success.

Join us in harnessing the power of data to unlock new insights and drive innovation! Apply now to kick-start your career in data science.

# 7.Testing the LLM model with Sample data

In [None]:
input_prompt= """
You are an skilled ATS (Applicant Tracking System) scanner with a deep understanding of all technologies and ATS functionality,
your task is to evaluate the resume against the provided job description. give me the percentage of match if the resume matches
the job description. First the output should come as percentage and then keywords missing and last final thoughts.
"""
job_description='''
Job Title: Junior Data Scientist

Job Description:

Are you passionate about data and its potential to drive insights and innovation? We are seeking a motivated and analytical Junior Data Scientist to join our dynamic team. As a Junior Data Scientist, you will work alongside experienced professionals to contribute to various data science projects aimed at solving real-world problems and enhancing business performance.

Key Responsibilities:

Data Collection and Preparation: Assist in gathering and cleaning data from various sources, ensuring its accuracy and reliability for analysis.

Exploratory Data Analysis (EDA): Conduct preliminary analysis to understand the structure and patterns within the data, identifying potential trends and outliers.

Statistical Analysis: Apply basic statistical techniques to interpret data and derive meaningful insights.

Machine Learning Modeling: Collaborate with senior team members to develop and implement machine learning models for predictive analysis, classification, and clustering tasks.

Model Evaluation and Validation: Evaluate model performance using appropriate metrics and techniques, and validate model results to ensure reliability and generalization.

Data Visualization: Create visualizations to effectively communicate findings and insights to stakeholders using tools like matplotlib, seaborn, or Tableau.

Documentation: Maintain thorough documentation of data science processes, methodologies, and results to facilitate knowledge sharing and reproducibility.

Requirements:

Bachelor’s degree in Computer Science, Statistics, Mathematics, or related field.
Proficiency in programming languages such as Python or R.
Familiarity with data manipulation libraries like pandas and numpy.
Basic understanding of machine learning algorithms and techniques.
Strong analytical and problem-solving skills.
Excellent communication and teamwork abilities.
Preferred Qualifications:

Experience with data visualization tools such as matplotlib, seaborn, or Tableau.
Knowledge of SQL for data manipulation and extraction.
Previous internship or project experience in data science or related field.
Benefits:

Competitive salary and benefits package.
Opportunities for career growth and professional development.
Collaborative and inclusive work environment.
Exposure to cutting-edge technologies and methodologies in data science.
Chance to make a real impact by contributing to meaningful projects that drive business success.
Join us in harnessing the power of data to unlock new insights and drive innovation! Apply now to kick-start your career in data science.
'''
resume1="/content/Sivadhandapani_S_DS_Resume.pdf" #Replace Resume path
if resume is not None:
  pdf_content=input_pdf_setup(uploaded_file_path=resume1)
  response=get_gemini_response(input_prompt,pdf_content,job_description)
  print("The Repsonse is:")
  print(response)
else:
  print("Please uplaod the resume")

The Repsonse is:
 Percentage Match: 75%

Keywords Missing: NLP, Natural Language Processing, Deep Learning, Computer Vision, SQL, NoSQL, Hadoop, Spark, Hive, Pig, TensorFlow, PyTorch, Keras, scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, Tableau, Power BI, QlikView, Data Mining, Data Warehousing, Data Integration, Data Governance, Data Security, Data Privacy, Data Ethics, Cloud Computing, AWS, Azure, Google Cloud Platform, Big Data, Artificial Intelligence, Machine Learning, Deep Learning, Natural Language Processing, Computer Vision, Robotics, Autonomous Vehicles, Internet of Things, Blockchain, Augmented Reality, Virtual Reality, Mixed Reality, Extended Reality, 5G, Edge Computing, Quantum Computing, Cybersecurity, Information Security, Risk Management, Compliance, Governance, Audit, Controls, Data Protection, Privacy, Incident Response, Disaster Recovery, Business Continuity, Information Technology, IT Service Management, ITIL, DevOps, Agile, Scrum, Kanban, Lean, Six Sigma, Conti

In [None]:
resume2="/content/web-developer-resume-example.pdf" #Replace Resume path
if resume is not None:
  pdf_content=input_pdf_setup(uploaded_file_path=resume2)
  response=get_gemini_response(input_prompt,pdf_content,job_description)
  print("The Repsonse is:")
  print(response)
else:
  print("Please uplaod the resume")

The Repsonse is:
 Percentage Match: 40%

Keywords Missing: Python, R, pandas, numpy, machine learning algorithms, SQL, data visualization tools

Final Thoughts: The candidate has a background in computer science and has taken some relevant coursework, but their skills and experience do not closely align with the job description.
