# Overview
### *Purpose*: This notebook is designed to demonstrate an iterative approach to developing a refined system that utilizes a language model for analyzing and scoring resumes based on specific criteria.

### *Summary*: The primary task is to guide the model in understanding a set of criteria and then use that understanding to assign scores to various skills present in a given resume. The evaluation mechanism is iteratively refined across five main steps, where each iteration is designed to improve upon the shortcomings of the previous one.

Now first install all the necessary libraries required to execute all functionalities within this notebook.

In [None]:
!pip install openai
!pip install PyMuPDF
!pip install textract
!pip install python-docx
!pip install tiktoken

Collecting openai
  Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
Successfully installed openai-0.28.0
Collecting PyMuPDF
  Downloading PyMuPDF-1.23.3-cp310-none-manylinux2014_x86_64.whl (4.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.3/4.3 MB[0m [31m12.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting PyMuPDFb==1.23.3 (from PyMuPDF)
  Downloading PyMuPDFb-1.23.3-py3-none-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (30.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m30.6/30.6 MB[0m [31m29.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: PyMuPDFb, PyMuPDF
Successfully installed PyMuPDF-1.23.3 PyMuPDFb-1.23.3
Collecting textract
  Downloading textract-1.6.5-py3-none-any.whl (23 kB)
Collecting argcomplete~=1.10.0 (from textract)
  Downloading argcomplete-

Collecting python-docx
  Downloading python-docx-0.8.11.tar.gz (5.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m5.6/5.6 MB[0m [31m17.4 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: python-docx
  Building wheel for python-docx (setup.py) ... [?25l[?25hdone
  Created wheel for python-docx: filename=python_docx-0.8.11-py3-none-any.whl size=184487 sha256=b725ad7dc3fe33ac163483889ad9947c4220afe8a97458f965f66d073cd55985
  Stored in directory: /root/.cache/pip/wheels/80/27/06/837436d4c3bd989b957a91679966f207bfd71d358d63a8194d
Successfully built python-docx
Installing collected packages: python-docx
Successfully installed python-docx-0.8.11
Collecting tiktoken
  Downloading tiktoken-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m9.5 MB/s[0m eta [36m0:00:00[0m
Installing co

Upload the .env file to the directory `/content/` which contains the "OPENAI_API_KEY"

The provided code snippet accesses sensitive values like the OpenAI API key

In [None]:
# Export your API Key to environment variable
# Upload the .env file to the directory "/content/"
!pip install python-dotenv
from dotenv import load_dotenv
load_dotenv()

Collecting python-dotenv
  Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.0


True

In [None]:
import openai
import os
# Retrieve the API key from environment variable
openai_api_key = os.getenv("OPENAI_API_KEY")

# Set the API key for OpenAI
openai.api_key = openai_api_key

Upload the json file containing important information about the Job requirements which was generated in Assignment1, the file containing information about the filtered resumes along with their summary generated from Assignment3 and the final JSON file containing the score criteria generated in Assignment6

In [None]:
from google.colab import files

# Upload the first file
print("Please upload the first file (filtered_applications_summary.json):")
uploaded1 = files.upload()

# Check to ensure a file was uploaded. If not, prompt again.
while len(uploaded1) == 0:
    print("No file uploaded. Please upload the first file (filtered_applications_summary.json) again:")
    uploaded1 = files.upload()

# Upload the second file
print("Please upload the second file (requirements_output.json):")
uploaded2 = files.upload()

# Check to ensure a file was uploaded. If not, prompt again.
while len(uploaded2) == 0:
    print("No file uploaded. Please upload the second file (requirements_output.json) again:")
    uploaded2 = files.upload()

# Upload the third file
print("Please upload the third file (criterion_and_string_match_output.txt):")
uploaded3 = files.upload()

# Check to ensure a file was uploaded. If not, prompt again.
while len(uploaded3) == 0:
    print("No file uploaded. Please upload the third file (criterion_and_string_match_output.txt) again:")
    uploaded3 = files.upload()

# Merge the dictionaries to have all uploaded files in one
uploaded = {**uploaded1, **uploaded2, **uploaded3}

# Print details of uploaded files
for fn in uploaded.keys():
    print('User uploaded file "{name}" with length {length} bytes'.format(
        name=fn, length=len(uploaded[fn])))


Please upload the first file (filtered_applications_summary.json):


Saving filtered_applications_summary.json to filtered_applications_summary.json
Please upload the second file (requirements_output.json):


Saving requirements_output (3).json to requirements_output (3).json
Please upload the third file (criterion_and_string_match_output.txt):


Saving criterion_and_string_match_output (1).txt to criterion_and_string_match_output (1).txt
User uploaded file "filtered_applications_summary.json" with length 13380 bytes
User uploaded file "requirements_output (3).json" with length 810 bytes
User uploaded file "criterion_and_string_match_output (1).txt" with length 2531 bytes


Now download the `Webinar_resumes.zip` file which contains all the resumes


In [None]:
import requests

def download_file_from_google_drive(file_id, destination):
    base_url = "https://drive.google.com/uc?export=download"

    session = requests.Session()

    response = session.get(base_url, params={'id': file_id}, stream=True)
    token = get_confirm_token(response)

    if token:
        params = {'id': file_id, 'confirm': token}
        response = session.get(base_url, params=params, stream=True)

    save_response_content(response, destination)

def get_confirm_token(response):
    for key, value in response.cookies.items():
        if key.startswith('download_warning'):
            return value
    return None

def save_response_content(response, destination):
    CHUNK_SIZE = 32768

    with open(destination, "wb") as f:
        for chunk in response.iter_content(CHUNK_SIZE):
            if chunk:
                f.write(chunk)
# Example Usage
file_id = '17V_o0Snt-Lj0FmegENPQ_rXpvWTWlZgQ'
destination = 'Webinar_resumes.zip'  # Replace with your desired file name and extension
download_file_from_google_drive(file_id, destination)

`Import Statements`: Essential libraries are imported. These include:
1. `openai` for interacting with the OpenAI API.
2. Utilities like `json`, `os`, and `re` for handling data and file operations.
3. Libraries for reading specific file formats (`docx` for Word documents, `textract` for older Word format, and `fitz` for PDFs).
4. `pandas` for reading Excel files.
5. `nltk` for natural language processing tasks.

`summarize_resume` Function:

1. This function interfaces with the GPT-3.5 model to summarize the content of a resume.
2. The model is provided with a system message (prompt) and user message (resume text). It then returns a summarized version of the resume.

`read_requirements` Function:

1. Reads job requirements from a JSON file and returns the data. It includes error handling to manage potential reading errors.

`read_json` Function:
1. Simplified function to read data from a JSON file and return it.

`read_document` Function:

1. Reads content from various file types including .docx, .doc, .pdf, .xls, and .xlsx.
2. Depending on the file type, different libraries/methods are utilized to extract the text.

`check_and_trim` Function:
1. Utilizes the nltk library to tokenize the text of a resume.
2. If the text exceeds a specified token count (default is 1500 tokens), the function trims the text to fit within the limit.
3. Returns the trimmed text and the original and new token lengths.

In [None]:
import openai
import json
import os
from collections import OrderedDict
import re
from docx import Document
import textract
import fitz  # PyMuPDF
import pandas as pd
import math
import tiktoken


def read_requirements(file_path):
    # Reads the job requirements from a JSON file
    try:
        with open(file_path, 'r') as f:
            data = json.load(f)
        return data
    except Exception as e:
        print(f"Error reading requirements JSON: {e}")
        return None

def read_json(file_path):
    with open(file_path, 'r') as f:
        data = json.load(f)
    return data

def read_document(file_path):
    file_path = str(file_path)
    _, file_extension = os.path.splitext(file_path)
    text = ""
    if file_extension == '.docx':
        doc = Document(file_path)
        for para in doc.paragraphs:
            text = text + para.text + " "
    elif file_extension == '.doc':
        text = textract.process(file_path).decode()
    elif file_extension.lower() == '.pdf':
        doc = fitz.open(file_path)
        for page_number in range(len(doc)):
            page = doc[page_number]
            text = text + page.get_text() + " "
    elif file_extension.lower() in ['.xls', '.xlsx']:
        data = pd.read_excel(file_path)
        text = data.to_string(index=False)

    else:
        print(f"Unsupported file type: {file_extension}")

    return text


def check_and_trim(resume_text, max_tokens=1500):
    # tokens = nltk.word_tokenize(resume_text)
    enc = tiktoken.get_encoding("cl100k_base")
    tokens = enc.encode(resume_text)
    old_len = len(tokens)
    if len(tokens) > max_tokens:
        tokens = tokens[:max_tokens]
        resume_text = enc.decode(tokens)
    return resume_text, old_len, len(tokens)


The provided code allows a user to select a desired number of resumes to process from a total set, with a default of 2 resumes if no input is given. The **user_select_number_of_resumes** function prompts the user for their choice, ensures valid input, and returns the selected number. The main execution block reads the **filtered_applications_summary** data from a JSON file, queries the user for their desired number of resumes using the aforementioned function, and then randomly selects the specified number of resumes from the total set, storing the result in the **selected_applications** variable.

In [None]:
import json
import random


def user_select_number_of_resumes(total_resumes, default=2):
    """
    Allow the user to input a number of resumes to process.
    If no input is given, the default value is returned.

    Args:
    - total_resumes (int): Total number of resumes available.
    - default (int): The default number to return if no input.

    Returns:
    - int: The number of resumes the user wants to process.
    """
    print(f"Total resumes available: {total_resumes}")
    user_input = input(f"How many resumes do you want to process? (Default is {default}): ")

    # If the user doesn't provide any input, return the default value.
    if not user_input:
        return default

    try:
        # Convert user input to an integer and ensure it's within the range.
        selected_num = int(user_input)
        if 1 <= selected_num <= total_resumes:
            return selected_num
        else:
            print(f"Please select a number between 1 and {total_resumes}.")
            return user_select_number_of_resumes(total_resumes, default)
    except ValueError:
        # If the user provides non-numeric input, prompt them again.
        print("Please enter a valid number.")
        return user_select_number_of_resumes(total_resumes, default)

# Read the filtered_applications_summary data from the JSON file
json_data = read_json('/content/filtered_applications_summary.json')

# Display total resumes and get the user's choice
n = user_select_number_of_resumes(len(json_data))

# Randomly select n resumes
selected_applications = random.sample(json_data, n)

Total resumes available: 6
How many resumes do you want to process? (Default is 2): 1


The provided code consists of two main functions: `score_from_criterion` and `read_from_textfile`. The `score_from_criterion` function takes in a prompt, a text, a previously generated text, a "must-have" criterion, and a string matching criterion. This function communicates with the GPT-3.5 Turbo model via the OpenAI API to generate a response. The purpose is to score a given text (like a resume) against specific criteria. The function returns the generated response from the model. The second function, `read_from_textfile`, reads content from a specified filename. It specifically extracts two sections from the content: the criterion generated output and a string match, which were saved by running the previous `Assignment6`. These sections are demarcated in the text file by specific headers, and the function returns the contents of these two sections separately. This extracted data can then be utilized to evaluate or assess the content of the file against certain predefined criteria.



---

## Understanding the roles
**assistant** - We use the role assistant for the critera that we generated in the last assignment.

**system** - We define the high level function that we want the API to return in the system,

**user** - We give the details from resumes which will change every time in user field as input to be processed.

In [None]:
def score_from_criterion(prompt, text, generated_text, must_have, string_match):
    model="gpt-3.5-turbo-16k"
    max_tokens=2000
    # print("prompt", prompt)
    messages = [
            {"role": "assistant", "content": f"{generated_text}"},
            {"role": "system", "content": f"{prompt}"},
            {"role": "user", "content": f"{text}"},
        ]
    response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo-16k",
            messages=messages,
            temperature=0,
            max_tokens=max_tokens
        )
    generated_texts = [
        choice.message["content"].strip() for choice in response["choices"]
    ]
    return generated_texts[0]

def read_from_textfile(filename):
    with open(filename, 'r') as file:
        content = file.read()

    # Separate criterion_gen_output and string_match
    criterion_start = content.find("----Criterion Generated Output----") + len("----Criterion Generated Output----")
    criterion_end = content.find("----String Match----")

    criterion_gen_output = content[criterion_start:criterion_end].strip()
    string_match = content[criterion_end + len("----String Match----"):].strip()

    return criterion_gen_output, string_match


The code provides functionality to extract and reorganize files from a given zip archive. After reading job requirements from a JSON file, the **extract_and_rename** function unzips the contents of a specified zip file (like "**Webinar_resumes.zip**") into a directory (defaulted as "**extracted_files**"). If the directory to extract to doesn't exist, it's created; if it's already populated, extraction is skipped. Post-extraction, the function scans the contents, and if it finds any directories with spaces in their names, it renames them by replacing spaces with underscores. If the directory with the new name already exists, it transfers files from the old directory to the new one and then deletes the old directory. The function finally returns the path of the reorganized or main content directory. The main execution block then calls this function with the given zip file path and stores the result in the **resume_path** variable.

In [None]:
import zipfile
import shutil
job_requirements = read_requirements('/content/requirements_output.json')
must_have_skills = job_requirements["must_have_skills"]
zip_file_path = "/content/Webinar_resumes.zip" # For example give the path to resume_data.zip

def extract_and_rename(zip_file_path, extract_path="extracted_files"):
    """
    Extract files from a zip archive to a specified directory.
    Rename directories containing spaces to use underscores instead.

    Args:
    - zip_file_path (str): The path to the zip file to be extracted.
    - extract_path (str, optional): The path where the zip file content should be extracted to.
                                    Defaults to "extracted_files".

    Returns:
    - str: Path to the resume or directory.
    """
    # Check if extract_path exists, if not, create it
    if not os.path.exists(extract_path):
        os.makedirs(extract_path)

    # If extract_path is not empty, skip extraction
    if not os.listdir(extract_path):
        with zipfile.ZipFile(zip_file_path, 'r') as zip_ref:
            zip_ref.extractall(extract_path)

    resume_path = extract_path
    for item in os.listdir(extract_path):
        item_path = os.path.join(extract_path, item)

        # Check if the current item is a directory and if it has spaces in its name
        if os.path.isdir(item_path) and ' ' in item:
            new_name = item.replace(' ', '_')
            new_path = os.path.join(extract_path, new_name)

            # If the new directory name doesn't already exist, create it
            if not os.path.exists(new_path):
                os.makedirs(new_path)

            # Copying contents from the old directory to the new one
            for sub_item in os.listdir(item_path):
                shutil.copy2(os.path.join(item_path, sub_item), new_path)

            # Removing the old directory
            shutil.rmtree(item_path)
            resume_path = new_path
        else:
            resume_path = item_path

    return resume_path
resume_path = extract_and_rename(zip_file_path)



# Iteration 1:
### *Goal*: Introduce the model to the task of using the generated criteria to judge a resume and give scores to each of the skills.
### *Prompt*:
```
# You are an assistant to a recruiter. Use the criteria given by {criterion_gen_output} to judge the resume given to you and give \
score to each of the skills present in {must_have_skills}.
```

### *Reason for Change*: We start with a basic prompt to get the inital results from the assistant.


In [None]:
def final_score_func(resume_text, must_have_skills):
    # Read the criteria and string_match from the text file
    criterion_gen_output, string_match = read_from_textfile("/content/criterion_and_string_match_output.txt")
    prompt=f'''You are an assistant to a recruiter. \
    Use the criteria given by {criterion_gen_output} to judge the resume given to you and give score to \
    each of the skills present in {must_have_skills}.'''

    # Assuming the function score_from_criterion exists and takes these parameters
    score_output = score_from_criterion(prompt, resume_text, criterion_gen_output, must_have_skills, string_match)

    return score_output

for application in selected_applications:
    if 'resume_path' in application and 'email_id' in application:

        # Extract resume text
        resume_text = read_document(os.path.join(resume_path, application['resume_path']))
        resume_text, _, _ = check_and_trim(resume_text)

        # Directly assign the resume_summary without json.loads()
        resume_summary = application['resume_summary']

        final_score_out = final_score_func(resume_text, must_have_skills)
        print(f'''[Score Request] for {resume_summary["name_of_candidate"]} ''', final_score_out)


[Score Request] for Soso Sukhitashvili  Based on the information provided in the resume, the scores for each skill can be given as follows:

- TensorFlow: 
  - Score: 0
  - Justification: The candidate has no mention or projects related to TensorFlow.

- Keras: 
  - Score: 2
  - Justification: The candidate mentions experience in computer vision projects, which likely involved the use of Keras.

- PyTorch: 
  - Score: 5
  - Justification: The candidate mentions working as a deep learning engineer and algorithm developer, with a focus on computer vision projects. They specifically mention using PyTorch, indicating significant experience in this skill.

- Computer Vision: 
  - Score: 3
  - Justification: The candidate mentions working on computer vision projects, including face recognition, image similarity search, object tracking, object detection, object segmentation, and OCR. This indicates good experience in computer vision.

The scores for each skill can be summarized as follows:

{

# Output:


```
[Score Request] for Abhilash Babu  Based on the provided resume, the scores for each skill can be given as follows:

- TensorFlow:
  - Score: 5
  - Justification: The candidate has mentioned experience in developing deep learning models using TensorFlow for detecting fraudulent identity documents and optimizing model performance.

- Keras:
  - Score: 5
  - Justification: The candidate has mentioned experience in developing deep learning models using Keras for various scenarios of identity verification in the KYC domain.

- PyTorch:
  - Score: 5
  - Justification: The candidate has mentioned experience in developing deep learning models using PyTorch for tasks such as facial attribute detection, object detection, and background elimination. They have also mentioned optimizing and tuning SOTA models using Apache TVM.

- Computer Vision:
  - Score: 5
  - Justification: The candidate has mentioned 18 years of experience in successfully delivering projects in the domain of Computer Vision and Image processing. They have developed machine learning solutions for tasks such as object detection, image classification, and image segmentation.

The final scores for each skill are as follows:

{
  "TensorFlow": {
    "score": 5,
    "justification": "The candidate has mentioned experience in developing deep learning models using TensorFlow for detecting fraudulent identity documents and optimizing model performance."
  },
  "Keras": {
    "score": 5,
    "justification": "The candidate has mentioned experience in developing deep learning models using Keras for various scenarios of identity verification in the KYC domain."
  },
  "PyTorch": {
    "score": 5,
    "justification": "The candidate has mentioned experience in developing deep learning models using PyTorch for tasks such as facial attribute detection, object detection, and background elimination. They have also mentioned optimizing and tuning SOTA models using Apache TVM."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has mentioned 18 years of experience in successfully delivering projects in the domain of Computer Vision and Image processing. They have developed machine learning solutions for tasks such as object detection, image classification, and image segmentation."
  }
}
[Score Request] for Pankaj Kumar Goyal  Based on the provided resume, the scores for each skill can be given as follows:

- TensorFlow: 2
- Keras: 2
- PyTorch: 0
- Computer Vision: 5

Justifications:

- TensorFlow: The candidate has mentioned experience in Deep learning, which often involves the use of TensorFlow. However, there are no specific projects mentioned, so the score is 2.
- Keras: The candidate has mentioned experience in Deep learning, which often involves the use of Keras. There is one project mentioned that involves fine-tuning a BERT model, so the score is 2.
- PyTorch: There is no mention of PyTorch in the resume, so the score is 0.
- Computer Vision: The candidate has mentioned a project related to skin cancer classification using CNN and CNN+LSTM, which indicates experience in Computer Vision. The score is 5.

The scores and justifications can be given in JSON format as shown below:

{
  "TensorFlow": {
    "score": 2,
    "justification": "The candidate has some experience in Deep learning, but no specific projects mentioned."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has experience in Deep learning and has mentioned a project involving fine-tuning a BERT model."
  },
  "PyTorch": {
    "score": 0,
    "justification": "There is no mention of PyTorch in the resume."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has experience in Computer Vision, as indicated by the skin cancer classification project."
  }
}
```



# Iteration 2:
### *Goal*: Refine the output by asking the model to return it in a JSON format to structure the data.

### *Prompt*:


```
You are an assistant to a recruiter. Use the criteria given by {criterion_gen_output} to judge the resume given to you and give \
score to each of the skills present in {must_have_skills}. \
Return the output in JSON format.
```

### *Reason for Change*: The problem with the above is that, there are a lot of stuff in the function output that are irrelevant. So in order to fix it we ask the assistant to return the final output in JSON format.


In [None]:

def final_score_func(resume_text, must_have_skills):
    # Read the criteria and string_match from the text file
    criterion_gen_output, string_match = read_from_textfile("/content/criterion_and_string_match_output.txt")
    prompt=f'''You are an assistant to a recruiter. \
    Use the criteria given by {criterion_gen_output} to judge the resume given to you and give score to \
    each of the skills present in {must_have_skills}. Return the output in JSON format.'''

    # Assuming the function score_from_criterion exists and takes these parameters
    score_output = score_from_criterion(prompt, resume_text, criterion_gen_output, must_have_skills, string_match)

    return score_output

for application in selected_applications:
    if 'resume_path' in application and 'email_id' in application:

        # Extract resume text
        resume_text = read_document(os.path.join(resume_path, application['resume_path']))
        resume_text, _, _ = check_and_trim(resume_text)

        # Directly assign the resume_summary without json.loads()
        resume_summary = application['resume_summary']

        final_score_out = final_score_func(resume_text, must_have_skills)
        print(f'''[Score Request] for {resume_summary["name_of_candidate"]} ''', final_score_out)


[Score Request] for Soso Sukhitashvili  {
  "TensorFlow": {
    "score": 0,
    "justification": "The candidate has no experience in TensorFlow."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has some experience in Keras, as mentioned in the resume."
  },
  "PyTorch": {
    "score": 5,
    "justification": "The candidate has significant experience in PyTorch, as mentioned in the resume."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has good experience in Computer Vision, as mentioned in the resume."
  }
}


# Output:


```
[Score Request] for Abhilash Babu  Based on the provided resume, the scores for each skill can be given as follows:

{
  "TensorFlow": {
    "score": 2,
    "justification": "The candidate has mentioned experience in developing deep learning models using TensorFlow in their current and previous roles."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has mentioned experience in using Keras for developing machine learning models in their technical skills section."
  },
  "PyTorch": {
    "score": 2,
    "justification": "The candidate has mentioned experience in using PyTorch and PyTorch-Lightning for developing machine learning models in their technical skills section."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has mentioned 18 years of experience in delivering projects in the domain of Computer Vision and Image processing. They have also mentioned developing and deploying machine learning solutions for various applications in Computer Vision."
  }
}
[Score Request] for Pankaj Kumar Goyal  Based on the provided resume, the scores for each skill can be given as follows:

{
  "TensorFlow": {
    "score": 2,
    "justification": "The candidate has mentioned experience in TensorFlow in the skills section."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has mentioned experience in Keras in the skills section."
  },
  "PyTorch": {
    "score": 0,
    "justification": "The candidate has no mention of PyTorch in the resume."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has significant experience in Computer Vision, as mentioned in the resume."
  }
}
```



# Iteration 3:
### *Goal*: Improve the accuracy of the scores by emphasizing that scores should be based on skills found in the resume.

### *Prompt*:


```
You are an assistant to a recruiter. Use the criteria given by {criterion_gen_output} to judge the resume given to you. \
 Scores must be given with respect to each of the must have \
 skills present here {must_have_skills}, which can be found inside \
 the resume. Return the output in JSON format.
```

### *Reason for Change*: With the above prompt we are able to get rid of the irrelevant statements, but now we have seen in some cases that the scores are not accurate.


In [None]:

def final_score_func(resume_text, must_have_skills):
    # Read the criteria and string_match from the text file
    criterion_gen_output, string_match = read_from_textfile("/content/criterion_and_string_match_output.txt")
    prompt=f'''You are an assistant to a recruiter. \
    Use the criteria given by {criterion_gen_output} to judge the resume given to you. \
    The scores must be given with respect to each of the must have skills present here {must_have_skills}, which \
    can be found inside the resume. \
    Return the output in JSON format.'''

    # Assuming the function score_from_criterion exists and takes these parameters
    score_output = score_from_criterion(prompt, resume_text, criterion_gen_output, must_have_skills, string_match)

    return score_output

for application in selected_applications:
    if 'resume_path' in application and 'email_id' in application:

        # Extract resume text
        resume_text = read_document(os.path.join(resume_path, application['resume_path']))
        resume_text, _, _ = check_and_trim(resume_text)

        # Directly assign the resume_summary without json.loads()
        resume_summary = application['resume_summary']

        final_score_out = final_score_func(resume_text, must_have_skills)
        print(f'''[Score Request] for {resume_summary["name_of_candidate"]} ''', final_score_out)


[Score Request] for Soso Sukhitashvili  {
  "TensorFlow": {
    "score": 0,
    "justification": "The candidate has no experience in TensorFlow."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has some experience in Keras, as mentioned in the resume."
  },
  "PyTorch": {
    "score": 5,
    "justification": "The candidate has significant experience in PyTorch, as mentioned in the resume."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has good experience in Computer Vision, as mentioned in the resume."
  }
}


# Output:

```
[Score Request] for Abhilash Babu  {
  "TensorFlow": {
    "score": 2,
    "justification": "The candidate has mentioned experience in developing deep learning models using TensorFlow in their previous role at Bundesdruckerei GmbH."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has mentioned experience in using Keras as one of the machine learning libraries they are familiar with."
  },
  "PyTorch": {
    "score": 5,
    "justification": "The candidate has mentioned significant experience in developing deep learning models using PyTorch in their previous role at Bundesdruckerei GmbH. They have also mentioned using PyTorch-Lightning, a PyTorch wrapper library."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has mentioned 18 years of experience in delivering projects in the domain of Computer Vision and Image processing. They have also mentioned developing and deploying machine learning solutions for various computer vision applications."
  }
}
[Score Request] for Pankaj Kumar Goyal  {
  "TensorFlow": {
    "score": 2,
    "justification": "The candidate has mentioned experience in TensorFlow in the skills section."
  },
  "Keras": {
    "score": 2,
    "justification": "The candidate has mentioned experience in Keras in the skills section."
  },
  "PyTorch": {
    "score": 0,
    "justification": "The candidate has no mention of PyTorch in the resume."
  },
  "Computer Vision": {
    "score": 5,
    "justification": "The candidate has significant experience in Computer Vision, as mentioned in the resume."
  }
}
```



Overall, the development of the prompt was an iterative process with the goal of maximizing accuracy and minimizing irrelevant or redundant outputs. Each iteration was driven by feedback from the previous run, and adjustments were made to improve clarity and specificity.