# Potential Talents Ranking System  
### LLM-Based Candidate Fitness Prediction & Intelligent Re-Ranking

## Problem Overview

Talent sourcing teams spend significant manual effort identifying high-potential candidates for specific job roles. The challenge is not only finding candidates, but ranking them based on how well they match a role.

In this project, we design an **AI-powered ranking system** that:

- Predicts how fit a candidate is for a job role
- Ranks candidates automatically
- Learns from recruiter feedback
- Reduces manual screening time

We experiment with three approaches:

1. **Prompt-based ranking (No fine-tuning)**
2. **Fine-tuned LLM scoring**
3. **RAG (Retrieval-Augmented Ranking)**

## Environment Setup

In this section, we prepare the environment for working with large language models.

Key actions:
- Mount Google Drive to access data
- Install memory-efficient libraries (bitsandbytes)
- Log into Hugging Face to access the LLM

This allows us to load and fine-tune a large language model using limited computational resources.


In [None]:
# Mounting google drive to google collab

from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# Hugging Face account login and authentication for LLM models access

from huggingface_hub import login
login("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")

In [None]:

!pip install -q -U bitsandbytes
!pip install -q -U accelerate
!pip install peft
!pip install -U trl
!pip install sentence-transformers faiss-cpu

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m59.1/59.1 MB[0m [31m15.9 MB/s[0m eta [36m0:00:00[0m
Collecting trl
  Downloading trl-0.27.2-py3-none-any.whl.metadata (11 kB)
Downloading trl-0.27.2-py3-none-any.whl (530 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m530.9/530.9 kB[0m [31m39.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: trl
Successfully installed trl-0.27.2
Collecting faiss-cpu
  Downloading faiss_cpu-1.13.2-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (7.6 kB)
Downloading faiss_cpu-1.13.2-cp310-abi3-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (23.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.8/23.8 MB[0m [31m35.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.13.2


In [None]:
# Importing and loading required libraries

import torch
import re
import transformers
import trl
import numpy as np
import faiss
import pandas as pd

from transformers import AutoModelForCausalLM, AutoTokenizer,BitsAndBytesConfig, TrainingArguments
from peft import prepare_model_for_kbit_training,LoraConfig,PeftModel,get_peft_model,LoraConfig,prepare_model_for_kbit_training
from datasets import load_dataset, Dataset
from trl import SFTTrainer
from sentence_transformers import SentenceTransformer
from datetime import datetime

##  Data Loading and Preparation

We load the candidate dataset containing structured information such as skills, experience, and descriptions.

Steps performed:
- Read dataset into a DataFrame
- Remove the original "fit" column (we want the model to predict fitness instead)
- Convert structured candidate information into natural language text

In [None]:
# Loading the data

file_path = '/content/drive/MyDrive/Machine Learning/potential-talents - Aspiring human resources - seeking human resources.csv'
df = pd.read_csv(file_path)

In [None]:
df.head()

Unnamed: 0,id,job_title,location,connection,fit
0,1,2019 C.T. Bauer College of Business Graduate (...,"Houston, Texas",85,
1,2,Native English Teacher at EPIK (English Progra...,Kanada,500+,
2,3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44,
3,4,People Development Coordinator at Ryan,"Denton, Texas",500+,
4,5,Advisory Board Member at Celal Bayar University,"İzmir, Türkiye",500+,


In [None]:
df = df.drop(columns=["fit"])

In [None]:
# Define Base LLM Model and Device

model_path = "meta-llama/Llama-3.2-3B-Instruct"
device = "cuda" # the device to load the model onto

# Quantization configuration

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=False,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float32
)

# Loading the model and tokenizer

model = AutoModelForCausalLM.from_pretrained(model_path,quantization_config=bnb_config)
tokenizer = AutoTokenizer.from_pretrained(
    model_path,
    model_max_length=512,
    padding_side="left",
    add_eos_token=True)
tokenizer.pad_token = tokenizer.eos_token #need to study this parameter

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/878 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/20.9k [00:00<?, ?B/s]

Downloading (incomplete total...): 0.00B [00:00, ?B/s]

Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/254 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/189 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/296 [00:00<?, ?B/s]

In [None]:
df.head()

Unnamed: 0,id,job_title,location,connection
0,1,2019 C.T. Bauer College of Business Graduate (...,"Houston, Texas",85
1,2,Native English Teacher at EPIK (English Progra...,Kanada,500+
2,3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44
3,4,People Development Coordinator at Ryan,"Denton, Texas",500+
4,5,Advisory Board Member at Celal Bayar University,"İzmir, Türkiye",500+


## Baseline Approach: Ranking Without Fine-Tuning

> Here we use the LLM **as-is**, without any training.

> We describe candidates in text and ask the LLM to act like a recruiter and rank them for a job role using prompt engineering.

> This simulates a recruiter giving instructions such as:
"Rank the following candidates for an Human Resource Role"

####**System Flow: Prompt-Based Ranking (No Fine-Tuning)**<br><br>

<center>Job Role (Search Term)
        <center>↓<br>
Convert candidate data → natural language profiles<
        <center>↓<br>
Insert profiles into ranking prompt<br>
        <center>↓<br>
Send prompt to base LLM
        <center>↓<br>
LLM reasons using general knowledge
        <center>↓<br>
LLM outputs ranked list of candidates<br><br>

In [None]:
# Converts a structured profile row into a compact text description for prompting/embedding.

def profile_to_text(row):
    return f"""Job Title: {row['job_title']}
Location: {row['location']}
Connections: {row['connection']}"""

In [None]:
# Apply Text Conversion to Entire Dataset

df["profile_text"] = df.apply(profile_to_text, axis=1)

In [None]:
df.head()

Unnamed: 0,id,job_title,location,connection,profile_text
0,1,2019 C.T. Bauer College of Business Graduate (...,"Houston, Texas",85,Job Title: 2019 C.T. Bauer College of Business...
1,2,Native English Teacher at EPIK (English Progra...,Kanada,500+,Job Title: Native English Teacher at EPIK (Eng...
2,3,Aspiring Human Resources Professional,"Raleigh-Durham, North Carolina Area",44,Job Title: Aspiring Human Resources Profession...
3,4,People Development Coordinator at Ryan,"Denton, Texas",500+,Job Title: People Development Coordinator at R...
4,5,Advisory Board Member at Celal Bayar University,"İzmir, Türkiye",500+,Job Title: Advisory Board Member at Celal Baya...


In [None]:
# Builds an LLM prompt that asks the model to rank multiple candidate profiles for a search term.

def ranking_prompt(search_term, profiles):
    candidates = "\n".join(
        [f"{i+1}. {p}" for i, p in enumerate(profiles)]
    )

    return f"""You are a recruiter.

Search Requirement:
{search_term}

Candidates:
{candidates}

Task:
Rank the candidates from best to worst match.
Assign a relevance score from 1 (poor) to 5 (excellent).
Provide a brief explanation for each ranking.
"""

In [None]:
#Select Sample Candidate Profiles

profiles = df["profile_text"].tolist()[:5]

In [None]:
# Test Baseline Ranking

prompt = ranking_prompt(
    "Human Resource",
    profiles
)

outputs = model.generate(
    **tokenizer(prompt, return_tensors="pt").to(model.device),
    max_new_tokens=300
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


You are a recruiter.

Search Requirement:
Human Resource

Candidates:
1. Job Title: 2019 C.T. Bauer College of Business Graduate (Magna Cum Laude) and aspiring Human Resources professional
Location: Houston, Texas
Connections: 85
2. Job Title: Native English Teacher at EPIK (English Program in Korea)
Location: Kanada
Connections: 500+ 
3. Job Title: Aspiring Human Resources Professional
Location: Raleigh-Durham, North Carolina Area
Connections: 44
4. Job Title: People Development Coordinator at Ryan
Location: Denton, Texas
Connections: 500+ 
5. Job Title: Advisory Board Member at Celal Bayar University
Location: İzmir, Türkiye
Connections: 500+ 

Task:
Rank the candidates from best to worst match.
Assign a relevance score from 1 (poor) to 5 (excellent).
Provide a brief explanation for each ranking.


Based on the job requirements and the candidates' profiles, here is the ranking from best to worst match:

1. **Job Title: 2019 C.T. Bauer College of Business Graduate (Magna Cum Laude) an

In [None]:
# Test Baseline Ranking

prompt = ranking_prompt(
    "Student",
    profiles
)

outputs = model.generate(
    **tokenizer(prompt, return_tensors="pt").to(model.device),
    max_new_tokens=300
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


You are a recruiter.

Search Requirement:
Student

Candidates:
1. Job Title: 2019 C.T. Bauer College of Business Graduate (Magna Cum Laude) and aspiring Human Resources professional
Location: Houston, Texas
Connections: 85
2. Job Title: Native English Teacher at EPIK (English Program in Korea)
Location: Kanada
Connections: 500+ 
3. Job Title: Aspiring Human Resources Professional
Location: Raleigh-Durham, North Carolina Area
Connections: 44
4. Job Title: People Development Coordinator at Ryan
Location: Denton, Texas
Connections: 500+ 
5. Job Title: Advisory Board Member at Celal Bayar University
Location: İzmir, Türkiye
Connections: 500+ 

Task:
Rank the candidates from best to worst match.
Assign a relevance score from 1 (poor) to 5 (excellent).
Provide a brief explanation for each ranking.


Based on the job requirements, I have ranked the candidates from best to worst match as follows:

1. Job Title: Aspiring Human Resources Professional
Location: Raleigh-Durham, North Carolina Area

In [None]:
# Test Baseline Ranking

prompt = ranking_prompt(
    "Engineer",
    profiles
)

outputs = model.generate(
    **tokenizer(prompt, return_tensors="pt").to(model.device),
    max_new_tokens=300
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Setting `pad_token_id` to `eos_token_id`:128001 for open-end generation.


You are a recruiter.

Search Requirement:
Engineer

Candidates:
1. Job Title: 2019 C.T. Bauer College of Business Graduate (Magna Cum Laude) and aspiring Human Resources professional
Location: Houston, Texas
Connections: 85
2. Job Title: Native English Teacher at EPIK (English Program in Korea)
Location: Kanada
Connections: 500+ 
3. Job Title: Aspiring Human Resources Professional
Location: Raleigh-Durham, North Carolina Area
Connections: 44
4. Job Title: People Development Coordinator at Ryan
Location: Denton, Texas
Connections: 500+ 
5. Job Title: Advisory Board Member at Celal Bayar University
Location: İzmir, Türkiye
Connections: 500+ 

Task:
Rank the candidates from best to worst match.
Assign a relevance score from 1 (poor) to 5 (excellent).
Provide a brief explanation for each ranking.



## Advanced Approach: Fine-Tuning the LLM

> In this section, we improve the model by teaching it what makes a strong candidate.

> Instead of just prompting, we fine-tune the LLM using **QLoRA**, a memory-efficient method that allows training large models with limited hardware.


### Model Training Pipeline

####**System Flow: Fine-Tuned Ranking System - Model Training**<br><br>

<center>Raw Candidate Data
        <center>↓<br>
Weak labeling (keyword-based fitness scores)
        <center>↓<br>
Instruction-format training samples
        <center>↓<br>
QLoRA Fine-Tuning
        <center>↓<br>
Fine-Tuned Model<br><br>

In [None]:
#Load Quantized Model for Efficient Training

MODEL_NAME = "meta-llama/Llama-3.2-3B-Instruct"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16
)

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
tokenizer.pad_token = tokenizer.eos_token

base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map="auto",
    dtype=torch.float16
)

base_model = prepare_model_for_kbit_training(base_model)

Loading weights:   0%|          | 0/254 [00:00<?, ?it/s]

In [None]:
# Configure LoRA for Parameter-Efficient Fine-Tuning

peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.1,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(base_model, peft_config)
model.print_trainable_parameters()

trainable params: 4,587,520 || all params: 3,217,337,344 || trainable%: 0.1426


In [None]:
# Reload Dataset for Training Pipeline

file_path = '/content/drive/MyDrive/Machine Learning/potential-talents - Aspiring human resources - seeking human resources.csv'
df_candidates = pd.read_csv(file_path)

df_candidates["profile_text"] = (
    "Job Title: " + df_candidates["job_title"].astype(str) + ", " +
    "Location: " + df_candidates["location"].astype(str) + ", " +
    "Connections: " + df_candidates["connection"].astype(str)
)

In [None]:
# Define Search Terms & Expand Training Data

search_terms = ["HR", "Human Resource", "Student", "Aspiring", "Engineer"]

expanded_rows = []

for _, row in df_candidates.iterrows():
    for term in search_terms:
        expanded_rows.append({
            "search_term": term,
            "profile_text": row["profile_text"]
        })

df_train = pd.DataFrame(expanded_rows)

In [None]:
# Generates a simple heuristic relevance score based on keyword overlap between search term and profile.

def weak_label(search_term, profile):
    st = search_term.lower()
    pf = profile.lower()

    if any(word in pf for word in st.split()):
        return 5
    return 2

In [None]:
# Formats a supervised fine-tuning training example using weak labels for instruction-style learning.

def build_training_sample(row):
    score = weak_label(row["search_term"], row["profile_text"])

    return f"""
### Instruction:
You are an AI recruitment assistant that ranks candidate relevance.

### Search Term:
{row['search_term']}

### Candidate Profile:
{row['profile_text']}

### Task:
Rate relevance from 1 (poor match) to 5 (strong match) and explain briefly.

### Response:
Score: {score}
Reason: The candidate profile was compared with the search term using keyword alignment and role similarity.
"""

In [None]:
# Prepare Dataset for Model Fine-Tuning

df_train["text"] = df_train.apply(build_training_sample, axis=1)

dataset = Dataset.from_pandas(df_train[["text"]])
dataset = dataset.train_test_split(test_size=0.1, seed=42)

In [None]:
# Define Training Hyperparameters

training_args = TrainingArguments(
    output_dir="./qlora_results",
    per_device_train_batch_size=2,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    num_train_epochs=3,
    fp16=False,
    bf16=True,  # Explicitly set to True
    logging_steps=10,
    optim="adamw_torch",
    save_strategy="epoch",
    eval_strategy="epoch",
    report_to="none"
)

In [None]:
# Initialize Fine-Tuning Trainer

trainer = SFTTrainer(
    model=model,
    train_dataset=dataset["train"],
    eval_dataset=dataset["test"],
    processing_class=tokenizer,
    args=training_args,
)

trainer.model.print_trainable_parameters()

Adding EOS to train dataset:   0%|          | 0/468 [00:00<?, ? examples/s]

Tokenizing train dataset:   0%|          | 0/468 [00:00<?, ? examples/s]

Truncating train dataset:   0%|          | 0/468 [00:00<?, ? examples/s]

Adding EOS to eval dataset:   0%|          | 0/52 [00:00<?, ? examples/s]

Tokenizing eval dataset:   0%|          | 0/52 [00:00<?, ? examples/s]

Truncating eval dataset:   0%|          | 0/52 [00:00<?, ? examples/s]

trainable params: 4,587,520 || all params: 3,217,337,344 || trainable%: 0.1426


In [None]:
# Start Model Fine-Tuning

trainer.train()

The tokenizer has new PAD/BOS/EOS tokens that differ from the model config and generation config. The model config and generation config were aligned accordingly, being updated with the tokenizer's values. Updated tokens: {'eos_token_id': 128009, 'pad_token_id': 128009}.
  return fn(*args, **kwargs)


Epoch,Training Loss,Validation Loss
1,0.431051,0.327322
2,0.263163,0.219251


  return fn(*args, **kwargs)
  return fn(*args, **kwargs)


In [None]:
# Save Fine-Tuned Adapter and Tokenizer

trainer.model.save_pretrained("./qlora_adapter")
tokenizer.save_pretrained("./qlora_adapter")

In [None]:
# Creates the inference-time prompt for scoring a single candidate against a search term.

def build_inference_prompt(search_term, profile):
    return f"""
### Instruction:
You are an AI recruitment assistant that ranks candidate relevance.

### Search Term:
{search_term}

### Candidate Profile:
{profile}

### Task:
Rate relevance from 1 to 5 and explain briefly.

### Response:
"""

In [None]:
# Runs the LLM to generate a relevance evaluation for one (search term, profile) pair.

def get_score(term, profile):
    prompt = build_inference_prompt(term, profile)

    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

    outputs = model.generate(
        **inputs,
        max_new_tokens=80,
        temperature=0.2
    )

    return tokenizer.decode(outputs[0], skip_special_tokens=True)

### Application (Inference) Pipeline

####**System Flow: Fine-Tuned Ranking System - Application Pipeline**<br><br>
<center>New Job Role + Candidate Profiles<br>
        <center>↓<br>
Convert profiles to text
        <center>↓<br>
Send to Fine-Tuned Model
        <center>↓<br>
Model predicts fitness score
        <center>↓<br>
Sort by score
        <center>↓<br>
Ranked Candidate List <br><br>

In [None]:
# Reload Fine-Tuned Model for Inference

MODEL_NAME = "meta-llama/Llama-3.2-3B-Instruct"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.float16
)

# Tokenizer
tokenizer = AutoTokenizer.from_pretrained("./qlora_adapter")

# Load base model again
base_model = AutoModelForCausalLM.from_pretrained(
    MODEL_NAME,
    quantization_config=bnb_config,
    device_map="auto",
    torch_dtype=torch.float16
)

# 🔥 Attach LoRA weights
model = PeftModel.from_pretrained(base_model, "./qlora_adapter")
model.eval()

In [None]:
# Constructs the scoring prompt template used before sending input to the language model.

def build_prompt(search_term, profile_text):
    return f"""
### Instruction:
You are an AI recruitment assistant that ranks candidate relevance.

### Search Term:
{search_term}

### Candidate Profile:
{profile_text}

### Task:
Rate relevance from 1 to 5 and explain briefly.

### Response:
"""

In [None]:
# Uses the LLM to score a candidate profile and returns both numeric score and model explanation.

def get_candidate_score(search_term, profile_text):
    prompt = build_prompt(search_term, profile_text)

    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")

    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_new_tokens=100,
            temperature=0.1
        )

    text = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # Extract score
    match = re.search(r"Score:\s*(\d)", text)
    score = int(match.group(1)) if match else 0

    return score, text

In [None]:
# Scores all candidate profiles in a dataframe and returns them sorted by relevance.

def rank_candidates(search_term, df):
    results = []

    for _, row in df.iterrows():
        score, response = get_candidate_score(search_term, row["profile_text"])

        results.append({
            "profile": row["profile_text"],
            "score": score,
            #"llm_output": response
        })

    ranked = sorted(results, key=lambda x: x["score"], reverse=True)
    return ranked

In [None]:
# Run Ranking

search_term = "Human Resource"

ranked_candidates = rank_candidates(search_term, df)

for r in ranked_candidates[:5]:
    print("Score:", r["score"])
    print(r["profile"])
    print()

In [None]:
# Run Ranking

search_term = "Student"

ranked_candidates = rank_candidates(search_term, df)

for r in ranked_candidates[:5]:
    print("Score:", r["score"])
    print(r["profile"])
    print()

In [None]:
# Run Ranking

search_term = "engineer"

ranked_candidates = rank_candidates(search_term, df)

for r in ranked_candidates[:5]:
    print("Score:", r["score"])
    print(r["profile"])
    print()

## Retrieval-Augmented Ranking (RAG)

>Instead of sending all candidates to the model, we first retrieve the most relevant ones using semantic search.

>Large candidate pools can overwhelm LLMs. Retrieval narrows down the best matches before scoring.

###**System Flow: RAG-Based Ranking**<br><br>

<center>All Candidate Profiles<br>
        <center>↓<br>
Convert profiles to embeddings (vector representation)
        <center>↓<br>
Store embeddings for search<br><br>

<center>Job Role Query
        <center>↓<br>
Convert role to embedding
        <center>↓<br>
Similarity search (retrieve top-K relevant candidates)
        <center>↓<br>
Selected candidate subset
        <center>↓<br>
Send subset to Fine-Tuned LLM
        <center>↓<br>
Model predicts fitness scores
        <center>↓<br>
Final ranked list<br><br>

In [None]:
# Load Sentence Embedding Model

embed_model = SentenceTransformer("all-MiniLM-L6-v2")  # fast + good

In [None]:
# Generate Embeddings for All Profiles

profiles = df["profile_text"].tolist()

embeddings = embed_model.encode(
    profiles,
    convert_to_numpy=True,
    show_progress_bar=True
)

In [None]:
# Build FAISS Vector Index

dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

In [None]:
# Retrieves top-k semantically similar candidate profiles using vector embeddings and FAISS search.

def retrieve_candidates(search_term, top_k=5):
    query_vec = embed_model.encode([search_term])
    distances, indices = index.search(query_vec, top_k)

    return df.iloc[indices[0]]

In [None]:
# Performs RAG-style ranking: retrieve top candidates via embeddings, then re-rank using the LLM.

def rag_rank(search_term, top_k=5):
    retrieved_df = retrieve_candidates(search_term, top_k)
    results = []

    for _, row in retrieved_df.iterrows():
        score, response = get_candidate_score(search_term, row["profile_text"])

        results.append({
            "profile": row["profile_text"],
            "score": score,
            "llm_output": response
        })

    return sorted(results, key=lambda x: x["score"], reverse=True)

In [None]:
# Run RAG Ranking - Query

search_term = "Human Resource"

ranked = rag_rank(search_term, top_k=10)

for r in ranked:
    print("Score:", r["score"])
    print(r["profile"])
    print()

In [None]:
# Run RAG Ranking - Query

search_term = "Student"

ranked = rag_rank(search_term, top_k=10)

for r in ranked:
    print("Score:", r["score"])
    print(r["profile"])
    print()

In [None]:
# Run RAG Ranking - Query

search_term = "Engineer"

ranked = rag_rank(search_term, top_k=10)

for r in ranked:
    print("Score:", r["score"])
    print(r["profile"])
    print()

## Summary of Approaches

| Approach | Learning | Strength | Weakness |
|----------|----------|----------|----------|
| Prompt Only | No | Fast | Inconsistent |
| Fine-Tuned LLM | Yes | Accurate | Requires training |
| RAG + LLM | Yes | Scalable | More complex |

<br>This project demonstrates how modern AI systems combine:<br>
LLMs + Retrieval + Human Feedback
to solve real-world ranking problems.


## Project Conclusion

This project presents an AI-powered system for ranking job candidates based on role fitness. The goal was to reduce manual screening effort and create a scalable, intelligent candidate evaluation pipeline.

We explored three approaches:

- **Prompt-Based Ranking** – Used an LLM without training as a baseline. Fast but inconsistent.
- **Fine-Tuned Ranking** – Trained the model using weak supervision and QLoRA, enabling more reliable and role-aware fitness scoring.
- **RAG-Based Ranking** – Added semantic retrieval to narrow candidates before scoring, improving relevance and scalability.

### 🔑 Key Insights
- LLMs can function as evaluators, not just text generators.
- Fine-tuning improves consistency and decision quality.
- Retrieval mechanisms make large-scale ranking feasible.
- Human feedback can be integrated to continuously improve rankings.

### 🚀 Final Takeaway
This project demonstrates how combining **LLMs, fine-tuning, retrieval systems, and human-in-the-loop learning** can build a practical AI solution for modern talent sourcing and candidate ranking.