# 03 - Medium Roaster 

This notebook implements our balanced CV critique model.

## Characteristics
- Professional but honest tone & direct feedback without sugarcoating
- Points out obvious issues clearly while maintains professionalism
- Using medium temperature (0.6-0.8) for more varied and candid responses

---

In [11]:
import pandas as pd
import json
from pathlib import Path
from datetime import datetime
import sys
sys.path.append('..')
import google.generativeai as genai

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', 100)

## Setup

In [2]:
# Load API key from config.py
import sys
sys.path.append('..')
from config import GEMINI_API_KEY
genai.configure(api_key=GEMINI_API_KEY)
print("API key loaded from config.py")

API key loaded from config.py


## Load Data and Helper Functions

In [3]:
# Load dataset
df = pd.read_csv('../data/resume_data.csv')

# Load test CV indices
with open('../data/test_cv_indices.json', 'r') as f:
    test_data = json.load(f)
    test_cv_indices = test_data['indices']

print(f"Loaded {len(df)} resumes")
print(f"Test CVs: {test_cv_indices}")

Loaded 9544 resumes
Test CVs: [0, 1]


In [4]:
# CV formatting function
def format_cv_for_llm(resume_row):
    """
    Format a resume row into a readable text for LLM processing.
    """
    cv_text = []
    
    if pd.notna(resume_row.get('career_objective')):
        cv_text.append(f"CAREER OBJECTIVE:\n{resume_row['career_objective']}")
    
    if pd.notna(resume_row.get('skills')):
        cv_text.append(f"\nSKILLS:\n{resume_row['skills']}")
    
    education_parts = []
    if pd.notna(resume_row.get('educational_institution_name')):
        education_parts.append(f"Institution: {resume_row['educational_institution_name']}")
    if pd.notna(resume_row.get('degree_names')):
        education_parts.append(f"Degree: {resume_row['degree_names']}")
    if pd.notna(resume_row.get('major_field_of_studies')):
        education_parts.append(f"Major: {resume_row['major_field_of_studies']}")
    if pd.notna(resume_row.get('passing_years')):
        education_parts.append(f"Year: {resume_row['passing_years']}")
    
    if education_parts:
        cv_text.append(f"\nEDUCATION:\n" + "\n".join(education_parts))
    
    work_parts = []
    if pd.notna(resume_row.get('professional_company_names')):
        work_parts.append(f"Company: {resume_row['professional_company_names']}")
    if pd.notna(resume_row.get('positions')):
        work_parts.append(f"Position: {resume_row['positions']}")
    if pd.notna(resume_row.get('start_dates')):
        work_parts.append(f"Period: {resume_row['start_dates']}")
        if pd.notna(resume_row.get('end_dates')):
            work_parts.append(f" to {resume_row['end_dates']}")
    if pd.notna(resume_row.get('responsibilities')):
        work_parts.append(f"Responsibilities:\n{resume_row['responsibilities']}")
    
    if work_parts:
        cv_text.append(f"\nWORK EXPERIENCE:\n" + "\n".join(work_parts))
    
    if pd.notna(resume_row.get('languages')):
        cv_text.append(f"\nLANGUAGES:\n{resume_row['languages']}")
    
    if pd.notna(resume_row.get('certification_skills')):
        cv_text.append(f"\nCERTIFICATIONS:\n{resume_row['certification_skills']}")
    
    return "\n".join(cv_text)

## Medium Roaster Prompt Design

In [5]:
MEDIUM_SYSTEM_PROMPT = """You are an experienced hiring manager who provides direct, honest CV feedback.

Your approach:
1. Be direct and honest - no sugarcoating
2. Point out obvious flaws and red flags
3. Call out generic buzzwords and filler content
4. Be professional but don't hold back the truth
5. Focus on what actually matters to employers

Keep your feedback:
- Brutally honest but professional
- Direct about weaknesses
- Critical of vague or generic content
- Focused on real-world hiring standards

Structure your response:
 FIRST IMPRESSION: What stands out (good or bad)
 MAJOR ISSUES: Glaring problems that need fixing
 CONCERNS: Things that raise questions
 WHAT WORKS: Brief acknowledgment of strengths
 BOTTOM LINE: Final verdict and priority fixes
"""

def create_medium_prompt(cv_text):
    """Create a medium roasting prompt."""
    return f"""Review this CV with honest, direct feedback. Don't hold back on pointing out issues:

{cv_text}

Provide your honest assessment following the structure in the system prompt."""

## Temperature Tuning Experiments

Let's test different temperatures for more varied and candid responses.

In [6]:
def roast_cv(cv_text, temperature=0.7, model_name="gemini-2.0-flash"):
    """
    Generate CV critique using Gemini.
    
    Args:
        cv_text: Formatted CV text
        temperature: Controls randomness (0.0-1.0)
        model_name: Gemini model to use
    
    Returns:
        str: Generated critique
    """
    model = genai.GenerativeModel(
        model_name=model_name,
        generation_config=genai.GenerationConfig(
            temperature=temperature,
            top_p=0.95,
            top_k=40,
            max_output_tokens=1024,
        )
    )
    
    full_prompt = f"{MEDIUM_SYSTEM_PROMPT}\n\n{create_medium_prompt(cv_text)}"
    
    response = model.generate_content(full_prompt)
    return response.text

# Test with first CV
test_cv = format_cv_for_llm(df.iloc[test_cv_indices[0]])
print("Test CV:")
print("="*80)
print(test_cv[:500] + "...")
print("="*80)

Test CV:
CAREER OBJECTIVE:
Big data analytics working and database warehouse manager with robust experience in handling all kinds of data. I have also used multiple cloud infrastructure services and am well acquainted with them. Currently in search of role that offers more of development.

SKILLS:
['Big Data', 'Hadoop', 'Hive', 'Python', 'Mapreduce', 'Spark', 'Java', 'Machine Learning', 'Cloud', 'Hdfs', 'YARN', 'Core Java', 'Data Science', 'C++', 'Data Structures', 'DBMS', 'RDBMS', 'Informatica', 'Talend...


### Experiment 1: Medium Temperature (0.6)
More varied but still controlled

In [7]:
print(" Temperature: 0.6 (Controlled Variety)")
print("="*80)
result_temp_06 = roast_cv(test_cv, temperature=0.6)
print(result_temp_06)
print("\n" + "="*80)

 Temperature: 0.6 (Controlled Variety)
Okay, here's my brutally honest assessment of this CV. Buckle up.

**FIRST IMPRESSION:**

Generic and underwhelming. It screams "entry-level" in the worst possible way. The formatting is basic, and the content feels like a keyword dump rather than a compelling narrative. The "Till Date" is unprofessional.

**MAJOR ISSUES:**

*   **Career Objective:** This is terrible. It's a rambling, unfocused sentence that tells me nothing concrete about what you *actually* want to do or what you bring to the table. "More of development" is vague and grammatically incorrect. This needs a complete rewrite to focus on a specific target role and quantifiable contributions.
*   **Skills Section:** This is a laundry list of buzzwords. Listing technologies is fine, but without context or evidence of proficiency, it's meaningless. Anyone can copy and paste a list of technologies. Where's the proof you're actually good at these things?
*   **Work Experience (Coca-Cola):

### Experiment 2: Medium-High Temperature (0.7)
Good balance of creativity and directness

In [8]:
print(" Temperature: 0.7 (Balanced)")
print("="*80)
result_temp_07 = roast_cv(test_cv, temperature=0.7)
print(result_temp_07)
print("\n" + "="*80)

 Temperature: 0.7 (Balanced)
Okay, here's my brutally honest assessment of this CV:

**FIRST IMPRESSION:**

This CV screams "generic and underdeveloped." It looks like a template was filled in with buzzwords and lacks any substance or quantifiable achievements. The formatting is also questionable with the odd spacing around "to ['Till Date']".

**MAJOR ISSUES:**

*   **Career Objective is Awful:** It's vague, grammatically incorrect, and focuses on *what you want* instead of what you offer. "Currently in search of role that offers more of development" tells me nothing about your actual capabilities or career goals. It also feels tacked on and doesn't flow. This needs a complete rewrite to highlight your key skills and value proposition.
*   **Skills Section is a Buzzword Dump:** Listing a bunch of technologies without any context or evidence of proficiency is useless. Anyone can claim to know 'Big Data' or 'Machine Learning.' This section needs to be substantiated with specific project

### Experiment 3: High Temperature (0.8)
More creative and varied critiques

In [9]:
print(" Temperature: 0.8 (More Creative)")
print("="*80)
result_temp_08 = roast_cv(test_cv, temperature=0.8)
print(result_temp_08)
print("\n" + "="*80)

 Temperature: 0.8 (More Creative)
Okay, here's the brutally honest feedback on this CV:

**FIRST IMPRESSION:** This CV screams "generic and underdeveloped." It looks like a template was filled in with minimal effort. The lack of detail is alarming, and the formatting is basic to the point of being unprofessional.

**MAJOR ISSUES:**

*   **Career Objective - Useless:** This is a vague, rambling statement that says absolutely nothing. "Robust experience in handling all kinds of data" is meaningless. Every candidate claims to be seeking a role that offers more development. Delete it or rewrite it to clearly state the specific role you are pursuing and why you're a good fit.
*   **Skills - Keyword Vomit:** This is just a laundry list of buzzwords. There's no indication of proficiency level in any of these technologies. Listing 'C++' and 'Core Java' after 'Big Data' skills makes it look like you are throwing everything at the wall.
*   **Work Experience - Vague and Meaningless:** "Technical

## Select Optimal Temperature

Based on reading through the experiments above, we manually select the temperature that provides a good balance of:
- Direct, honest feedback with critique points in candid tone

**Recommended: 0.7 for medium roasting**


In [10]:
# Set optimal temperature
OPTIMAL_TEMPERATURE = 0.7

print(f" Selected optimal temperature: {OPTIMAL_TEMPERATURE}")

 Selected optimal temperature: 0.7


## Summary

This notebook demonstrated:
1.  Direct CV critique prompt design
2.  Temperature tuning experiments (0.6, 0.7, 0.8)
3.  Selection of optimal temperature
4.  Generation of critiques
5.  Saving results for comparison