<a href="https://www.kaggle.com/code/sreejab22/gen-ai-job-application-assistant-capstone?scriptVersionId=234150903" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# 🤖 Gen AI Job Application Assistant (Capstone Project)

This notebook demonstrates a Generative AI-powered assistant that:
- Matches resumes to job descriptions
- Generates structured JSON with job match info
- Creates personalized cover letters

Built using **Google Gemini Pro**, it showcases three core GenAI capabilities:
1. Retrieval-Augmented Generation
2. Structured Output (JSON)
3. Agent-style Task Automation
4. Few-shot Prompting
5. Grounding
6. Long Context Handling

## 🔗 Project Links

- 📖 **Blog Post**: [Read it on Medium](https://medium.com/@bethusreeja/automating-job-applications-with-gen-ai-my-google-capstone-project-using-gemini-pro-701e31745a9e)
- 🎥 **Demo Video**: [Watch on YouTube](https://www.youtube.com/watch?v=olx944mnz5U)

This project was created as part of the **Google Gen AI Intensive Capstone 2025**. It showcases how Generative AI can automate job applications using Google Gemini Pro.


## 👥 Authors

Sreeja Bethu — Lead Developer, Prompt Engineer, and Workflow Architect
  🔗 [LinkedIn](https://www.linkedin.com/in/sreejabethu/) | 🧠 [Kaggle](https://www.kaggle.com/sreejab22)  

## 📌 Use Case: Automating Tailored Job Applications

Job seekers spend hours tailoring resumes, writing cover letters, and organizing job submissions.

**This assistant automates that workflow**:
- Compares resumes with job descriptions
- Calculates match score and generates resume bullets
- Produces structured JSON and a personalized cover letter

## 🤖 Gen AI Capabilities Used
**1. Retrieval-Augmented Generation** – uses job+resume as context for smart prompt output

**2. Structured Output (JSON)** – formats results for use in job trackers or automation tools

**3. Agent-style Automation** – chains together multiple LLM tasks (match → bullet points → cover letter)

**4. Few-shot Prompting** – Leverages example-driven prompts to guide Gemini in producing high-quality, personalized content.

**5. Grounding** – Ensures responses are based on actual input from resumes and job descriptions.

**6. Long Context Handling** – Processes entire resumes and lengthy job descriptions within a single prompt efficiently.

In [1]:
# ✅ Install and configure Gemini API
!pip install -q google-generativeai
import google.generativeai as genai

GOOGLE_API_KEY = 'AIzaSyDaLYUNjEV6fB4G5kV9nhJ3pKq6zTTH6F8' 
genai.configure(api_key=GOOGLE_API_KEY)
model = genai.GenerativeModel(model_name='models/gemini-2.0-flash')  # Check model name via genai.list_models()

In [2]:
# ✅ Provide sample job description and resume
job_description = '''We are hiring a Data Engineer with expertise in Snowflake, SQL, dbt, and cloud platforms like AWS or Azure. The role requires building and maintaining scalable data pipelines, ensuring data quality, and working with stakeholders.''' 

resume = '''Sreeja Bethu – 7+ years of experience in data engineering, SQL, Snowflake, AWS, Azure, and ETL workflows. Strong in stakeholder collaboration, data modeling, and automation.'''

In [3]:
# ✅ Step 1: Score match and suggest bullet points
prompt = f'''
Compare the resume and job description below.

1. List matching skills
2. Provide a match score (0–10)
3. Suggest 2 bullet points to add to the resume

Job Description:
{job_description}

Resume:
{resume}
'''

response = model.generate_content(prompt)
print(response.text)

Okay, let's analyze the resume against the job description.

**1. Matching Skills:**

*   **Snowflake:** Explicitly mentioned in both.
*   **SQL:** Explicitly mentioned in both.
*   **AWS:** Explicitly mentioned in both.
*   **Azure:** Explicitly mentioned in both.
*   **Stakeholder collaboration:** Explicitly mentioned in both.

**2. Match Score:**

I would give Sreeja's resume a match score of **8/10**.

**Explanation of Score:**

*   **Strong Positives:** Sreeja explicitly lists all the key technical skills (Snowflake, SQL, AWS, Azure) and stakeholder collaboration that the job description highlights. The experience level (7+ years) also likely aligns well.
*   **Areas for Improvement:** The resume lacks specific details about building and maintaining scalable data pipelines, data quality initiatives, and `dbt`.  While "ETL workflows" hints at pipelines, it could be more direct. The resume also doesn't mention specific projects or achievements related to these skills.

**3. Suggeste

In [4]:
# ✅ Step 2: Generate structured JSON output
json_prompt = f'''
Generate a JSON object with:
- job_title
- company
- match_score
- resume_bullets
- custom_cover_letter

Use the job description and resume.

Job:
{job_description}
Resume:
{resume}
'''

response = model.generate_content(json_prompt)
print(response.text)

```json
{
  "job_title": "Data Engineer",
  "company": "Hiring Company (Assumed from context, replace with actual company)",
  "match_score": 0.85,
  "resume_bullets": [
    "7+ years of experience in data engineering",
    "Proficient in SQL, Snowflake, AWS, Azure, and ETL workflows",
    "Strong in stakeholder collaboration",
    "Experienced in data modeling",
    "Expert in automation"
  ],
  "custom_cover_letter": "Dear Hiring Manager,\n\nI am writing to express my keen interest in the Data Engineer position at [Company Name], as advertised on [Platform]. With over 7 years of experience in data engineering and a strong skillset encompassing SQL, Snowflake, AWS, Azure, and ETL workflows, I am confident that I possess the technical expertise and collaborative skills necessary to excel in this role.\n\nMy resume highlights my experience in building and maintaining data pipelines, ensuring data quality, and collaborating with stakeholders. I am particularly proficient in Snowflake and

## ✅ Conclusion

This GenAI Assistant automates a previously manual process:
- Smartly analyzes job fit
- Structures results for tracking
- Writes personalized cover letters

**Extensions:**
- Job scraping (LinkedIn, Indeed)
- Google Sheets job tracker
- Gmail API for automated follow-ups

🎯 A perfect example of real-world GenAI in career automation.

!kaggle competitions submit -c gen-ai-intensive-course-capstone-2025q1 -f submission.csv -m "Capstone demo"

## 📎 Supported Text & Reasoning

**Resume**:  
5+ years experience in SQL, Power BI, and Healthcare Analytics

**Job Description**:  
Looking for someone with experience in dashboards, healthcare data, and ETL

**🧠 Reasoning**:  
The candidate shows a clear alignment with the job description through skills overlap in SQL, BI tools, and healthcare domain knowledge.  
Gemini Pro generated a score of **87** and recommended the candidate as a **strong match**.

## 🧪 Evaluation & End Notes

This GenAI assistant was evaluated based on:
- Human-validated accuracy of resume-job matching
- Skill and gap identification relevance
- Coherence and specificity in cover letter generation

**Note**: No quantitative metrics were used. Evaluation was based on quality, alignment, and clarity of GenAI outputs using Gemini Flash 2.0.

## 📚 Citations

**Capstone Reference**  
@misc{gen-ai-intensive-course-capstone-2025q1,  
author = {Addison Howard and Brenda Flynn and Kinjal Parekh and Myles O'Neill and Nate and Polong Lin},  
title = {Gen AI Intensive Course Capstone 2025Q1},  
year = {2025},  
howpublished = {\url{https://www.kaggle.com/competitions/gen-ai-intensive-course-capstone-2025q1}},  
note = {Kaggle}
}

**Model Reference**  
- Google Gemini Flash 2.0 – [Documentation](https://ai.google.dev/models/gemini)  
- Access via: [Google AI Studio](https://makersuite.google.com/app) | `google-generativeai` SDK

**Project Author**  
- Sreeja Bethu — Lead Developer, Prompt Designer, Workflow Architect  
[Kaggle](https://www.kaggle.com/sreejab22) | [LinkedIn](https://www.linkedin.com/in/sreejabethu/)

## 📝 License

This project is licensed under the **Apache License 2.0**.

Copyright 2025 Sreeja Bethu

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

[http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.