# Cover Letter Generator
In this notebook we are going to create a Cover letter generator RAG using LangChain

## Workflow of Cover Letter Generator
1. Input Collection
* User uploads Resume (PDF/DOCX → parsed into text).
* User pastes Job Description (JD).

2. Keyword Extraction (LangChain pipeline #1)
* Use an LLM + PydanticOutputParser to extract:
    - role (e.g., "Data Scientist")
    - seniority (e.g., "Mid-level")
    - must_have (e.g., Python, SQL, Machine Learning)
    - nice_to_have (e.g., Cloud, Docker, NLP)
    - tools (e.g., TensorFlow, PyTorch, Tableau)
✅ Output = structured JDKeywords object.

3. Resume Analysis

* Parse the resume into sections: Experience, Skills, Projects, Education.
* Extract skills & experiences → again with a structured schema (e.g., ResumeProfile).

4. Keyword Matching

* Compare JD keywords vs Resume keywords.
* Mark:
    - ✅ Matches (strengths to emphasize).
    - ❌ Gaps (skills missing → handled carefully, not overclaimed).

5. Cover Letter Generation (LangChain pipeline #2)

* Prompt template uses:
    - JDKeywords (so letter aligns with employer’s needs).
    - ResumeProfile (so letter emphasizes relevant experience).
    - LLM writes a personalized cover letter, structured into:
    - Greeting
    - Hook (why the candidate is interested in this role/company)
    - Body (match candidate’s experience → JD requirements)
    - Closing (enthusiasm + call to action).

configure LLM

In [2]:
from dotenv import load_dotenv
import os
import sys
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults

class Settings:
    def __init__(self) -> None:
        sys.path.append(os.path.abspath(".."))
        self.gemini_api = os.environ.get("GOOGLE_API_KEY")
        self.tavily_api_key = os.environ.get("TAVILY_API_KEY")
        
    def load_gemini(self, temp: float = 0.5) -> ChatGoogleGenerativeAI:
        llm = ChatGoogleGenerativeAI(
            model = "gemini-2.5-flash",
            api_key = self.gemini_api,
            temperature = temp
        )
        print("LLM ready:", type(llm).__name__)
        return llm
    def load_gemma(self, temp: float = 0.5)->ChatOpenAI:
        """
        This method returns the local gemma3 model hosted by LM Studio.
        """
        llm = ChatOpenAI(
            model="google/gemma-3-4b",
            openai_api_key = 'lm-studio', # type: ignore
            openai_api_base="http://localhost:1234/v1", # type: ignore
            temperature=temp
        )
        return llm
    
    def load_tavily_search(self, max_results: int = 2) -> TavilySearchResults:
        return TavilySearchResults(max_results = max_results)
        

In [3]:
config = Settings()
llm = config.load_gemini()

LLM ready: ChatGoogleGenerativeAI


## Define the schemas
We will use this schemas to structure our outputs from LLMS

1. schema for extracting keywords

In [4]:
from pydantic import BaseModel, Field
from typing import List, Optional

class JDKeyWords(BaseModel):
    company_name: Optional[str] = Field(None, description="Name of the company")
    address: Optional[str] = Field(None, description="Location of the job or company")
    role: Optional[str] = Field(None, description="Role inferred from job description")
    seniority: Optional[str] = Field(None, description="Seniority level like junior, senior, associate etc")
    must_have: List[str] = Field(..., description="Critical must-have skills needed for the job")
    nice_to_have: List[str] = Field(default_factory=list, description="Optional skills")
    tools: List[str] = Field(default_factory=list, description="Software/tools needed for this job")

# ... means the field is required
# Optional[str] means it can be a string or None
# if tools is empty then we will return a []

## Creating Pipelines

* pipeline to extract keywords from job description

In [5]:
job_desc = """
Job Description: Senior Full-Stack Developer
Company: InnovateTech Solutions
Location: San Francisco, CA (Hybrid Remote)
Job Type: Full-Time

About Us
At InnovateTech Solutions, we're building the next generation of SaaS tools that empower businesses to thrive. Our platform leverages cutting-edge AI and data analytics to provide actionable insights. Join our passionate team of engineers and play a key role in shaping our product's future.

The Role
We are seeking a highly skilled and motivated Senior Full-Stack Developer to design, develop, and implement robust software solutions. You will be involved in all stages of the product lifecycle, from concept to deployment, and will mentor junior developers on the team. This is a fantastic opportunity to make a significant impact on a product used by thousands.

Key Responsibilities
Design, code, test, and manage full-stack applications from the database to the UI.

Collaborate with product managers, designers, and other engineers to define, design, and ship new features.

Lead technical architecture discussions and make recommendations on system improvements.

Write clean, maintainable, and efficient code while following best practices.

Conduct code reviews and provide constructive feedback to team members.

Identify and troubleshoot complex performance and scalability issues.

Must-Have Qualifications
5+ years of professional experience in software development.

Frontend: Proven expertise with modern JavaScript frameworks, specifically React and its ecosystem (Redux, Webpack, Hooks).

Backend: Strong proficiency in Python and experience with web frameworks, specifically Django or FastAPI.

Database: Experience with both PostgreSQL and Redis.

Cloud & DevOps: Hands-on experience with AWS (EC2, S3, RDS, Lambda) and familiarity with Docker and CI/CD pipelines.

Solid understanding of RESTful API design principles.

Experience with version control using Git.

Nice-to-Have Qualifications
Experience with TypeScript.

Knowledge of GraphQL.

Familiarity with testing frameworks (e.g., Jest, Pytest, Cypress).

Understanding of agile/scrum development methodologies.

Previous experience in a startup or SaaS environment.

Experience with Kubernetes.

Tools You'll Use
Frontend: React, Redux Toolkit, TypeScript, Vite, Jest

Backend: Python, Django REST Framework, FastAPI, Celery

Database: PostgreSQL, Redis

Infrastructure: AWS, Docker, GitHub Actions, Terraform

Collaboration: Jira, Slack, Figma, Confluence

What We Offer
Competitive salary and equity package.

Comprehensive health, dental, and vision insurance.

401(k) with company matching.

Flexible work schedule and generous PTO.

Professional development budget.

A collaborative, inclusive, and innovative culture.

How to Apply

If you are excited about this opportunity, please apply with your resume and a link to your GitHub profile or portfolio.
"""

In [14]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser

def generate_jd_keywords(job_desc: str, llm)->JDKeyWords:

    parser = JsonOutputParser(pydantic_object=JDKeyWords)
    prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are an AI tool that is responsible for extracting hiring signals from job description."
     "Return ONLY valid JSON that matches this schema: \n{format_instructions}"),
     ("human",
      "Job description:\n {job_description}\n"
      "Be precise. Keep lists concise and deduplicated")
    ]).partial(format_instructions=parser.get_format_instructions())
    jd = prompt | llm | parser
    result = jd.invoke({"job_description": job_desc})
    
    return result    


In [16]:
import json

#llm2 = config.load_gemma()

result1 = generate_jd_keywords(job_desc , llm)
print(json.dumps(result1 , indent=2))

{
  "company_name": "InnovateTech Solutions",
  "address": "San Francisco, CA",
  "role": "Full-Stack Developer",
  "seniority": "Senior",
  "must_have": [
    "JavaScript",
    "React",
    "Redux",
    "Webpack",
    "Python",
    "Django",
    "FastAPI",
    "PostgreSQL",
    "Redis",
    "AWS",
    "Docker",
    "CI/CD",
    "RESTful API design",
    "Git"
  ],
  "nice_to_have": [
    "TypeScript",
    "GraphQL",
    "Jest",
    "Pytest",
    "Cypress",
    "Agile/Scrum",
    "Kubernetes"
  ],
  "tools": [
    "React",
    "Redux Toolkit",
    "TypeScript",
    "Vite",
    "Jest",
    "Python",
    "Django REST Framework",
    "FastAPI",
    "Celery",
    "PostgreSQL",
    "Redis",
    "AWS",
    "Docker",
    "GitHub Actions",
    "Terraform",
    "Jira",
    "Slack",
    "Figma",
    "Confluence"
  ]
}


## Pipeline to extract the contents of the resume

In [6]:
class ResumeProfile(BaseModel):
    name: str = Field(..., description="Name of the cantidate applying for the job")
    contact: List[str] = Field(default_factory=list , description="contact informations such as email, phone etc")
    education: List[str] = Field(default_factory=list, description="Contains the education information of the cantidate.")
    experience: List[str] = Field(default_factory=list ,  description="Work experience of the cantidate mentioned in resume, with a short description")
    skills: List[str] = Field(..., description="Skills of the cantidate mentioned in the resume.")
    projects: List[str] = Field(default_factory=list , description="Projects done by the candidate with a short description.")

In [11]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("resume.pdf")

pages = loader.load()
resume_text = "\n".join([p.page_content for p in pages])

# Pipeline to make a report of resume

In [10]:
class ResumeReport(BaseModel):
    matched_skills: List[str] = Field(... , description="Skills that are needed in job description and present in resume")
    missed_skills: List[str] = Field(default_factory=list , description="Skills needed in job description but not present in resume.")
    phrasing_suggestions: List[str] = Field(
        default_factory=list, description="Concrete bullet suggestions to add"
    )
    relevance_score: int = Field(..., ge=0, le=100)

In [13]:
def get_resume_profile(resume: str , llm)->ResumeProfile:

    parser = JsonOutputParser(pydantic_object=ResumeProfile)
    prompt = ChatPromptTemplate.from_messages([
        ("system",
         "You are an AI tool that analyzes a resume in the form of string and parse information in a structured format."
         "Include a brief and small description about the experience(if any) and projects(if any) done by the user"
         "Return ONLY valid JSON that matches this schema:\n{format_instructions}"),
         ("human",
          "Below contains the resume:\n\n {resume_text}")
    ]).partial(format_instructions=parser.get_format_instructions())

    chain = prompt | llm | parser
    result = chain.invoke({"resume_text":resume})

    return result



In [38]:
result = get_resume_profile(resume_text , llm)
print(json.dumps(result , indent=2))

{
  "name": "Amal Varghese",
  "contact": [
    "Kerala, India",
    "LinkedIn",
    "+919207506741",
    "officialamalv2004@gmail.com"
  ],
  "education": [
    "Model Engineering College, Trikkakara, Kerala, India - Bachelor of Technology (GPA: 9.33) September 2023 \u2013 Present"
  ],
  "experience": [
    "IEDC MEC, Kerala, India - Tech Team Member (September 2020 \u2013 Present): Collaborated in the development and maintenance of EventSync, an event management platform for over 1000 users, contributing to UI development and architecture scaling.",
    "CSRBOX & IBM SkillsBuild, Remote - Web Development Intern (June 2024 - August 2024): Completed a competitive 6-week internship, developing BookBridge, a full-featured, serverless book donation platform.",
    "Wrench Solutions, Kochi, Kerala - Machine Learning Intern (May 2025 \u2013 Present): Developing a predictive ML model to forecast epileptic seizures using EEG data, applying fractal analysis for feature extraction, and researc

In [17]:
dummy_resume = """
Johnathan Chen
San Francisco, CA | (555) 123-4567 | johnathan.chen@email.com | linkedin.com/in/johnathanchen | github.com/jchen-dev

Summary
Senior Full-Stack Developer with 6 years of experience building scalable web applications in fast-paced startup environments. Proven expertise in modern JavaScript (React) and Python (Django, FastAPI) stacks. Passionate about clean architecture, mentorship, and leveraging AWS to build efficient, cloud-native solutions.

Technical Skills
Languages: JavaScript (ES6+), TypeScript, Python, SQL, HTML5, CSS3

Frontend: React, Redux, Redux Toolkit, Context API, Vite, Webpack, Jest, Cypress

Backend: Django, Django REST Framework, FastAPI, Flask, Celery

Databases: PostgreSQL, Redis, SQLite

Cloud & DevOps: AWS (EC2, S3, RDS, Lambda, IAM), Docker, GitHub Actions, CI/CD, Terraform

Tools & Methods: Git, Jira, Agile/Scrum, Figma, Confluence, Slack

Professional Experience
Senior Software Engineer | TechNova Inc., San Francisco, CA | June 2020 – Present

Led the end-to-end development of a new B2B SaaS analytics dashboard using React, TypeScript, and FastAPI, resulting in a 30% increase in user engagement.

Migrated legacy monolithic Django application to a microservices architecture, improving system scalability and reducing API response time by 40%.

Designed and implemented CI/CD pipelines with GitHub Actions and Docker, automating testing and deployment processes.

Mentored 3 junior developers on best practices for React state management, Python code quality, and AWS security.

Key Technologies: Python, FastAPI, React, TypeScript, PostgreSQL, Redis, AWS (EC2, S3, Lambda), Docker, Jest

Full-Stack Developer | StartUpGrid, Austin, TX | July 2018 – May 2020

Developed and maintained core features for a project management platform using Django REST Framework and React.

Built real-time notification features using WebSockets and Redis for pub/sub.

Optimized database queries on PostgreSQL, reducing page load times by over 25%.

Collaborated in an Agile team, participating in sprint planning, code reviews, and daily standups.

Key Technologies: Python, Django, Django REST Framework, React, JavaScript, PostgreSQL, Redis, Jira

Projects
CI/CD Pipeline Automator | github.com/jchen-dev/cicd-automator

A custom Docker and GitHub Actions pipeline for automating testing, building, and deployment of Python/JS applications to AWS.

Reduced deployment time by 70% and eliminated manual deployment errors.

Education
Bachelor of Science in Computer Science | University of Texas at Austin | *2014 – 2018*
"""
result2 = get_resume_profile(dummy_resume , llm)
print(json.dumps(result2 , indent=2))

{
  "name": "Johnathan Chen",
  "contact": [
    "(555) 123-4567",
    "johnathan.chen@email.com",
    "linkedin.com/in/johnathanchen",
    "github.com/jchen-dev"
  ],
  "education": [
    "Bachelor of Science in Computer Science | University of Texas at Austin | 2014 \u2013 2018"
  ],
  "experience": [
    "Senior Software Engineer | TechNova Inc., San Francisco, CA | June 2020 \u2013 Present: Led end-to-end development of a B2B SaaS analytics dashboard, migrated a monolithic application to microservices, designed and implemented CI/CD pipelines, and mentored junior developers.",
    "Full-Stack Developer | StartUpGrid, Austin, TX | July 2018 \u2013 May 2020: Developed and maintained core features for a project management platform, built real-time notification features, optimized database queries, and collaborated in an Agile team."
  ],
  "skills": [
    "JavaScript (ES6+)",
    "TypeScript",
    "Python",
    "SQL",
    "HTML5",
    "CSS3",
    "React",
    "Redux",
    "Redux Toolk

In [18]:
class ATSCheckResult(BaseModel):
    score: int = Field(...,ge=0,le=100, description="ATS compatibility score of the resume from 0 to 100")
    suggestions: List[str] = Field(..., description="Concrete suggestions to improve the resume to pass ATS")

In [19]:
def checkATS(resume: str, llm)->ATSCheckResult:
    
    parser = JsonOutputParser(pydantic_object=ATSCheckResult)
    prompt = ChatPromptTemplate.from_messages([
        ("system",
         "You are an AI tool that analyzes a resume and provides an ATS compatibility score from 0 to 100 along with concrete suggestions to improve the resume to pass ATS. "
         "Return only valid JSON that matches this schema: \n{format_instructions}"),
        ("human",
         "The resume is a string:\n\n {resume_text}")
    ]).partial(format_instructions = parser.get_format_instructions())

    chain = prompt | llm | parser
    result = chain.invoke({"resume_text": resume})
    return result

In [20]:
ats_score = checkATS(dummy_resume, llm)

In [21]:
print(json.dumps(ats_score, indent=2))

{
  "score": 95,
  "suggestions": [
    "Tailor the resume for each specific job application by incorporating keywords directly from the job description into your summary and experience bullet points. ATS systems heavily weigh exact keyword matches.",
    "Consider adding a dedicated 'Certifications' section if you hold any relevant industry certifications (e.g., AWS Certified Developer, Professional Scrum Master). This adds valuable keywords and validates expertise.",
    "Ensure that every technology listed in your 'Technical Skills' section that is critical for a target role is also contextually mentioned within your 'Professional Experience' bullet points. While your 'Key Technologies' sub-sections are excellent, embedding them naturally throughout the descriptions further reinforces keyword density.",
    "If 'Microservices Architecture' is a highly sought-after keyword for your target roles, consider adding it explicitly to your 'Cloud & DevOps' or a new 'Architecture' sub-sectio

In [27]:
def generate_resume_report(job_desc: JDKeyWords , resume: ResumeProfile, llm)-> ResumeReport:
    parser = JsonOutputParser(pydantic_object=ResumeReport)
    prompt = ChatPromptTemplate.from_messages([
        ("system",
         "You are an AI tool that takes a structured JSON describing a job description and a Resume."
         "Your task is to analyze both to find matching skills and missing skills and make phrasing suggestions"
         "You must make the suggestions in concrete bullet points"
         "You must make a relevance_score of the resume and job description with a value between 0 and 100"
         "Return ONLY valid JSON that matches this schema: \n{format_instructions}"),
         ("human",
          "Keywords of Job Description:\n\n{job_description}"
          "Structured information of the resume:\n\n{resume}"
          "Be precise. Keep lists concise and deduplicated")
    ]).partial(format_instructions = parser.get_format_instructions())

    chain = prompt | llm | parser
    result = chain.invoke({"job_description": job_desc,"resume":resume})

    return result

In [39]:
result = generate_resume_report(result1 , result2 , llm)
print(json.dumps(result , indent=2))

{
  "matched_skills": [
    "JavaScript",
    "React",
    "Redux",
    "Webpack",
    "Python",
    "Django",
    "FastAPI",
    "PostgreSQL",
    "Redis",
    "AWS",
    "Docker",
    "CI/CD",
    "Git",
    "TypeScript",
    "Jest",
    "Cypress",
    "Agile/Scrum",
    "Redux Toolkit",
    "Vite",
    "Django REST Framework",
    "Celery",
    "GitHub Actions",
    "Terraform",
    "Jira",
    "Slack",
    "Figma",
    "Confluence"
  ],
  "missed_skills": [
    "GraphQL",
    "Pytest",
    "Kubernetes",
    "RESTful API design"
  ],
  "phrasing_suggestions": [
    "Add 'Proficient in RESTful API design and implementation' to your technical skills list, given your extensive experience with Django REST Framework and FastAPI.",
    "If applicable, include a bullet point detailing any experience with GraphQL, Pytest, or Kubernetes, e.g., 'Explored/Implemented solutions using GraphQL for API development' or 'Deployed and managed applications on Kubernetes clusters'.",
    "Quantify the 

# Pipeline to generate cover letter 
We will feed the LLM with
* Job description keywords
* Resume Profile
* Resume Report

In [44]:
def generate_cover_letter(job_desc: JDKeyWords, resume: ResumeProfile, report: ResumeReport, llm, pages: int = 1)->str:
    """
    This function will create a pipeline to generate a cover letter based on
    * Job description Keywords
    * Resume Profile
    * Resume Report
    """
    prompt = ChatPromptTemplate.from_messages([
        ("system",
         "You are a career assistant that writes professional, ATS-friendly and personalized cover letters."
         "The cover letter should be structured properly with paragraphs"
         "The cover letter should have a proper structure such as from address, introduction, body, conclusion and salutation"),
         ("human", 
          "Here is the job description keywords:\n {job_description}\n"
          "Here is the candidate's resume profile:\n {resume}\n"
          "Here is the analysis report of the resume against the job description:\n {report}\n"
          "Write a {pages} page cover letter in professional tone.\n"
          "Make sure to emphasize the matching skills, address missing skills diplomatically, "
          "and align the candidate's experience with the job requirements.\n\n"
          "Output only the cover letter text.")
    ])
    chain = prompt | llm 
    result = chain.invoke({
        "job_description": job_desc,
        "resume": resume,
        "report": report,
        "pages": pages
    })

    return result

In [45]:
cover_letter = generate_cover_letter(result1 , result2 ,result, llm)

In [46]:
print(cover_letter.content)

Johnathan Chen
(555) 123-4567
johnathan.chen@email.com
linkedin.com/in/johnathanchen
github.com/jchen-dev

[Current Date]

InnovateTech Solutions
San Francisco, CA

Dear InnovateTech Solutions Hiring Team,

I am writing to express my enthusiastic interest in the Senior Full-Stack Developer position at InnovateTech Solutions, as advertised. With over five years of experience in developing robust, scalable web applications and a proven track record of leading complex projects from conception to deployment, I am confident that my skills and experience align perfectly with the requirements of this role and your company's innovative mission.

My background as a Senior Software Engineer at TechNova Inc. has provided me with extensive expertise across the full stack. I possess deep proficiency in **JavaScript (ES6+)**, **React**, and **Redux**, complemented by experience with modern front-end tooling such as **Webpack**, **Redux Toolkit**, and **Vite**. On the back end, I am highly skilled in

# Pipeline to generate EMAIL

In [49]:
from langchain.prompts import ChatPromptTemplate

def generate_cover_email(
    job_desc: JDKeyWords,
    resume: ResumeProfile,
    report: ResumeReport,
    llm
) -> str:
    """
    Generates a professional email-style cover letter.
    Includes subject, greeting, body, and closing.
    """

    prompt = ChatPromptTemplate.from_messages([
        ("system", "You are a career assistant that writes professional job application emails."),
        ("human",
         "Here are the job description keywords:\n{job_description}\n\n"
         "Here is the candidate resume profile:\n{resume}\n\n"
         "Here is the analysis report comparing resume and job description:\n{report}\n\n"
         "Write a professional email that the candidate can send directly to a recruiter.\n"
         "The email must include:\n"
         "- A clear subject line mentioning the role.\n"
         "- A polite greeting.\n"
         "- A concise body that emphasizes relevant skills and experience.\n"
         "- A closing with name and contact details.\n\n"
         "Output only the final email content.")
    ])

    chain = prompt | llm
    result = chain.invoke({
        "job_description": job_desc,
        "resume": resume,
        "report": report
    })

    return result.content


In [50]:
cover_email = generate_cover_email(result1 , result2 ,result, llm)
print(cover_email.content)

AttributeError: 'str' object has no attribute 'content'

In [51]:
print(cover_email)

Subject: Application for Senior Full-Stack Developer - Johnathan Chen

Dear InnovateTech Solutions Hiring Team,

I am writing to express my enthusiastic interest in the Senior Full-Stack Developer position at InnovateTech Solutions, as advertised. With over five years of experience in full-stack development, including a recent role as a Senior Software Engineer, I am confident that my skills and experience align perfectly with the requirements of this role.

In my current role at TechNova Inc., I have led the end-to-end development of complex B2B SaaS applications, including the architectural design and successful migration of a legacy monolithic Django application to a scalable microservices architecture. I have also designed and implemented robust CI/CD pipelines using GitHub Actions and Terraform on AWS, significantly streamlining deployment processes. My experience also includes mentoring junior developers and collaborating effectively within Agile/Scrum methodologies.

My technica