COLD EMAIL GENERATOR

In [1]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    temperature=0, 
    groq_api_key='Enter your created API key from Groq', 
    model_name="llama3-70b-8192"
)


Load career page content from a URL

In [2]:
from langchain_community.document_loaders import WebBaseLoader

# ✅ New job URL from Notion (hosted on Greenhouse)
url = "https://job-boards.greenhouse.io/notion/jobs/6549644003"

loader = WebBaseLoader(url)
page_data = loader.load().pop().page_content

print(page_data[:1000])


USER_AGENT environment variable not set, consider setting it to identify your requests.


Job Application for Data Engineer, Product at NotionBack to jobsData Engineer, ProductSan Francisco, CaliforniaApplyAbout Us:
We're on a mission to make it possible for every person, team, and company to be able to tailor their software to solve any problem and take on any challenge. Computers may be our most powerful tools, but most of us can't build or modify the software we use on them every day. At Notion, we want to change this with focus, design, and craft.
We've been working on this together since 2016, and have customers like OpenAI, Toyota, Figma, Ramp, and thousands more on this journey with us. Today, we're growing fast and excited for new teammates to join us who are the best at what they do. We're passionate about building a company as diverse and creative as the millions of people Notion reaches worldwide.
Notion is an in-person company, and currently requires its employees to come to the office for two Anchor Days (Mondays & Thursdays) and requests that employees spend t

Extract job JSON info from the page content using your LLM

In [3]:
from langchain_core.prompts import PromptTemplate

prompt_extract = PromptTemplate.from_template(
        """
        ### SCRAPED TEXT FROM WEBSITE:
        {page_data}
        ### INSTRUCTION:
        The scraped text is from the career's page of a website.
        Your job is to extract the job postings and return them in JSON format containing the 
        following keys: `role`, `experience`, `skills` and `description`.
        Only return the valid JSON.
        ### VALID JSON (NO PREAMBLE):    
        """
)

chain_extract = prompt_extract | llm 
res = chain_extract.invoke(input={'page_data':page_data})
type(res.content)

str

Parse the extracted JSON string into a Python dictionary

In [4]:
from langchain_core.output_parsers import JsonOutputParser

json_parser = JsonOutputParser()
job = json_parser.parse(res.content)
print(job)

{'role': 'Data Engineer, Product', 'experience': 4, 'skills': ['analytics use cases', 'solving complex data problems', 'hands-on experience shipping scalable data solutions in the cloud', 'SQL expert', 'object-oriented programming paradigms', 'designing and building highly scalable and reliable data pipelines using BigData stack'], 'description': "As Notion continues to grow rapidly, we're seeking talented data engineers to join our team and help us build foundational datasets and pipelines, as well as the infrastructure that supports them. Your work will accelerate the decision-making process of key product and business functions."}


In [5]:
type(job)

dict

In [6]:
import pandas as pd
df = pd.read_csv("sample_resumes_by_role.csv")
df

Unnamed: 0,Role,Skills,Resume_Link
0,Frontend Developer,"HTML, CSS, JavaScript, React",https://example.com/frontend-resume
1,Backend Developer,"Java, Spring Boot, MySQL",https://example.com/backend-resume
2,Full Stack Developer,"JavaScript, Node.js, MongoDB, React",https://example.com/fullstack-resume
3,Mobile App Developer,"Flutter, Dart, Firebase",https://example.com/mobile-resume
4,DevOps Engineer,"Docker, Jenkins, Kubernetes, AWS",https://example.com/devops-resume
5,Data Scientist,"Python, Pandas, NumPy, SQL",https://example.com/data-scientist-resume
6,Machine Learning Engineer,"Python, TensorFlow, Scikit-learn",https://example.com/ml-engineer-resume
7,UI/UX Designer,"Figma, Adobe XD, Sketch",https://example.com/uiux-resume
8,Cloud Architect,"AWS, Azure, GCP, Terraform",https://example.com/cloud-architect-resume
9,Cybersecurity Analyst,"Python, Wireshark, Kali Linux",https://example.com/cybersecurity-resume


Load your resumes CSV and initialize ChromaDB collection

In [7]:
import pandas as pd
import uuid
import chromadb

# Load resumes data
df = pd.read_csv("sample_resumes_by_role.csv")

# Initialize ChromaDB client and collection
client = chromadb.PersistentClient('vectorstore')
resume_collection = client.get_or_create_collection(name="resume_collection")

# Add resumes to collection if empty
if not resume_collection.count():
    for _, row in df.iterrows():
        resume_collection.add(
            documents=[row["Skills"]],
            metadatas=[{"link": row["Resume_Link"]}],
            ids=[str(uuid.uuid4())]
        )


Query ChromaDB to find top matching resumes by job skills

In [8]:
query_text = ", ".join(job.get("skills", [])) if isinstance(job.get("skills", []), list) else job.get("skills", "")

results = resume_collection.query(query_texts=[query_text], n_results=3)

# Extract resume links from query results
metadatas = results.get("metadatas", [])
links = []
if metadatas:
    for meta in metadatas[0]:
        link = meta.get("link", "")
        if link:
            links.append(link)

print("Matched resume links:", links)

Matched resume links: ['https://example.com/ba-resume', 'https://example.com/cloud-architect-resume', 'https://example.com/devops-resume']


Generate a cold email based on the job and matched resumes

In [12]:
from langchain import PromptTemplate as LangchainPromptTemplate

link_list = ", ".join(links)

prompt_email = LangchainPromptTemplate.from_template(
   """
    ### JOB DESCRIPTION:
    {job_description}
    
    ### INSTRUCTION:
    You are Sri, a job seeker reaching out to HR/recruiter about the job described above.
    Write a polite and concise cold email expressing your interest in the role.
    Highlight your relevant skills and experience based on the job description.
    Include a link to your resume suitable for the role from this list: {link_list}
    Keep the tone professional, simple, and formal.
    The email must include:
    - A subject line
    - Don't mention that you have selected a resume suitable for the role
    - Body starting with "Hello [Hiring manager's name],"
    
    
    ### EMAIL (NO PREAMBLE):
    """
)

chain_email = prompt_email | llm  # replace `llm` with your LLM instance
email_response = chain_email.invoke({"job_description": str(job), "link_list": link_list})

print(email_response.content)


Subject: Application for Data Engineer, Product Role

Hello Hiring Manager,

I am excited to express my interest in the Engineer, Product role at Notion. With 4+ years of experience in building scalable data solutions in the cloud, I believe I can make a significant impact in accelerating the decision-making process of key product and business functions.

My skills align well with the job requirements, including analytics use cases, hands-on experience with BigData stack, and building highly scalable and reliable data pipelines. I am an SQL expert and proficient in object-oriented programming paradigms.

I would love the opportunity to discuss how my skills and experience can contribute to Notion's growth. Please find my resume at https://example.com/ba-resume

Thank you for considering my application. I look forward to hearing from you soon.

Best regards,
Sri
