# Cold Email Generator using LangChain, and Chroma Vector Database

## 1. Loading Modules

In [2]:
import os
import pandas as pd
from dotenv import load_dotenv

from langchain_community.document_loaders import WebBaseLoader

from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import JsonOutputParser
from langchain_core.documents import Document

import uuid
import chromadb

from langchain_groq import ChatGroq

from IPython.display import display, Markdown

## 2. Setup ENV Config

In [3]:
load_dotenv()

True

In [4]:
os.environ['GROQ_API_KEY'] = os.getenv('GROQ_API_KEY')

## 3. Global Configuration

In [6]:
llm_model = 'llama-3.1-8b-instant'
groq_llm = ChatGroq(model=llm_model, temperature=0.3) 

## 4. Job Data Scrapping and Extraction

### 4.1 Web Page Link for the Job Posting

In [7]:
### This web page must be an existing or current job posting page. Shouldn't be an old or removed one

web_page = 'https://jobs.lever.co/paytm/43e1a748-88e0-4068-a934-8e519a72982b'

### 4.2 Web Page Data Loading

In [8]:
page_loader = WebBaseLoader(web_page)

page_data = page_loader.load()

In [9]:
print(f'''Loaded Web Page: "{page_data[0].metadata['source']}"''')
print(f'''Job Post Title: "{page_data[0].metadata['title']}"''')

Loaded Web Page: "https://jobs.lever.co/paytm/43e1a748-88e0-4068-a934-8e519a72982b"
Job Post Title: "Paytm - Business Analyst - Team Lead (NOIDA)"


In [10]:
data = page_data[0].page_content

display(Markdown(data))

Paytm - Business Analyst - Team Lead (NOIDA)Business Analyst - Team Lead (NOIDA)Noida, Uttar PradeshAnalytics – EDC Retail /On-roll /On-siteApply for this jobAbout Us: Paytm is India’s leading digital payments and financial services company, which is focused on driving consumers and merchants to its platform by offering them a variety of payment use cases. Paytm provides consumers with services like utility payments and money transfers, while empowering them to pay via Paytm Payment Instruments (PPI) like Paytm Wallet, Paytm UPI, Paytm Payments Bank Netbanking, Paytm FASTag and Paytm Postpaid - Buy Now, Pay Later. To merchants, Paytm offers acquiring devices like Soundbox, EDC, QR and Payment Gateway where payment aggregation is done through PPI and also other banks’ financial instruments. To further enhance merchants’ business, Paytm offers merchants commerce services through advertising and Paytm Mini app store. Operating on this platform leverage, the company then offers credit services such as merchant loans, personal loans and BNPL, sourced by its financial partners. About the Team: Analytics- Lending- CollectionsLiaison between the creditors and consumers. They are in charge of observing accounts to identify overdue payments, report collection activity, address client queries, and develop repayment plans. About the role: As a Business Analyst for Paytm Collections, you will be responsible for managing end-to-end projects for the Collections team. You will work closely with the business team to understand their requirements and translate them into detailed Business Requirement Documents (BRDs) for the Product team. You will also be responsible for project planning, prioritization, and tracking the progress of projects to ensure timely delivery of high-quality solutions. Expectations/ Requirements 1. Must have advance SQL experience2. Basic knowledge of Python3. Working experience on Google Looker or another visualization tool4. Should have worked on automation projects related to data integration, data extraction and data visualization.5. Should have advanced excel knowledge6. Should have built management reports, MIS and dashboardsExperience in VBA.7. Analytics and critical thinking is important8. Good to have worked on Looker Studio for visualization Superpowers/ Skills that will help you succeed in this role: 1.   Bachelor's degree in Computer Science, Engineering, or a related field.2.  1+ years of experience as a Project Manager, preferably in the      financial services or technology industry.3. Strong project management skills, including the ability to develop      and manage project plans, prioritize tasks, and track progress.4. Strong analytical and problem-solving skills, with the ability to      understand complex business problems and translate them into technical      requirements.5. Excellent communication and interpersonal skills, with the ability      to work effectively with cross-functional teams and stakeholders.6. Experience in market research and analysis is a plus7. Experience working with Agile methodologies is a plus. Education: Graduation/Post Graduation Why Join Us:1.  A collaborative output driven program that brings cohesiveness across businesses through technology. 2.  Improve the average revenue per use by increasing the cross-sell opportunities.3.  A solid 360 feedback from your peer teams on your support of their goals.4.  Respect, that is earned, not demanded from your peers and manager. Compensation:If you are the right fit, we believe in creating wealth for you.With enviable 500 mn+ registered users, 21 mn+ merchants and depth of data in our ecosystem, we are in a unique position to democratize credit for deserving consumers & merchants – and we are committed to it. India’s largest digital lending story is brewing here. It’s your opportunity to be a part of the story! Apply for this jobPaytm Home PageJobs powered by 

### 4.3 Prompt Template for Job Detail Extraction

In [11]:
job_template = """
### CONTEXT:
The following context contains the scrapped data about the job posting from a website.
{text}

### STRICT INSTRUCTION:
Your job is to extract the relevant information from the context provided above and convert it into simple JSON string.
Do not add anything from your side such as PREAMBLE, etc.

### OUTPUT JSON STRING FORMAT:
Create a JSON string with the following keys: 'Role', 'Location', 'Experience', 'Skills'
"""

gen_job_template = PromptTemplate.from_template(job_template)
gen_job_template

PromptTemplate(input_variables=['text'], input_types={}, partial_variables={}, template="\n### CONTEXT:\nThe following context contains the scrapped data about the job posting from a website.\n{text}\n\n### STRICT INSTRUCTION:\nYour job is to extract the relevant information from the context provided above and convert it into simple JSON string.\nDo not add anything from your side such as PREAMBLE, etc.\n\n### OUTPUT JSON STRING FORMAT:\nCreate a JSON string with the following keys: 'Role', 'Location', 'Experience', 'Skills'\n")

### 4.4 GROQ API Call to Extract the Job Information

In [12]:
chain_job = gen_job_template | groq_llm

In [13]:
resp_job = chain_job.invoke(input={'text':data})

### 4.5 GROQ Response

In [14]:
print(resp_job.content)

{
  "Role": "Business Analyst - Team Lead",
  "Location": "Noida, Uttar Pradesh",
  "Experience": {
    "Required": "1+ years of experience as a Project Manager",
    "Preferred": "financial services or technology industry"
  },
  "Skills": [
    "Advance SQL experience",
    "Basic knowledge of Python",
    "Google Looker or another visualization tool",
    "Automation projects related to data integration, data extraction, and data visualization",
    "Advanced excel knowledge",
    "VBA",
    "Analytics and critical thinking",
    "Looker Studio for visualization"
  ]
}


In [15]:
print(f'Data Type of the Resposne: {type(resp_job.content)} \n')

json_parser = JsonOutputParser()
json_dict = json_parser.parse(resp_job.content)

print(f'Data Type of the Resposne: {type(json_dict)} \n')

json_dict

Data Type of the Resposne: <class 'str'> 

Data Type of the Resposne: <class 'dict'> 



{'Role': 'Business Analyst - Team Lead',
 'Location': 'Noida, Uttar Pradesh',
 'Experience': {'Required': '1+ years of experience as a Project Manager',
  'Preferred': 'financial services or technology industry'},
 'Skills': ['Advance SQL experience',
  'Basic knowledge of Python',
  'Google Looker or another visualization tool',
  'Automation projects related to data integration, data extraction, and data visualization',
  'Advanced excel knowledge',
  'VBA',
  'Analytics and critical thinking',
  'Looker Studio for visualization']}

In [16]:
jobs = json_dict['Skills']

## 5. Vector Store Index

### 5.1 Data Loading and Pre-Processing

In [17]:
df = pd.read_csv(r'./data/links.csv')

print(f'Shape of the Dataset: {df.shape} \n')

df.head()

Shape of the Dataset: (20, 2) 



Unnamed: 0,Techstack,Links
0,"React, Node.js, MongoDB",https://example.com/react-portfolio
1,"Angular,.NET, SQL Server",https://example.com/angular-portfolio
2,"Vue.js, Ruby on Rails, PostgreSQL",https://example.com/vue-portfolio
3,"Python, Django, MySQL",https://example.com/python-portfolio
4,"Java, Spring Boot, Oracle",https://example.com/java-portfolio


In [18]:
# Preparing Documents from the CSV file features

def data_prep(dff): 
    # Preparing a list of Dictionary of items 
    datas = []
    
    for idx, row in dff.iterrows():        
        obj = {
            'Techstack':row['Techstack'],
            'Links':row['Links']
        }
        datas.append(obj)

    
    # Preparing a list of Documents
    docs = []
    
    for obj in datas:
        te_xt = obj['Techstack']
        meta_data = {'Links':obj['Links']}
            
        doc = Document(page_content=te_xt, metadata=meta_data)
        docs.append(doc)

    return docs

In [19]:
docs = data_prep(df)

In [20]:
docs[:2]

[Document(metadata={'Links': 'https://example.com/react-portfolio'}, page_content='React, Node.js, MongoDB'),
 Document(metadata={'Links': 'https://example.com/angular-portfolio'}, page_content='Angular,.NET, SQL Server')]

### 5.2 Create Persistent Vector Store

In [21]:
print('Creating Chroma Vector DB ... wait \n')

client = chromadb.PersistentClient('./chroma_db')
collection = client.get_or_create_collection(name="portfolio")

if collection.count():
    print('Collection Exists ...')
else:
    for idx, row in df.iterrows():
        collection.add(documents=row["Techstack"],
                       metadatas={"Links": row["Links"]},
                       ids=[str(uuid.uuid4())])
        
print('Chroma Vector DB created successfully ... \n')

Creating Chroma Vector DB ... wait 

Chroma Vector DB created successfully ... 



### 5.2 Load Vector Store

In [22]:
print('Loading Chroma Vector DB ... wait \n')

client = chromadb.PersistentClient('./chroma_db')
collection = client.get_collection(name="portfolio")

print('Chroma Vector DB loaded successfully ... \n')

Loading Chroma Vector DB ... wait 

Chroma Vector DB loaded successfully ... 



## 6. Job Similarity Search and Links Metadata Output

In [23]:
links = []

for job in jobs:
    resp = collection.query(query_texts=job, n_results=1).get('metadatas', [])
    # print(job)
    # print(resp[0][0]['Links'])
    links.append((job,resp[0][0]['Links']))

links

[('Advance SQL experience', 'https://example.com/angular-portfolio'),
 ('Basic knowledge of Python', 'https://example.com/ml-python-portfolio'),
 ('Google Looker or another visualization tool',
  'https://example.com/ml-python-portfolio'),
 ('Automation projects related to data integration, data extraction, and data visualization',
  'https://example.com/magento-portfolio'),
 ('Advanced excel knowledge', 'https://example.com/ml-python-portfolio'),
 ('VBA', 'https://example.com/ml-python-portfolio'),
 ('Analytics and critical thinking', 'https://example.com/ios-portfolio'),
 ('Looker Studio for visualization',
  'https://example.com/ml-python-portfolio')]

## 7. Cold Email Generation 

### 7.1 Prompt Template for Email Generation

In [24]:
email_template = """

### JOB DESCRIPTION:
{job_descriptions}

### INSTRUCTION:
You are John, working as a Business Development Executive at ABC Corporation. 

ABC is an AI & Software Consulting company dedicated to facilitating the seamless integration of business processes 
through automated tools. 
With our experience, we have empowered numerous enterprises with tailored solutions, fostering scalability, 
process optimization, cost reduction, and improved overall efficiency. 

Your job is to write a convincing cold email to the hiring manager regarding the job description mentioned above 
describing the capability of ABC in fulfilling their needs.

Also add the most relevant links as list from the following list of links to showcase ABC's portfolio: {links}

Add your formal details with designation at the end as given below:
John,
Business Developement Executive,
ABC Corporation

Do not provide a preamble.
### EMAIL (NO PREAMBLE):
        
"""

gen_email_template = PromptTemplate.from_template(email_template)
gen_email_template

PromptTemplate(input_variables=['job_descriptions', 'links'], input_types={}, partial_variables={}, template="\n\n### JOB DESCRIPTION:\n{job_descriptions}\n\n### INSTRUCTION:\nYou are John, working as a Business Development Executive at ABC Corporation. \n\nABC is an AI & Software Consulting company dedicated to facilitating the seamless integration of business processes \nthrough automated tools. \nWith our experience, we have empowered numerous enterprises with tailored solutions, fostering scalability, \nprocess optimization, cost reduction, and improved overall efficiency. \n\nYour job is to write a convincing cold email to the hiring manager regarding the job description mentioned above \ndescribing the capability of ABC in fulfilling their needs.\n\nAlso add the most relevant links as list from the following list of links to showcase ABC's portfolio: {links}\n\nAdd your formal details with designation at the end as given below:\nJohn,\nBusiness Developement Executive,\nABC Corpor

### 7.2 GROQ API Call to Generate the Cold Email

In [25]:
chain_email = gen_email_template | groq_llm
resp_email = chain_email.invoke(input={'job_descriptions':str(jobs), 'links':links})

### 7.3 Generated Email Content

In [26]:
display(Markdown(resp_email.content))

Subject: Unlock the Full Potential of Your Business with ABC Corporation

Dear [Hiring Manager's Name],

I came across your job description for a skilled professional with expertise in advanced SQL, Python, and data visualization tools. I am excited to introduce you to ABC Corporation, a leading AI & Software Consulting company that has been empowering enterprises with tailored solutions for seamless business process integration.

At ABC Corporation, we pride ourselves on delivering cutting-edge automation projects that cater to data integration, data extraction, and data visualization needs. Our team of experts has a proven track record of leveraging advanced excel knowledge, VBA, and analytics to drive business growth and efficiency. We have successfully implemented projects using Google Looker and Looker Studio for visualization, ensuring that our clients gain valuable insights into their operations.

I would like to highlight some of our notable projects that align with your requirements:

- **Advanced SQL Experience**: [https://example.com/angular-portfolio](https://example.com/angular-portfolio)
- **Automation projects related to data integration, data extraction, and data visualization**: [https://example.com/magento-portfolio](https://example.com/magento-portfolio)
- **Google Looker or another visualization tool**: [https://example.com/ml-python-portfolio](https://example.com/ml-python-portfolio)
- **Looker Studio for visualization**: [https://example.com/ml-python-portfolio](https://example.com/ml-python-portfolio)

Our team is passionate about delivering innovative solutions that drive business success. We would be delighted to discuss how ABC Corporation can support your organization's goals and objectives. Please feel free to contact me to schedule a call and explore how we can collaborate.

Thank you for considering ABC Corporation. I look forward to the opportunity to discuss this further.

Best regards,

John,
Business Development Executive,
ABC Corporation