## Step 1: Set up LLM

In [1]:
from langchain_groq import ChatGroq

In [None]:
llm = ChatGroq(
    model_name="llama-3.3-70b-versatile",
    groq_api_key="YOUR_API_KEY",
    temperature=0, 
)

# Test if can response
response = llm.invoke("The first person to land on moon was ...")
print(response.content)

The first person to land on the moon was Neil Armstrong. He stepped out of the lunar module Eagle and onto the moon's surface on July 20, 1969, during the Apollo 11 mission. Armstrong famously declared, "That's one small step for man, one giant leap for mankind," as he became the first human to set foot on the moon.


## Step 2: Web scraping

In [3]:
from langchain_community.document_loaders import WebBaseLoader, AsyncHtmlLoader

def web_scraping(url: str):
    try:
        # Try WebBaseLoader first
        loader = WebBaseLoader(url)
        page_data = loader.load().pop().page_content
        
        # If WebBaseLoader returned empty/too small, use AsyncHtmlLoader
        if not page_data or len(page_data) < 100:
            print("⚠️ WebBaseLoader returned blank, switching to AsyncHtmlLoader...")
            loader = AsyncHtmlLoader(url)
            page_data = loader.load().pop().page_content
            
        return page_data

    except Exception as e:
        print("❌ Error in loaders:", e)
        return ""

url = "https://intel.wd1.myworkdayjobs.com/en-US/External/details/GenAI-Software-Solutions-Engineer_JR0277516-1?locations=1e4a4eb3adf1016620eafb74bf81d9cd&locations=1e4a4eb3adf101740c9ff674bf81d4cd"
page_data = web_scraping(url)
print(page_data)

USER_AGENT environment variable not set, consider setting it to identify your requests.
USER_AGENT environment variable not set, consider setting it to identify your requests.


⚠️ WebBaseLoader returned blank, switching to AsyncHtmlLoader...


Fetching pages: 100%|############################| 1/1 [00:00<00:00,  1.01it/s]

<!DOCTYPE html>
<html lang="en-US">
<head>
    <title></title>
    <!-- Application Properties -->
    <meta http-equiv="X-UA-Compatible" content="chrome=1;IE=EDGE"/>
    <meta http-equiv="content-type" content="text/html; charset=UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=2.0">

    <link rel="canonical" href="https://intel.wd1.myworkdayjobs.com/en-US/External/job/GenAI-Software-Solutions-Engineer_JR0277516-1" />

    <!-- OpenGraph Tags -->
    <meta name="title" property="og:title" content="GenAI Software Solutions Engineer">
    <meta name="description" property="og:description" content="Job Details: Job Description: The ideal candidate is hands-on with AI systems engineering, has experience integrating multiple models and runtimes, and is passionate about building secure, scalable, and efficient AI solutions that power next-generation agentic applications. Responsibilities Hybrid AI Agent Development: Architect, build, and optim




## Step 3: Extract important info from job posting

In [7]:
from langchain_core.prompts import PromptTemplate

prompt_extract_1 = PromptTemplate.from_template(
        """
        ### SCRAPED TEXT FROM WEBSITE:
        {page_data}
        ### INSTRUCTION:
        The scraped text is from the career's page of a website.
        Your job is to extract the job postings and return them in JSON format containing the 
        following keys: `role`, `experience`, `skills` and `description`.
        Only return the valid JSON.
        ### VALID JSON (NO PREAMBLE):    
        """
)

# chain_extract = LLMChain(llm=llm, prompt=prompt_extract) <--- same as below
job_chain_extract = prompt_extract_1 | llm # send prompt to LLM 
job = job_chain_extract.invoke(input={'page_data':page_data})

In [8]:
from langchain_core.output_parsers import JsonOutputParser

json_parser = JsonOutputParser() # convert str into Json object
json_job = json_parser.parse(job.content) # takes the string output (res.content) and returns a dict
json_job

{'role': 'GenAI Software Solutions Engineer',
 'experience': '2+ years hands-on experience on AI/ML algorithm development, 1+ years of hands-on experience in NLP, LLM-based systems, or AI agent development',
 'skills': ['Deep expertise in GenAI algorithms, solution architecture, and performance tuning',
  'Proven experience building custom AI tools, agents, or apps for real-world use cases',
  'Strong Python or C++ skills',
  'Excellent problem-solving skills with a results-driven, customer-focused mindset',
  'Familiarity with client AI tools, cross-platform agents, or plugin ecosystems',
  'Experience with RAG pipelines, vector databases (e.g.,FAISS, Chroma), and embedding techniques',
  'Experience optimizing GenAI workloads for edge devices using xPU accelerators',
  'Experience with local LLMs (e.g., Mistral, Llama) or fine-tuning open-source models',
  'Experience in customer/partner support for GenAI workflow design and deployment',
  'Experience with frameworks such as LangChai

## Step 4: Scrap text from resume

In [11]:
from langchain_community.document_loaders import PyPDFLoader, Docx2txtLoader

def resume_scraping(pdf_path:str):
    if pdf_path.endswith(".pdf"):
        # if pdf use PyPDFLoader
        loader = PyPDFLoader(pdf_path)
        resume = loader.load().pop().page_content

    # if docs use Docx2txtLoader
    elif pdf_path.endswith(".docx"):
        loader = Docx2txtLoader(pdf_path)
        resume = loader.load().pop().page_content
            
    return resume

pdf_path = "../Data/Resume.pdf"   # local file
resume = resume_scraping(pdf_path)
print(resume)

DENNIS
SCHERRER
IT Engineer
d.scherrer38@email.com
(123) 456-7890
Houston, TX
LinkedIn
EDUCATION
Bachelor of Science
Computer Science
Texas A&M University
2005 - 2009
College Station, TX
SKILLS
Python
Microsoft 365
Agile Project Management
Network Infrastructure
Troubleshooting
Windows/Apple OS
VPN Maintenance
Verbal Communication
Customer Service
CERTIFICATIONS
MCSE
CCNA
WORK EXPERIENCE
IT Engineer
Loomis Armored US, LLC
2020 - current Houston, TX
Hired 11 technicians and instructed them in Agile project
management, increasing efﬁciency by 39%
Drafted troubleshooting guides for common technical
strategies, decreasing average ticket resolution time by 48%
Collaborated with 13 techs to upgrade VPN security,
including updating encryption methods and adding
antivirus protection, reducing chances of a breach by 67%
Developed and enhanced product security systems,
meeting 100% of client requirements
Network Engineer
ADP
2017 - 2020 Houston, TX
Created and reorganized SQL queries and scripts

## Step 5: Extract important info from pdf/docs

In [12]:
from langchain_core.prompts import PromptTemplate

prompt_extract_2 = PromptTemplate.from_template(
        """
        ### SCRAPED TEXT FROM RESUME:
        {page_data}
        ### INSTRUCTION:
        The scraped text is from the resume uploaded by user in pdf or docs.
        Your job is to extract the resume and return them in JSON format containing the 
        following keys: `education`, `experience` and `skills`,
        Only return the valid JSON.
        ### VALID JSON (NO PREAMBLE):    
        """
)

resume_chain_extract = prompt_extract_2 | llm 
resume = resume_chain_extract.invoke(input={'page_data':resume})

In [13]:
from langchain_core.output_parsers import JsonOutputParser

json_parser = JsonOutputParser() # convert output of an LLM call into Json object
json_resume = json_parser.parse(resume.content) # takes the string output (res.content) and returns a dict
json_resume

{'education': [{'degree': 'Bachelor of Science',
   'field': 'Computer Science',
   'university': 'Texas A&M University',
   'location': 'College Station, TX',
   'duration': '2005 - 2009'}],
 'experience': [{'jobTitle': 'IT Engineer',
   'company': 'Loomis Armored US, LLC',
   'location': 'Houston, TX',
   'duration': '2020 - current',
   'achievements': ['Hired 11 technicians and instructed them in Agile project management, increasing efficiency by 39%',
    'Drafted troubleshooting guides for common technical strategies, decreasing average ticket resolution time by 48%',
    'Collaborated with 13 techs to upgrade VPN security, including updating encryption methods and adding antivirus protection, reducing chances of a breach by 67%',
    'Developed and enhanced product security systems, meeting 100% of client requirements']},
  {'jobTitle': 'Network Engineer',
   'company': 'ADP',
   'location': 'Houston, TX',
   'duration': '2017 - 2020',
   'achievements': ['Created and reorganize

## Step 6: Store embedded resume in vectordb

In [21]:
import uuid
import chromadb
import json

client = chromadb.PersistentClient('vectorstore') 
collection = client.get_or_create_collection(name="resume")

if not collection.count():
    # Education
    for edu in json_resume.get("education", []): # iterate over the list inside the dictionary
        collection.add(
            documents=[json.dumps(edu)], # store value in education key 
            metadatas=[{"type": "education"}],
            ids=[str(uuid.uuid4())]
        )
        
    # Experience
    for exp in json_resume.get("experience", []): # [] → an empty list. # [{}] → a list containing one empty dictionary.
        collection.add(
            documents=[json.dumps(exp)],
            metadatas=[{"type": "experience"}],
            ids=[str(uuid.uuid4())]
        )

    # skills
    for skill in json_resume.get("skills", []):
        collection.add(
            documents=[json.dumps(skill)],
            metadatas=[{"type": "skills"}],
            ids=[str(uuid.uuid4())]
        )

## Step 7: Make query with job skills

In [22]:
education = collection.query(
    query_texts=json_job['skills'],
    n_results=1, 
    where={"type": "education"}
)

education = education['documents']
education
# 1 skill correspond to 1 result

[['{"degree": "Bachelor of Science", "field": "Computer Science", "university": "Texas A&M University", "location": "College Station, TX", "duration": "2005 - 2009"}'],
 ['{"degree": "Bachelor of Science", "field": "Computer Science", "university": "Texas A&M University", "location": "College Station, TX", "duration": "2005 - 2009"}'],
 ['{"degree": "Bachelor of Science", "field": "Computer Science", "university": "Texas A&M University", "location": "College Station, TX", "duration": "2005 - 2009"}'],
 ['{"degree": "Bachelor of Science", "field": "Computer Science", "university": "Texas A&M University", "location": "College Station, TX", "duration": "2005 - 2009"}'],
 ['{"degree": "Bachelor of Science", "field": "Computer Science", "university": "Texas A&M University", "location": "College Station, TX", "duration": "2005 - 2009"}'],
 ['{"degree": "Bachelor of Science", "field": "Computer Science", "university": "Texas A&M University", "location": "College Station, TX", "duration": "200

In [23]:
experience = collection.query(
    query_texts=json_job['skills'],
    n_results=2, 
    where={"type": "experience"}
)

experience = experience['documents']
experience

[['{"jobTitle": "IT Engineer", "company": "Loomis Armored US, LLC", "location": "Houston, TX", "duration": "2020 - current", "achievements": ["Hired 11 technicians and instructed them in Agile project management, increasing efficiency by 39%", "Drafted troubleshooting guides for common technical strategies, decreasing average ticket resolution time by 48%", "Collaborated with 13 techs to upgrade VPN security, including updating encryption methods and adding antivirus protection, reducing chances of a breach by 67%", "Developed and enhanced product security systems, meeting 100% of client requirements"]}',
  '{"jobTitle": "Network Engineer", "company": "ADP", "location": "Houston, TX", "duration": "2017 - 2020", "achievements": ["Created and reorganized SQL queries and scripts for internal troubleshooting, decreasing work tickets by 28%", "Analyzed escalated tickets and coached junior techs to resolve 84% of excessive escalations", "Analyzed diagnostic data to understand causes/correlat

In [24]:
skills = collection.query(
    query_texts=json_job['skills'],
    n_results=3, 
    where={"type": "skills"}
)

skills = skills['documents']
skills

[['"Network Infrastructure"', '"VPN Maintenance"', '"Python"'],
 ['"Python"', '"Microsoft 365"', '"Troubleshooting"'],
 ['"Python"', '"Network Infrastructure"', '"Agile Project Management"'],
 ['"Customer Service"', '"Troubleshooting"', '"Agile Project Management"'],
 ['"Microsoft 365"', '"Python"', '"Network Infrastructure"'],
 ['"Python"', '"Customer Service"', '"Network Infrastructure"'],
 ['"Network Infrastructure"', '"Microsoft 365"', '"Windows/Apple OS"'],
 ['"Agile Project Management"',
  '"Network Infrastructure"',
  '"Troubleshooting"'],
 ['"Customer Service"', '"Agile Project Management"', '"VPN Maintenance"'],
 ['"Python"', '"Microsoft 365"', '"Agile Project Management"'],
 ['"Troubleshooting"', '"Python"', '"Verbal Communication"']]

## Step 8: Generate email

In [25]:
prompt_email = PromptTemplate.from_template(
        """
        ### JOB DESCRIPTION:
        {job_description}
        
        ### INSTRUCTION:
        You the person written in the resume, who is finding job and excellent at writing email.
        Your job is to write a personalized cold email to the company to apply for the job mentioned above describing your capability 
        in fulfilling their needs.
        Make the email **professional**, **confident**, and **concise**.
        Mention about company name for more personalization.
        If you dont know what the company name is, leave as [Company Name].
        Also add the most relevant ones from the following education, experience and skills to showcase youself: {education}, {experience} and {skills}
        Remember you are the person written in the resume, write as your are the person. 
        If you dont know what your name, leave as [Your Name].
        Do not create or write anything outside of resume.
        A recruiter may skim emails so keep the email short but concise.
        Do not provide a preamble.
        ### EMAIL (NO PREAMBLE):
        
        """
        )

chain_email = prompt_email | llm
res = chain_email.invoke({"job_description": str(json_job), "education": education, "experience": experience, "skills": skills})
print(res.content)

Subject: Application for GenAI Software Solutions Engineer at [Company Name]

Dear Hiring Manager,

I am [Your Name], a highly motivated and experienced IT professional with a strong background in AI/ML algorithm development, NLP, and LLM-based systems. With over 2 years of hands-on experience in AI/ML and 1 year of experience in NLP, I am confident in my ability to fulfill the requirements of the GenAI Software Solutions Engineer role at [Company Name].

As a seasoned IT Engineer with experience at Loomis Armored US, LLC, and ADP, I possess deep expertise in Python, Network Infrastructure, and Agile Project Management. My achievements include increasing efficiency by 39% through Agile project management, decreasing average ticket resolution time by 48%, and collaborating with teams to upgrade VPN security, reducing breach chances by 67%.

I am excited about the opportunity to leverage my skills and experience to contribute to the development of secure, scalable, and efficient AI solut