In [1]:
from langchain_groq import ChatGroq
from dotenv import load_dotenv
import os

# Load the .env file
load_dotenv()

# Get the API key from the environment variable
api_key = os.getenv('GROQ_API_KEY')
if not api_key:
    raise ValueError("GROQ_API_KEY is not set in the .env file")

# Use the API key
os.environ['GROQ_API_KEY'] = api_key

# Initialize the ChatGroq object
llm = ChatGroq(
    temperature=0,
    model_name="llama3-70b-8192",
)

response = llm.invoke("The first person to land on the moon was . . .")
print(response.content)

Neil Armstrong! On July 20, 1969, Neil Armstrong became the first person to set foot on the moon, famously declaring "That's one small step for man, one giant leap for mankind" as he stepped off the lunar module Eagle onto the moon's surface.


In [2]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://amazon.jobs/en/jobs/2879224/applied-scientist-generative-ai-innovation-center")
page_data = loader.load().pop().page_content
print(page_data)

USER_AGENT environment variable not set, consider setting it to identify your requests.


Applied Scientist, Generative AI Innovation Center - Job ID: 2879224 | Amazon.jobs
Skip to main contentHomeTeamsLocationsJob categoriesMy careerMy applicationsMy profileAccount securitySettingsSign outResourcesDisability accommodationsBenefitsDiversity, equity, and inclusionInterview tipsLeadership principlesWorking at AmazonFAQ×Applied Scientist, Generative AI Innovation CenterJob ID: 2879224 | AWS EMEA SARL (Spain Branch)Apply nowDESCRIPTIONMachine learning (ML) has been strategic to Amazon from the early years. We are pioneers in areas such as recommendation engines, product search, eCommerce fraud detection, and large-scale optimization of fulfillment center operations. The Generative AI team helps AWS customers accelerate the use of Generative AI to solve business and operational challenges and promote innovation in their organization. As an applied scientist, you are proficient in designing and developing advanced ML models to solve diverse challenges and opportunities. You will 

In [3]:
from langchain_core.prompts import PromptTemplate

prompt_extract = PromptTemplate.from_template(
    """
    ### SCRAPED TEXT FROM WEBSITE:
    {page_data}
    
    ### INSTRUCTION:
    The scraped text is from the career's page of a website.
    Your job is to extract the job postings and return them in JSON format 
    containing the following keys: `role`, `experience`, `skills`, and `description`.
    Only return the valid JSON.
    
    ### VALID JSON (NO PREAMBLE):
    """
)
chain_extract = prompt_extract | llm
res = chain_extract.invoke(input={'page_data' :page_data})
print(res.content)

Here is the extracted job posting in JSON format:

```
{
  "role": "Applied Scientist, Generative AI Innovation Center",
  "experience": "PhD, or a Master's degree and experience in CS, CE, ML or related field",
  "skills": [
    "Java",
    "C++",
    "Python",
    "Unix/Linux",
    "algorithms and data structures",
    "parsing",
    "numerical optimization",
    "data mining",
    "parallel and distributed computing",
    "high-performance computing"
  ],
  "description": "Design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries. Interact with customer directly to understand their business problems, and help them with defining and implementing scalable Generative AI solutions to solve them. Work closely with account teams, research scientist teams, and product engineering teams to drive model implementations and new solution."
}
```


In [4]:
from langchain_core.prompts import PromptTemplate

prompt_extract = PromptTemplate.from_template(
    """
    ### SCRAPED TEXT FROM WEBSITE:
    {page_data}
    
    ### INSTRUCTION:
    The scraped text is from the career's page of a website.
    Your job is to extract the job postings and return them in JSON format 
    containing the following keys: `role`, `experience`, `skills`, and `description`.
    Only return the valid JSON.
    
    ### VALID JSON (NO PREAMBLE):
    """
)
chain_extract = prompt_extract | llm
res = chain_extract.invoke(input={'page_data' :page_data})
type(res.content)

str

In [5]:
from langchain_core.output_parsers import JsonOutputParser

json_parser = JsonOutputParser()
json_res = json_parser.parse(res.content)
json_res

{'role': 'Applied Scientist, Generative AI Innovation Center',
 'experience': "PhD, or a Master's degree and experience in CS, CE, ML or related field",
 'skills': ['Java',
  'C++',
  'Python',
  'Unix/Linux',
  'algorithms and data structures',
  'parsing',
  'numerical optimization',
  'data mining',
  'parallel and distributed computing',
  'high-performance computing'],
 'description': 'Design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries. Interact with customer directly to understand their business problems, and help them with defining and implementing scalable Generative AI solutions to solve them. Work closely with account teams, research scientist teams, and product engineering teams to drive model implementations and new solution.'}

In [6]:
type(json_res)

dict

In [7]:
import pandas as pd

df = pd.read_csv("Cold_email_data.csv")
df

Unnamed: 0,Techstack,Links
0,"React, Node.js, MongoDB",https://example.com/react-portfolio
1,"Angular,.NET, SQL Server",https://example.com/angular-portfolio
2,"Vue.js, Ruby on Rails, PostgreSQL",https://example.com/vue-portfolio
3,"Python, Django, MySQL",https://example.com/python-portfolio
4,"Java, Spring Boot, Oracle",https://example.com/java-portfolio
5,"Flutter, Firebase, GraphQL",https://example.com/flutter-portfolio
6,"WordPress, PHP, MySQL",https://example.com/wordpress-portfolio
7,"Magento, PHP, MySQL",https://example.com/magento-portfolio
8,"React Native, Node.js, MongoDB",https://example.com/react-native-portfolio
9,"iOS, Swift, Core Data",https://example.com/ios-portfolio


In [8]:
import uuid
import chromadb

client = chromadb.PersistentClient('vectorstore')
collection = client.get_or_create_collection(name="portfolio")

if not collection.count():
    for _, row in df.iterrows():
        collection.add(documents=row["Techstack"],
                       metadatas={"links": row["Links"]},
                       ids=[str(uuid.uuid4())])

In [9]:
job = json_res
job['skills']

['Java',
 'C++',
 'Python',
 'Unix/Linux',
 'algorithms and data structures',
 'parsing',
 'numerical optimization',
 'data mining',
 'parallel and distributed computing',
 'high-performance computing']

In [10]:
links = collection.query(query_texts=job['skills'], n_results=2).get('metadatas', [])
links

[[{'links': 'https://example.com/java-portfolio'},
  {'links': 'https://example.com/android-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/magento-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/python-portfolio'}],
 [{'links': 'https://example.com/magento-portfolio'},
  {'links': 'https://example.com/java-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/magento-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/magento-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/wordpress-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/magento-portfolio'}],
 [{'links': 'https://example.com/ml-python-portfolio'},
  {'links': 'https://example.com/xamarin-portfolio'}],
 [{'links'

In [11]:
prompt_email = PromptTemplate.from_template(
        """
        ### JOB DESCRIPTION:
        {job_description}
        
        ### INSTRUCTION:
        You are Mohan, a business development executive at AtliQ. AtliQ is an AI & Software Consulting company dedicated to facilitating
        the seamless integration of business processes through automated tools. 
        Over our experience, we have empowered numerous enterprises with tailored solutions, fostering scalability, 
        process optimization, cost reduction, and heightened overall efficiency. 
        Your job is to write a cold email to the client regarding the job mentioned above describing the capability of AtliQ 
        in fulfilling their needs.
        Also add the most relevant ones from the following links to showcase Atliq's portfolio: {link_list}
        Remember you are Mohan, BDE at AtliQ. 
        Do not provide a preamble.
        ### EMAIL (NO PREAMBLE):
        
        """
        )

chain_email = prompt_email | llm
res = chain_email.invoke({"job_description": str(job), "link_list": links})
print(res.content)

Subject: Expertise in Generative AI Innovation for Your Business Challenges

Dear [Hiring Manager's Name],

I came across the job posting for an Applied Scientist, Generative AI Innovation Center, and I'm excited to introduce AtliQ, a leading AI & Software Consulting company, as a potential partner to help you design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries.

Our team of experts has extensive experience in CS, CE, ML, and related fields, with proficiency in Java, C++, Python, Unix/Linux, algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, and high-performance computing. We have a proven track record of delivering scalable Generative AI solutions that drive business growth and efficiency.

At AtliQ, we understand the importance of understanding customer business problems and defining and implementing tailored solutions to address them. Our team is well-equippe