## Groq Llama integration
This is a Groq Llama integration for the [Groq](https://groq.dev) query language. It provides syntax highlighting and code completion for Groq queries as a basis for the rest of the integration.

In [2]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    temperature=0,
    groq_api_key='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',
    model_name="llama-3.1-70b-versatile"
)

### Job Description Web Scraping
using webscraping we can streamline our prompts and queries to the specific gob we are applying for to make it easier to use the Groq Llama integration.

In [3]:
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader("https://nutanix.dejobs.org/san-jose-ca/systems-reliability-engineer-intern-san-jose-ca-summer-2025-undergraduate-only/FF33B9F348D44EF1874E2147AF635482/job/")
page_data = loader.load().pop().page_content
print(page_data)

USER_AGENT environment variable not set, consider setting it to identify your requests.






  
    Nutanix Jobs - Systems Reliability Engineer Intern, San Jose, CA (Summer 2025 - Undergraduate Only) in San Jose, California, United States
  







































            We use cookies to improve your experience on our site. To find out more, read our privacy policy.
        

            Accept
        



Nutanix Jobs















What



job title, keywords




Where



city, state, country













Home

            View All Jobs
            
(535)








Job Information




Nutanix


Systems Reliability Engineer Intern, San Jose, CA (Summer 2025 - Undergraduate Only) in
            
San Jose, California








Hungry, Humble, Honest, with Heart.
The Opportunity
This is a 12-week internship with start dates beginning in  May through June 2025,  contingent on your availability. You will work in our San Jose, CA office (3 days/week) and remotely (2 days/week).  Submit your application by October 4, 2024.
Build on your studies and gain profess

In [4]:
from langchain_core.prompts import PromptTemplate

prompt_extract = PromptTemplate.from_template(
    """
    ### SCRAPED TEXT FROM WEBSITE:
    {page_data}
    ### INSTRUCTION:
    The scraped text is from the career's page of a website.
    Your job is to extract the first visible job postings and return it in JSON format containing the following keys: `company`, `role`, `experience`, `skills`, `company description` and `role description`.
    Only return the valid JSON.
    ### VALID JSON (NO PREAMBLE):
    """
)

chain_extract = prompt_extract | llm
res = chain_extract.invoke(input={'page_data':page_data})
print(res.content)

```json
{
  "company": "Nutanix",
  "role": "Systems Reliability Engineer Intern",
  "experience": "Intern",
  "skills": [
    "Linux",
    "Networking",
    "Virtualization",
    "Storage technologies"
  ],
  "company_description": "Hungry, Humble, Honest, with Heart. Our Worldwide Support team delivers industry-leading customer experiences to 24,000+ happy customers in over 200 countries.",
  "role_description": "Work with our enterprise customers to provide technical engineering support and an enriching customer experience, troubleshooting and resolving complex technical issues to shape the allegiance and success of our growing customer base."
}
```


In [5]:
from langchain_core.output_parsers import JsonOutputParser

json_parser = JsonOutputParser()
json_res = json_parser.parse(res.content)
json_res

{'company': 'Nutanix',
 'role': 'Systems Reliability Engineer Intern',
 'experience': 'Intern',
 'skills': ['Linux', 'Networking', 'Virtualization', 'Storage technologies'],
 'company_description': 'Hungry, Humble, Honest, with Heart. Our Worldwide Support team delivers industry-leading customer experiences to 24,000+ happy customers in over 200 countries.',
 'role_description': 'Work with our enterprise customers to provide technical engineering support and an enriching customer experience, troubleshooting and resolving complex technical issues to shape the allegiance and success of our growing customer base.'}

Load up your portfolio from a csv file that matches specific techstacks and skills to a project in your portfolio.

In [6]:
import pandas as pd

df = pd.read_csv("app/resources/my_portfolio.csv")
df

Unnamed: 0,Techstack,Links,Description
0,"Java, Event-Driven Architecture, Microservices...",https://github.com/Biswas57/Tributary-Cluster,A Java-based event processing API emulating Ap...
1,"Java, Spring Boot, RESTful API Development, Re...",https://github.com/Biswas57/shopping-website-s...,An e-commerce platform enhanced with search ca...
2,"Shell Scripting, POSIX Compliance, Linux, Dash...",https://github.com/Biswas57/GRiD,"A version control system implemented in Dash, ..."
3,"React, JavaScript, HTML, CSS, Styled-Component...",https://github.com/Biswas57/Recipe-Book-Website,An interactive web application built with Reac...
4,"TypeScript, JavaScript, Node.js, Express.js, A...",https://github.com/Biswas57/Toohak,A TypeScript and JavaScript-based Kahoot! emul...


## ChromaDB integration

This vector database will allow this AI tool to filter through our application information to answer application questions and generate emails. This will allow the AI tool to have a better understanding of the user's needs and provide a more accurate response.

Using the vector database we can see below how the database is able to take a simple prompt and use linear algebra to correlate a prompt with the most relevant information in the database. This is done by taking the dot product of the prompt and the database to find the most relevant information.

In [19]:
import chromadb
import uuid

client = chromadb.PersistentClient()
collection = client.get_or_create_collection(name="portfolio")

if not collection.count():
    for _, row in df.iterrows():
        collection.add(
            documents=[row["Description"]],
            metadatas={"links": row["Description"]},
            ids=[str(uuid.uuid4())]
        )

job = json_res
job['skills']

['Linux', 'Networking', 'Virtualization', 'Storage technologies']

In [20]:
links = collection.query(query_texts=job['skills'], n_results=1)
links

{'ids': [['f0cb5492-5a10-4316-b2dc-6df53de60497'],
  ['ddc892dc-2389-43a4-bd0d-1f59de0f0d84'],
  ['ddc892dc-2389-43a4-bd0d-1f59de0f0d84'],
  ['f0cb5492-5a10-4316-b2dc-6df53de60497']],
 'distances': [[1.2457473689951613],
  [1.6892686887854633],
  [1.5122358562382314],
  [1.698296256316049]],
 'metadatas': [[{'links': 'https://github.com/Biswas57/GRiD'}],
  [{'links': 'https://github.com/Biswas57/Tributary-Cluster'}],
  [{'links': 'https://github.com/Biswas57/Tributary-Cluster'}],
  [{'links': 'https://github.com/Biswas57/GRiD'}]],
 'embeddings': None,
 'documents': [['Shell Scripting, POSIX Compliance, Linux, Dash Shell, Bash Utilities, File System Management, Version Control Desing, '],
  ['Java, Event-Driven Architecture, Microservices, Concurrency, Multithreading, Object-Oriented Programming (OOP), Design Patterns, JUnit Testing, Mockito Testing, API Development'],
  ['Java, Event-Driven Architecture, Microservices, Concurrency, Multithreading, Object-Oriented Programming (OOP), Des

### Prompt Engineering

The AI will prompt the user for a job description and then scrape the web for the job description and then prompt the user for their portfolio and then scrape the web for the portfolio. The below prompt will be then used to generate a query that will used the above vector database to match the job description to the portfolio and then write the user an email that they can send to the job recruiter.

In [None]:
prompt_email = PromptTemplate.from_template(
        """
        ### JOB DESCRIPTION:
        {job_description}
        
        ### INSTRUCTION:
        You are Biswas Simkhada, an aspiring software engineer and a Bachelor of Computer Science student at the University of New South Wales, expected to graduate in December 2026. You have hands-on experience in developing complex software projects, such as creating a Kafka Cluster Emulator using Java and a Kahoot! clone API using TypeScript and JavaScript with Node.js and Express. You have expertise in Java, Python, SQL, JavaScript, and frameworks like Spring Boot, React.js, and Express, as well as experience with CI/CD pipelines, testing frameworks like JUnit and Spock, and various development tools including Git, GitHub, and Vercel.
        Your job is to write a cold email to a job recruiter regarding the job mentioned above, showcasing your expertise in software development, your ability to build scalable and efficient solutions, and your experience with relevant technologies that meet their needs.
        Also, add the most relevant ones from the following links to highlight your portfolio: {link_list}
        Remember you are Biswas Simkhada, a skilled software engineer with a background in AI and software consulting.
        Do not provide a preamble.
        ### EMAIL (NO PREAMBLE):
        """
        )

chain_email = prompt_email | llm
res = chain_email.invoke({"job_description": str(job), "link_list": links})
print(res.content)