<a href="https://colab.research.google.com/github/Devica2000/StanfordTech16LLM/blob/main/Devica_Verma_TECH16_HW5.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building a Multi Agent Crew to Support the Job Application Process
- Tailors the candidate resume according to a given role (MLE at Reddit in this example)
- Generates a candidate profile summary
- Generates a list of interview questions and talking points related to the specific job role and description

In [1]:
!pip install crewai==0.28.8 crewai_tools==0.1.6 langchain_community==0.0.29

Collecting crewai==0.28.8
  Downloading crewai-0.28.8-py3-none-any.whl.metadata (13 kB)
Collecting crewai_tools==0.1.6
  Downloading crewai_tools-0.1.6-py3-none-any.whl.metadata (4.6 kB)
Collecting langchain_community==0.0.29
  Downloading langchain_community-0.0.29-py3-none-any.whl.metadata (8.3 kB)
Collecting appdirs<2.0.0,>=1.4.4 (from crewai==0.28.8)
  Downloading appdirs-1.4.4-py2.py3-none-any.whl.metadata (9.0 kB)
Collecting embedchain<0.2.0,>=0.1.98 (from crewai==0.28.8)
  Downloading embedchain-0.1.120-py3-none-any.whl.metadata (9.3 kB)
Collecting instructor<0.6.0,>=0.5.2 (from crewai==0.28.8)
  Downloading instructor-0.5.2-py3-none-any.whl.metadata (10 kB)
Collecting langchain<0.2.0,>=0.1.10 (from crewai==0.28.8)
  Downloading langchain-0.1.20-py3-none-any.whl.metadata (13 kB)
Collecting openai<2.0.0,>=1.13.3 (from crewai==0.28.8)
  Downloading openai-1.41.0-py3-none-any.whl.metadata (22 kB)
Collecting opentelemetry-api<2.0.0,>=1.22.0 (from crewai==0.28.8)
  Downloading opente

In [50]:
# Warning control
import warnings
warnings.filterwarnings('ignore')

In [51]:
from crewai import Agent, Task, Crew

In [52]:
import os
from google.colab import userdata

os.environ['OPENAI_API_KEY'] = userdata.get('open_ai_key')
os.environ["OPENAI_MODEL_NAME"] = 'gpt-4-turbo'
os.environ["SERPER_API_KEY"] = userdata.get('serp_api_key')

In [53]:
!pip install PyPDF2



In [54]:
# Creating a custom PDF read tool using Crew AI

from crewai_tools import BaseTool
import PyPDF2

class PDFReadTool(BaseTool):
    name: str = "PDF Reader"
    description: str = "A tool that reads the content of a PDF file. Provide the file path as an argument."

    def _run(self, file_path: str) -> str:
        try:
            with open(file_path, 'rb') as file:
                reader = PyPDF2.PdfReader(file)
                text = ""
                for page in reader.pages:
                    text += page.extract_text() + "\n"
            return text
        except Exception as e:
            return f"An error occurred while reading the PDF: {str(e)}"


In [55]:
pdf_tool = PDFReadTool()

In [56]:
from google.colab import auth
auth.authenticate_user()

#Copying my resume from GCS bucket to colab
!gsutil cp gs://hw1_audio_files/Devica_Verma_Resume.pdf /content/Devica_Verma_Resume.pdf

Copying gs://hw1_audio_files/Devica_Verma_Resume.pdf...
- [1 files][174.3 KiB/174.3 KiB]                                                
Operation completed over 1 objects/174.3 KiB.                                    


In [57]:
job_posting_url = 'https://www.linkedin.com/jobs/view/machine-learning-engineer-core-ranking-at-reddit-inc-3994283274?utm_campaign=google_jobs_apply&utm_source=google_jobs_apply&utm_medium=organic&original_referer=https%3A%2F%2Fwww.google.com%2F',
github_url = 'https://github.com/Devica2000?tab=repositories',
linkedin_url = 'https://www.linkedin.com/in/devica-verma/',
pdf_filepath = "./Devica_Verma_Resume.pdf"


## Defining the tools needed from CrewAI

In [58]:
from crewai_tools import (
  ScrapeWebsiteTool,
  PDFSearchTool,
  SerperDevTool
)

search_tool = SerperDevTool()
scrape_tool = ScrapeWebsiteTool()
# read_resume = pdf_tool(file_path='./Devica_Verma_Resume.pdf')
semantic_search_resume = PDFSearchTool()

## Creating Agents

In [59]:
# Agent 1: Researcher
researcher = Agent(
    role="Tech Job Researcher",
    goal="Make sure to do amazing analysis on "
         "job posting to help job applicants",
    tools = [scrape_tool, search_tool],
    verbose=True,
    backstory=(
        "As a Job Researcher, your prowess in "
        "navigating and extracting critical "
        "information from job postings is unmatched."
        "Your skills help pinpoint the necessary "
        "qualifications and skills sought "
        "by employers, forming the foundation for "
        "effective application tailoring."
    )
)

In [60]:
# Agent 2: Profiler
profiler = Agent(
    role="Personal Profiler for Engineers",
    goal="Do incredible research on job applicants "
         "to help them stand out in the job market",
    tools = [scrape_tool, search_tool,
             pdf_tool, semantic_search_resume],
    verbose=True,
    backstory=(
        "Equipped with analytical prowess, you dissect "
        "and synthesize information "
        "from diverse sources to craft comprehensive "
        "personal and professional profiles, laying the "
        "groundwork for personalized resume enhancements."
    )
)

In [61]:
# Agent 3: Resume Strategist
resume_strategist = Agent(
    role="Resume Strategist for Engineers",
    goal="Find all the best ways to make a "
         "resume stand out in the job market.",
    tools = [scrape_tool, search_tool,
             pdf_tool, semantic_search_resume],
    verbose=True,
    backstory=(
        "With a strategic mind and an eye for detail, you "
        "excel at refining resumes to highlight the most "
        "relevant skills and experiences, ensuring they "
        "resonate perfectly with the job's requirements."
    )
)

In [62]:
# Agent 4: Interview Preparer
interview_preparer = Agent(
    role="Engineering Interview Preparer",
    goal="Create interview questions and talking points "
         "based on the resume and job requirements",
    tools = [scrape_tool, search_tool,
             pdf_tool, semantic_search_resume],
    verbose=True,
    backstory=(
        "Your role is crucial in anticipating the dynamics of "
        "interviews. With your ability to formulate key questions "
        "and talking points, you prepare candidates for success, "
        "ensuring they can confidently address all aspects of the "
        "job they are applying for."
    )
)

## Creating Tasks

In [63]:
# Task for Researcher Agent: Extract Job Requirements
research_task = Task(
    description=(
        "Analyze the job posting URL provided ({job_posting_url}) "
        "to extract key skills, experiences, and qualifications "
        "required. Use the tools to gather content and identify "
        "and categorize the requirements."
    ),
    expected_output=(
        "A structured list of job requirements, including necessary "
        "skills, qualifications, and experiences."
    ),
    agent=researcher,
    async_execution=True
)

In [64]:
# Task for Profiler Agent: Compile Comprehensive Profile
profile_task = Task(
    description=(
        "Compile a detailed personal and professional profile "
        "using the candidate resume ({pdf_filepath}), GitHub ({github_url}) URLs "
        "and LinkedIn profile ({linkedin_url}). Utilize tools to extract and "
        "synthesize information from all these sources before forming a "
        "comprehensive profile."
    ),
    expected_output=(
        "A comprehensive profile document that includes skills, "
        "project experiences, contributions, interests, and "
        "communication style."
    ),
    agent=profiler,
    async_execution=True
)

In [65]:
# Task for Resume Strategist Agent: Align Resume with Job Requirements
resume_strategy_task = Task(
    description=(
        "Using the profile and job requirements obtained from "
        "previous tasks, tailor the resume ({pdf_filepath}) to highlight the most "
        "relevant areas. Employ tools to adjust and enhance the "
        "resume content. Make sure this is the best resume even but "
        "don't make up any information. Update every section, "
        "All to better reflrect the candidates "
        "abilities and how it matches the job posting."
    ),
    expected_output=(
        "An updated resume that effectively highlights the candidate's "
        "qualifications and experiences relevant to the job."
    ),
    output_file="tailored_resume.md",
    context=[research_task, profile_task],
    agent=resume_strategist
)

In [66]:
# Task for Profiler Agent: Generate a candidate summary
resume_summary_task = Task(
    description=(
        "Create a brief summary of the candidate profile ONLY using "
        "the provided resume ({pdf_filepath}) and LinkedIn profile ({linkedin_url}). Be sure to highlight the"
        "key skills, qualifications, and experiences. "
        "Make sure that the created summary is written in first person "
        "so that it could be used by the candidate to reach out to"
        "potential employers on LinkedIn."
        "Be concise. Do not make up any additional information on your own."
    ),
    expected_output=(
        "A summary of the candidates profile which the candidate"
        "could use to reach out to potential employers on LinkedIn."
    ),
    output_file="summary.md",
    agent=profiler
)


In [67]:
# Task for Interview Preparer Agent: Develop Interview Materials
interview_preparation_task = Task(
    description=(
        "Create a set of potential interview questions and talking "
        "points based on the tailored resume and job requirements. "
        "Utilize tools to generate relevant questions and discussion "
        "points. Make sure to use these question and talking points to "
        "help the candiadte highlight the main points of the resume "
        "and how it matches the job posting."
    ),
    expected_output=(
        "A document containing all possible key questions and talking points "
        "that the candidate should prepare for the initial interview. Be detailed."
    ),
    output_file="interview_materials.md",
    context=[research_task, profile_task, resume_strategy_task],
    agent=interview_preparer
)


## Creating the Crew

In [68]:
job_application_crew = Crew(
    agents=[researcher,
            profiler,
            resume_strategist,
            interview_preparer],

    tasks=[research_task,
           profile_task,
           resume_summary_task,
           resume_strategy_task,
           interview_preparation_task],

    verbose=True
)



## Running the Crew

In [69]:
job_application_inputs = {
    'job_posting_url': job_posting_url,
    # 'personal_writeup': personal_writeup,
    'github_url':  github_url,
    'linkedin_url': linkedin_url,
    'pdf_filepath': pdf_filepath
}

In [70]:
### this execution will take a few minutes to run
result = job_application_crew.kickoff(inputs=job_application_inputs)

[1m[95m [DEBUG]: == Working Agent: Tech Job Researcher[00m
[1m[95m [INFO]: == Starting Task: Analyze the job posting URL provided (('https://www.linkedin.com/jobs/view/machine-learning-engineer-core-ranking-at-reddit-inc-3994283274?utm_campaign=google_jobs_apply&utm_source=google_jobs_apply&utm_medium=organic&original_referer=https%3A%2F%2Fwww.google.com%2F',)) to extract key skills, experiences, and qualifications required. Use the tools to gather content and identify and categorize the requirements.[00m
[1m[92m [DEBUG]: == [Tech Job Researcher] Task output: 

[00m
[1m[95m [DEBUG]: == Working Agent: Personal Profiler for Engineers[00m
[1m[95m [INFO]: == Starting Task: Compile a detailed personal and professional profile using the candidate resume (./Devica_Verma_Resume.pdf), GitHub (('https://github.com/Devica2000?tab=repositories',)) URLs and LinkedIn profile (('https://www.linkedin.com/in/devica-verma/',)). Utilize tools to extract and synthesize information from all t

In [71]:
#Displaying the new tailored resume

from IPython.display import Markdown, display
display(Markdown("./tailored_resume.md"))

DEVICA VERMA
Las Vegas, NV, (469) 922-9511 | dv2465@columbia.edu | www.linkedin.com/in/devica-verma/

**EDUCATION**
**Columbia University**
New York, NY
Master’s in Computer Science, Machine Learning Track
Dec 2022
- Teaching Assistant for COMS 4995 – Applied Machine Learning taught by Prof. Vijay Pappu
- Core Courses: Algorithms, Artificial Intelligence, Machine Learning, Deep Learning, Natural Language Processing, and Big Data Analytics

**Amity University**
Noida, IN
Bachelor’s in Electronics and Communication Engineering
Jul 2021
- Gold Medalist; Recipient of Amity University Merit Scholarship

**WORK EXPERIENCE**
**Vegas.com**
Las Vegas, NV
Data Scientist
Mar 2023 - Present
- Led the development and implementation of high-performing machine learning models for ranking search results, significantly enhancing user engagement and revenue.
- Conducted extensive A/B testing with a user base exceeding 1M monthly active users, driving improvements in user engagement and conversion rates through insightful data analysis.
- Developed robust models to assess and mitigate site outages, enhancing website reliability and user experience.
- Collaborated across teams to swiftly resolve critical website issues, improving overall system stability and user satisfaction.

**JobTarget**
New York, NY
Data Scientist Intern
May 2022 - Aug 2022
- Engineered a novel CNN architecture for predictive analytics in job posting clicks, integrating advanced machine learning techniques and achieving high accuracy.
- Partnered closely with the Sales Team to enhance data pipeline efficiency, significantly reducing data processing times and supporting faster decision-making.
- Developed and managed performance dashboards, providing key insights that supported strategic business decisions.

**PROJECTS**
**Columbia University**
- **Detection of Cancer Metastases on Gigapixel Pathology Images** (Sep 2022 - Dec 2022): Applied advanced machine learning algorithms, including Inceptionv3 and custom CNN models, to accurately identify cancerous tissues, enhancing diagnostic processes.
- **Prediction of Health Inspection Scores of Restaurants in New York City** (Mar 2022 - Apr 2022): Utilized machine learning models to predict inspection scores, improving public health outcomes.

**Amity University**
- **Prediction of LTE User Throughput and Capacity using Artificial Neural Networks** (Sep 2020 - May 2021): Developed models that significantly enhanced the prediction accuracy of network throughput and capacity, contributing to a peer-reviewed journal.

**TECHNICAL SKILLS**
- **Programming Languages**: Python, R, SQL, C, C++
- **Modeling and Frameworks**: Machine Learning, Deep Learning, NLP, TensorFlow, Keras, Pandas, Scikit-Learn
- **Software & Tools**: Git, GitHub, AWS, GCP, Apache Spark, Hadoop, Google Analytics, A/B Testing

**LEADERSHIP EXPERIENCE**
- **TreeHacks** – Stanford University: Judged and mentored projects focusing on advanced machine learning applications.
- **GRATE** – Founder & President: Led a community dedicated to technological innovation and professional development.
- **Women in AI India** – Core Team Member: Advocated for and supported women's involvement in AI and data sciences.

**GitHub Contributions**
- Active participant in projects involving Jupyter Notebook and Python, focusing on machine learning and data science.

This tailored resume highlights Devica Verma's relevant skills, experiences, and qualifications that align with the Machine Learning Engineer position at Reddit Inc., emphasizing her proficiency in Python, machine learning expertise, and leadership in tech communities.

In [72]:
#Displaying the professional summary generated
display(Markdown("./summary.md"))

"Hello, I'm Devica Verma, a passionate Data Scientist currently based in Las Vegas, NV. I completed my Master’s in Computer Science specializing in Machine Learning from Columbia University, where I also served as a Teaching Assistant in Applied Machine Learning. My academic journey began at Amity University, where I graduated as a Gold Medalist in Electronics and Communication Engineering.

Professionally, I am employed at Vegas.com, where my role involves leading the development of machine learning models that significantly enhance user engagement and revenue. Previously, I interned at JobTarget in New York, where I developed a novel CNN architecture that optimized ad performance metrics.

My technical expertise spans Python, R, SQL, and C/C++, with a deep proficiency in Machine Learning, Deep Learning, and Natural Language Processing. I am skilled in using frameworks like TensorFlow, Keras, and Scikit-Learn, and am familiar with tools such as AWS, GCP, and Apache Spark.

Beyond my technical skills, I am an active community leader, having founded GRATE, a community focused on academic excellence, and I am deeply involved with Women in AI, advocating for increased female participation in this field.

I'm always eager to connect with professionals who share my passion for technology and innovation. Let’s connect and explore how we can drive forward the future of data science and machine learning together."

This summary uses the first-person perspective and includes key details from Devica’s professional and academic background, suitable for reaching out to potential employers on LinkedIn.

In [73]:
#Displaying the interview questions and talking points related to the job posting
display(Markdown("./interview_materials.md"))

Based on the resume of Devica Verma and the job requirements for a Machine Learning Engineer at Reddit Inc., here is a comprehensive set of interview questions and talking points designed to highlight Devica's qualifications and match them to the job's demands:

### Interview Questions:

1. **Python Proficiency & Libraries Usage**
   - Can you discuss a project where you utilized TensorFlow or PyTorch? What specific challenges did you face and how did you overcome them?
   - How have you used Python libraries such as pandas in your data manipulation work?

2. **Machine Learning Algorithms & Principles**
   - Please explain a time when you had to choose one machine learning algorithm over another for a project at Vegas.com.
   - What machine learning principles do you consider most crucial when developing models for large-scale datasets?

3. **Experience with Large-scale Datasets**
   - Could you share your experience working with large-scale datasets at Vegas.com and the types of models you developed?
   - How do you ensure your models perform well at scale?

4. **Software Engineering Best Practices**
   - How do you incorporate coding standards and code reviews into your workflow?
   - What source control management practices do you follow and why do you think they are important?

5. **Educational Background**
   - How has your Master’s degree from Columbia University prepared you for a role in machine learning engineering?
   - Can you discuss a key takeaway from one of your core courses that you applied in a professional setting?

6. **Professional Experience Related to Job Requirements**
   - You have experience in deploying machine learning models into production environments. Could you walk us through one of these deployments?
   - At Vegas.com, how did you utilize A/B testing to validate your models?

7. **Leadership & Contribution to Open-source**
   - As the founder and president of GRATE, what leadership strategies did you find most effective?
   - Can you mention some contributions you have made to open-source projects? How do these reflect your capability in machine learning?

### Talking Points:

- **Proficiency in Python and Related Libraries**: Highlight Devica's extensive use of Python, TensorFlow, and pandas in her projects, emphasizing her technical expertise relevant to the job’s technical requirements.
  
- **Understanding of ML Algorithms**: Discuss her practical application of machine learning principles and algorithms in real-world projects, such as the development of CNN architecture at JobTarget and predictive models at Vegas.com.

- **Handling Large-scale Datasets**: Elaborate on her experience with large user bases and datasets at Vegas.com, showcasing her capability to manage and analyze data at the scale required by Reddit Inc.

- **Software Engineering Practices**: Note her adherence to coding standards, her use of Git for version control, and her approach to testing, which align with best practices in software engineering.

- **Education & Continuous Learning**: Point out her advanced degree and the relevant coursework that directly prepares her for the complexities of the role at Reddit Inc.

- **Leadership and Community Engagement**: Discuss her roles at GRATE and Women in AI India, which highlight her leadership skills and her commitment to fostering a community in tech, aligning with the collaborative culture at Reddit Inc.

- **Open-source Contributions and Innovations**: Focus on her active GitHub contributions and her project involvement, demonstrating her ongoing commitment to advancing the field of machine learning.

This set of questions and talking points thoroughly prepares Devica Verma for her interview at Reddit Inc., ensuring she can confidently discuss how her background, skills, and experiences align with the job requirements.