In [None]:
import os
from dotenv import load_dotenv
from scraper import fetch_website_contents
from IPython.display import Markdown, display
from openai import OpenAI

In [None]:
load_dotenv(override=True)
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    print("❌ No API key found. Please add it to your .env file in project root.")
elif not api_key.startswith("sk-proj-"):
    print("⚠️ API key found, but it doesn't start with 'sk-proj-'; please verify.")
else:
    print("✅ API key loaded successfully!")

# Connect to OpenAI
openai = OpenAI()

✅ API key loaded successfully!


In [None]:
def read_resume():
    """Read the resume text file from a fixed absolute path."""
    path = r"C:\Users\sangeeta\projects\llm_engineering\week1\solutions\resume.txt"
    try:
        with open(path, "r", encoding="utf-8") as f:
            return f.read()
    except FileNotFoundError:
        print(f"❌ Resume file not found at path: {path}")
        return ""

In [None]:
system_prompt = """
You are an AI HR assistant.

Compare the following:
1. A job description from a website
2. The candidate's résumé text

Your tasks:
- Extract the following details from the job posting (if available):
  • 📍 Job location
  • 💼 Job type (Full-time / Contract / Internship / Remote / Hybrid)
  • 💰 Salary range or compensation details
- Summarize each source briefly in a professional tone.
- Estimate a match score (0–100%) based on skill alignment and relevance.
- List required and optional skills from job description
- List the top 3 strengths.
- Highlight missing or weak skills in red using <span style="color:red"> tags.
- Do not mention or infer the number of years of experience (e.g., avoid "3+ years", "five years").
- Focus only on the candidate’s skills, tools, domains, and achievements — not duration or tenure.

Respond in Markdown (no code blocks).
"""

In [None]:
def messages_for(job_text, resume_text):
    user_prompt = f"""
Job Description:
{job_text}

Candidate Resume:
{resume_text}
"""
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt}
    ]


In [None]:
def match_job_resume(job_url):
    """Fetch job + resume and send to OpenAI."""
    print("📡 Fetching job description...")
    job_text = fetch_website_contents(job_url)

    print("📄 Reading resume file...")
    resume_text = read_resume()

    if not resume_text.strip():
        print("⚠️ Resume file is empty or not found. Please check the path.")
        return

    print("🤖 Sending data to OpenAI model... please wait.")
    response = openai.chat.completions.create(
        model="gpt-4.1-mini",
        messages=messages_for(job_text, resume_text)
    )

    return response.choices[0].message.content

In [None]:
def display_match(job_url):
    """Display the final result in Markdown."""
    result = match_job_resume(job_url)
    if result:
        display(Markdown(result))

In [None]:
display_match("https://www.linkedin.com/jobs/view/4316368656")

📡 Fetching job description...
📄 Reading resume file...
🤖 Sending data to OpenAI model... please wait.


### Job Description Summary
The job posting is for various Analyst roles in the United States, with prominent locations such as New York, Houston, Austin, Miami, and Denver. Available job types include full-time, part-time, contract, temporary, and internship positions. Remote, hybrid, and on-site options are offered. Salary ranges start from $40,000 and go beyond $120,000 depending on the role and location. Companies hiring include Nomura, Toyota North America, and others, with roles typically in sectors like global markets, investment banking, acquisitions, or business insights.

- 📍 Job location: Multiple locations including New York, NY; Houston, TX; Austin, TX; Miami, FL; Denver, CO; Plano, TX  
- 💼 Job type: Full-time (majority), Part-time, Contract, Temporary, Internship; also Remote and Hybrid options  
- 💰 Salary: Starting from $40,000+ up to $120,000+ depending on role/location  

---

### Candidate Résumé Summary
Sangeeta Ramdas Kamite is a proficient Data Analyst based in Texas with expertise in data analysis, statistical modeling, machine learning, and cloud platforms. She has advanced skills in Python, SQL, data visualization (Tableau, Power BI), big data technologies (Hadoop, Spark, Kafka), and cloud computing (AWS, Azure, Google Cloud). Her experience spans building ETL pipelines, predictive modeling, anomaly detection, interactive dashboards, and compliance management in finance, education, and healthcare sectors. She demonstrates strong analytical thinking, stakeholder management, and technical delivery with proven results in improving decision-making efficiency and data quality.

---

### Required Skills from Job Description (Inferred for Analyst Roles)
- Data Analysis  
- Statistical Modeling  
- SQL and database management  
- Reporting and data visualization  
- Business insights and strategy  
- Communication and stakeholder collaboration  
- Use of cloud platforms and big data tools (optional depending on role)  

### Optional Skills from Job Description  
- Machine learning techniques  
- Financial or market data experience  
- Advanced analytics and predictive modeling  
- Agile or SDLC methodologies  

---

### Top 3 Candidate Strengths
1. **Advanced Data Analysis & Statistical Modeling:** Uses Python, R, SQL for predictive analytics, clustering, regression, and time series forecasting (Prophet, ARIMA).  
2. **Data Engineering & Big Data Technologies:** Skilled in building scalable ETL pipelines, leveraging Apache Airflow, Spark, Kafka, Hadoop, and cloud infrastructures (AWS, Azure, GCP).  
3. **Data Visualization & Reporting:** Expertise in Tableau, Power BI, Excel dashboards, and reporting tools like SSRS for actionable insights and stakeholder presentations.  

---

### Skill Match and Gap Analysis

| Skill / Area                         | Candidate Matches?                       | Notes                                                                                      |
|------------------------------------|----------------------------------------|--------------------------------------------------------------------------------------------|
| Data Analysis & Statistical Skills | ✅ Strong                              | Comprehensive experience including supervised learning, regression, clustering, A/B testing |
| SQL and Database Management        | ✅ Strong                              | Extensive use of SQL variants, Oracle, MySQL, PostgreSQL, MongoDB                         |
| Data Visualization                 | ✅ Strong                              | Tableau, Power BI, Advanced Excel, Plotly, Seaborn                                        |
| Cloud Platforms                   | ✅ Strong                              | AWS, Azure, Google Cloud experience with data storage and processing                      |
| Big Data Technologies             | ✅ Strong                              | Hadoop, Spark, Kafka, Hive, HBase, MapReduce, Snowflake                                  |
| Machine Learning                  | ✅ Strong                              | Supervised ML models, anomaly detection, AI/LLM familiarity                              |
| Business Insights / Strategy       | ✅ Moderate - Indirect                 | Worked on funnel analysis, marketing, mortgage risk insights, but no explicit business strategy roles mentioned |
| Agile / SDLC                      | ✅ Moderate                          | Familiar with Agile and Waterfall methodologies                                           |
| Financial Markets / Investment Banking | <span style="color:red">No direct mention</span>                  | No explicit experience in investment banking or global markets noted                      |
| Communication / Stakeholder Management | ✅ Strong                           | Presentation skills and stakeholder engagement demonstrated                               |

---

### Candidate Missing or Weak Skills
- <span style="color:red">Direct financial markets or investment banking experience</span>  
- <span style="color:red">Explicit business strategy formulation role</span>  

---

### Match Score Estimate: **85%**

The candidate is highly qualified on core data analytics, big data, cloud computing, machine learning, and data visualization skills relevant for most analyst positions. The main gap lies in direct financial market or investment banking experience and explicit business strategy exposure, which may limit fit for highly specialized finance-focused analyst roles but not general analyst roles.

---

**Summary:**  
Sangeeta brings strong technical expertise in advanced analytics, machine learning, cloud-based data engineering, and visualization with proven results across multiple industries. Although lacking direct finance or market-specific experience, her comprehensive skills and accomplishments indicate a strong match for analyst roles focusing on data insights, reporting, and scalable data solutions in diverse domains.