In [1]:
from app.backend.core.file_processor import FileProcessor
from app.backend.core.analyzer import Analyzer

processor = FileProcessor()
analyzer = await Analyzer.create()

2025-07-14 23:20:51 | app.backend.services.groq | DEBUG | Retrieving Groq API key from Azure Key Vault...
2025-07-14 23:20:54 | app.backend.services.groq | DEBUG | Groq API key retrieved.


In [14]:
job = """
Job Summary
Our corporate activities are growing rapidly, and we are currently seeking a full-time, office-based Junior Data Engineer to join our Information Technology team. This position will work on a team to accomplish tasks and projects that are instrumental to the company’s success. If you want an exciting career where you use your previous expertise and can develop and grow your career even further, then this is the opportunity for you. 


Responsibilities

Utilize skills in development areas including data warehousing, business intelligence, and databases (Snowflake, ANSI SQL, SQL Server, T-SQL);
Support programming/software development using Extract, Transform, and Load (ETL) and Extract, Load and Transform (ELT) tools, (dbt, Azure Data Factory, SSIS);
Design, develop, enhance and support business intelligence systems primarily using Microsoft Power BI;
Collect, analyze and document user requirements;
Participate in software validation process through development, review, and/or execution of test plan/cases/scripts;
Create software applications by following software development lifecycle process, which includes requirements gathering, design, development, testing, release, and maintenance;
Communicate with team members regarding projects, development, tools, and procedures; and
Provide end-user support including setup, installation, and maintenance for applications

Qualifications

Bachelor's Degree in Computer Science, Data Science, or a related field;
Internship experience in Data or Software Engineering;
Knowledge of developing dimensional data models and awareness of the advantages and limitations of Star Schema and Snowflake schema designs;
Solid ETL development, reporting knowledge based off intricate understanding of business process and measures;
Knowledge of Snowflake cloud data warehouse, Fivetran data integration and dbt transformations is preferred;
Knowledge of Python is preferred;
Knowledge of REST API;
Basic knowledge of SQL Server databases is required;
Knowledge of C#, Azure development is a bonus; and
Excellent analytical, written and oral communication skills.

Medpace Overview

Medpace is a full-service clinical contract research organization (CRO). We provide Phase I-IV clinical development services to the biotechnology, pharmaceutical and medical device industries. Our mission is to accelerate the global development of safe and effective medical therapeutics through its scientific and disciplined approach. We leverage local regulatory and therapeutic expertise across all major areas including oncology, cardiology, metabolic disease, endocrinology, central nervous system, anti-viral and anti-infective. Headquartered in Cincinnati, Ohio, employing more than 5,000 people across 40+ countries.


Why Medpace?

People. Purpose. Passion. Make a Difference Tomorrow. Join Us Today.

 

The work we’ve done over the past 30+ years has positively impacted the lives of countless patients and families who face hundreds of diseases across all key therapeutic areas. The work we do today will improve the lives of people living with illness and disease in the future.
"""

In [15]:
job_response = await analyzer.analyze_job(job)

2025-07-14 23:27:33 | app.backend.services.groq | INFO | Running structured query with function calling using ainvoke...
2025-07-14 23:27:35 | app.backend.services.groq | DEBUG | Beautified JSON args:
{
    "certificate": [],
    "domain": "Information Technology",
    "educational_requirements": [{
        "fields": ["Computer Science", "Data Science"],
        "level": "Bachelor's"
    }],
    "employment_type": "Full-time",
    "experience": ["Internship experience in Data or Software Engineering"],
    "ideal_candidate_summary": "A Junior Data Engineer with a strong foundation in data warehousing, business intelligence, and databases, and excellent analytical, written, and oral communication skills.",
    "job_level": "Junior",
    "job_title": "Junior Data Engineer",
    "location_requirement": "Office-based",
    "responsibilities": ["Utilize skills in development areas including data warehousing, business intelligence, and databases (Snowflake, ANSI SQL, SQL Server, T-SQL)", "Su

In [3]:
resume = await processor.extract_from_file("data/ExampleResume1.pdf")
candidate_response = await analyzer.analyze_candidate(resume)

2025-07-14 23:21:03 | app.backend.core.file_processor | INFO | Extracting text from local file: data/ExampleResume1.pdf
2025-07-14 23:21:03 | app.backend.core.file_processor | INFO | Using .PDF parser for file: data/ExampleResume1.pdf
2025-07-14 23:21:03 | app.backend.core.file_processor | INFO | Extracting text from 'data/ExampleResume1.pdf' using local PDF parser (pypdf)
2025-07-14 23:21:03 | app.backend.services.groq | INFO | Running structured query with function calling using ainvoke...
2025-07-14 23:21:06 | app.backend.services.groq | DEBUG | Beautified JSON args:
{
    "candidate_name": "Hedy Lamarr",
    "certificate": [],
    "comment": "Hedy Lamarr is a highly skilled and experienced data engineer with a strong educational background in Computer Science and Statistics. She has demonstrated excellent technical skills and the ability to work effectively in various roles.",
    "education": [{
        "field": "Computer Science, Statistics",
        "institution": "University of

In [6]:
import json

combined_input = {
    "candidate": candidate_response,
    "job": job_response
}
combined_input_str = json.dumps(combined_input)
print(combined_input_str)
print(len(combined_input_str)/4)

{"candidate": {"candidate_name": "Hedy Lamarr", "certificate": [], "comment": "Hedy Lamarr is a highly skilled candidate with experience in data engineering, software development, and teaching. She has a strong technical background and has worked with various technologies and tools. Her experience in data analysis and modeling is also notable.", "education": [{"field": "Computer Science, Statistics", "gpa": null, "institution": "University of North Carolina at Chapel Hill", "level": "Bachelor's", "year": "2026"}], "email": "hedylamarr@email.edu", "job_recommended": ["Data Engineer", "Software Engineer"], "phone_number": "(555) 555-5555", "responsibilities": ["Designed a cumulative aggregated dataset", "Implemented a multi-level Tableau dashboard system", "Built and deployed new front-end features", "Automated backend API test suite", "Managed and coordinated support tasks"], "roles": [{"company": "Spotify", "summary": "Designed a cumulative aggregated dataset hosted in BigQuery and GCS

In [16]:
match_response = await analyzer.match(candidate_analysis=candidate_response, job_analysis=job_response)

2025-07-14 23:27:50 | app.backend.services.groq | INFO | Running structured query with function calling using ainvoke...
2025-07-14 23:27:53 | app.backend.services.groq | DEBUG | Beautified JSON args:
{
    "certificate": {
        "comment": "The candidate does not have any certificates, and the job does not require any specific certificates. Therefore, the score is 0 as there are no certificates to evaluate.",
        "score": 0
    },
    "domain": {
        "comment": "The candidate's domain experience is in Information Technology, specifically in data engineering, which closely aligns with the job's domain in Information Technology and data engineering. The candidate's experience in data engineering and technical skills make them a strong fit in terms of domain.",
        "score": 90
    },
    "education": {
        "comment": "The candidate has a Bachelor's degree in Computer Science and Statistics, which closely aligns with the job's educational requirements in Computer Science

In [19]:
def compute_match_score(match_result: dict) -> float:
    weights = {
        "education": 0.20,
        "experience": 0.25,
        "technical_skill": 0.25,
        "responsibility": 0.20,
        "soft_skill": 0.05,
        "domain": 0.05,
        "certificate": 0,
    }

    total_score = 0
    for key, weight in weights.items():
        score = match_result.get(key, {}).get("score", 0)
        total_score += score * weight

    return round(total_score, 2)

print(compute_match_score(match_response))


66.0


In [1]:
from app.backend.services.factory import AIServiceFactory

service = await AIServiceFactory.create_service()
response = await service.query("What is the area of the UK?")
print(response)

2025-07-07 00:38:58 | app.backend.services.groq | DEBUG | Retrieving Groq API key from Azure Key Vault...
2025-07-07 00:39:01 | app.backend.services.groq | DEBUG | Groq API key retrieved.


Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0x00000255A5E7B1C0>


2025-07-07 00:39:01 | app.backend.services.groq | INFO | Calling Groq model 'llama3-70b-8192' using ainvoke...
The total area of the United Kingdom (UK) is approximately 243,610 square kilometers (93,625 square miles). This includes:

* England: 130,279 km² (50,261 sq mi)
* Scotland: 78,772 km² (30,414 sq mi)
* Wales: 20,779 km² (8,023 sq mi)
* Northern Ireland: 14,160 km² (5,470 sq mi)

Note: These figures are approximate and may vary slightly depending on the source.


In [13]:
from datetime import date

prompt = f"""
You are an expert HR recruiter and resume screener. Your task is to extract relevant information from a resume and organize it into structured categories using the format below.

---

**Categories**:

1. **Synopsis** - A personal summary, description, or “about me” section.  
2. **Experience** - Includes jobs/internships under `<job>` and standalone projects under `<project>`.  
3. **Education** - Each degree appears under a `<degreeEntry>` with optional GPA and graduation year.  
4. **Skills** - Divided into technical and non-technical.

---

**Important Instructions**:
- If no synopsis is present, insert the word `Unknown` in the `<synopsis>` tag.
- Do **not** invent or infer missing information. Only extract what is explicitly present.
- Use the exact XML structure and tag names provided.
- All durations must use time units (e.g., `"3 months"`, `"2 years"`).
- Only include UTF-8 characters. Strip bullets, markdown, or decorative formatting.
    - If only a single month is present in the date range, the duration is `"1 month"`
    - For reference, `"Present"` in a job duration refers to {date.today()}

---

**Format your output as follows**:

```xml
<resume>
    <synopsis>
        INSERT SYNOPSIS INFORMATION HERE
    </synopsis>
    <experience>
        <job>
            <company>
                INSERT COMPANY NAME
            </company>
            <duration>
                INSERT JOB DURATION
            </duration>
            <description>
                INSERT JOB DESCRIPTION
            </description>
        </job>
        <!-- Repeat <job> as needed -->

        <project>
            <name>
                INSERT PROJECT NAME
            </name>
            <duration>
                INSERT PROJECT DURATION (if available)
            </duration>
            <description>
                INSERT PROJECT DESCRIPTION
            </description>
        </project>
        <!-- Repeat <project> as needed -->
    </experience>
    <education>
        <degreeEntry>
            <university>
                INSERT UNIVERSITY NAME
            </university>
            <degree>
                <level>
                    INSERT DEGREE LEVEL (e.g., Doctorate, Masters, Baccalaureate)
                </level>
                <name>
                    INSERT DEGREE NAME (e.g., Computer Science)
                </name>
            </degree>
            <graduationYear>
                INSERT GRADUATION YEAR (if available)
            </graduationYear>
            <gpa>
                INSERT GPA (if available)
            </gpa>
        </degreeEntry>
        <!-- Repeat <degreeEntry> as needed -->
    </education>
    <skills>
        <technical>
            <software>
                LIST SOFTWARE SKILLS
            </software>
            <languages>
                LIST PROGRAMMING LANGUAGES
            </languages>
            <frameworks>
                LIST FRAMEWORKS
            </frameworks>
            <tools>
                LIST TECHNICAL TOOLS
            </tools>
        </technical>
        <nontechnical>
            LIST NON-TECHNICAL SKILLS
        </nontechnical>
    </skills>
</resume>

---

You will now be given a resume in plain text. Extract and organize the information as described above.
**Resume**:  
{resume}
"""