Important aspects of a job description:
1. Job title
2. Responsibilities
3. Required Qualifications
4. Preferred Qualifications

Some resumes may have:
1. Skills and Competencies
2. Experience Requirements (e.g., "5+ years of marketing experience")

These extra sections can be grouped under "Required Qualifications"

In [2]:
from utils.with_structured_output import with_structured_output
from pydantic import BaseModel, RootModel, Field
from pprint import pprint

In [9]:
class JobDescription(BaseModel):
    required_skills: list[str]          = Field(..., alias="Required Skills")
    preferred_skills: list[str]         = Field(..., alias="Preferred Skills")
    required_experience: list[str]      = Field(..., alias="Required Experience")
    preferred_experience: list[str]     = Field(..., alias="Preferred Experience")
    required_education: list[str]       = Field(..., alias="Required Education")
    preferred_education: list[str]      = Field(..., alias="Preferred Education")

In [89]:
JOB_DESC_EXTRACTION_TEMPLATE = """
You are an expert at parsing important information from job descriptions. Given a job description, your job is to parse key details from it in this format:
    {{
        "Required Skills": ["list", "of", "required", "skills"],
        "Preferred Skills": ["list", "of", "preferred", "skills"],
        "Required Experience": ["list", "of", "required", "experience"],
        "Preferred Experience": ["list", "of", "preferred", "experience"],
        "Required Education": ["list", "of", "required", "education"],
        "Preferred Education": ["list", "of", "preferred", "education"]
    }}

Please parse all information from bulleted lists in the job description.

When parsing "Education" requirements, please output each individual education requirement separately. For example, "Master's degree or PhD in Computer Science" -> ["Master's degree in Computer Science", "PhD in Computer Science"].

Skills are typically specific, technical terms that are prefixed with "Proficient in," "Knowledge of," "Familiarity with," "Ability to," or "Expertise in." When parsing skills, only include the skill itself in the output, e.g. "Proficient in software development" -> "software development."

Experience requirements are typically prefixed with "Years of experience," "Prior work in," or "Proven history of," "Demonstrated track record."

**Ensure that all information parsed is explicitly contained in the resume.**
---
Job Description:
{job_desc}

Output:
"""

In [90]:
with open("../sample_data/google_swe_senior.txt", "r") as file:
    job_desc = file.read()

In [91]:
job_desc = with_structured_output(
    prompt=JOB_DESC_EXTRACTION_TEMPLATE.format(job_desc=job_desc),
    schema=JobDescription)
pprint(job_desc)

{'Preferred Education': ["Master's degree or PhD in Computer Science or "
                         'related technical field'],
 'Preferred Experience': ['1 year of experience in a technical leadership role',
                          'Experience developing accessible technologies'],
 'Preferred Skills': ['accessible technologies'],
 'Required Education': ['Bachelor’s degree or equivalent practical experience'],
 'Required Experience': ['5 years of experience with software development in '
                         'one or more programming languages',
                         '3 years of experience testing, maintaining, or '
                         'launching software products',
                         '1 year of experience with software design and '
                         'architecture',
                         '3 years of experience with state of the art GenAI '
                         'techniques (e.g., LLMs, Multi-Modal, Large Vision '
                         'Models) or with 