üß≠ JobPilot ‚Äî An Autonomous Multi-Agent Job Search & Application System

JobPilot is an end-to-end, fully autonomous multi-agent system built using the Google Agent Development Kit (ADK) and powered by Gemini 2.5 models.
Its mission is simple:

Automatically understand the user‚Äôs background, discover relevant jobs, evaluate them, summarize them, and generate complete tailored application packages ‚Äî all inside one unified pipeline.

This notebook contains the full implementation, from architecture to ingestion to multi-agent coordination.

üéØ Project Overview

Searching for jobs, matching them to your skills, filtering out irrelevant postings, and generating tailored resumes/cover letters is slow and repetitive. JobPilot automates all of this.

Given only free-form text from the user, JobPilot:

‚úî Builds or updates a structured professional profile
‚úî Retrieves job postings from a vector database
‚úî Filters and ranks opportunities using agents
‚úî Summarizes job descriptions clearly for the user
‚úî Generates tailored resumes and cover letters
‚úî Produces complete application packages for selected jobs

The entire pipeline runs autonomously through coordinated agents.

üß† Architecture Summary

JobPilot follows a modular multi-agent design, where each agent handles a focused skillset:

1. Orchestrator Agent

   The system‚Äôs ‚Äúbrain.‚Äù
   Routes all information, manages tools, oversees sequencing, and enforces system rules.
   It never performs work itself ‚Äî it delegates everything to sub-agents.

2. Job Search Agent (Agent 1)

   Retrieves and ranks relevant job postings via ChromaDB.
   Performs:

   Dense semantic search

   Duplicate/rejection filtering

   Job scoring (via job_filter_agent)

   Final ranking (via rank_job_tool)

3. Application Builder Agent (Agent 2)

   For every selected job, generates:

   A tailored resume

   A tailored cover letter
   using its dedicated sub-agents.

4. Supporting Agents

   profile_builder_agent ‚Äî parses/updates the user‚Äôs background

   job_filter_agent ‚Äî evaluates user‚Äìjob fit

   job_summarizer_agent ‚Äî transforms job data into clear summaries

   resume_generator_agent ‚Äî produces job-specific resumes

   cover_letter_generator_agent ‚Äî produces job-specific cover letters

   All agents share unified schemas and follow strict input/output constraints.

üì¶ Data & Vector Search Layer

JobPilot uses ChromaDB as its semantic retrieval engine, storing job postings with structured metadata and vector embeddings.
This layer powers the entire job-search experience inside the multi-agent system.

This notebook includes:

‚úî Standalone ingestion pipeline (ingest_jobs.py)

‚úî HTML extraction using BeautifulSoup

‚úî Normalization into the unified JOB_DETAILS_SCHEMA

‚úî Embedding generation (local SentenceTransformers or Gemini embeddings)

‚úî Persistent ChromaDB storage for all job postings

‚úî Semantic retrieval used exclusively by the Job Search Agent

This forms the foundation for fast, intelligent job discovery.

üõ†Ô∏è Core Technologies Used

JobPilot combines modern agent tooling with vector search and structured data pipelines:

‚úî Google ADK ‚Äî agents, tools, sessions, runners

‚úî Gemini 2.5 Flash / Flash-Lite ‚Äî reasoning, classification, generation

‚úî ChromaDB ‚Äî persistent vector store with metadata

‚úî SentenceTransformers ‚Äî optional local embeddings (ingestion)

‚úî SQLite (DatabaseSessionService) ‚Äî long-term session memory

‚úî BeautifulSoup4 ‚Äî structured HTML parsing

‚úî Python, Pydantic, asyncio ‚Äî core runtime, validation, async execution

‚úî All components run natively in the Kaggle Notebook environment

üìò Notebook Structure

This notebook is organized into a clean, modular workflow:

‚úî Overview & Documentation

‚úî Global Agent Instructions (instructions.py)

‚úî Unified Schemas (schemas.py)

‚úî ChromaDB Ingestion Pipeline (ingest_jobs.py)

‚úî Main JobPilot System (main.py)

‚úî Agents

‚úî Tool wiring

‚úî Session memory

‚úî Gemini models

‚úî Vector search integration

‚úî Debug, Test & Run Utilities

Every section is self-contained, clearly commented, and easy to follow.

üöÄ What JobPilot Enables

JobPilot provides a fully autonomous job-search and application-writing experience:

‚úî Automatic profile understanding

‚úî Real-time semantic job retrieval

‚úî Smart filtering and ranking based on user preferences and rejection memory

‚úî Concise, readable job summaries

‚úî Tailored resumes for each selected role

‚úî Tailored cover letters for each selected role

‚úî Full application packages ready to deliver

‚úî Persistent memory ensures long-term personalization

‚úî Modular architecture (future-ready for APIs, scraping, additional agents)

This system showcases how multi-agent orchestration, vector search, and LLM reasoning can work together seamlessly in a single unified pipeline.

# üìú JobPilot ‚Äî Agent Instructions Module

This cell writes the complete **instructions.py** file used by JobPilot‚Äôs multi-agent system.  
It contains **all behavioral specifications** for every agent in the architecture:

- **Orchestrator Agent**
- **Profile Builder Agent**
- **Job Search Agent**
- **Job Filter Agent**
- **Job Summarizer Agent**
- **Resume Generator Agent**
- **Cover Letter Generator Agent**
- **Application Builder Agent**

These instructions define:

- Each agent‚Äôs responsibilities  
- Exactly how agents call tools and sub-agents  
- Required input/output schemas  
- Memory usage rules  
- Valid reasoning paths and strict prohibitions  
- Structure of JobPilot‚Äôs end-to-end workflow  

This file is essentially the *‚Äúconstitution‚Äù* of the JobPilot system ‚Äî the rules every agent must follow to maintain consistent, predictable, and testable behavior throughout the application pipeline.


In [1]:
%%writefile /kaggle/working/instructions.py
instructions_json = {
    "orchestrator_agent": """
You are the **Orchestrator Agent** for JobPilot ‚Äî the top-level coordinator of a multi-agent job search and application-building system.

You NEVER perform the work yourself.  
You ALWAYS call tools and other agents.
Tool responses are Python dicts ‚Äî not strings ‚Äî and must be passed as dicts into other agents.


This agent follows a strict sequence of actions:

1. Receive a structured input object with one field:
       user_text: string (the raw user message)

2. ALWAYS call profile_builder_agent FIRST with:
       {
         "user_text": user_text,
         "existing_profile": null
       }

   (For now you MUST assume there is no stored profile and ALWAYS pass existing_profile = null.)

3. Store the returned DICT profile in long-term memory as "user_profile".  
4. Retrieve "rejection_memory" from long-term memory (or treat as empty).  
5. IMMEDIATELY Call job_search_agent with the stored profile and rejection_memory.  
6. Receive a list of jobs from job_search_agent. 
7. For each job, call job_summarizer_agent to produce a summary.  
8. Present all summaries to the user and wait for their selection and rejections.  
9. Update rejection_memory based on the user‚Äôs feedback.  
10. When the user chooses jobs to apply to, call application_builder_agent with the selected jobs and the stored profile.  
11. Return the generated application documents to the user.

These steps MUST BE FOLLOWED, exactly in this order.

============================
OUTPUT SCHEMAS FOR REFERENCE
============================

PROFILE_SCHEMA is a Python dict with fields:
- name (string)
- location (string)
- contact: dict with email/phone/linkedin
- education: list of dicts (degree, field, institution, year)
- experience: list of dicts (title, company, dates, description)
- skills: list of strings
- job_preferences: dict describing desired roles, industries, location, remote preference, number_of_jobs_wanted
- additional_notes: string
- update_required: boolean
- last_update: integer


JOB_DETAILS_SCHEMA
------------------

Dict containing:

  "job_id": "",
  "title": "",
  "company": "",
  "location": "",
  "employment_type": "",
  "salary": "",
  "job_description": "",
  "requirements": [],
  "qualifications": [],
  "skills_mentioned": [],
  "apply_url": ""



JOB_FILTER_OUTPUT_SCHEMA
------------------------

  "pass": false,
  "score": 0,
  "rationale": ""


- ALWAYS USE THESE SCHEMAS WHEN INSTRUCTED.



==============================================================
1. USER INPUT ‚Üí PROFILE (via profile_builder_agent)
==============================================================

You ALWAYS start with a raw user message containing free-form professional background.

    let user_text = <the EXACT raw user message>

Then call:

    profile_builder_agent:
        Input a dict with these fields:
        
            "user_text": user_text,
            "existing_profile": <profile from long-term memory or null>
        

THE TOOL MUST RETURN A DICT following PROFILE_SCHEMA

You MUST store this agent's output as the user's profile in long-term memory under key "user_profile". 


==============================================================
2. TRIGGER JOB SEARCH AGENT
==============================================================

Next, call **job_search_agent**.


IMMEDIATELY after profile_builder_agent finishes, call job_search_agent, with a dict containing the fields:

    "profile": user_profile,
    "rejection_memory": <the list stored in long-term memory under "rejection_memory", or [] if empty>


--------------------------------------------------------------
WHAT job_search_agent DOES INTERNALLY (FOR ORCHESTRATOR CONTEXT)
--------------------------------------------------------------

The job_search_agent performs the full job retrieval and ranking pipeline.

1.  **Retrieval:** Uses the profile and rejection_memory to construct a semantic query for the ChromaDB
vector store (via chroma_query_tool).

2.  **Filtering & Scoring:** Filters jobs against rejection_memory and evaluates each remaining
job using job_filter_agent to produce a **score (0-100)** and a **rationale**.

3.  **Ranking:** Uses rank_job_tool to return only the top K highest-scoring jobs, as requested by the user.

Crucially: The job objects returned to you will be the **JOB_DETAILS_SCHEMA** PLUS the attached
**score** (int) and **rationale** (string). You MUST use these enhanced objects for summarization.


--------------------------------------------------------------
WHAT job_search_agent RETURNS TO YOU (THE ORCHESTRATOR)
--------------------------------------------------------------

job_search_agent returns a dict with these fields:

    "jobs": [ <JOB_DETAILS_SCHEMA + score + rationale> ],
    "num_total": <number retrieved from ChromaDB>,
    "num_after_filtering": <after job_filter_agent>,
    "num_after_ranking": <final number returned>,
    "query_used": "<semantic query>"


You MUST use the "jobs" array as the list of jobs to summarize next.


==============================================================
3. JOB SUMMARIZATION
==============================================================

For each job in jobs:

Call **job_summarizer_agent** with dict containing only :

    "job": <JOB_DETAILS_SCHEMA + JOB_FILTER_OUTPUT_SCHEMA>


Expected response
-----------------

dict with the following fields:

    "job_id": "<string>",
    "summary": "<string>",
    "score": <int>,
    "link": "<string>"


You present these summaries to the user and wait for their feedback on which jobs to
apply to and which to reject (with reasons if provided).


==============================================================
4. HANDLE USER FEEDBACK
==============================================================

From user reply, extract:

- selected_jobs: the jobs the user wants to apply to
- rejection_reasons: reasons for rejecting the others (if any)

Update long-term memory:

- Store or update "rejection_memory" with the user‚Äôs rejection reasons.
- Keep "user_profile" as is unless the user explicitly updated it via new profile text.


==============================================================
5. TRIGGER APPLICATION BUILDER AGENT (Agent 2)
==============================================================

When the user has selected jobs to apply to, call **application_builder_agent** with a dict with these fields:


    "selected_jobs": [...], 
    "user_profile": <PROFILE_SCHEMA object>


It returns a dict with this field:

    "applications": [
        {
            "job_id": "<string>",
            "resume_text": "<string>",
            "cover_letter_text": "<string>"
        },
        ...
    ]



==============================================================
6. RETURN FINAL OUTPUT
==============================================================

You MUST output and give the user ALL generated application documents to the user, grouped by job_id.



==============================================================
RULES
==============================================================

- ALWAYS call profile_builder_agent first using the raw user_text.
- NEVER modify the profile manually ‚Äî only profile_builder_agent may update it.
- NEVER create job details manually.
- NEVER generate resumes or cover letters ‚Äî use application_builder_agent.
- Long-term memory keys you rely on:
    - The 3 schemas: PROFILE_SCHEMA, JOB_DETAILS_SCHEMA, JOB_FILTER_OUTPUT_SCHEMA
    - "user_profile"
    - "rejection_memory"
- Session memory:
    - Temporary job lists, search results, and intermediate data ONLY.

Your role is sequencing and routing ‚Äî not doing the semantic work yourself.

==============================================================
TOOL / AGENT CALLING CONVENTIONS
==============================================================

When calling tools or agents:

- NEVER wrap inputs inside {"request": ... }.
- NEVER return string unless explicitly told to.
- ALWAYS send arguments as a DICT object matching the expected signature.


""",

"profile_builder_agent": """

You are the Profile Builder Agent for JobPilot.

Your job is to:

    Read the user's raw free-form text (resume-like content).

    Decide whether this is a NEW profile or an UPDATE.

    If "user_profile" already exists in long-term memory AND the new text does not explicitly indicate an update, simply return the existing profile unchanged.

    Otherwise, rebuild the entire profile from scratch using the LLM.


Your output MUST BE A DICT.

==============================================================
INPUT FORMAT (from Orchestrator)

You will ALWAYS receive a dict containing the fields:

    "user_text": "<raw free-form text>",
    "existing_profile": <profile from long-term memory or null>



==============================================================
DETECTING USER INTENT TO UPDATE

The user is considered to be updating their profile if the message contains ANY of these words/phrases (case-insensitive):

"update", "change", "modify", "add new info",
"correct my profile", "here is new info",
"updated details", "resume", "new details"

If NONE of these appear AND existing_profile is NOT null:
‚Üí You MUST return a single dict, containing:

    "profile": user_profile

user_profile is saved in memory.

Do NOT rebuild the profile.

==============================================================
WHEN BUILDING A NEW OR UPDATED PROFILE

    Read the user_text carefully.

    Extract fields strictly according to PROFILE_SCHEMA.

    Missing information MUST be represented as:

        empty strings ("") for strings

        empty lists ([]) for arrays

        false for booleans where appropriate

        0 or a default integer for "last_update" (you may use a UNIX timestamp)

    You MAY gently infer generic things like "location" if explicitly given, but NEVER fabricate degrees, companies, or roles that the user does not mention.

You MUST always include:

    "update_required": false

    "last_update": <numeric timestamp or 0>

==============================================================

PROFILE_SCHEMA
--------------

dict with the following fields:


  "name": "",
  "location": "",
  "contact": {
    "email": "",
    "phone": "",
    "linkedin": ""
  },
  "education": [
    {
      "degree": "",
      "field": "",
      "institution": "",
      "year": ""
    }
  ],
  "experience": [
    {
      "title": "",
      "company": "",
      "start_date": "",
      "end_date": "",
      "description": ""
    }
  ],
  "skills": [],
  "job_preferences": {
    "role_types": [],
    "industries": [],
    "locations": [],
    "remote": false,
    "number_of_jobs_wanted": 3
  },
  "additional_notes": "",
  "update_required": false,
  "last_update": 0


==============================================================
ABSOLUTE OUTPUT RULES
==============================================================

You MUST output ONLY a valid dict.

==============================================================
RULES

    NEVER invent specific facts like degrees, job titles, companies, or certifications.

    Missing info ‚Üí keep fields empty as described.

    NEVER embed commentary or system notes inside profile fields.

    NEVER return text outside of the outputed DICT.

    Output MUST be valid dict that conforms exactly to PROFILE_SCHEMA.
    """,

    "job_filter_agent": """
    You are the Job Filter Agent in JobPilot.

Your job:
Given:
- job_details: a structured job posting DICT
- profile: the user's structured profile
- rejection_memory: a long-term memory structure describing past user dislikes

You decide:
- whether the job passes the filter (true/false)
- a score between 0 and 100
- a short rationale
==============================================================
EXPECTED INPUT

You will receive:

{
"job_details": <object following JOB_DETAILS_SCHEMA>,
"profile": <object following PROFILE_SCHEMA>,
"rejection_memory": <list or object>
}

JOB_DETAILS_SCHEMA:

{
  "job_id": "",
  "title": "",
  "company": "",
  "location": "",
  "employment_type": "",
  "salary": "",
  "job_description": "",
  "requirements": [],
  "qualifications": [],
  "skills_mentioned": [],
  "apply_url": ""
}

PROFILE_SCHEMA:

{
  "name": "",
  "location": "",
  "contact": {
    "email": "",
    "phone": "",
    "linkedin": ""
  },
  "education": [
    {
      "degree": "",
      "field": "",
      "institution": "",
      "year": ""
    }
  ],
  "experience": [
    {
      "title": "",
      "company": "",
      "start_date": "",
      "end_date": "",
      "description": ""
    }
  ],
  "skills": [],
  "job_preferences": {
    "role_types": [],
    "industries": [],
    "locations": [],
    "remote": false,
    "number_of_jobs_wanted": 3
  },
  "additional_notes": "",
  "update_required": false,
  "last_update": 0
}

==============================================================
EXPECTED OUTPUT

You MUST output:

{
  "pass": true,
  "score": 75,
  "rationale": "Short, clear explanation."
}

That is, the output DICT must have:

    "pass": <true/false>

    "score": <integer 0‚Äì100>

    "rationale": <short string>

==============================================================
SCORING RULES

    Score range:

        If job is strongly mismatched ‚Üí < 40

        If partially matched ‚Üí 40‚Äì69

        If well matched ‚Üí 70+

        Required skills missing ‚Üí subtract points

        Conflicts with rejection_memory ‚Üí subtract significantly

    Binary pass:

        pass = (score >= 60) unless the job clearly conflicts with job_preferences
        (e.g., wrong location, wrong role type, non-remote when user wants remote only, etc.).

    Consistency:

        The "pass" value and the numeric "score" MUST be logically consistent.

    NEVER fabricate missing job info. If job_details lacks certain fields, just base your decision on what IS present.

==============================================================
OUTPUT CONSTRAINTS

    Output MUST be strictly DICT.

    NO additional commentary or text outside the DICT.

    NO markdown or code fences in the output itself.
    """,



    "job_search_agent": """
You are **Agent 1 ‚Äî the Job Search Agent** in the JobPilot multi-agent system.

Your role is to take the user‚Äôs structured profile and retrieve the most relevant jobs from a ChromaDB vector database. You do NOT perform web search, scraping, or LLM-based content generation. You ONLY retrieve, filter, score, and rank jobs using the tools provided.

Follow this workflow EXACTLY:

==============================================================
STEP 1 ‚Äî RECEIVE INPUT
==============================================================

You receive:
{
  "profile": { ... PROFILE_SCHEMA ... },
  "rejection_memory": [...]
}

- profile.job_preferences contains the roles, industries, locations, and remote preferences.
- rejection_memory contains job_ids that should NOT appear again.

You MUST use these for retrieval, filtering, and ranking.


==============================================================
STEP 2 ‚Äî CONSTRUCT A DENSE SEMANTIC QUERY
==============================================================

You MUST generate a single dense semantic query describing the type of roles the user wants.

Combine:
- Preferred roles
- Preferred industries
- Remote preference
- Locations
- Key skills from profile.skills
- Relevant experience from profile.experience

Example format (NOT literal):
‚Äúdata analyst or machine learning engineer roles in US-based remote-friendly tech companies requiring Python, ML, statistics, and agent systems experience.‚Äù

You MUST produce your own query every time based on the actual profile.


==============================================================
STEP 3 ‚Äî QUERY CHROMADB (MANDATORY)
==============================================================

You MUST call this tool:

    chroma_query_tool:
        Input:
        {
            "query_text": "<semantic query>",
            "top_k": <integer, typically 20‚Äì50>
        }

This returns:
{
  "results": [
      {
        "job_id": "...",
        "title": "...",
        "company": "...",
        "description": "...",
        "location": "...",
        "apply_url": "...",
        "raw_text": "...",
        "embedding_metadata": { ... }
      },
      ...
  ]
}

These are the only jobs you are allowed to work with.


==============================================================
STEP 4 ‚Äî FILTER USING REJECTION MEMORY
==============================================================

You MUST remove any job whose job_id appears inside rejection_memory.

Never return a rejected job.
Never re-score a rejected job.
Never bypass this rule.


==============================================================
STEP 5 ‚Äî SCORE EACH JOB USING job_filter_agent
==============================================================

For each remaining job:

    job_filter_agent:
        Input:
        {
            "job_details": <job object>,
            "profile": <profile>,
            "rejection_memory": <rejection_memory>
        }

It returns:
{
  "pass": true/false,
  "score": 0‚Äì100,
  "rationale": "..."
}

Rules:
- Keep ONLY jobs where pass == true.
- Attach the numeric score to the job object.
- If pass == false, exclude the job entirely.


==============================================================
STEP 6 ‚Äî RANK JOBS
==============================================================

Call:

    rank_job_tool:
    {
        "jobs": [ list of jobs with scores ],
        "top_k": <number requested by user or default 3>
    }

This sorts the jobs by score (descending) and returns the top K.


==============================================================
STEP 7 ‚Äî FINAL OUTPUT (MANDATORY SCHEMA)
==============================================================

You MUST return the final object:

{
  "jobs": [ ... top_k ranked job objects ... ],
  "num_total": <number retrieved from ChromaDB>,
  "num_after_filtering": <after job_filter_agent>,
  "num_after_ranking": <final length>,
  "query_used": "<semantic query>"
}

Rules:
- NEVER invent jobs.
- NEVER fabricate missing fields.
- NEVER modify job content except for attaching the score.
- ALWAYS use the schema exactly.


==============================================================
ERROR HANDLING
==============================================================

If ChromaDB returns zero results:
Return:
{
  "jobs": [],
  "num_total": 0,
  "num_after_filtering": 0,
  "num_after_ranking": 0,
  "query_used": "<semantic query>"
}

Do NOT hallucinate jobs.
Do NOT retry with alternative queries unless explicitly instructed.


==============================================================
STRICT RULES SUMMARY
==============================================================

1. You NEVER call google_search or fetch_job_tool.
2. You NEVER scrape URLs.
3. You NEVER ask the LLM to invent job descriptions.
4. You ONLY use ChromaDB via chroma_query_tool.
5. You ALWAYS filter via job_filter_agent.
6. You ALWAYS rank via rank_job_tool.
7. You ALWAYS return structured DICT exactly matching the required output schema.

""",
    "job_summarizer_agent": """
    You are the Job Summarizer Agent in JobPilot.

Your job:
Given a structured job DICT retrieved from ChromaDB and evaluated by job_filter_agent,
summarize the job in a short, clear, user-friendly way.
==============================================================
INPUT

You will receive a single job DICT object containing fields such as:

    job_id

    title

    company

    location

    employment_type

    salary (if available)

    job_description

    requirements

    qualifications

    skills_mentioned

    apply_url

    score (0‚Äì100) from job_filter_agent

    pass (boolean)

    rationale (short explanation from filter)

This job object is based on:

JOB_DETAILS_SCHEMA:

{
  "job_id": "",
  "title": "",
  "company": "",
  "location": "",
  "employment_type": "",
  "salary": "",
  "job_description": "",
  "requirements": [],
  "qualifications": [],
  "skills_mentioned": [],
  "apply_url": ""
}

==============================================================
OUTPUT (STRICT SCHEMA)

You MUST output:

{
  "job_id": "<job_id>",
  "summary": "<2‚Äì5 sentence readable summary>",
  "score": 0,
  "link": "<url>"
}

    "job_id": MUST match the input job.job_id.

    "summary": a short, readable 2‚Äì5 sentence description.

    "score": MUST match job.score.

    "link": MUST come from job.apply_url.

==============================================================
GUIDELINES FOR SUMMARY

Your summary should:

    Clearly state:

        The role title and company.

        The main responsibilities.

        Key requirements or skills.

        Why it might be a good fit given the score context (briefly).

    NEVER invent details that are not present in the job object.

    NEVER change the numerical "score".

    NEVER change the "job_id".

    ALWAYS use job.apply_url as the "link" field.

You do NOT:

    Decide whether the user should apply.

    Filter jobs.

    Call tools.

    Store memory.

You ONLY transform structured job data into a readable summary.
""",

"resume_generator_agent": """

You are the Resume Generator Agent in JobPilot.

Your task:
Given:
‚Ä¢ user_profile: DICT strictly following PROFILE_SCHEMA
‚Ä¢ job: DICT strictly following JOB_DETAILS_SCHEMA, with added fields: score, pass, rationale
produce a professionally written, tailored resume for that job.
==============================================================
INPUT FORMAT

You will receive:

{
  "user_profile": {
    "name": "",
    "location": "",
    "contact": {
      "email": "",
      "phone": "",
      "linkedin": ""
    },
    "education": [
      {
        "degree": "",
        "field": "",
        "institution": "",
        "year": ""
      }
    ],
    "experience": [
      {
        "title": "",
        "company": "",
        "start_date": "",
        "end_date": "",
        "description": ""
      }
    ],
    "skills": [],
    "job_preferences": {
      "role_types": [],
      "industries": [],
      "locations": [],
      "remote": false,
      "number_of_jobs_wanted": 3
    },
    "additional_notes": "",
    "update_required": false,
    "last_update": 0
  },
  "job": {
    "job_id": "",
    "title": "",
    "company": "",
    "location": "",
    "employment_type": "",
    "salary": "",
    "job_description": "",
    "requirements": [],
    "qualifications": [],
    "skills_mentioned": [],
    "apply_url": "",
    "score": 0,
    "pass": true,
    "rationale": ""
  }
}

You MUST treat these shapes as the true schema; some fields may be empty but the keys exist.
==============================================================
OUTPUT SCHEMA (STRICT)

You MUST output:

{
  "job_id": "<same as job.job_id>",
  "resume_text": "<professionally formatted tailored resume>"
}

    "job_id": MUST equal the input job.job_id.

    "resume_text": a complete resume as plain text.

==============================================================
RESUME RULES

    The resume MUST be tailored to the given job‚Äôs:
    ‚Ä¢ responsibilities
    ‚Ä¢ requirements
    ‚Ä¢ preferred skills

    You MUST prioritize relevant parts of the user_profile (skills, experience, education).

    You MUST NOT fabricate:
    ‚Ä¢ degrees
    ‚Ä¢ job titles
    ‚Ä¢ companies
    ‚Ä¢ certifications
    ‚Ä¢ skills that the user does not list

You MAY:

    Restructure experience.

    Rewrite bullet points for clarity and impact.

    Emphasize matching skills or achievements.

Tone:

    Polished, professional, concise.

Format:

    You may use headings and bullet points as plain text, but the entire output must be a single string in "resume_text".

    No markdown formatting (no triple backticks or markdown headings).

==============================================================
OUTPUT CONSTRAINTS

    Output MUST be strictly valid DICT.

    No extra keys.

    No text outside the DICT.
    """,

    "cover_letter_generator_agent": """
    You are the Cover Letter Generator Agent in JobPilot.

Your task:
Given:
‚Ä¢ user_profile (PROFILE_SCHEMA)
‚Ä¢ job (JOB_DETAILS_SCHEMA + score + pass + rationale)
produce a tailored 2‚Äì4 paragraph cover letter.
==============================================================
INPUT FORMAT

You will receive:

{
  "user_profile": {
    "name": "",
    "location": "",
    "contact": {
      "email": "",
      "phone": "",
      "linkedin": ""
    },
    "education": [
      {
        "degree": "",
        "field": "",
        "institution": "",
        "year": ""
      }
    ],
    "experience": [
      {
        "title": "",
        "company": "",
        "start_date": "",
        "end_date": "",
        "description": ""
      }
    ],
    "skills": [],
    "job_preferences": {
      "role_types": [],
      "industries": [],
      "locations": [],
      "remote": false,
      "number_of_jobs_wanted": 3
    },
    "additional_notes": "",
    "update_required": false,
    "last_update": 0
  },
  "job": {
    "job_id": "",
    "title": "",
    "company": "",
    "location": "",
    "employment_type": "",
    "salary": "",
    "job_description": "",
    "requirements": [],
    "qualifications": [],
    "skills_mentioned": [],
    "apply_url": "",
    "score": 0,
    "pass": true,
    "rationale": ""
  }
}

==============================================================
OUTPUT SCHEMA (STRICT)

You MUST output:

{
  "job_id": "<string>",
  "cover_letter_text": "<string>"
}

    "job_id": MUST match job.job_id.

    "cover_letter_text": the full cover letter as plain text.

==============================================================
COVER LETTER RULES

CONTENT:

    Explain:
    ‚Ä¢ Why the user is a strong match for job.title at job.company.
    ‚Ä¢ Relevant experience & skills tied directly to the job requirements.
    ‚Ä¢ Tangible value the user offers the company.
    ‚Ä¢ Motivation for the role and/or company (grounded in the job and profile).

TONE:

    Professional, warm, confident.

    NOT generic; MUST reference job.title and job.company at least once.

    Use the user's profile information for specificity.

STRUCTURE:

    2‚Äì4 paragraphs.

    Coherent and personalized.

    Clear opening, body, and closing.

CONSTRAINTS:

    NO tool calls.

    NO user interaction.

    ONLY output valid JSON with keys "job_id" and "cover_letter_text".

    No markdown, no code fences.

==============================================================
END OF SPECIFICATION

""",

"application_builder_agent": """

You are the Application Builder Agent (Agent 2) in JobPilot.

Your job:
Take the final selected job list from the orchestrator and produce complete application packages
(resume + cover letter) by calling your sub-agents.
==============================================================
EXPECTED INPUT SCHEMA

You will receive:

{
  "selected_jobs": [
    {
      "job_id": "",
      "title": "",
      "company": "",
      "location": "",
      "employment_type": "",
      "salary": "",
      "job_description": "",
      "requirements": [],
      "qualifications": [],
      "skills_mentioned": [],
      "apply_url": "",
      "score": 0,
      "pass": true,
      "rationale": ""
    }
  ],
  "user_profile": {
    "name": "",
    "location": "",
    "contact": {
      "email": "",
      "phone": "",
      "linkedin": ""
    },
    "education": [
      {
        "degree": "",
        "field": "",
        "institution": "",
        "year": ""
      }
    ],
    "experience": [
      {
        "title": "",
        "company": "",
        "start_date": "",
        "end_date": "",
        "description": ""
      }
    ],
    "skills": [],
    "job_preferences": {
      "role_types": [],
      "industries": [],
      "locations": [],
      "remote": false,
      "number_of_jobs_wanted": 3
    },
    "additional_notes": "",
    "update_required": false,
    "last_update": 0
  }
}

Each element of selected_jobs is a job DICT from Agent 1 (job_search_agent), enriched with
score, pass, and rationale.
==============================================================
PROCESS

For EACH job in selected_jobs:

    Call resume_generator_agent with:
    {
    "user_profile": <PROFILE_SCHEMA>,
    "job": <job DICT>
    }

    It returns:

{
  "job_id": "<string>",
  "resume_text": "<string>"
}

Call cover_letter_generator_agent with:
{
"user_profile": <PROFILE_SCHEMA>,
"job": <job DICT>
}

It returns:

    {
      "job_id": "<string>",
      "cover_letter_text": "<string>"
    }

    Combine both into a single application object:

    {
    "job_id": "<string>",
    "resume_text": "<string>",
    "cover_letter_text": "<string>"
    }

Collect all such application objects into a list.
==============================================================
FINAL OUTPUT SCHEMA

You MUST output:

{
  "applications": [
    {
      "job_id": "<string>",
      "resume_text": "<string>",
      "cover_letter_text": "<string>"
    }
  ]
}

    The "applications" list MUST be in the SAME order as selected_jobs.

    job_id MUST match the job.job_id from Agent 1 for each respective job.

==============================================================
RULES

    NEVER generate resume_text or cover_letter_text yourself ‚Äî always call the sub-agents.

    NEVER modify job data or profile data.

    You MAY only assemble and return structured results.

    You MUST return DICT ONLY ‚Äî no extra keys, no additional text.

    No contacting the user ‚Äî orchestrator handles communication.

    If a sub-agent returns invalid or incomplete JSON, you should still produce a
    structured error object if possible, but your primary output schema remains:

    {
    "applications": [ ... ]
    }

==============================================================
END OF SPECIFICATION

"""
}

Writing /kaggle/working/instructions.py


# üß© JobPilot ‚Äî Unified Schema Definitions

This cell writes **schemas.py**, the central module containing all structured data schemas used throughout JobPilot.

These schemas define the **exact shape** of every structured object passed between agents, tools, and memory:

- **PROFILE_SCHEMA** ‚Äî user profile structure  
- **JOB_DETAILS_SCHEMA** ‚Äî normalized job posting structure  
- **JOB_FILTER_OUTPUT_SCHEMA** ‚Äî scoring/filtering output  

All agents rely on these schemas for validation, consistency, and interoperability.  
They mirror the specifications defined in `instructions.py` and ensure that every component of the system speaks the same ‚Äúdata language.‚Äù

This module serves as JobPilot‚Äôs single source of truth for structured data formats.


In [29]:
%%writefile /kaggle/working/schemas.py
"""
Unified schema definitions for JobPilot.

These schemas mirror EXACTLY the structures defined inside instructions.py.
They are simple Python dictionaries representing the expected fields and default
values for all structured objects shared across the JobPilot system.

"""


PROFILE_SCHEMA = {
    "name": "",
    "location": "",
    "contact": {
        "email": "",
        "phone": "",
        "linkedin": ""
    },
    "education": [
        {
            "degree": "",
            "field": "",
            "institution": "",
            "year": ""
        }
    ],
    "experience": [
        {
            "title": "",
            "company": "",
            "start_date": "",
            "end_date": "",
            "description": ""
        }
    ],
    "skills": [],
    "job_preferences": {
        "role_types": [],
        "industries": [],
        "locations": [],
        "remote": False,
        "number_of_jobs_wanted": 3
    },
    "additional_notes": "",
    "update_required": False,
    "last_update": 0
}



JOB_DETAILS_SCHEMA = {
    "job_id": "",
    "title": "",
    "company": "",
    "location": "",
    "employment_type": "",
    "salary": "",
    "job_description": "",
    "requirements": [],
    "qualifications": [],
    "skills_mentioned": [],
    "apply_url": ""
}




JOB_FILTER_OUTPUT_SCHEMA = {
    "pass": False,
    "score": 0,
    "rationale": ""
}



schemas = {
    "PROFILE_SCHEMA": PROFILE_SCHEMA,
    "JOB_DETAILS_SCHEMA": JOB_DETAILS_SCHEMA,
    "JOB_FILTER_OUTPUT_SCHEMA": JOB_FILTER_OUTPUT_SCHEMA,
}


Overwriting /kaggle/working/schemas.py


# üõ†Ô∏è JobPilot ‚Äî Job Ingestion Module

This cell writes the `ingest_jobs.py` file, a **fully independent job ingestion pipeline** used to populate JobPilot‚Äôs vector database (ChromaDB) with job postings.

Although separate from the multi-agent system, this module is an essential component that prepares the job dataset used by JobPilot's Job Search Agent.

### **What this ingestion module does**
- Connects to the **same ChromaDB instance** used by the JobPilot pipeline  
- Retrieves job posting URLs (configurable source)  
- Fetches the raw HTML for each URL  
- Extracts key job information using BeautifulSoup  
- Normalizes all fields into `JOB_DETAILS_SCHEMA`  
- Deduplicates entries using a deterministic job_id  
- Embeds raw job text using a local embedding model  
- Stores job metadata + vector embeddings inside ChromaDB  

This allows JobPilot‚Äôs retrieval system to perform fast, semantic search over real job postings, independent of any LLM calls or external APIs at runtime.

The ingestion script is intended to be run periodically to refresh and expand the job database powering the Job Search Agent.


In [28]:
#%%writefile /kaggle/working/ingest_jobs.py
"""
JobPilot ‚Äî Autonomous Job Ingestion Pipeline (FINAL VERSION)

Pipeline:
1. Use ADK Google Search to discover job URLs
2. Fetch raw HTML for each URL
3. Extract structured job details using LLM
4. Insert into ChromaDB using SentenceTransformer embeddings
5. Print summary

This script matches main.py perfectly.
"""

import os
import hashlib
import requests
import chromadb
from sentence_transformers import SentenceTransformer
from chromadb.utils import embedding_functions


CHROMA_DB_PATH = "/kaggle/working/jobpilot_chroma_db"   

HEADERS = {
    "User-Agent": "Mozilla/5.0 (X11; Linux x86_64)"
}

JOB_DETAILS_SCHEMA = {
    "job_id": "",
    "title": "",
    "company": "",
    "location": "",
    "employment_type": "",
    "salary": "",
    "job_description": "",
    "requirements": [],
    "qualifications": [],
    "skills_mentioned": [],
    "apply_url": ""
}


class LocalEmbeddingFunction:
    def __init__(self):
        self.model = SentenceTransformer("all-MiniLM-L6-v2")

    def __call__(self, input):
        if isinstance(input, str):
            input = [input]
        return self.model.encode(input, convert_to_numpy=True).tolist()

    def name(self):
        return "local-mini-lm-l6-v2"

embedding_fn = LocalEmbeddingFunction()


def connect_to_chromadb():
    client = chromadb.PersistentClient(path=CHROMA_DB_PATH)
    jobs = client.get_or_create_collection(
        name="jobs",
        metadata={"hnsw:space": "cosine"},
        embedding_function=embedding_fn
    )
    return jobs


from google.adk.tools.function_tool import FunctionTool
from google.adk.tools.google_search_tool import google_search

def job_link_search(tool_context, query: str, n_results: int = 15):
    """
    Uses ADK Google Search to retrieve job URLs.
    Returns only clean http/https URLs.
    """
    try:
        output = google_search(query=query, n_results=n_results)
        raw = output.get("search_results", [])

        urls = []
        for item in raw:
            link = item.get("link")
            if isinstance(link, str) and link.startswith("http"):
                urls.append(link)

        return {"query": query, "count": len(urls), "urls": urls}

    except Exception as e:
        return {"query": query, "count": 0, "urls": [], "error": str(e)}

job_link_search_tool_adk = FunctionTool(func=job_link_search)



def get_job_urls(query="machine learning engineer remote", n_results=15):
    """
    Uses ADK tool to discover real job posting URLs.
    """
    result = job_link_search_tool_adk.run({
        "query": query,
        "n_results": n_results
    })

    urls = result.get("urls", [])
    print(f"[INFO] ADK search discovered {len(urls)} job links.")
    return urls



def fetch_html(url: str) -> str | None:
    try:
        resp = requests.get(url, headers=HEADERS, timeout=12)
        if resp.status_code == 200:
            return resp.text
        print(f"[WARN] Failed {url} ‚Äî status {resp.status_code}")
    except Exception as e:
        print(f"[ERROR] Fetch error for {url}: {e}")
    return None



from google.adk.agents import LlmAgent
from google.adk.models.google_llm import Gemini

HTML_EXTRACTION_INSTRUCTION = """
You are the Job HTML Extraction Agent.

Given raw HTML and a job URL, extract job details into JOB_DETAILS_SCHEMA.

Output EXACTLY this JSON dict:

{
  "job_id": "<SHA256(url)[:16]>",
  "title": "",
  "company": "",
  "location": "",
  "employment_type": "",
  "salary": "",
  "job_description": "",
  "requirements": [],
  "qualifications": [],
  "skills_mentioned": [],
  "apply_url": "<same as input url>"
}

RULES:
- Extract ONLY what appears in the HTML.
- NEVER hallucinate information.
- Missing fields ‚Üí leave empty.
- All lists MUST be lists of strings.
- No markdown, no commentary.
"""

gemini_flash = Gemini(model="gemini-2.5-flash")

html_extractor_agent = LlmAgent(
    model=gemini_flash,
    name="html_extractor_agent",
    description="Extracts structured job details from raw HTML.",
    instruction=HTML_EXTRACTION_INSTRUCTION
)

def parse_job_html(html: str, url: str) -> dict:
    job_id = hashlib.sha256(url.encode()).hexdigest()[:16]

    response = html_extractor_agent.run({
        "url": url,
        "html": html
    })

    # Enforce schema
    response["job_id"] = job_id
    response["apply_url"] = url

    # Ensure all keys exist
    for k, v in JOB_DETAILS_SCHEMA.items():
        response.setdefault(k, v)

    return response


def job_exists(collection, job_id: str) -> bool:
    try:
        out = collection.get(ids=[job_id])
        return len(out.get("ids", [])) > 0
    except:
        return False


def insert_job(collection, job_details: dict, raw_html: str):
    collection.add(
        ids=[job_details["job_id"]],
        documents=[raw_html],
        metadatas=[job_details]
    )



def ingest():
    jobs_collection = connect_to_chromadb()

    urls = get_job_urls(
        query="machine learning engineer remote",
        n_results=15
    )

    inserted = 0
    skipped = 0

    for url in urls:
        print(f"\n[INFO] Processing: {url}")

        html = fetch_html(url)
        if not html:
            print("[WARN] Skipping ‚Äî no HTML")
            continue

        parsed = parse_job_html(html, url)
        job_id = parsed["job_id"]

        if job_exists(jobs_collection, job_id):
            print(f"[INFO] Skipped (already exists): {job_id}")
            skipped += 1
            continue

        insert_job(jobs_collection, parsed, html)
        print(f"[SUCCESS] Inserted: {job_id}")
        inserted += 1

    print("\n======== INGEST SUMMARY ========")
    print(f"Inserted: {inserted}")
    print(f"Skipped: {skipped}")
    print("================================\n")


if __name__ == "__main__":
    ingest()


Overwriting /kaggle/working/ingest_jobs.py


# üöÄ JobPilot ‚Äî Main Multi-Agent System (ADK Orchestrator)

This cell writes the full `main.py` module, which serves as the **core entry point of the JobPilot application**.  
It initializes all key components of the multi-agent job-search engine built with Google‚Äôs Agent Development Kit (ADK).

### **What this module does**
It sets up the entire JobPilot runtime:

### **1. Services & Infrastructure**
- Loads API keys and environment variables  
- Initializes the **SQLite-backed DatabaseSessionService** (for persistent sessions & memory)  
- Connects to the **shared ChromaDB vector store** that contains job postings  
- Registers the chosen embedding function for semantic search  

### **2. Tooling for the Agents**
Defines ADK-compliant tools including:
- `chroma_query_tool` ‚Üí semantic job search  
- `rank_job_tool` ‚Üí deterministic ranking of job candidates  

These tools are used internally by the Job Search Agent.

### **3. All LLM Agents**
Initializes every JobPilot agent using Gemini models:
- **Orchestrator Agent** (top-level controller)  
- **Profile Builder Agent**  
- **Job Search Agent**  
- **Job Filter Agent**  
- **Job Summarizer Agent**  
- **Resume Generator Agent**  
- **Cover Letter Generator Agent**  
- **Application Builder Agent**

Each agent receives its corresponding instruction block from `instructions.py`, ensuring consistent behavior.

### **4. Agent ‚Üí Tool Wiring**
Attaches tools to the correct agents following ADK conventions:
- Orchestrator uses: profile builder ‚Üí job search ‚Üí summarizer ‚Üí application builder  
- Job Search Agent uses: `chroma_query_tool`, `job_filter_agent`, `rank_job_tool`  
- Application Builder uses: resume + cover letter generators  

### **5. Runner Setup**
Creates an ADK `Runner` with:
- the orchestrator as the root agent  
- global logging  
- persistent session handling  

This is what enables multi-turn state, memory, and a reproducible job search workflow.

### **6. Optional Debug Entry Point**
The `main()` coroutine at the bottom demonstrates a **full end-to-end test run** using a sample user profile.  
Running it triggers the entire JobPilot pipeline:
1. Profile extraction  
2. Job database lookup  
3. Filtering & ranking  
4. Summarization  
5. Application package generation  

---

This module is the **heart of the JobPilot system** ‚Äî the part that ties every agent, tool, schema, memory, and vector-search capability together into one cohesive multi-agent application.


In [30]:
#%%writefile /kaggle/working/main.py
import os
import json
import hashlib
from typing import List, Dict, Any
from dotenv import load_dotenv
import asyncio
import requests
from pydantic import BaseModel, Field

from google.genai import types
from google.adk.agents import LlmAgent
from google.adk.models.google_llm import Gemini
from google.adk.sessions import DatabaseSessionService
from google.adk.tools.agent_tool import AgentTool, ToolContext
from google.adk.tools.function_tool import FunctionTool
from google.adk.tools.google_search_tool import google_search
from google.adk.runners import Runner
from google.adk.plugins.logging_plugin import LoggingPlugin

from schemas import (
    JOB_DETAILS_SCHEMA,
    PROFILE_SCHEMA,
    JOB_FILTER_OUTPUT_SCHEMA
)

from kaggle_secrets import UserSecretsClient

user_secrets = UserSecretsClient()
api_key = user_secrets.get_secret("GOOGLE_API_KEY")

os.environ["GOOGLE_API_KEY"] = api_key

retry_config = types.HttpRetryOptions(
    attempts=5,
    exp_base=7,
    initial_delay=1,
    http_status_codes=[429, 500, 503, 504],
)

gemini_flash = Gemini(model="gemini-2.5-flash", retry_options=retry_config)
gemini_lite = Gemini(model="gemini-2.5-flash-lite", retry_options=retry_config)

session_service = DatabaseSessionService(
    db_url="sqlite:////kaggle/working/autoapply_sessions.db"
)

from sentence_transformers import SentenceTransformer

class LocalEmbeddingFunction:
    def __init__(self):
        self.model = SentenceTransformer("all-MiniLM-L6-v2")

    def __call__(self, input):
        if isinstance(input, str):
            input = [input]
        return self.model.encode(input, convert_to_numpy=True).tolist()

    def name(self):
        return "local-mini-lm-l6-v2"

embedding_fn = LocalEmbeddingFunction()


import chromadb
from chromadb.utils import embedding_functions

CHROMA_DB_PATH = "jobpilot_chroma_db"
client = chromadb.PersistentClient(path=CHROMA_DB_PATH)


jobs_collection = client.get_or_create_collection(
    name="jobs",
    metadata={"hnsw:space": "cosine"},
    embedding_function=embedding_fn
)

def chroma_query_tool(
    tool_context: ToolContext,
    query_text: str,
    top_k: int = 20
) -> Dict[str, Any]:
    """
    Performs a semantic search against the ChromaDB 'jobs' collection.

    Inputs:
        query_text (str): Dense semantic query created by job_search_agent.
        top_k (int): Number of results to return from vector search.

    Returns:
        {
            "results": [ job documents ],
            "query_text": "<query used>",
            "top_k": <int>,
            "num_returned": <int>,
            "error": None or <string>
        }
    """
    if not isinstance(query_text, str) or len(query_text.strip()) == 0:
        return {
            "results": [],
            "query_text": query_text,
            "top_k": top_k,
            "num_returned": 0,
            "error": "Invalid or empty query_text."
        }

    try:
        query_results = jobs_collection.query(
            query_texts=[query_text],
            n_results=top_k
        )

        documents = []
        if (
            query_results
            and "documents" in query_results
            and len(query_results["documents"]) > 0
        ):
            for idx, doc in enumerate(query_results["documents"][0]):
                metadata = query_results["metadatas"][0][idx]
                documents.append(metadata)

        return {
            "results": documents,
            "query_text": query_text,
            "top_k": top_k,
            "num_returned": len(documents),
            "error": None
        }

    except Exception as e:
        return {
            "results": [],
            "query_text": query_text,
            "top_k": top_k,
            "num_returned": 0,
            "error": f"CHROMA_EXCEPTION: {str(e)}"
        }

chroma_query_tool_adk = FunctionTool(func=chroma_query_tool)

def rank_job_tool(tool_context: ToolContext, jobs: List[Dict[str, Any]], top_k: int) -> Dict[str, Any]:
    if not isinstance(jobs, list):
        return {
            "ranked_jobs": [],
            "top_k": top_k,
            "total_jobs_in": 0,
            "total_jobs_ranked": 0,
            "error": "Invalid input: jobs must be a list."
        }

    valid_jobs = [j for j in jobs if isinstance(j.get("score"), (int, float))]
    ranked = sorted(valid_jobs, key=lambda j: j["score"], reverse=True)
    top_ranked = ranked[:top_k]

    return {
        "jobs": top_ranked,
        "top_k": top_k,
        "total_jobs_in": len(jobs),
        "total_jobs_ranked": len(top_ranked)
    }

rank_job_tool_adk = FunctionTool(func=rank_job_tool)

from instructions import instructions_json

orchestrator_agent = LlmAgent(
    model=gemini_flash,
    name="orchestrator_agent",
    description="Top-level controller for the JobPilot multi-agent system.",
    instruction=instructions_json['orchestrator_agent']
)

class ProfileBuilderInput(BaseModel):
    user_text: str
    existing_profile: Dict[str, Any] | None = None

profile_builder_agent = LlmAgent(
    model=gemini_flash,
    name="profile_builder_agent",
    description="Parses the user's free-form background into PROFILE_SCHEMA.",
    input_schema=ProfileBuilderInput,
    static_instruction=instructions_json['profile_builder_agent']
)

job_filter_agent = LlmAgent(
    model=gemini_lite,
    name="job_filter_agent",
    description="Evaluates user‚Äìjob fit and produces a binary pass/fail and numeric score.",
    static_instruction=instructions_json['job_filter_agent']
)

class JobSearchAgentInput(BaseModel):
    profile: Dict[str, Any]
    rejection_memory: List[Any]

job_search_agent = LlmAgent(
    model=gemini_flash,
    name="job_search_agent",
    description="Searches for jobs in the existing database.",
    input_schema=JobSearchAgentInput,
    instruction=instructions_json['job_search_agent']
)

job_summarizer_agent = LlmAgent(
    model=gemini_lite,
    name="job_summarizer_agent",
    description="Generates clear, concise summaries of job postings.",
    static_instruction=instructions_json['job_summarizer_agent']
)

resume_generator_agent = LlmAgent(
    model=gemini_flash,
    name="resume_generator_agent",
    description="Generates a fully tailored resume for a specific job.",
    static_instruction=instructions_json['resume_generator_agent']
)

cover_letter_agent = LlmAgent(
    model=gemini_flash,
    name="cover_letter_generator_agent",
    description="Generates a tailored cover letter for a job.",
    instruction=instructions_json['cover_letter_generator_agent']
)

application_builder_agent = LlmAgent(
    model=gemini_flash,
    name="application_builder_agent",
    description="Agent 2 in JobPilot. Coordinates resume and cover letter generation.",
    instruction=instructions_json['application_builder_agent']
)

# === Attach Tools ===
profile_builder_agent_adk = AgentTool(agent=profile_builder_agent)
job_filter_agent_adk = AgentTool(agent=job_filter_agent)
resume_generator_agent_adk = AgentTool(agent=resume_generator_agent)
cover_letter_agent_adk = AgentTool(agent=cover_letter_agent)

orchestrator_agent.tools = [
    profile_builder_agent_adk,
    AgentTool(agent=job_search_agent),
    AgentTool(agent=job_summarizer_agent),
    AgentTool(agent=application_builder_agent),
]

job_search_agent.tools = [
    chroma_query_tool_adk,
    job_filter_agent_adk,
    rank_job_tool_adk,
]

job_filter_agent.tools = []
job_summarizer_agent.tools = []
resume_generator_agent.tools = []
cover_letter_agent.tools = []

application_builder_agent.tools = [
    resume_generator_agent_adk,
    cover_letter_agent_adk,
]

APP_NAME = "JobPilot_AgentSystem"

runner = Runner(
    agent=orchestrator_agent,
    app_name=APP_NAME,
    session_service=session_service,
    plugins=[LoggingPlugin()],
)

load_dotenv()
api_key = os.environ.get("GOOGLE_API_KEY")

async def main():
    test_input = """
Hi, my name is Ofer Harpaz Vaizman.

I'm currently based in Rockville, Maryland.
My phone number is 240-316-0830 and my email is oferharvai@gmail.com.
My LinkedIn is https://www.linkedin.com/in/ofer-v-data-analysis.

I have a BSc in Mathematics from the Open University of Israel (graduated with honors).
I also hold certifications in NASM CPT and CPR/AED.

Experience-wise, I‚Äôve worked on several analytics and machine learning projects.
I‚Äôve built agent-based systems (including multi-agent pipelines using Google‚Äôs ADK),
done data analysis in Python, and completed various machine learning projects ranging from
supervised models to RNNs, CNNs, and transformer-based architectures.

I also have experience tutoring students in math and assisting in coaching at a climbing gym.

My main skills include Python, data analysis, statistics, machine learning, agent systems,
and fitness coaching. I'm also familiar with TensorFlow, SQLAlchemy, and web scraping.

For job preferences:
I'm mainly looking for Data Analyst, Machine Learning Engineer, or AI Engineer roles.
I prefer remote or hybrid positions, ideally in the United States.
Industries I‚Äôm most interested in: AI, tech, startups, research organizations, or fitness tech.

I‚Äôd like to see 3 job options for now.
Let me know what roles you find.
"""

    response = await runner.run_debug(
        test_input,
        session_id="my_new_session_014"
    )

    print("\n============================")
    print("üü¢ Test Run Complete")
    print("============================")
    print(response)
    print("ready")

print("Successful")

Overwriting /kaggle/working/main.py


In [None]:
await main()