# <center>**`Project Details`**</center>

#### **Purpose**:

Tis project goal is matching a resume to a job description. A poorly aligned resume can lead to missed opportunities, even when the candidate is a strong fit. The project aims to showcase how we can make use of AI agents to help applicants tailor their resumes more strategically, uncover hidden gaps, and present a stronger case to recruiters — all with minimal effort.

A [Github repo](https://github.com/vikrambhat2/MultiAgents-with-CrewAI-ResumeJDMatcher) tackling this project exist and our goal will be to improve it by seperating the **Backend** and the **Frontend** logics.

##### **Why Split**?

 - The backend will hold business logic, agent orchestration, model calls, state, data processing, API endpoints, while the frontend focuses on the UI/UX, user interaction, session management, file upload, displaying results.
 - Scalability: Backend can scale independently of UI (and can even serve other clients)
 - Security: Sensitive logic, API keys, and resource-intensive processing are kept server-side
 - Performance: Streamlit remains snappy, while heavy lifting is offloaded to backend
 
##### **Responsibilities**

  - *<u>Backend</u>*: 
    - Expose REST API endpoints:
        - `/match`: Accepts resume + JD, returns match results and insights.
        - `/enhance`: Accepts resume + JD, returns resume improvement suggestions.
        - `/cover-letter`: Accepts resume + JD, returns a cover letter.
    - Agent orchestration: All CrewAI workflows run here.
    - Input validation, error handling.
    - PDF/text parsing if desired (or can also be handled in frontend, see below).
    - Optional: Authentication, user/session management, logging, monitoring.
    - Optional: Serve as an async queue for heavy jobs if latency is an issue (using Celery/RQ, etc.).
 
 - *<u>Frontend</u>(Streamlit)*
    - UI for uploading files, entering/pasting text.
    - Visualization: Render reports, scores, enhanced resume, cover letter, etc.
    - API client: Handles all interaction with FastAPI backend.
    - Light preprocessing: E.g., local PDF parsing if you want to send plain text to backend (saves bandwidth).
    - Session/user state, feedback, download links, etc.

Here is how the system works (Flow):

 1. User uploads resume & JD (PDF or text) → Streamlit UI

 2. Frontend extracts or passes files → Sends to FastAPI (as text or file)

 3. FastAPI endpoint receives, orchestrates CrewAI agents, returns structured results

 4. Streamlit displays results, progress, suggestions, etc.

#### **Constraints**:

 - None


#### **Tools**:

 - Use local **ollama** model

#### **Requirements**:
 - Make it work as expected


***

## <center>**`Implementation`**</center>

## **`Backend`**

#### Config

In [None]:
%%writefile ../backend/app/config.py
# backend/app/config.py

from pydantic import BaseModel, Field
import os
from dotenv import load_dotenv

load_dotenv()

class Settings(BaseModel):
    # LLM config
    LLM_PROVIDER: str = Field(default=os.getenv("LLM_PROVIDER", "ollama"))
    LLM_API_KEY: str = Field(default=os.getenv("LLM_API_KEY", "ollama"))
    LLM_BASE_URL: str = Field(default=os.getenv("LLM_BASE_URL", "http://ollama:11434"))
    #LLM_BASE_URL: str = Field(default=os.getenv("LLM_BASE_URL", "http://host.docker.internal:11434"))
    LLM_MODEL_NAME: str = Field(default=os.getenv("LLM_MODEL_NAME", "llama3.2"))
    #LLM_MODEL_NAME: str = Field(default=os.getenv("LLM_MODEL_NAME", "qwen3"))
    LLM_TEMPERATURE: str = Field(default=float(os.getenv("LLM_TEMPERATURE", "0.0")))

    # Celery/Redis
    REDIS_URL: str = Field(default=os.getenv("REDIS_URL", "redis://host.docker.internal:6379/0"))

    def full_model_id(self) -> str:
        """
        Return provider-prefixed model id for LiteLLM, e.g.:
        - 'ollama/llama3.2'
        - 'openai/gpt-4o-mini'
        - 'groq/llama3-8b-8192'
        """
        provider = self.LLM_PROVIDER.strip().lower()
        # If already prefixed, keep as is
        if "/" in self.LLM_MODEL_NAME:
            return self.LLM_MODEL_NAME
        return f"{provider}/{self.LLM_MODEL_NAME}"


settings = Settings()

Overwriting ../backend/app/config.py


### Core

#### PDF Parsing

In [2]:
%%writefile ../backend/app/core/pdf_parser.py

#backend/app/core/pdf_parser.py
from typing import Union
from pathlib import Path
from PyPDF2 import PdfReader


class PDFParser:
    """Handles PDF and plain text extraction."""

    def extract_text(self, file: Union[Path, bytes]) -> str:
        if isinstance(file, Path):
            with open(file, "rb") as f:
                reader = PdfReader(f)
                return self._extract_all(reader)
        elif isinstance(file, bytes):
            from io import BytesIO
            reader = PdfReader(BytesIO(file))
            return self._extract_all(reader)
        else:
            raise ValueError("Unsupported file type for PDFParser.")
        
    def _extract_all(self, reader: PdfReader) -> str:
        text = []
        for page in reader.pages:
            page_text = page.extract_text()
            if page_text:
                text.append(page_text)
        return "\n".join(text).strip()

Overwriting ../backend/app/core/pdf_parser.py


#### Artifacts

In [26]:
%%writefile ../backend/app/core/artifacts.py
# backend/app/core/artifacts.py

from typing import Dict, Any, List, Tuple
from reportlab.lib.pagesizes import A4
from reportlab.lib.units import cm
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.enums import TA_CENTER
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer, ListFlowable
from reportlab.lib import colors
import datetime
import re

class PDFRenderer:
    """Render job results as polished PDFs using ReportLab.

    Improvements in this version:
    - Consistent headers + timestamps
    - Proper bullet lists (unordered + ordered)
    - Basic Markdown-like rendering:
        * '#', '##', '###' headings
        * '- ' and '* ' bullets
        * '1. ' numbered lists
        * **bold** and *italic* inline
        * Paragraph spacing and line breaks
    - Normalization helpers for Enhance/Cover Letter content
    """

    def __init__(self):
        styles = getSampleStyleSheet()
        self.title_style = ParagraphStyle(
            name="TitleCentered",
            parent=styles["Title"],
            alignment=TA_CENTER,
            spaceAfter=12,
        )
        self.h1 = styles["Heading1"]
        self.h2 = styles["Heading2"]
        self.h3 = styles["Heading3"]
        self.body = styles["BodyText"]

        # Slightly tighter body text for letters
        self.body_letter = ParagraphStyle(
            name="BodyLetter",
            parent=self.body,
            leading=14
        )

    # ---------- Public API ----------
    def build_match_pdf(self, path: str, result: Dict[str, Any]) -> None:
        """Structured, sectioned report for matching results."""
        doc = SimpleDocTemplate(
            path, pagesize=A4,
            topMargin=2 * cm, bottomMargin=2 * cm,
            leftMargin=2 * cm, rightMargin=2 * cm
        )
        flow: List = []
        flow += self._header("Resume ↔ JD Match Report")

        score = result.get("match_score", "N/A")
        strengths: List[str] = result.get("strengths", []) or []
        gaps: List[str] = result.get("gaps", []) or []
        summary = result.get("summary", "")

        flow.append(Paragraph("Overall Score", self.h2))
        flow.append(Paragraph(f"<b>{self._escape_html(str(score))}%</b>", self.body))
        flow.append(Spacer(1, 0.3 * cm))

        flow.append(Paragraph("Strengths", self.h2))
        flow += self._bullet_list(strengths)
        flow.append(Spacer(1, 0.3 * cm))

        flow.append(Paragraph("Gaps", self.h2))
        flow += self._bullet_list(gaps)
        flow.append(Spacer(1, 0.3 * cm))

        flow.append(Paragraph("Summary", self.h2))
        flow.append(Paragraph(self._nl2br(self._escape_html(summary or "_No summary provided._")), self.body))

        doc.build(flow)

    def build_enhance_pdf(self, path: str, result: Dict[str, Any]) -> None:
        """Render enhancement suggestions with clear sections and bullets."""
        doc = SimpleDocTemplate(
            path, pagesize=A4,
            topMargin=2 * cm, bottomMargin=2 * cm,
            leftMargin=2 * cm, rightMargin=2 * cm
        )
        flow: List = []
        flow += self._header("Resume Enhancement Suggestions")

        raw_md = result.get("resume_enhancement_md", "") or "_No suggestions generated._"
        md = self._normalize_enhance_md(raw_md)
        flow += self._markdown_to_flowables(md, use_letter_style=False)

        doc.build(flow)

    def build_cover_letter_pdf(self, path: str, result: Dict[str, Any]) -> None:
        """Render the cover letter with readable paragraph spacing."""
        doc = SimpleDocTemplate(
            path, pagesize=A4,
            topMargin=2 * cm, bottomMargin=2 * cm,
            leftMargin=2 * cm, rightMargin=2 * cm
        )
        flow: List = []
        flow += self._header("Cover Letter")

        raw_md = result.get("cover_letter_md", "") or "_No cover letter generated._"
        md = self._normalize_cover_letter_md(raw_md)
        flow += self._markdown_to_flowables(md, use_letter_style=True)

        doc.build(flow)

    def build_generic_pdf(self, path: str, title: str, body_text_or_md: str) -> None:
        """Fallback generic PDF with a title and markdown-ish body."""
        doc = SimpleDocTemplate(
            path, pagesize=A4,
            topMargin=2 * cm, bottomMargin=2 * cm,
            leftMargin=2 * cm, rightMargin=2 * cm
        )
        flow: List = []
        flow += self._header(title)
        flow += self._markdown_to_flowables(body_text_or_md or "_No content._", use_letter_style=False)
        doc.build(flow)

    # ---------- Section Builders ----------
    def _header(self, title: str) -> List:
        now = datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M UTC")
        return [
            Paragraph(title, self.title_style),
            Paragraph(f"<font size=9 color=grey>Generated: {self._escape_html(now)}</font>", self.body),
            Spacer(1, 0.5 * cm),
        ]

    def _bullet_list(self, items: List[str]) -> List:
        """Unordered bullet list with clean bullets."""
        if not items:
            return [Paragraph("<i>None</i>", self.body)]
        paras = [Paragraph(self._inline_format(self._escape_html(x)), self.body) for x in items]
        return [ListFlowable(
            paras,
            bulletType="bullet",
            leftIndent=10,
            bulletColor=colors.black,
        )]

    def _numbered_list(self, items: List[str]) -> List:
        """Ordered list (1., 2., 3., ...)"""
        if not items:
            return [Paragraph("<i>None</i>", self.body)]
        paras = [Paragraph(self._inline_format(self._escape_html(x)), self.body) for x in items]
        return [ListFlowable(
            paras,
            bulletType="1",
            leftIndent=10,
            bulletColor=colors.black,
        )]

    # ---------- Markdown-lite Rendering ----------
    def _markdown_to_flowables(self, text: str, use_letter_style: bool) -> List:
        """
        Very light-weight markdown-ish parser to make nice PDFs:
        - '# ', '## ', '### ' headings
        - '- ' or '* ' unordered bullets
        - '1. ' ordered bullets
        - Blank lines -> paragraph spacing
        - Inline **bold** and *italic* supported
        """
        lines = text.splitlines()
        flow: List = []
        buffer_ul: List[str] = []
        buffer_ol: List[str] = []

        def flush_lists():
            nonlocal buffer_ul, buffer_ol, flow
            if buffer_ul:
                flow += self._bullet_list(buffer_ul)
                flow.append(Spacer(1, 0.2 * cm))
                buffer_ul = []
            if buffer_ol:
                flow += self._numbered_list(buffer_ol)
                flow.append(Spacer(1, 0.2 * cm))
                buffer_ol = []

        p_style = self.body_letter if use_letter_style else self.body

        for raw in lines:
            line = raw.rstrip()

            # Blank line separates blocks
            if not line.strip():
                flush_lists()
                flow.append(Spacer(1, 0.2 * cm))
                continue

            # Headings
            if line.startswith("### "):
                flush_lists()
                flow.append(Paragraph(self._escape_html(line[4:]), self.h3))
                continue
            if line.startswith("## "):
                flush_lists()
                flow.append(Paragraph(self._escape_html(line[3:]), self.h2))
                continue
            if line.startswith("# "):
                flush_lists()
                flow.append(Paragraph(self._escape_html(line[2:]), self.h1))
                continue

            # Ordered list "1. ", "2. ", etc.
            m_num = re.match(r"^\s*\d+\.\s+(.*)$", line)
            if m_num:
                buffer_ol.append(m_num.group(1))
                continue

            # Unordered bullets "- " or "* "
            if line.lstrip().startswith("- "):
                buffer_ul.append(line.lstrip()[2:])
                continue
            if line.lstrip().startswith("* "):
                buffer_ul.append(line.lstrip()[2:])
                continue

            # Normal paragraph
            flush_lists()
            flow.append(Paragraph(self._nl2br(self._inline_format(self._escape_html(line))), p_style))

        flush_lists()
        return flow

    # ---------- Normalizers for specific job types ----------
    def _normalize_enhance_md(self, md: str) -> str:
        """Ensure standard sections exist for Enhance output."""
        text = md.strip()
        if not text:
            return "_No suggestions generated._"

        # If it doesn't contain an H2, add standard headings
        has_h2 = any(line.startswith("## ") for line in text.splitlines())
        if not has_h2:
            # Heuristic split: first paragraph as intro, then bullets become "Improvements"
            parts = text.splitlines()
            bullets = [p[2:] for p in parts if p.lstrip().startswith("- ")]
            intro = "\n".join(p for p in parts if not p.lstrip().startswith("- "))
            rebuilt = "## Improvements\n"
            if bullets:
                rebuilt += "\n".join(f"- {b}" for b in bullets)
            else:
                rebuilt += "_No bullet suggestions found._"
            if intro.strip():
                rebuilt = f"## Notes\n{intro.strip()}\n\n" + rebuilt
            return rebuilt

        return text

    def _normalize_cover_letter_md(self, md: str) -> str:
        """Make sure the letter reads well; add minimal structure if missing."""
        text = md.strip()
        if not text:
            return "_No cover letter generated._"

        # If there are no headings at all, just return as paragraphs
        has_heading = any(line.startswith("#") for line in text.splitlines())
        if not has_heading:
            return text

        return text

    # ---------- Inline helpers ----------
    @staticmethod
    def _nl2br(text: str) -> str:
        """Convert newlines to <br/> for ReportLab Paragraph."""
        return text.replace("\n", "<br/>")

    @staticmethod
    def _escape_html(text: str) -> str:
        """Minimal XML/HTML escaping for ReportLab Paragraph."""
        return (
            text.replace("&", "&amp;")
                .replace("<", "&lt;")
                .replace(">", "&gt;")
        )

    @staticmethod
    def _inline_format(text: str) -> str:
        """Convert **bold** and *italic* markdown to HTML for ReportLab."""
        # Bold: **text**
        text = re.sub(r"\*\*(.+?)\*\*", r"<b>\1</b>", text)
        # Italic: *text*
        text = re.sub(r"(?<!\*)\*(?!\s)(.+?)(?<!\s)\*(?!\*)", r"<i>\1</i>", text)
        return text

Overwriting ../backend/app/core/artifacts.py


#### Agents

In [3]:
%%writefile ../backend/app/core/agents.py
# backend/app/core/agents.py

from dataclasses import dataclass
from typing import Any
from crewai import Agent, LLM

@dataclass
class MatcherAgents:
    resume_parser: Agent
    jd_parser: Agent
    matcher: Agent
    enhancer: Agent
    cover_letter: Agent

class AgentsFactory:
    """Factory that builds all CrewAI agents with a shared LLM."""
    def __init__(self, llm: LLM):
        self.llm = llm

    def build(self) -> MatcherAgents:
        resume_parser = Agent(
            role="Resume Parsing Specialist",
            goal="Extract structured data (skills, experience, education, tools) from a resume.",
            backstory="You are meticulous and consistent. Output JSON only.",
            llm=self.llm,
            verbose=False
        )
        jd_parser = Agent(
            role="Job Description Analyst",
            goal="Extract required skills, responsibilities, and must-haves from a JD.",
            backstory="You identify core requirements and hiring signals. Output JSON only.",
            llm=self.llm,
            verbose=False
        )
        matcher = Agent(
            role="Resume-JD Matcher",
            goal="Compare parsed resume vs parsed JD. Score 0-100 and list strengths and gaps.",
            backstory="You are objective and concise. Output JSON only.",
            llm=self.llm,
            verbose=False
        )
        enhancer = Agent(
            role="Resume Enhancer",
            goal="Suggest resume improvements aligned with the JD and rewrite 3–5 key bullets.",
            backstory="Keep it ATS-friendly and specific. Output Markdown.",
            llm=self.llm,
            verbose=False
        )
        cover_letter = Agent(
            role="Cover Letter Writer",
            goal="Draft a tailored one-page cover letter aligned with resume and JD.",
            backstory="Professional, concise, concrete achievements. Output Markdown.",
            llm=self.llm,
            verbose=False
        )

        return MatcherAgents(
            resume_parser=resume_parser,
            jd_parser=jd_parser,
            matcher=matcher,
            enhancer=enhancer,
            cover_letter=cover_letter
        )

Writing ../backend/app/core/agents.py


#### Agent Orchestrator

In [28]:
%%writefile ../backend/app/core/agent_orchestrator.py

# backend/app/core/agent_orchestrator.py
from typing import Dict, Any
from crewai import Task, Crew, LLM, Process
from backend.app.config import settings
from backend.app.core.agents import AgentsFactory

class AgentOrchestrator:
    """Handles agent pipeline for resume-JD matching."""
    def __init__(self):
        # We will build LLM here
        model_id = settings.full_model_id()
        print(model_id)
        self.llm = LLM(
            model=model_id,
            base_url=settings.LLM_BASE_URL,
            api_key=settings.LLM_API_KEY,
            temperature=settings.LLM_TEMPERATURE,
        )

    def _common_validate(self, data: Dict[str, Any]):
        resume = (data or {}).get("resume") or ""
        jd = (data or {}).get("jd") or ""
        if not resume.strip() or not jd.strip():
            raise ValueError("Both 'resume' and 'jd' text are required.")
        return resume, jd
    
    def _build_parsing_tasks(self, agents, resume: str, jd: str):
        resume_task = Task(
            description=f"Extract structured JSON from the resume text below.\nReturn keys: skills, experience, education, tools.\n\nRESUME:\n{resume}",
            expected_output="Valid JSON with keys: skills, experience, education, tools.",
            agent=agents.resume_parser
        )
        jd_task = Task(
            description=f"Extract structured JSON from the job description below.\nReturn keys: must_haves, nice_to_haves, responsibilities, keywords.\n\nJD:\n{jd}",
            expected_output="Valid JSON with keys: must_haves, nice_to_haves, responsibilities, keywords.",
            agent=agents.jd_parser
        )
        return resume_task, jd_task

    def run(self, job_type: str, data: Dict[str, Any]) -> Dict[str, Any]:
        """Executes the specified agent pipeline.
        Args:
            job_type: 'match', 'enhance', or 'cover_letter'
            data: Dict with 'resume' and 'jd' (plain text)
        Returns:
            Dict with results
        """

        job_type = (job_type or "").lower()
        if job_type not in {"match", "enhance", "cover_letter"}:
            raise ValueError(f"Unsupported job_type: {job_type}")
        
        resume, jd = self._common_validate(data)
        agents = AgentsFactory(self.llm).build()

        if job_type == "match":
            resume_task, jd_task = self._build_parsing_tasks(agents, resume, jd)
            match_task = Task(
                description="Compare the parsed resume vs parsed JD and return a JSON with keys: "
                            "- match_score: integer from 0-100, " \
                            "- strengths: list of matching skills from the resume and the JD, " \
                            "- gaps: list of gaps in the resume compared to the JD, " \
                            "- summary: string to summarize the evaluation.",
                expected_output="Valid JSON with keys: match_score, strengths, gaps, summary.",
                agent=agents.matcher,
                context=[resume_task, jd_task]
            )
            crew = Crew(
                agents=[agents.resume_parser, agents.jd_parser, agents.matcher],
                tasks=[resume_task, jd_task, match_task],
                process=Process.sequential,
                verbose=False,
                name="MatchCrew",
                description="Parses resume and JD, then computes a structured match report."
            )
            result = crew.kickoff()
            return self._safe_parse_result(result, kind="match")
        
        if job_type == "enhance":
            resume_task, jd_task = self._build_parsing_tasks(agents, resume, jd)
            enhance_task = Task(
                description="Using parsed resume and JD, suggest concrete improvements and rewrite 3–5 bullets. "
                            "Return Markdown with sections: 'Improvements' (bulleted) and 'Rewritten Bullets'.",
                expected_output="Markdown with 'Improvements' and 'Rewritten Bullets' sections.",
                agent=agents.enhancer,
                context=[resume_task, jd_task]
            )
            crew = Crew(
                agents=[agents.resume_parser, agents.jd_parser, agents.enhancer],
                tasks=[resume_task, jd_task, enhance_task],
                process=Process.sequential,
                verbose=False,
                name="EnhanceCrew",
                description="Parses resume and JD, then produces targeted enhancements."
            )
            result = crew.kickoff()
            return {"status": "done", "result": {"resume_enhancement_md": getattr(result, "raw", str(result))}}

        if job_type == "cover_letter":
            resume_task, jd_task = self._build_parsing_tasks(agents, resume, jd)
            cl_task = Task(
                description="Draft a tailored one-page cover letter in Markdown based on parsed resume and JD.",
                expected_output="A Markdown-formatted cover letter.",
                agent=agents.cover_letter,
                context=[resume_task, jd_task]
            )
            crew = Crew(
                agents=[agents.resume_parser, agents.jd_parser, agents.cover_letter],
                tasks=[resume_task, jd_task, cl_task],
                process=Process.sequential,
                verbose=False,
                name="CoverLetterCrew",
                description="Parses resume and JD, then writes a tailored cover letter."
            )
            result = crew.kickoff()
            return {"status": "done", "result": {"cover_letter_md": getattr(result, "raw", str(result))}}

        raise RuntimeError("Unreachable branch.")

    def _safe_parse_result(self, crew_result, kind: str) -> Dict[str, Any]:
        raw = getattr(crew_result, "raw", None)
        if not raw:
            return {"status": "done", "result": {"raw": str(crew_result)}}
        # The matcher agent is instructed to output JSON, but we guard anyway.
        try:
            import json
            parsed = json.loads(raw)
            return {"status": "done", "result": parsed}
        except Exception:
            return {"status": "done", "result": {"raw": raw}}


Overwriting ../backend/app/core/agent_orchestrator.py


### Job Queueing + Celery for distributed background job handling

#### Queueing

In [37]:
%%writefile ../backend/app/core/async_queue.py
# backend/app/core/async_queue.py

from typing import Dict, Any
from celery.result import AsyncResult
from backend.app.core.tasks import run_agent_job
from backend.worker.worker import celery_app

class AsyncJobQueueCelery:
    """Async job queue using Celery with queue routing."""

    def _pick_queue(self, job_type: str, payload: Dict[str, Any]) -> str:
        """
        Decide which queue to use based on job_type/payload.
        - LLM-heavy jobs → 'llm'
        - Future: add 'pdf' for large PDF parse tasks, etc.
        """
        jt = (job_type or "").lower()
        if jt in {"match", "enhance", "cover_letter"}:
            return "llm"
        return "default"
    
    def submit_job(self, job_type: str, payload: dict) -> str:
        # Ensure we don't pass 'job_type' twice (in task arg and inside payload)
        clean_payload = dict(payload or {})
        clean_payload.pop("job_type", None)

        queue_name = self._pick_queue(job_type, clean_payload)

        # Route to the selected queue via apply_async; use positional args
        async_result = run_agent_job.apply_async(
            args=[job_type, clean_payload],
            queue=queue_name,
            routing_key=queue_name,
        )
        return async_result.id    
    
    def get_status(self, job_id: str) -> Dict[str, Any]:
        result = AsyncResult(job_id, app=celery_app)
        status = result.status
        value = result.result if result.successful() else None
        return {"job_id": job_id, "status": status, "info": None}
    
    def get_result(self, job_id: str) -> Dict[str, Any]:
        result = AsyncResult(job_id, app=celery_app)
        state = result.status
        if state == "SUCCESS":
            return {"job_id": job_id, "status": state, "result": result.result, "error": None}
        if state == "FAILURE":
            return {"job_id": job_id, "status": state, "result": None, "error": str(result.result)}
        return {"job_id": job_id, "status": state, "result": None, "error": None}
        
    
    def wait_for_result(self, job_id:str, timeout: float | None = None) -> Dict[str, Any]:
        """
        Block until job finishes or timeout (seconds).
        Returns same shape as get_result().
        """
        ar = AsyncResult(job_id, app=celery_app)
        try:
            val = ar.get(timeout=timeout, propagate=False)
        except Exception as e:
            state = ar.status
            return {"job_id": job_id, "status": state, "result": None, "error": str(e) if str(e) else "Timeout"}
        state = ar.status
        if state == "SUCCESS":
            return {"job_id": job_id, "status": state, "result": val, "error": None}
        if state == "FAILURE":
            return {"job_id": job_id, "status": state, "result": None, "error": str(ar.result)}
        return {"job_id": job_id, "status": state, "result": None, "error": None}

    
# Singleton
queue = AsyncJobQueueCelery()

Overwriting ../backend/app/core/async_queue.py


#### Celery config

In [36]:
%%writefile ../backend/celeryconfig.py
# backend/celeryconfig.py

import os
from kombu import Queue, Exchange

# redis is in another docker container
# if it's not the case for you,
# use : "redis://localhost:6379/0"

BROKER_URL = os.getenv("CELERY_BROKER_URL", "redis://host.docker.internal:6379/0")
RESULT_BACKEND = os.getenv("CELERY_RESULT_BACKEND", BROKER_URL)

broker_url = BROKER_URL
result_backend = RESULT_BACKEND


task_serializer = "json"
result_serializer = "json"
accept_content = ["json"]
timezone = "UTC"
enable_utc = True

# -------- Queues & Routing --------
# Exchanges (direct for simple routing)

default_exchange = Exchange("default", type="direct")
llm_exchange = Exchange("llm", type="direct")
pdf_exchange = Exchange("pdf", type="direct")

# Declare queues
task_queues = (
    Queue("celery", exchange=default_exchange, routing_key="celery"),  # default
    Queue("default", exchange=default_exchange, routing_key="default"),
    Queue("llm", exchange=llm_exchange, routing_key="llm"),
    Queue("pdf", exchange=pdf_exchange, routing_key="pdf"),
)

# Default routing if a task has no explicit route
task_default_queue = "default"
task_default_exchange = "default"
task_default_routing_key = "default"

# We can optionally define task_routes if there is multiple tasks
# Here we keep it minimal and mostly route from apply_async.
task_routes = {
    # Example (for dedicated PDF task):
    # "run_pdf_parse": {"queue": "pdf", "routing_key": "pdf"},
    # Our main agent task can default to llm via apply_async from code.
}

Overwriting ../backend/celeryconfig.py


#### Celery worker

In [38]:
%%writefile ../backend/worker/worker.py
# backend/worker/worker.py

from celery import Celery

# Create Celery app
celery_app = Celery("resume_jd_matcher")
celery_app.config_from_object("backend.celeryconfig")

# Ensure tasks are imported on worker start
import backend.app.core.tasks       # noqa: F401

Overwriting ../backend/worker/worker.py


#### Celery Task

In [13]:
%%writefile ../backend/app/core/tasks.py
# backend/app/core/tasks.py

from celery.utils.log import get_task_logger
from backend.worker.worker import celery_app
from backend.app.core.agent_orchestrator import AgentOrchestrator

logger = get_task_logger(__name__)

@celery_app.task(
    name="run_agent_job",
    bind=False,
    autoretry_for=(Exception,),
    retry_backoff=True,
    retry_jitter=True,
    retry_kwargs={"max_retries": 3},
    soft_time_limit=180,  # seconds
    time_limit=240        # hard limit)
)
def run_agent_job(job_type: str, data: dict):
    logger.info("Starting job type=%s", job_type)
    orchestrator = AgentOrchestrator()
    result = orchestrator.run(job_type, data or {})
    logger.info("Finished job type=%s", job_type)
    return result

Overwriting ../backend/app/core/tasks.py


### Backend api

#### Data models

In [2]:
%%writefile ../backend/app/models/job_models.py

#backend/app/models/job_models.py

from pydantic import BaseModel, Field
from typing import Optional, Dict, Any
from enum import Enum

class JobState(str, Enum):
    PENDING = "PENDING"
    RECEIVED = "RECEIVED"
    STARTED = "STARTED"
    RETRY = "RETRY"
    FAILURE = "FAILURE"
    SUCCESS = "SUCCESS"
    REVOKED = "REVOKED"
    UNKNOWN = "UNKNOWN"

class ResumeJDRequest(BaseModel):
    job_type: str = Field(..., description="One of: match, enhance, cover_letter")
    resume: Optional[str] = Field(default=None, description="Plain text resume")
    jd: Optional[str] = Field(default=None, description="Plain text job description")

class PDFUploadResponse(BaseModel):
    extracted_text: str

class JobSubmitResponse(BaseModel):
    job_id: str

class JobStatusResponse(BaseModel):
    job_id: str
    status: JobState
    info: Optional[Dict[str, Any]] = None

class JobResultResponse(BaseModel):
    job_id: str
    status: JobState
    result: Optional[Dict[str, Any]] = None
    error: Optional[str] = None

Overwriting ../backend/app/models/job_models.py


#### Router

In [24]:
%%writefile ../backend/app/api/routes.py
#backend/app/api/routes.py

from typing import Optional, Dict, Any, List
from fastapi import APIRouter, File, UploadFile, Query, HTTPException
from fastapi.responses import FileResponse
from backend.app.core.pdf_parser import PDFParser
from backend.app.core.async_queue import queue
from backend.app.core.artifacts import PDFRenderer
from backend.app.models.job_models import(
    ResumeJDRequest,
    PDFUploadResponse,
    JobSubmitResponse,
    JobStatusResponse,
    JobResultResponse,
)
import tempfile
import os
import json
import datetime


api_router = APIRouter()
_pdf = PDFRenderer()

@api_router.get("/health", tags=["Health"])
def health_check():
    return {"status": "ok"}

@api_router.post("/parse-pdf", response_model=PDFUploadResponse, tags=["Parsing"])
async def parse_pdf_endpoint(file: UploadFile = File(...)):
    """Extract text from uploaded PDF file."""
    content = await file.read()
    parser = PDFParser()
    text = parser.extract_text(content)
    return PDFUploadResponse(extracted_text=text)

@api_router.post("/submit-job", response_model=JobSubmitResponse, tags=["Jobs"])
async def submit_job(request: ResumeJDRequest):
    """Submit a matching/enhancing/cover letter job."""

    jt = (request.job_type or "").lower()
    if jt not in {"match", "enhance", "cover_letter"}:
        raise HTTPException(status_code=422, detail="job_type must be one of: match, enhance, cover_letter")
    job_id = queue.submit_job(jt, request.dict())
    return JobSubmitResponse(job_id=job_id)

@api_router.get("/job-status/{job_id}", response_model=JobStatusResponse, tags=["Jobs"])
async def job_status(job_id: str):
    status = queue.get_status(job_id)
    return JobStatusResponse(**status)

@api_router.get("/job/{job_id}", response_model=JobResultResponse, tags=["Jobs"])
async def job_result(job_id:str):
    result = queue.get_result(job_id)
    return JobResultResponse(**result)

@api_router.get("/job-wait/{job_id}", response_model=JobResultResponse, tags=["Jobs"])
async def job_wait(job_id: str, timeout: Optional[float] = Query(default=30.0, ge=0.0, description="Seconds to wait")):
    """
    Blocks up to `timeout` seconds for the result, then returns current state/result.
    Good for Swagger testing or Streamlit 'long poll'.
    """
    result = queue.wait_for_result(job_id, timeout=timeout)
    return JobResultResponse(**result)

# ---------------- Downloadable Artifacts ----------------

@api_router.get("/job/{job_id}/download", tags=["Jobs"])
async def job_download(
    job_id: str,
    format: str = Query("md", pattern="^(md|json|pdf)$", description="Download format: md, json, or pdf"),
):
    """
    Download the job's result as a Markdown (md), JSON (json), or PDF (pdf) file.
    """
    jr = queue.get_result(job_id)
    status = jr.get("status")
    raw_result = jr.get("result")

    if status is None:
        raise HTTPException(status_code=404, detail="Job not found")
    if status not in ("SUCCESS", "FAILURE"):
        raise HTTPException(status_code=202, detail=f"Job not finished yet (status={status})")
    if status == "FAILURE":
        err = jr.get("error") or "Unknown error"
        if format == "json":
            return _download_json({"job_id": job_id, "status": status, "error": err}, f"job_{job_id}_error.json")
        raise HTTPException(status_code=500, detail=f"Job failed: {err}")

    # SUCCESS — unwrap nested shapes like {"status":"done","result":{...}}
    result = _unwrap_result(raw_result)

    # Detect job type by keys at the unwrapped level
    if isinstance(result, dict) and "match_score" in result:
        job_type = "match"
        md = _markdown_for_match(result)
        filename_md = f"match_report_{job_id}.md"
        filename_json = f"match_report_{job_id}.json"
        filename_pdf = f"match_report_{job_id}.pdf"
        if format == "json":
            return _download_json({"job_id": job_id, "status": status, "result": result, "job_type": job_type}, filename_json)
        if format == "pdf":
            tmp_path = _tmp_path(filename_pdf)
            _pdf.build_match_pdf(tmp_path, result)
            return FileResponse(tmp_path, media_type="application/pdf", filename=os.path.basename(tmp_path))
        return _download_md(md, filename_md)

    if isinstance(result, dict) and "resume_enhancement_md" in result:
        job_type = "enhance"
        md = _markdown_for_enhance(result)
        filename_md = f"resume_enhancement_{job_id}.md"
        filename_json = f"resume_enhancement_{job_id}.json"
        filename_pdf = f"resume_enhancement_{job_id}.pdf"
        if format == "json":
            return _download_json({"job_id": job_id, "status": status, "result": result, "job_type": job_type}, filename_json)
        if format == "pdf":
            tmp_path = _tmp_path(filename_pdf)
            _pdf.build_enhance_pdf(tmp_path, result)
            return FileResponse(tmp_path, media_type="application/pdf", filename=os.path.basename(tmp_path))
        return _download_md(md, filename_md)

    if isinstance(result, dict) and "cover_letter_md" in result:
        job_type = "cover_letter"
        md = _markdown_for_cover_letter(result)
        filename_md = f"cover_letter_{job_id}.md"
        filename_json = f"cover_letter_{job_id}.json"
        filename_pdf = f"cover_letter_{job_id}.pdf"
        if format == "json":
            return _download_json({"job_id": job_id, "status": status, "result": result, "job_type": job_type}, filename_json)
        if format == "pdf":
            tmp_path = _tmp_path(filename_pdf)
            _pdf.build_cover_letter_pdf(tmp_path, result)
            return FileResponse(tmp_path, media_type="application/pdf", filename=os.path.basename(tmp_path))
        return _download_md(md, filename_md)

    # Unknown structure → generic
    if format == "json":
        return _download_json({"job_id": job_id, "status": status, "result": result, "job_type": "unknown"}, f"job_{job_id}.json")
    if format == "pdf":
        tmp_path = _tmp_path(f"job_{job_id}.pdf")
        pretty = _pretty_json(result)
        _pdf.build_generic_pdf(tmp_path, "Job Result", pretty)
        return FileResponse(tmp_path, media_type="application/pdf", filename=os.path.basename(tmp_path))
    md = _markdown_from_unknown(result)
    return _download_md(md, f"job_{job_id}.md")

# ---------------- Helpers: Markdown & File responses ----------------

def _unwrap_result(raw_result: Any) -> Any:
    """
    Accepts any structure. If it's a dict that looks like {'status': 'done', 'result': {...}},
    return the inner .result; otherwise return as-is.
    """
    if isinstance(raw_result, dict) and "result" in raw_result and set(raw_result.keys()) <= {"status", "result"}:
        return raw_result.get("result")
    return raw_result

def _download_md(markdown_text: str, filename: str) -> FileResponse:
    tmp_path = _write_temp_file(markdown_text, filename)
    return FileResponse(tmp_path, media_type="text/markdown", filename=os.path.basename(tmp_path))

def _download_json(payload: Dict[str, Any], filename: str) -> FileResponse:
    text = json.dumps(payload, indent=2, ensure_ascii=False)
    tmp_path = _write_temp_file(text, filename)
    return FileResponse(tmp_path, media_type="application/json", filename=os.path.basename(tmp_path))

def _write_temp_file(content: str, filename: str) -> str:
    tmp_path = _tmp_path(filename)
    with open(tmp_path, "w", encoding="utf-8") as f:
        f.write(content)
    return tmp_path

def _tmp_path(filename: str) -> str:
    tmp_dir = tempfile.mkdtemp(prefix="artifacts_")
    return os.path.join(tmp_dir, filename)

def _header(title: str) -> str:
    now = datetime.datetime.utcnow().strftime("%Y-%m-%d %H:%M UTC")
    return f"# {title}\n\n_Generated: {now}_\n\n"

def _pretty_json(value: Any) -> str:
    try:
        return json.dumps(value, indent=2, ensure_ascii=False)
    except Exception:
        return str(value)

def _markdown_for_match(result: Dict[str, Any]) -> str:
    score = result.get("match_score", "N/A")
    strengths: List[str] = result.get("strengths", []) or []
    gaps: List[str] = result.get("gaps", []) or []
    summary = result.get("summary", "")

    md = _header("Resume ↔ JD Match Report")
    md += f"## Overall Score\n**{score}%**\n\n"
    md += "## Strengths\n"
    md += "\n".join(f"- {s}" for s in strengths) + ("\n\n" if strengths else "_None_\n\n")
    md += "## Gaps\n"
    md += "\n".join(f"- {g}" for g in gaps) + ("\n\n" if gaps else "_None_\n\n")
    md += "## Summary\n"
    md += f"{summary or '_No summary provided._'}\n"
    return md

def _markdown_for_enhance(result: Dict[str, Any]) -> str:
    body = result.get("resume_enhancement_md", "") or "_No suggestions generated._"
    md = _header("Resume Enhancement Suggestions")
    md += body
    return md

def _markdown_for_cover_letter(result: Dict[str, Any]) -> str:
    body = result.get("cover_letter_md", "") or "_No cover letter generated._"
    md = _header("Cover Letter")
    md += body
    return md

def _markdown_from_unknown(result_any: Any) -> str:
    md = _header("Job Result")
    md += "```\n" + _pretty_json(result_any) + "\n```"
    return md

Overwriting ../backend/app/api/routes.py


#### App

In [10]:
%%writefile ../backend/app/main.py
#backend/app/main.py

import os
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from backend.app.api.routes import api_router

class JDMatcherApp:
    def __init__(self):
        self.app = FastAPI(
            title="Resume-JD Matcher API",
            description="Backend for matching candidate resumes to job descriptions using AI agents.",
            version="0.2.0"
        )
        self._configure_cors()
        self.include_routers()        

    def _configure_cors(self):
        origins_env = os.getenv("BACKEND_CORS_ORIGINS", "http://localhost:8501,http://127.0.0.1:8501")
        origins = [o.strip() for o in origins_env.split(",") if o.strip()]
        self.app.add_middleware(
            CORSMiddleware,
            allow_origins=origins,
            allow_credentials=True,
            allow_methods=["*"],
            allow_headers=["*"],
        )

    def include_routers(self):
        self.app.include_router(api_router)

def get_app():
    """Entrypoint for ASGI"""
    return JDMatcherApp().app

# Run with 'uvicorn backend.app.main:get_app'
app = get_app()

Overwriting ../backend/app/main.py


## **`Frontend`**

### Api client

In [41]:
%%writefile ../frontend/api_client.py
# frontend/api_client.py

import os
import time
from typing import Optional, Dict, Any
import requests
from urllib.parse import urlencode

class BackendClient:
    def __init__(self, base_url: Optional[str] = None, timeout: float = 30.0):
        self.base_url = (base_url or os.getenv("BACKEND_URL") or "http://localhost:8000").rstrip("/")
        self.timeout = timeout

    # -------- Parsing --------
    def parse_pdf(self, file_bytes: bytes, filename: str = "resume.pdf") -> str:
        url = f"{self.base_url}/parse-pdf"
        files = {"file": (filename, file_bytes, "application/pdf")}
        resp = requests.post(url, files=files, timeout=self.timeout)
        resp.raise_for_status()
        data = resp.json()
        return data.get("extracted_text", "") or ""

    # -------- Jobs --------
    def submit_job(self, job_type: str, resume: str, jd: str) -> str:
        url = f"{self.base_url}/submit-job"
        payload = {"job_type": job_type, "resume": resume, "jd": jd}
        resp = requests.post(url, json=payload, timeout=self.timeout)
        resp.raise_for_status()
        return resp.json()["job_id"]

    def job_status(self, job_id: str) -> Dict[str, Any]:
        url = f"{self.base_url}/job-status/{job_id}"
        resp = requests.get(url, timeout=self.timeout)
        resp.raise_for_status()
        return resp.json()

    def job_result(self, job_id: str) -> Dict[str, Any]:
        url = f"{self.base_url}/job/{job_id}"
        resp = requests.get(url, timeout=self.timeout)
        resp.raise_for_status()
        return resp.json()

    def job_wait(self, job_id: str, timeout: float = 60.0) -> Dict[str, Any]:
        url = f"{self.base_url}/job-wait/{job_id}"
        params = {"timeout": timeout}
        resp = requests.get(url, params=params, timeout=timeout + 5)
        resp.raise_for_status()
        return resp.json()

    # -------- Convenience: poll with progress callback --------
    def wait_with_progress(
        self,
        job_id: str,
        total_wait: float = 120.0,
        poll_interval: float = 1.5,
        on_tick=None,
    ) -> Dict[str, Any]:
        elapsed = 0.0
        while elapsed < total_wait:
            try:
                res = self.job_result(job_id)
            except Exception as e:
                res = {"status": "UNKNOWN", "error": str(e)}
            if on_tick:
                on_tick(elapsed, res.get("status"))
            if res.get("status") in ("SUCCESS", "FAILURE"):
                return res
            time.sleep(poll_interval)
            elapsed += poll_interval
        # Fallback: final status fetch
        return self.job_result(job_id)
    
    # -------- Download URLs (for link buttons) --------
    def download_url(self, job_id: str, fmt: str) -> str:
        """Builds the direct backend URL to download artifacts (md/json/pdf)."""
        qs = urlencode({"format": fmt})
        return f"{self.base_url}/job/{job_id}/download?{qs}"


Overwriting ../frontend/api_client.py


### Streamlit app

In [45]:
%%writefile ../frontend/streamlit_app.py
# frontend/streamlit_app.py

import os
import json
import streamlit as st

from api_client import BackendClient

st.set_page_config(page_title="🧠 Resume ↔ JD Matcher", layout="wide")

# Backend URL (can be set via env BACKEND_URL)
BACKEND_URL = os.getenv("BACKEND_URL", "http://localhost:8000")
client = BackendClient(base_url=BACKEND_URL)

st.title("🧠 AI Resume ↔ Job Description Matcher")
st.caption(f"Backend: {BACKEND_URL}")

with st.expander("ℹ️ Instructions", expanded=False):
    st.markdown("""
    1) Upload **Resume** and **Job Description** (PDF or paste text).
    2) Click an action: **Run Matching**, **Enhance Resume**, or **Generate Cover Letter**.
    3) The app submits a job to the backend and waits for the result.
    """)

st.markdown("---")

col1, col2 = st.columns(2)

def extract_text_from_upload(uploaded_file) -> str:
    if not uploaded_file:
        return ""
    # Delegate parsing to backend to keep logic consistent
    bytes_data = uploaded_file.read()
    return client.parse_pdf(bytes_data, filename=uploaded_file.name)

with col1:
    st.subheader("📄 Resume")
    resume_input = st.radio("Input method", ["Upload PDF", "Paste Text"], key="resume_method")
    resume_text = ""
    if resume_input == "Upload PDF":
        up_res = st.file_uploader("Upload Resume (PDF)", type=["pdf"], key="resume_pdf")
        if up_res:
            with st.spinner("Parsing resume PDF..."):
                try:
                    resume_text = extract_text_from_upload(up_res)
                except Exception as e:
                    st.error(f"Resume parsing failed: {e}")
    else:
        resume_text = st.text_area("Paste Resume Text", height=250, key="resume_textarea")

    if resume_text:
        with st.expander("🔍 Resume Preview"):
            st.text_area("Resume Text", resume_text, height=150, key="resume_preview")

with col2:
    st.subheader("📑 Job Description")
    jd_input = st.radio("Input method", ["Upload PDF", "Paste Text"], key="jd_method")
    jd_text = ""
    if jd_input == "Upload PDF":
        up_jd = st.file_uploader("Upload JD (PDF)", type=["pdf"], key="jd_pdf")
        if up_jd:
            with st.spinner("Parsing JD PDF..."):
                try:
                    jd_text = extract_text_from_upload(up_jd)
                except Exception as e:
                    st.error(f"JD parsing failed: {e}")
    else:
        jd_text = st.text_area("Paste JD Text", height=250, key="jd_textarea")

    if jd_text:
        with st.expander("🔍 JD Preview"):
            st.text_area("JD Text", jd_text, height=150, key="jd_preview")

st.markdown("---")

# Action buttons
disabled = not (resume_text and jd_text)
c1, c2, c3 = st.columns(3)
output = st.empty()

# Keep last job info in session_state for download links
if "last_job" not in st.session_state:
    st.session_state.last_job = {"id": None, "type": None, "result": None}

def _html_button(label: str, href: str):
    # Simple anchor styled like a button; works across Streamlit versions
    return f"""
    <a href="{href}" target="_blank" style="
        display: inline-block;
        text-decoration: none;
        background: #0f62fe;
        color: white;
        padding: 0.5rem 0.75rem;
        border-radius: 6px;
        font-weight: 600;
        border: 1px solid #0f62fe;
        text-align: center;
        width: 100%;
        ">
        {label}
    </a>
    """

def _render_downloads(job_id: str):
    st.markdown("### 📥 Download Result")
    url_pdf = client.download_url(job_id, "pdf")
    st.markdown(_html_button("⬇️ Download PDF", url_pdf), unsafe_allow_html=True)


def _run_job(job_type: str, resume: str, jd: str):
    with output.container():
        st.info(f"Submitting **{job_type}** job...")
        try:
            job_id = client.submit_job(job_type, resume, jd)
        except Exception as e:
            st.error(f"Failed to submit job: {e}")
            return

        st.success(f"Job submitted. ID: `{job_id}`")
        prog = st.progress(0)
        status_box = st.empty()

        def on_tick(elapsed, status):
            pct = min(100, int((elapsed / 60.0) * 100))  # scale progress to 60s
            prog.progress(pct)
            status_box.write(f"⏳ Elapsed: {int(elapsed)}s — Status: **{status}**")

        with st.spinner("Waiting for result..."):
            result = client.wait_with_progress(job_id, total_wait=180.0, poll_interval=1.5, on_tick=on_tick)

        prog.progress(100)
        st.write("")

        status = result.get("status")
        st.session_state.last_job = {"id": job_id, "type": job_type, "result": result}

        if status == "SUCCESS":
            st.success("✅ Job finished")

            # Minimal on-screen preview; full artifacts are downloadable
            payload = result.get("result") or {}
            with st.expander("👀 Quick Preview (raw result)", expanded=False):
                st.json(payload)

            _render_downloads(job_id)
        elif status == "FAILURE":
            st.error(f"❌ Job failed: {result.get('error')}")
        else:
            st.warning(f"Job ended in state {status}. Try again or check logs.")
        st.write("---")

with c1:
    if st.button("🚀 Run Matching", disabled=disabled, use_container_width=True):
        _run_job("match", resume_text, jd_text)

with c2:
    if st.button("📝 Enhance Resume", disabled=disabled, use_container_width=True):
        _run_job("enhance", resume_text, jd_text)

with c3:
    if st.button("✉️ Generate Cover Letter", disabled=disabled, use_container_width=True):
        _run_job("cover_letter", resume_text, jd_text)

# If a job already completed this session, show its download buttons again at the bottom
if st.session_state.last_job.get("id"):
    st.markdown("---")
    st.markdown("### Last job downloads")
    _render_downloads(st.session_state.last_job["id"])

Overwriting ../frontend/streamlit_app.py
