Skip to content

VeTamas/cvmatcher

Repository files navigation

CV / Job Description Matcher

App Screenshot

Built during my own job search — because I wanted to know exactly how well my CV matches a role before applying.

What it does

You upload your CV and a job description (PDF, DOCX, or TXT). The app returns a match score (semantic similarity + skill coverage), a skill-by-skill breakdown (full / partial / missing), gap analysis, and actionable suggestions plus a cover-letter opening line. Everything runs locally: no cloud APIs, no account, no data leaving your machine.

Architecture

  PDF/DOCX (CV + JD)
         │
         ▼
  ┌──────────────┐
  │   Parser     │  PyMuPDF / python-docx → plain text, chunked
  └──────┬───────┘
         │
         ▼
  ┌──────────────┐
  │  ChromaDB    │  sentence-transformers embeddings, persistent store
  └──────┬───────┘
         │
         ▼
  ┌──────────────┐     ┌──────────────┐
  │ MatchEngine  │────▶│    Ollama    │  skill extraction, suggestions, cover hook
  └──────┬───────┘     └──────────────┘
         │
         ▼
  ┌──────────────┐
  │ MatchReport  │  score, verdict, skills[], missing_skills[], cv_suggestions[]
  └──────────────┘

Tech Stack

Technology Role Why this choice
FastAPI HTTP API + static frontend Async, auto OpenAPI docs, Pydantic. Single process serves API and UI.
PyMuPDF, python-docx Document parsing Solid PDF/DOCX text extraction without external services.
ChromaDB Vector store In-process, persistent. Good fit for small-scale embedding storage.
sentence-transformers (all-MiniLM-L6-v2) Embeddings Local, fast, no API key. Used for CV–JD semantic similarity.
Ollama (e.g. llama3.2) LLM Local inference for skill extraction and suggestions. No usage limits, data stays on device.
Vanilla HTML/CSS/JS Frontend One file, no build. Served at / from FastAPI.

Quick Start

Prerequisites: Python 3.11+, Ollama installed.

cd cvmatcher
python -m venv .venv
.venv\Scripts\activate   # Windows: source .venv/bin/activate on macOS/Linux
pip install -r requirements.txt
cp .env.example .env     # optional, defaults work with local Ollama
ollama pull llama3.2     # first run only
uvicorn app.main:app --reload

Open http://localhost:8000/ for the app, http://localhost:8000/docs for the API.

Docker Setup

Prerequisites

  • Docker Desktop installed and running
  • Ollama installed on the host with the model already pulled

Important note about Ollama

Ollama must be started manually with these environment variables before running docker compose, because it needs to be accessible from the Docker network:

  • OLLAMA_HOST=0.0.0.0:11434
  • OLLAMA_ORIGINS=*

On Windows PowerShell:

$env:OLLAMA_HOST = "0.0.0.0:11434"
$env:OLLAMA_ORIGINS = "*"
ollama serve

Running with Docker

docker compose up -d

Then open http://localhost:8000

Notes

  • Ollama runs on the host to utilize the GPU (NVIDIA RTX 4060)
  • ChromaDB data persists in a Docker named volume (chroma_data)
  • The WSL interface IP (172.21.80.1) is used to connect the FastAPI container to host Ollama
  • To stop: docker compose down

API Endpoints

Method Endpoint Description
GET / Serves the frontend (index.html).
GET /health Health check; returns status, ollama, model.
POST /api/upload/cv Upload CV (multipart file). Returns file_id, filename, chunks, status.
POST /api/upload/jd Upload job description (same contract).
POST /api/analyze JSON body { "cv_id", "jd_id" }. Returns full MatchReport.

Uploads: .pdf, .docx, .txt only; max 10 MB.

Known Limitations & Roadmap

Limitations

  • Tuned for English; other languages may get weaker skill extraction and suggestions.
  • Quality of suggestions depends on the local Ollama model; smaller models can be noisy or generic.
  • No auth or multi-tenancy; intended for single-user, local use.
  • Semantic score uses a fixed embedding model; it does not adapt to domain-specific jargon.

Roadmap

  • Paste JD text (no file) for quick checks.
  • Export MatchReport as PDF.
  • Optional Ollama embeddings so one stack handles both similarity and generation.
  • Configurable score weights (e.g. skill match vs semantic).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors