match-cv

CV ingestion and matching service built with Django, pgvector, Datapizza pipelines, and Celery

Scoring uses category weights: skill, experience, education. The API requires all three and they must sum to 1.0 (example: 0.4 + 0.4 + 0.2). Higher weight means stronger impact on the final ranking.

🚀 Quick Start • 🔌 API • 🏗️ Architecture • 🧪 Testing • 📝 Notes

🏗️ Architecture Overview

Search flow

Split job offer (skill, experience, education) -> parallel category retrieval for semantic search and full text search on metadata -> merge by document -> weighted scoring.

Upload flow

Single upload: API -> serializer -> metadata extraction -> embedding -> vector store write
Bulk upload: API creates batch/items -> Celery task per item -> status polling endpoint

⚙️ Requirements

Python 3.13+
uv
Docker (recommended for PostgreSQL + Redis)

🚀 Quick Start (Local)

Create and activate virtualenv.

uv venv
source .venv/bin/activate

Install dependencies.

uv sync

Create local env file from template.

cp .env.example .env

Configure environment variables in .env.

OPENAI_API_KEY=your_key
EMBEDDING_MODEL_NAME=text-embedding-3-small

Start infrastructure.

docker compose up -d

Run migrations.

python manage.py migrate

Start Django server.

python manage.py runserver

Start Celery worker (new terminal).

celery -A src.config.celery worker -l info

🧪 Testing

Run tests:

pytest

Run tests with coverage:

pytest --cov --cov-report=html

🎨 Formatting

ruff format --config ./ruff.toml .

🔌 API Endpoints

Base prefix: /api/

1. Upload single CV

Method: POST /api/cv-documents/
Content-Type: multipart/form-data
File field: source_file

Example:

curl -X POST http://127.0.0.1:8000/api/cv-documents/ \
  -F "source_file=@/absolute/path/cv.pdf"

Responses:

201 document created and ingested synchronously
400 validation error

2. Bulk upload CVs (async)

Method: POST /api/cv-documents/bulk/
Content-Type: multipart/form-data
File field: repeated files

Example:

curl -X POST http://127.0.0.1:8000/api/cv-documents/bulk/ \
  -F "files=@/absolute/path/cv1.pdf" \
  -F "files=@/absolute/path/cv2.pdf"

Responses:

202 returns batch_id and upload_item_id list
400 invalid multipart payload

3. Bulk upload batch status

Method: GET /api/cv-documents/bulk/<batch_id>/status/

Response contains:

batch status (PENDING|RUNNING|SUCCESS|FAILED|PARTIAL)
counters (total_files, processed_files, failed_files)
per-item status and error_message

4. Run matching pipeline

Method: POST /api/search-runs/
Content-Type: application/json

Example payload:

{
  "job_offer_text": "Looking for backend engineer with Python and 5+ years",
  "weights": {
    "skill": 0.1,
    "experience": 0.7,
    "education": 0.2
  },
  "top_k": 10
}

Responses:

200 ranked candidate list
400 request validation error
500 pipeline/runtime error

📝 Notes

CV upload endpoints require multipart file upload; JSON file paths are not accepted.
If Celery worker is not running, bulk upload items remain in PENDING.
Vector metadata must be JSON-serializable; UUID handling is normalized in vector store code.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
docker/postgres/init		docker/postgres/init
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
manage.py		manage.py
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
ruff.toml		ruff.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

match-cv

🏗️ Architecture Overview

Search flow

Upload flow

⚙️ Requirements

🚀 Quick Start (Local)

🧪 Testing

🎨 Formatting

🔌 API Endpoints

1. Upload single CV

2. Bulk upload CVs (async)

3. Bulk upload batch status

4. Run matching pipeline

📝 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

match-cv

🏗️ Architecture Overview

Search flow

Upload flow

⚙️ Requirements

🚀 Quick Start (Local)

🧪 Testing

🎨 Formatting

🔌 API Endpoints

1. Upload single CV

2. Bulk upload CVs (async)

3. Bulk upload batch status

4. Run matching pipeline

📝 Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages