Agentic CV Parser API

A robust FastAPI-based backend service for candidate management, CV parsing, and workflow automation.

Features

🚀 High-performance FastAPI backend
🤖 AI-powered CV parsing for PDF and Docx (document loaders & Unstructured OCR)
🔒 Rate limiting and security middleware
📊 PostgreSQL database with SQLAlchemy ORM (Async)
🔄 Asynchronous request handling
📝 Structured logging system with correlation IDs
🌐 CORS support
🔍 Request ID tracking
🐳 Docker support
📄 Pydantic schema validation
📚 Database Migrations using Alembic
📈 Pinecone vector database integration
📦 Poetry dependency management
🔑 Redis for rate limiting
🤖 LangChain & LangGraph integration for AI workflows
📄 Document processing capabilities
🔐 AWS S3 integration for file storage (Not enabled)
📝 Streaming and None-Streaming chat endpoints

Project Structure

api/
├── agent/              # AI agent implementation
│   ├── workflow.py     # Candidate processing workflow
│   ├── tools.py       # Agent tools
│   └── prompts.py     # Agent prompts
├── api/               # API routes
│   ├── v1/           # API version 1 endpoints
│   ├── deps.py       # Dependencies
│   └── router.py     # Main router
├── core/             # Core configuration
│   └── config.py     # Settings management
├── crud/             # Database operations
│   ├── candidates.py # Candidate CRUD operations
│   └── sections.py   # Section CRUD operations
├── models/           # SQLAlchemy models
│   ├── candidates.py
│   ├── education.py
│   ├── experience.py
│   ├── projects.py
│   └── skills.py
├── schema/           # Pydantic schemas
│   ├── agent.py
│   ├── candidates.py
│   ├── education.py
│   └── responses.py
├── services/         # Business logic
│   └── documents.py  # Document processing
└── utils/           # Utilities
    ├── helpers.py
    ├── logger.py
    ├── s3_client.py
    └── middlewares/

Prerequisites

Python 3.12+
Docker and Docker Compose (optional)
Redis
PostgreSQL
OpenAI API key
Pinecone API key (https://www.pinecone.io/)
Unstructured API key (https://docs.unstructured.io/api-reference/api-services/free-api)

Installation

Using Poetry (Recommended)

Clone the repository:

git clone https://github.com/andrew-sameh/agentic-cv-parser.git
cd agentic-cv-parser

Install dependencies:

poetry install

Set up environment variables:

cp .env.example .env
# Edit .env with your configuration

Run the application:

poetry run python main.py

Using pip

Create a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Set up environment variables:

cp .env.example .env
# Edit .env with your configuration

Run the application:

python main.py

Using Docker Compose

1.Clone the repository

2.Set up environment variables:

cp .env.example .env
# Edit .env with your configuration

3.Start the application:

docker-compose -p cv-parser up -d
# or
docker compose -p cv-parser up -d

Environment Configuration

The following environment variables need to be configured in your .env file:

Application Settings

ENV: Environment (dev/prod)
PROJECT_NAME: Project name
VERSION: API version
DESCRIPTION: API description
LOG_LEVEL: Logging level
LOG_JSON_ENABLE: Enable JSON logging
BACKEND_CORS_ORIGINS: Allowed CORS origins

Database Settings

DATABASE_USER: PostgreSQL username
DATABASE_PASSWORD: PostgreSQL password
DATABASE_NAME: Database name
DATABASE_HOSTNAME: Database host
DATABASE_PORT: Database port

Redis Settings

REDIS_HOST: Redis host
REDIS_PORT: Redis port
REDIS_DB: Redis database number

AWS S3 Settings (Not enabled)

AWS_S3_BUCKET_NAME: S3 bucket name
AWS_S3_ACCESS_KEY_ID: AWS access key
AWS_S3_SECRET_ACCESS_KEY: AWS secret key
AWS_S3_REGION_NAME: AWS region
AWS_S3_BASE_FOLDER: Base folder in S3

AI Settings

OPENAI_API_KEY: OpenAI API key
LLM_MODEL: Language model to use
EMBEDDING_MODEL: Embedding model
UNSTRUCTURED_API_KEY: Unstructured API key

Vector Database Settings

PINECONE_API_KEY: Pinecone API key
PINECONE_INDEX_NAME: Pinecone index name
EMBEDDING_SEARCH_TYPE: Search type
EMBEDDING_SCORE_THRESHOLD: Similarity threshold
EMBEDDING_TOPK: Top K results

Database Setup

1.Start PostgreSQL and Redis using Docker:

docker-compose up -d db redis

2.Initialize the database:

alembic upgrade head

API Documentation

Once the application is running, visit:

Swagger UI: http://localhost:8000/
ReDoc: http://localhost:8000/redoc

Key Components

Agent System

The agent system (agent/) handles chat functionalities.

API Routes

API endpoints are organized in the api/ directory with versioning support.

Database Models

SQLAlchemy models in models/ define the database schema for:

Candidates
Education
Experience
Projects
Skills
Certifications

Services

Document processing (services/documents.py) handles document parsing and extraction.

Utilities

S3 integration for file storage (not used)
Structured logging
Rate limiting
Request correlation
Error handling
Redis integration

Tests

In progress

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
agent		agent
alembic		alembic
api		api
core		core
crud		crud
data/sample_cvs		data/sample_cvs
models		models
schema		schema
services		services
test		test
utils		utils
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
alembic.ini		alembic.ini
database.py		database.py
docker-compose.yml		docker-compose.yml
dockerfile		dockerfile
main.py		main.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic CV Parser API

Features

Project Structure

Prerequisites

Installation

Using Poetry (Recommended)

Using pip

Using Docker Compose

Environment Configuration

Application Settings

Database Settings

Redis Settings

AWS S3 Settings (Not enabled)

AI Settings

Vector Database Settings

Database Setup

API Documentation

Key Components

Agent System

API Routes

Database Models

Services

Utilities

Tests

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

asamgx/agentic-cv-parser

Folders and files

Latest commit

History

Repository files navigation

Agentic CV Parser API

Features

Project Structure

Prerequisites

Installation

Using Poetry (Recommended)

Using pip

Using Docker Compose

Environment Configuration

Application Settings

Database Settings

Redis Settings

AWS S3 Settings (Not enabled)

AI Settings

Vector Database Settings

Database Setup

API Documentation

Key Components

Agent System

API Routes

Database Models

Services

Utilities

Tests

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages