Neuraldocs - Web Article RAG API - Built with OpenAI's Codex

Before diving deep into the project, let's discuss Codex, the OpenAI CLI tool for automated code generation using the latest o4-mini and o3 models. For this project, I exclusively used the o4-mini model with --full-auto and I planned the project idea in the codex.md file using ChatGPT (o4-mini again).

While I don't have experience with Claude Code for comparison, using Codex was excellent.

The tool and model handled almost everything in one shot, with only 2 errors due to outdated documentation in their training:

OpenAI Python package (yes, you read that right)
ChromaDB

I'm impressed by the model's speed, precision and overall performance. And it's cost-effective too. Building this app, including fixing the 2 errors and adding 2 features (frontend and stats routes), cost less than $1.50 USD. While larger apps will naturally cost more, the tool seems to optimize token usage to maintain reasonable costs. I still need to try o3, but so far, o4-mini has proven excellent and I'm eager to explore it further.

The main challenge remains the knowledge cutoff date. This should be moved to Feb/March 2025, or the OpenAI API/model should have internet access to fetch latest data, as outdated information is problematic in our rapidly evolving tech landscape. To bypass the model's knowledge cutoff, relevant documentation can be included in markdown files. For instance, doc_openai.md was added to provide details on the OpenAI Python Package.

Web Article Retrieval-Augmented Generation (RAG) API with FastAPI, Docker, and OpenAI

Overview

This project provides an API-first system for ingesting web articles, structuring their content via OpenAI, indexing them in a vector database, and answering user queries using a retrieval-augmented generation (RAG) approach.

Key Features

Asynchronous Ingestion: Background workers fetch URLs, extract article content with trafilatura, organize it into structured JSON using OpenAI GPT-4.1-nano, and store documents in MongoDB.
Vector Indexing: Text chunks are embedded with a local sentence-transformers model (all-MiniLM-L6-v2) and stored in ChromaDB.
RAG Querying: FastAPI endpoints embed user questions, retrieve top-k relevant chunks from ChromaDB, and generate answers via OpenAI GPT-4.1-nano.
Dockerized: All components (API, worker, MongoDB, Redis, ChromaDB) run in Docker containers orchestrated by Docker Compose.

Tech Stack

Language & Framework: Python 3.11, FastAPI
HTTP Client: httpx
Article Extraction: trafilatura
LLM API: OpenAI gpt-4.1-nano via openai Python SDK
Embeddings: sentence-transformers (all-MiniLM-L6-v2)
Vector Database: ChromaDB
Metadata Store: MongoDB
Task Queue: Redis + RQ
Containerization: Docker, Docker Compose

Prerequisites

Docker Engine (v20+)
Docker Compose (v2+)
OpenAI API key

Setup

Clone the repository

git clone https://github.com/mxmarchal/neuraldocs.git
cd neuraldocs

Configure environment variables Copy the example and set your OpenAI key:
```
cp .env.example .env
# Edit .env and set OPENAI_API_KEY
```
Start services
```
docker-compose up --build
```
This will build the images and start the following services:
- api: The main FastAPI application, accessible at http://localhost:8000.
- worker: The RQ worker processing background tasks (no direct access needed).
- mongodb: The MongoDB database, accessible on the host at mongodb://localhost:27018 (maps to container port 27017).
- redis: The Redis server used for the task queue, accessible on the host at redis://localhost:6379.
- chromadb: The ChromaDB vector store, accessible on the host at http://localhost:8001 (maps to container port 8000).

API Endpoints

1. Add URL for Ingestion

Endpoint: POST /add-url

Body:

{ "url": "https://example.com/article" }

Response:

{ "message": "URL queued for processing", "task_id": "<rq_job_id>" }

2. Check Task Status

Endpoint: GET /tasks/{task_id}

Response:

{
  "task_id": "<rq_job_id>",
  "status": "queued|started|finished|failed",
  "result": {
    /* output of processing or error */
  }
}

3. Query with RAG

Endpoint: POST /query

Body:

{
  "question": "What is the main idea of the article?",
  "top_k": 5 // optional, defaults to 5
}

Response:

{
  "answer": "The article explains ...",
  "sources": [
    "https://example.com/article",
    ...
  ]
}

4. Get System Statistics

Endpoint: GET /stats

Response:

{
  "documents": 123, // number of ingested documents in MongoDB
  "vectors": 456 // number of stored vectors in ChromaDB
}

Example Usage

# 1. Add a URL
curl -X POST http://localhost:8000/add-url \
  -H 'Content-Type: application/json' \
  -d '{"url":"https://example.com/article"}'

# 2. Check ingestion status
curl http://localhost:8000/tasks/<task_id>

# 3. Ask a question
curl -X POST http://localhost:8000/query \
  -H 'Content-Type: application/json' \
  -d '{"question":"Key points from the article?"}'

Project Structure

.
├── docker-compose.yml       # Multi-service orchestration
├── .env.example             # Environment variable template
├── .env                     # Local environment variables (ignored by git)
├── README.md                # Project overview & usage
└── app/
    ├── Dockerfile           # API & worker image definition
    ├── requirements.txt     # Python dependencies
    ├── config.py            # Pydantic settings
    ├── db.py                # DB & vector client setup
    ├── tasks.py             # RQ tasks: ingestion pipeline
    └── main.py              # FastAPI application

Configuration Variables

Environment variables (in .env):

Variable	Description	Default
OPENAI_API_KEY	OpenAI API key
MONGO_HOST	MongoDB hostname	`mongodb`
MONGO_PORT	MongoDB port	`27017`
REDIS_HOST	Redis hostname	`redis`
REDIS_PORT	Redis port	`6379`
CHROMA_HOST	ChromaDB hostname	`chromadb`
CHROMA_PORT	ChromaDB port	`8000`
EMBEDDING_MODEL_NAME	SentenceTransformer model name	`all-MiniLM-L6-v2`
NANO_MODEL_NAME	OpenAI model for structuring articles	`gpt-4.1-nano`
RAG_MODEL_NAME	OpenAI model for RAG answering	`gpt-4.1-nano`
TOP_K	Default number of retrieved chunks	`5`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Neuraldocs - Web Article RAG API - Built with OpenAI's Codex

Overview

Key Features

Tech Stack

Prerequisites

Setup

API Endpoints

1. Add URL for Ingestion

2. Check Task Status

3. Query with RAG

4. Get System Statistics

Example Usage

Project Structure

Configuration Variables

Acknowledgements

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
app		app
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
codex.md		codex.md
doc_openai.md		doc_openai.md
docker-compose.yml		docker-compose.yml

mxmarchal/neuraldocs

Folders and files

Latest commit

History

Repository files navigation

Neuraldocs - Web Article RAG API - Built with OpenAI's Codex

Overview

Key Features

Tech Stack

Prerequisites

Setup

API Endpoints

1. Add URL for Ingestion

2. Check Task Status

3. Query with RAG

4. Get System Statistics

Example Usage

Project Structure

Configuration Variables

Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages