DocsRAG

Local-first documentation reasoning for engineers and AI coding agents.

DocsRAG is an open-source, self-hostable platform that turns public technical documentation into a grounded reasoning layer. It crawls docs, extracts structured sections, separates explanations from code examples, and answers implementation-focused questions with citations — using your own local setup and provider configuration.

Why now?
In the era of AI-assisted coding, normal docs search and naive RAG are no longer enough. Engineers and coding agents need answers that are specific, grounded, and actionable — not just related paragraphs retrieved from a large documentation site.

Why I Built This

When I’m working on integrations (Stripe, APIs, SDKs, etc.), I constantly find myself going back and forth between documentation pages.

You search for something simple like:

“how to generate an API key”

And you end up:

opening multiple pages
reading partially related sections
piecing together steps manually
guessing which code example actually applies

Even with AI tools, the answers are often:

incomplete
not grounded in the actual docs
or missing the exact implementation details

DocsRAG was built to fix that.

Instead of just searching documentation, it tries to understand how the docs are structured and return:

clear explanations
actionable steps
relevant code examples
grounded answers with citations

The Idea

Engineers and AI coding tools struggle to find exact implementation guidance inside large documentation sites.

The useful answer is usually:

buried across multiple pages
mixed with navigation noise
surrounded by high-level explanations
disconnected from the code examples

DocsRAG turns documentation into a structured reasoning layer.

Instead of treating docs as one big text corpus, it:

separates explanation from examples
preserves section context
builds a retrieval flow optimized for real engineering questions

The Problem

Technical documentation is designed for browsing — not for implementation-time reasoning.

Docs are large, fragmented, and spread across many pages
Engineers need exact steps, not broad summaries
AI tools hallucinate when context is incomplete
Code examples and explanations are not tightly linked
Docs chatbots often return related pages, not the right ones
Naive RAG treats all chunks equally (explanation, reference, examples)

The Solution

DocsRAG is a staged documentation reasoning engine.

It:

crawls public technical docs
extracts structured sections
separates explanation blocks from code examples
keeps them linked by section/page context

When a question comes in:

retrieves explanation-first context
attaches only relevant code examples
validates whether the docs actually support the answer
generates a grounded response with citations

The goal is not to be a generic docs chatbot.

👉 The goal is to help engineers and AI agents answer implementation questions reliably from official docs.

Why DocsRAG Is Different

Implementation-oriented, not summary-oriented
Explanation-first retrieval
Code examples are linked to their actual context
Intent-aware retrieval pipeline
Support validation before answering
Designed for humans and AI agents
Local-first and self-hostable

How It Works

User adds a public documentation URL
Docs are crawled and extracted
Content is parsed into structured sections
Explanation blocks and code examples are stored separately but linked
Explanation chunks are indexed
User asks a question
Intent analysis + query planning
Retrieval + reranking
Support validation
Relevant code examples are attached
Final grounded answer is generated with citations

Architecture / Approach

DocsRAG uses a staged backend pipeline rather than a single "retrieve and prompt" step.

Ingestion is responsible for crawling public docs, extracting text, parsing sections, and building linked explanation and code-example records.
Retrieval focuses on explanation chunks first, because those are usually the most reliable source for procedural answers.
Query planning and reranking use question intent, entities, and procedural signals to improve retrieval quality.
Code example selection is handled separately so the answer only includes examples that are relevant to the question.
Support validation checks whether the retrieved documentation directly supports the requested task before the answer is returned.
Answer composition produces explanation-first responses with citations and optionally linked implementation steps and examples.

This separation is important. It makes the system easier to inspect, improve, and adapt for both human-facing UI flows and future agent tooling.

Tech Stack

The current repository is a monorepo with a local frontend and backend:

Next.js 15 + React 19 power the web UI for ingestion, questioning, status, citations, and answer display.
FastAPI provides the backend API for ingestion, asking questions, streaming answers, status, and reset operations.
Poetry manages the Python backend environment and CLI packaging.
Typer powers the backend CLI for ingesting docs, asking questions, serving the API, checking status, and reset flows.
Chroma stores vector indexes for explanation-oriented retrieval.
SQLite + SQLAlchemy persist local metadata such as docsets, pages, ingestion runs, and provider-related state.
Trafilatura handles primary content extraction from public documentation pages.
BeautifulSoup supports HTML parsing and structured section/code extraction.
OpenAI-compatible LLM and embedding providers supply embeddings and answer generation while letting users bring their own keys and model choices.
Uvicorn runs the FastAPI application in development and local deployment scenarios.
Tailwind CSS + Radix UI support the frontend UI layer.
Docker Compose provides an optional local container-based development path.
Pytest, Ruff, Black, ESLint, and GitHub Actions CI provide the current testing and quality baseline.

Local-First Philosophy

DocsRAG is designed to run locally.

You can clone the repository, configure your own providers, and run the full stack on your machine without depending on a hosted DocsRAG service. The project is built around bring-your-own API keys and provider configuration, so you control the models, the storage, and the runtime environment.

This matters for engineering workflows. Local-first tools are easier to inspect, easier to extend, and easier to trust when they are being used inside real implementation work.

Why It Is Useful for Engineers and AI Coding Agents

For engineers, DocsRAG helps turn large documentation sets into something closer to an implementation assistant. It is useful when you need the practical meaning of a docs section, the exact flow for setting something up, or the most relevant example tied to the explanation that justifies it.

For AI coding agents, the value is grounded context. Instead of relying on partial memory or weak retrieval, an agent can use DocsRAG to pull supportable answers from official docs, cite them, and reduce the chance of inventing steps that are not actually documented.

This is especially useful for AI-assisted coding, vibe coding, internal tooling experiments, and any workflow where the gap between "related docs" and "actionable implementation guidance" matters.

Use Cases

"How do I implement Stripe webhook verification?"
"How do I configure FastAPI background tasks?"
"How do I rotate API keys?"
"What does this docs section mean in practical terms?"
"Which official example is most relevant to this setup flow?"

Who This Is For

Engineers working from public technical documentation
Developer tools users who want grounded answers instead of loose summaries
AI coding workflows that need reliable documentation retrieval
Coding assistants and agentic tooling experiments
Open-source contributors building better developer infrastructure
Teams that want self-hostable, citation-backed documentation answers

Repository Layout

DocsRAG/
├── backend/              FastAPI app, CLI, ingestion, retrieval, pipeline, tests
├── frontend/             Next.js app
├── docker-compose.yml    Optional local container workflow
├── Makefile              Common development commands
└── README.md

Setup

DocsRAG currently uses a monorepo with separate frontend and backend apps.

1. Clone the repository

git clone git@github.com:Ando22/rag-docs.git
cd rag-docs

2. Configure environment variables

Copy the example env file at the repository root:

cp .env.example .env

Then set the values you want to use. At minimum, you will usually want to configure:

OPENAI_API_KEY
DOCSRAG_LLM_PROVIDER
DOCSRAG_LLM_MODEL
DOCSRAG_EMBEDDING_PROVIDER
DOCSRAG_EMBEDDING_MODEL
NEXT_PUBLIC_API_URL

By default, DocsRAG uses local SQLite and local Chroma persistence directories.

3. Install backend dependencies

cd backend
poetry install
cd ..

4. Install frontend dependencies

cd frontend
npm install
cd ..

5. Run the backend

make backend-dev

This starts the FastAPI server via the DocsRAG CLI.

6. Run the frontend

In a second terminal:

make frontend-dev

The frontend runs on http://localhost:3000 by default, and the backend runs on http://localhost:8000 by default.

7. Optional: run with Docker Compose

make dev

CLI and API

The backend already includes both a CLI and HTTP API.

Example CLI usage:

cd backend
poetry run docsrag ingest https://fastapi.tiangolo.com/
poetry run docsrag ask "How do I declare a query parameter?" --docset https://fastapi.tiangolo.com/
poetry run docsrag chat --docset https://fastapi.tiangolo.com/
poetry run docsrag status
poetry run docsrag reset
poetry run docsrag serve --reload

Current API surface includes:

POST /api/ingest
POST /api/ask
POST /api/ask/stream
GET /api/status
GET /api/health
POST /api/reset

OpenAPI docs are available at http://localhost:8000/docs.

Contribution Invitation

Contributions are welcome.

If you want to help, useful areas include:

retrieval improvements
parser and section-structure improvements
better explanation/code-example linkage
provider support and configuration profiles
UI and UX improvements
MCP integration and agent-facing interfaces
tests, fixtures, and documentation quality

Issues, discussions, and pull requests are all useful. If you are exploring the project for the first time, small improvements to docs, test coverage, or retrieval debugging are good places to start.

Progress Checklist

The repository already has a solid foundation, but the project is still early. The checklist below is intentionally conservative.

Core platform

Retrieval quality

Intent analysis
Query planning
Retrieval reranking
Support validation before answering
Initial code-example filtering and selection
Better lexical and hybrid reranking
Stronger code-example relevance scoring
Retrieval evaluation benchmarks and fixtures
Better handling for ambiguous or underspecified questions

UX

Basic web UI
Ingest and ask workflow
Answer and citation display
Settings/status surface
Better code display and comparison views
Expanded debug and retrieval inspection panel
First-run onboarding/setup flow
Sharper docset management UX

Agent / MCP readiness

MCP ask_docs interface
Structured tool output modes for agents
Agent-oriented answer mode
Better streaming/tool-call ergonomics
CLI runner flows for coding-agent pipelines

Open-source polish

Roadmap / Next Steps

Near-term directions for the project include:

improved structured parsing for inconsistent public docs
better retrieval and reranking quality
stronger code-example selection and explanation/example linking
MCP support for AI coding tools
a more agent-friendly CLI runner and structured output mode
multi-doc and cross-doc reasoning
better provider profiles and local configuration ergonomics
improved debugging, evaluation, and benchmark workflows

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation

DocsRAG

Why I Built This

The Idea

The Problem

The Solution

Why DocsRAG Is Different

How It Works

Architecture / Approach

Tech Stack

Local-First Philosophy

Why It Is Useful for Engineers and AI Coding Agents

Use Cases

Who This Is For

Repository Layout

Setup

1. Clone the repository

2. Configure environment variables

3. Install backend dependencies

4. Install frontend dependencies

5. Run the backend

6. Run the frontend

7. Optional: run with Docker Compose

CLI and API

Contribution Invitation

Progress Checklist

Core platform

Retrieval quality

UX

Agent / MCP readiness

Open-source polish

Roadmap / Next Steps

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages