Turn any llms.txt or documentation base into a production-ready, lightning-fast RAG assistant in 60 seconds. Built with Agno, Next.js, and FastAPI for developers who want private, reliable AI sidekicks.
DocsAI eliminates hallucinations by rigidly adhering to your documentation. It ships with a beautiful web dashboard for agent management and an embeddable widget that can be pasted effortlessly into any external website.
DocsAI features a secure hybrid-monorepo design. NextJS handles the front-facing dashboard and widgets. FastAPI orchestrates incoming chats with the vector store through Agno, securely fetching responses against the internal LLM.
graph LR
A[Visitor] -->|Widget Chat| B(Next.js Web App)
C[Admin] -->|Dashboard / Setup| B
B -->|API Requests| D(FastAPI Backend)
D -->|RAG Framework| E(Agno)
E -->|Vector Store| F[(Chroma DB)]
E -->|Inference Node| G[Ollama, OpenAI, Anthropic...]
You can drop the embed script generated by DocsAI onto:
- Developer Documentation portals (Nextra, Docusaurus, Mintlify)
- Internal Support Helplines for Employee FAQs
- Knowledge Base Sidekicks for internal HR / Finance / DevOps policies.
- Customer support widgets.
DocsAI strongly favors running within Docker Compose. We ship Ollama natively as the default provider over host.docker.internal for a completely free, local, and keyless experience! (OpenAI, Anthropic, OpenRouter, and Aisa.one are also supported during agent setup).
- Clone the repository
- Setup your environment:
cp apps/web/.env.example apps/web/.env.local # -> Insert your Privy App ID here! - Spin up the stack:
docker-compose up --build
Ports:
- Web application (Dashboard, Widget generator): http://localhost:3000
- API Layer (Swagger UI / Swagger Docs): http://localhost:8000
- Chroma DB Data: Mounts locally to
/tmp/chroma
(Note: Ensure your local Ollama is running (ollama serve) and is accessible from docker if you chose the local inference mode!).
If you'd like to put DocsAI on the open web, you easily can by leveraging Docker deployments on platforms like GCP Cloud Run, Railway, Render, or ECS.
Broad Deployment Theory:
- Backend (FastAPI): Deploy the
apps/apidirectory using its Dockerfile. It requires no persistent state assuming you are ok with volatile in-memory indexing via Chroma (though a mounted volume is highly recommended for production). Make sure to select an external model provider like OpenAI and pass in the keys when using Serverless, as local Ollama instances generally do not adapt to zero-scale networks. - Frontend (Next.js): Deploy the
apps/webdirectory (via Dockerfile or native Vercel). You must pass two build/run time environment variables:NEXT_PUBLIC_PRIVY_APP_ID(for authentication) andNEXT_PUBLIC_API_URL(pointing to your newly minted backend URL).
- Frontend Application: Next.js 14, React.js, Tailwind CSS V4, Lucide Icons, ShadCN (Vanilla), Privy (Auth)
- Backend Application: Python 3.11+, FastAPI, Uvicorn (uvloop)
- AI Core Framework: Agno (Knowledge Base / Semantic RAG)
- Vector Database: ChromaDB (Embeddings automatically generated via
sentence-transformers/all-MiniLM-L6-v2)
Privacy is baked into the foundation.
- Tenancy: Every project requested has a customized ID.
- BYO Model: We manage absolutely zero configurations. If you enter an API key for your assistant, the backend only uses it at runtime routing.
- Strict Guidelines: System prompts are inherently designed to refuse questions outside of indexed material.