AskGraph

RAG Knowledge Base Platform with custom GraphRAG on PostgreSQL.

Overview

AskGraph is a self-hosted retrieval-augmented generation platform that lets you upload documents, build a queryable knowledge base, and get grounded answers from a language model of your choice. Unlike systems that rely on dedicated graph databases or closed-source embedding APIs, AskGraph implements its full GraphRAG pipeline — entity extraction, knowledge graph construction, and multi-hop traversal — entirely inside PostgreSQL using pgvector and recursive CTEs. The result is a single-database architecture that is straightforward to operate, easy to back up, and capable of three distinct retrieval strategies (vector similarity, graph traversal, and a weighted hybrid of both) selectable per query.

Architecture

  Browser
     |
     v
 Next.js (port 3000)
     |  REST + WebSocket
     v
 FastAPI (port 8000)
     |
     +---> PostgreSQL 17 + pgvector
     |         |
     |         +-- collections
     |         +-- documents
     |         +-- chunks          (embeddings via HNSW index)
     |         +-- kg_entities     (entity nodes, embeddings via HNSW index)
     |         +-- kg_entity_chunks
     |         +-- kg_relations    (edges with weight, recursive CTE traversal)
     |
     +---> LLM provider (Ollama / OpenAI / any LLMWire-compatible)
     |
     +---> Embedding provider (local sentence-transformers / OpenAI)

Features

Document ingestion for PDF, DOCX, and plain text files, with configurable chunk size and overlap
Three retrieval modes per query: pure vector similarity search, knowledge graph traversal, and a weighted hybrid of both
Knowledge graph built automatically during ingestion: entities and typed relations are extracted by the LLM from every chunk and stored as nodes and edges with pgvector embeddings
Graph traversal using a WITH RECURSIVE CTE over kg_relations, following edges bidirectionally up to a configurable hop depth, with relation-weight decay per hop
Streaming chat responses over WebSocket, with per-token delivery and citation metadata
Collection-scoped search so documents are isolated by project or tenant
Knowledge graph visualisation endpoint returning all entities and relations for a collection
Alembic-managed database schema with HNSW indexes on both chunk and entity embedding columns
Fully containerised: single docker compose up -d starts database, API, and frontend

Quick Start

# Clone the repository
git clone https://github.com/your-org/askgraph.git
cd askgraph

# Start all services
docker compose up -d

# Open the application
open http://localhost:3000

The API interactive docs are available at http://localhost:8000/docs.

To use OpenAI instead of Ollama:

OPENAI_API_KEY=sk-... LLM_PROVIDER=openai LLM_MODEL=gpt-4o docker compose up -d

API Endpoints

Method	Path	Description
GET	`/health`	Service liveness check
POST	`/collections`	Create a new collection
GET	`/collections`	List all collections
DELETE	`/collections/{id}`	Delete a collection and all its documents
POST	`/documents`	Upload and ingest a document (multipart/form-data)
GET	`/documents?collection_id={id}`	List documents in a collection
DELETE	`/documents/{id}`	Delete a document
POST	`/chat`	Run a RAG query, returns answer + citations
WS	`/chat/stream`	Stream a RAG answer token-by-token over WebSocket
GET	`/kg?collection_id={id}`	Return all KG entities and relations for a collection

Chat request body

{
  "query": "What are the main themes in the uploaded papers?",
  "collection_id": "uuid",
  "retrieval_mode": "hybrid",
  "top_k": 5
}

WebSocket message format

Send a JSON message after connecting to /chat/stream:

{
  "query": "Explain the architecture",
  "collection_id": "uuid",
  "retrieval_mode": "graph",
  "top_k": 5
}

The server sends {"type": "token", "content": "..."} events during generation, followed by a single {"type": "citations", "data": [...]} event.

Retrieval Modes

vector

Standard dense retrieval. The query is embedded using the configured sentence-transformer model, then the chunks table is searched by cosine distance via the HNSW index on chunks.embedding. Returns the top-k most similar chunks.

graph

Knowledge-graph traversal. The query embedding is first used to find the top-3 most similar entity nodes from kg_entities (seed entities). A WITH RECURSIVE CTE then walks kg_relations bidirectionally up to 2 hops. Each hop multiplies the previous relation weight by 1/(depth+1) to decay scores with distance. The best score per chunk is kept and the top-k chunks are returned. This mode is particularly effective for multi-hop questions that require connecting related concepts across documents.

hybrid

Runs both the vector and graph retrievers, then fuses their result sets. Each chunk receives a merged score of 0.5 * vector_score + 0.5 * graph_score. Chunks that appear in only one result set receive 0.0 for the missing signal. The fused list is sorted by merged score and truncated to top-k. Weights are configurable in the HybridRetriever constructor.

Configuration

All settings are read from environment variables (or a .env file in the project root).

Variable	Default	Description
`DATABASE_URL`	`postgresql+asyncpg://askgraph:askgraph@localhost:5432/askgraph`	Async SQLAlchemy database URL
`EMBEDDING_PROVIDER`	`local`	Embedding backend: `local` or `openai`
`EMBEDDING_MODEL`	`all-MiniLM-L6-v2`	Model name for the embedding provider
`LLM_PROVIDER`	`ollama`	LLM backend: `ollama`, `openai`, or others
`LLM_MODEL`	`llama3`	Model name for the LLM provider
`LLM_API_KEY`	(empty)	API key for the LLM provider (if required)
`OPENAI_API_KEY`	(empty)	OpenAI API key (used when provider is openai)
`CHUNK_SIZE`	`512`	Maximum characters per chunk
`CHUNK_OVERLAP`	`50`	Character overlap between adjacent chunks

Benchmarks

Run these after ingesting a representative document set. Replace the placeholder values once measured.

Mode	Precision@5	Recall@5	Latency (p50)	Latency (p95)
vector	—	—	—	—
graph	—	—	—	—
hybrid	—	—	—	—

Run benchmarks after ingesting data and comparing retrieval results against a ground-truth QA set.

References and Papers

The design of AskGraph draws from the following research:

LightRAG — Simple and Fast Retrieval-Augmented Generation https://arxiv.org/abs/2410.05779
Microsoft GraphRAG — From Local to Global: A Graph RAG Approach to Query-Focused Summarization https://arxiv.org/abs/2404.16130
Original RAG — Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks https://arxiv.org/abs/2005.11401

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.github/workflows		.github/workflows
alembic		alembic
frontend		frontend
scripts		scripts
src/askgraph		src/askgraph
tests		tests
.env.example		.env.example
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AskGraph

Overview

Architecture

Features

Quick Start

API Endpoints

Chat request body

WebSocket message format

Retrieval Modes

vector

graph

hybrid

Configuration

Benchmarks

References and Papers

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AskGraph

Overview

Architecture

Features

Quick Start

API Endpoints

Chat request body

WebSocket message format

Retrieval Modes

vector

graph

hybrid

Configuration

Benchmarks

References and Papers

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages