Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 72 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,22 @@
<div align="center">

# bigRAG

Open-source, self-hostable RAG platform with Turbopuffer-backed search. Upload documents, auto-chunk, embed, and retrieve through semantic, keyword, and hybrid modes behind a simple REST API.
**Open-source, self-hostable RAG platform with Turbopuffer-backed search.**

Upload documents, auto-chunk, embed, and retrieve through semantic, keyword, and hybrid search — all behind one clean REST API.

[![PyPI version](https://img.shields.io/pypi/v/bigrag?style=flat-square&logo=pypi&logoColor=white&label=PyPI)](https://pypi.org/project/bigrag/)
[![npm version](https://img.shields.io/npm/v/%40bigrag%2Fclient?style=flat-square&logo=npm&logoColor=white&label=npm)](https://www.npmjs.com/package/@bigrag/client)
[![Docker image](https://img.shields.io/docker/v/yoginth/bigrag-api?style=flat-square&logo=docker&logoColor=white&label=Docker&sort=semver)](https://hub.docker.com/r/yoginth/bigrag-api)
[![License: MIT](https://img.shields.io/badge/License-MIT-blue?style=flat-square)](LICENSE)
[![GitHub stars](https://img.shields.io/github/stars/bigint/bigrag?style=flat-square&logo=github&label=Stars)](https://github.com/bigint/bigrag)

[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)
[Quick Start](#quick-start) · [Architecture](#architecture) · [API Reference](#api-reference) · [SDKs](#sdks) · [MCP Server](#mcp-server) · [Configuration](#configuration)

</div>

---

## Features

Expand Down Expand Up @@ -32,7 +46,12 @@ Open-source, self-hostable RAG platform with Turbopuffer-backed search. Upload d
docker compose up -d
```

This starts bigRAG API, worker, admin UI, Postgres, and Redis. Configure Turbopuffer from onboarding before ingesting or querying collections. Open http://localhost:3000 for the admin UI or http://localhost:4000/docs for the interactive API docs.
This starts the bigRAG API, worker, admin UI, Postgres, and Redis. Open **[localhost:3000](http://localhost:3000)** for the admin UI or **[localhost:4000/docs](http://localhost:4000/docs)** for the interactive API docs.

> [!IMPORTANT]
> Configure Turbopuffer from onboarding before ingesting or querying collections.

Once Turbopuffer is configured, you can drive everything over HTTP:

```bash
# Create a collection
Expand Down Expand Up @@ -63,8 +82,7 @@ docker pull yoginth/bigrag-api:2026.4.30
docker pull yoginth/bigrag-ui:2026.4.30
```

Release artifacts use CalVer (`YYYY.M.D`). Docker also publishes `latest`; the
Python SDK publishes dated PyPI releases.
Release artifacts use CalVer (`YYYY.M.D`). Docker also publishes `latest`; the Python SDK publishes dated PyPI releases.

## Architecture

Expand Down Expand Up @@ -244,7 +262,7 @@ const { results } = await client.queries.query("docs", { query: "What is RAG?" }
### Python

```bash
pip install bigrag==2026.5.7
pip install bigrag==2026.5.22
```

```python
Expand All @@ -259,7 +277,7 @@ doc = await client.documents.upload("docs", "/path/to/paper.pdf")
result = await client.queries.query("docs", {"query": "What is RAG?"})
```

## MCP server
## MCP Server

Expose bigRAG to Claude Desktop, Cursor, and any MCP-aware runtime:

Expand Down Expand Up @@ -289,36 +307,49 @@ Full-workspace keys expose 8 tools — `list_collections`, `get_collection`, `ge

## Configuration

Bootstrap settings use the `BIGRAG_` prefix as environment variables, or configure via `bigrag.toml`.
Backend logging defaults to `debug` / `text` for local development. Use `BIGRAG_LOG_LEVEL=info` and `BIGRAG_LOG_FORMAT=json` for production log collection. Configure Turbopuffer from the admin UI; it is stored in Postgres with the other instance settings.
Bootstrap settings use the `BIGRAG_` prefix as environment variables, or configure them in `bigrag.toml`. Backend logging defaults to `debug` / `text` for local development — use `BIGRAG_LOG_LEVEL=info` and `BIGRAG_LOG_FORMAT=json` for production log collection. Turbopuffer is configured from the admin UI and stored in Postgres alongside the other instance settings.

#### Server

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_PORT` | Server port | `4000` |
| `BIGRAG_HOST` | Bind address | `127.0.0.1` |
| `BIGRAG_WORKERS` | API worker processes | `1` |
| `BIGRAG_ENV` | `dev` or `prod` (prod enables startup safety checks) | `dev` |
| `BIGRAG_LOG_LEVEL` | Backend log level: `debug`, `info`, `warning`, or `error` | `debug` |
| `BIGRAG_LOG_FORMAT` | Backend log renderer: `text` or `json` | `text` |
| `BIGRAG_CORS_ORIGINS` | JSON array of allowed browser origins | `[]` |
| `BIGRAG_TRUSTED_PROXIES` | JSON array of trusted proxy CIDRs used to honor `X-Forwarded-For` for audit and access logs | `[]` |

#### Database & Redis

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_DATABASE_URL` | Postgres URL (`postgres:5432` inside docker-compose, `localhost:5432` for bare-metal dev) | `postgres://bigrag:bigrag@localhost:5432/bigrag?sslmode=disable` |
| `BIGRAG_DB_POOL_MIN` | Min Postgres pool size | `5` |
| `BIGRAG_DB_POOL_MAX` | Max Postgres pool size | `50` |
| `BIGRAG_MIGRATION_TIMEOUT_SECONDS` | Startup migration check timeout (`0` disables the timeout) | `60` |
| `BIGRAG_REDIS_URL` | Redis URL | `redis://localhost:6379/0` |
| `BIGRAG_ENV` | `dev` or `prod` (prod enables startup safety checks) | `dev` |
| `BIGRAG_TRUSTED_PROXIES` | JSON array of trusted proxy CIDRs used to honor `X-Forwarded-For` for audit and access logs | `[]` |

#### Sessions & Auth

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_SESSION_EXPIRY_HOURS` | Session cookie lifetime | `168` |
| `BIGRAG_SESSION_COOKIE_NAME` | Session cookie name | `bigrag_session` |
| `BIGRAG_SESSION_COOKIE_SECURE` | HTTPS-only session cookies | `false` |
| `BIGRAG_SESSION_COOKIE_SAMESITE` | Session cookie SameSite policy | `lax` |
| `BIGRAG_SESSION_COOKIE_DOMAIN` | Optional session cookie domain | — |
| `BIGRAG_AUTH_PRINCIPAL_CACHE_TTL` | Principal cache TTL in seconds | `60` |

`./dev.sh` and the default Docker Compose setup allow the local admin UI origin
`http://localhost:3000`. For production, set `BIGRAG_CORS_ORIGINS` to the exact
admin UI origin. Cross-site admin UI deployments also need
`BIGRAG_SESSION_COOKIE_SECURE=true` and usually
`BIGRAG_SESSION_COOKIE_SAMESITE=none`.
> [!TIP]
> `./dev.sh` and the default Docker Compose setup allow the local admin UI origin `http://localhost:3000`. For production, set `BIGRAG_CORS_ORIGINS` to the exact admin UI origin. Cross-site admin UI deployments also need `BIGRAG_SESSION_COOKIE_SECURE=true` and usually `BIGRAG_SESSION_COOKIE_SAMESITE=none`.

#### Embedding

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_EMBEDDING_API_KEY` | Default embedding API key | — |
| `BIGRAG_EMBEDDING_PROVIDER` | Default embedding provider | `openai` |
| `BIGRAG_EMBEDDING_MODEL` | Default embedding model | `text-embedding-3-small` |
Expand All @@ -327,15 +358,30 @@ admin UI origin. Cross-site admin UI deployments also need
| `BIGRAG_EMBEDDING_CONCURRENCY` | Max concurrent embedding requests | `8` |
| `BIGRAG_ALLOWED_EMBEDDING_BASE_URLS` | JSON allow-list for embedding base URLs | `[]` |
| `BIGRAG_ALLOW_PRIVATE_EMBEDDING_BASE_URLS` | Allow private-network embedding endpoints | `false` |

#### Chat

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_CHAT_PROVIDER` | Chat provider | `openai` |
| `BIGRAG_CHAT_MODEL` | Default chat model | `gpt-4o-mini` |
| `BIGRAG_CHAT_BASE_URL` | Base URL for OpenAI-compatible chat endpoints | — |
| `BIGRAG_CHAT_TEMPERATURE` | Default chat temperature | `0.2` |
| `BIGRAG_CHAT_MAX_CONTEXT_CHARS` | Max retrieved-context characters per chat call | `120000` |
| `BIGRAG_ALLOWED_CHAT_BASE_URLS` | JSON allow-list for chat base URLs | `[]` |
| `BIGRAG_ALLOW_PRIVATE_CHAT_BASE_URLS` | Allow private-network chat endpoints | `false` |

#### Security

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_MASTER_KEY` | Fernet key that encrypts provider credentials, embedding cache rows, and Redis cache payloads (required in `prod`) | — |
| `BIGRAG_MASTER_KEY_PREVIOUS` | JSON array of old Fernet keys for staged rotation | `[]` |

#### Ingestion & Uploads

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_UPLOAD_DIR` | Local upload directory | `./data/uploads` |
| `BIGRAG_INGESTION_WORKERS` | Ingestion concurrency target | `4` |
| `BIGRAG_MAX_UPLOAD_SIZE_MB` | Max single-file upload size | `64` |
Expand All @@ -344,11 +390,21 @@ admin UI origin. Cross-site admin UI deployments also need
| `BIGRAG_CONVERSION_TIMEOUT` | Docling conversion timeout in seconds | `300` |
| `BIGRAG_CONVERSION_PDF_OCR_ENABLED` | Enable OCR for scanned PDFs | `true` |
| `BIGRAG_QUEUE_MAX_DEPTH` | Max pending jobs in the ingestion queue | `10000` |

#### Caching

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_COLLECTION_CACHE_TTL` | Collection metadata cache TTL in seconds | `30` |
| `BIGRAG_QUERY_EMBEDDING_CACHE_TTL` | Query embedding cache TTL in seconds | `300` |
| `BIGRAG_QUERY_RESULT_CACHE_TTL` | Exact query-result cache TTL in seconds | `30` |
| `BIGRAG_EMBEDDING_CACHE_MODE` | Persistent chunk embedding cache mode (`encrypted` or `disabled`) | `encrypted` |
| `BIGRAG_EMBEDDING_CACHE_RETENTION_DAYS` | Days to keep persistent embedding-cache rows after last use | `30` |

#### Webhooks

| Variable | Description | Default |
|----------|-------------|---------|
| `BIGRAG_WEBHOOK_DELIVERY_TIMEOUT` | Webhook HTTP timeout in seconds | `10` |
| `BIGRAG_WEBHOOK_RETRY_DELAYS` | JSON array of webhook retry delays in seconds | `[10,30,90]` |
| `BIGRAG_WEBHOOK_MAX_COUNT` | Max configured webhooks | `50` |
Expand Down
4 changes: 0 additions & 4 deletions api/bigrag/services/access_log/flusher.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,6 @@
_access_log_flusher_task: asyncio.Task | None = None


def get_queue() -> asyncio.Queue[dict[str, Any]] | None:
return _access_log_queue


async def _drain_batch(queue: asyncio.Queue[dict[str, Any]]) -> list[dict[str, Any]]:
try:
first = await asyncio.wait_for(queue.get(), timeout=_ACCESS_LOG_FLUSH_INTERVAL)
Expand Down
15 changes: 0 additions & 15 deletions api/bigrag/services/audit.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,21 +164,6 @@ async def stop_audit_flusher() -> None:
_audit_queue = None


async def flush_audit_logs() -> None:
queue = _audit_queue
if queue is None:
return
while not queue.empty():
batch: list[dict[str, Any]] = []
while len(batch) < _AUDIT_BATCH_MAX:
try:
batch.append(queue.get_nowait())
except asyncio.QueueEmpty:
break
if batch:
await _flush_batch(batch)


def record(
request: Request,
*,
Expand Down
15 changes: 0 additions & 15 deletions api/bigrag/services/backup/manifest.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,18 +67,3 @@ def _manifest(
body = orjson.dumps(payload, option=orjson.OPT_SORT_KEYS)
payload["hmac_sha256"] = hmac.new(key, body, hashlib.sha256).hexdigest()
return payload


def verify_manifest(manifest: dict[str, Any]) -> bool:
signature = manifest.get("hmac_sha256")
if not signature:
return False
key = _signing_key()
if key is None:
return False
body = orjson.dumps(
{k: v for k, v in manifest.items() if k != "hmac_sha256"},
option=orjson.OPT_SORT_KEYS,
)
expected = hmac.new(key, body, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, signature)
19 changes: 0 additions & 19 deletions api/bigrag/services/connectors/google_drive_sources.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
from bigrag.services.connector_core import (
configured,
create_source,
create_sync_job,
delete_source,
list_sources,
source_public,
Expand Down Expand Up @@ -83,24 +82,6 @@ async def create_google_source(
)


async def create_google_sync_job(
session,
*,
source: ConnectorSource,
trigger: str,
user_id: str | None,
commit: bool = True,
) -> ConnectorSyncJob:
return await create_sync_job(
session,
provider=GOOGLE_PROVIDER,
source=source,
trigger=trigger,
user_id=user_id,
commit=commit,
)


def start_google_sync_job(job_id: str) -> None:
from bigrag.services.jobs.actors import enqueue_google_drive_sync

Expand Down
14 changes: 0 additions & 14 deletions api/bigrag/services/event_bus/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,20 +30,6 @@ class IngestionEvent:
detail: dict = field(default_factory=dict)
collection_name: str = ""

def to_sse(self) -> str:
data = {
"document_id": self.document_id,
"collection_name": self.collection_name,
"step": self.step,
"status": self.status,
"message": self.message,
"progress": self.progress,
**self.detail,
}
return (
f"id: {next_sse_id()}\nretry: {SSE_RETRY_MS}\ndata: {orjson.dumps(data).decode()}\n\n"
)

def serialize(self) -> bytes:
return orjson.dumps(asdict(self))

Expand Down
43 changes: 0 additions & 43 deletions api/bigrag/services/queue.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
QUEUE_KEY = queue_state.QUEUE_KEY
PROCESSING_KEY = queue_state.PROCESSING_KEY
DEAD_LETTER_KEY = queue_state.DEAD_LETTER_KEY
RETRY_KEY = queue_state.RETRY_KEY
STATS_KEY = queue_state.STATS_KEY
LEASE_KEY_PREFIX = queue_state.LEASE_KEY_PREFIX
COLLECTION_EPOCH_KEY_PREFIX = queue_state.COLLECTION_EPOCH_KEY_PREFIX
Expand Down Expand Up @@ -81,9 +80,6 @@ async def _recover_stuck_jobs(self) -> int:
return 0
return await queue_recovery.recover_stuck_jobs(self._redis)

async def _epoch_value(self, key: str) -> int:
return await queue_state.epoch_value(self._redis, key)

async def _collection_epoch(self, collection_name: str) -> int:
return await queue_state.collection_epoch(self._redis, collection_name)

Expand Down Expand Up @@ -116,14 +112,6 @@ async def enqueue(self, job: IngestionJob) -> None:
await self._redis.hincrby(STATS_KEY, "queued", 1)
logger.info(f"{job.collection_name} | queued | {pending + 1} pending")

async def flush_collection(self, collection_name: str) -> int:
if not self._redis:
return 0
removed = await queue_state.flush_collection_jobs(self._redis, collection_name)
if removed:
logger.info("queue flushed jobs", collection=collection_name, removed=removed)
return int(removed)

async def cancel_collection(self, collection_name: str) -> int:
if not self._redis:
return 0
Expand Down Expand Up @@ -161,18 +149,6 @@ async def stats(self) -> dict:
)
return stats

async def _promote_due_retries(self) -> int:
from bigrag.services.runtime_settings import get_value

queue_max_depth = await get_value("queue_max_depth")
promoted = await queue_state.promote_due_retries(
self._redis,
queue_max_depth=queue_max_depth,
)
if promoted:
logger.info("queue promoted retry jobs", count=promoted)
return promoted

async def _renew_lease(self, job_id: str) -> None:
lease_key = _lease_key(job_id)
while True:
Expand Down Expand Up @@ -241,25 +217,6 @@ async def _fanout_webhook_event(self, event: IngestionEvent) -> None:
error=repr(exc),
)

async def _ocr_scanned_pdf(
self,
*,
file_data: bytes,
suffix: str,
job: IngestionJob,
prefix: str,
start_time: float,
) -> str:
return await queue_conversion.ocr_scanned_pdf(
file_data=file_data,
suffix=suffix,
job=job,
prefix=prefix,
start_time=start_time,
emit=self._emit,
ensure_job_current=self._ensure_job_current,
)

async def _convert_document(self, job: IngestionJob, prefix: str) -> ParsedDocument:
return await queue_conversion.convert_document(
job,
Expand Down
Loading
Loading