Skip to content

erdkocak/PrivateRAG

PrivateRAG

PrivateRAG v0 is an open-source privacy layer for Retrieval-Augmented Generation systems. It lets a client search a local, client-visible index and retrieve matching remote chunks through PIR without sending the raw query, query embedding, selected chunk IDs, or selected internal record IDs to the retrieval server.

The current repository contains a working local v0 implementation:

  • Rust bundle packing, verification, local index search, PIR client/server logic, HTTP server, and CLI commands.
  • A Python SDK with public-cache download/validation, local search, PIR request preparation, HTTP retrieval, and local response decode.
  • Optional LangChain and LlamaIndex adapters that delegate to the core Python SDK.
  • Local benchmark, smoke, and evidence tooling for the supported backend profile.

PrivateRAG is not a full RAG framework, vector database, embedding service, LLM orchestration layer, authentication system, or general confidential-computing platform.

PrivateRAG v0 has one narrow supported private retrieval profile: simplepir-rs-v0-1000-1k. It is unaudited, not externally audited, and not production-grade cryptographic assurance.

Supported Backend Profile

The default private retrieval profile is simplepir-rs-v0-1000-1k over the simplepir-lwe-v0 scheme. It is a supported v0 unaudited profile only within this envelope:

  • at most 1,000 records
  • exactly 1,024-byte PIR records
  • exactly 384-dimensional finite f32 local index rows
  • fixed top_k=2
  • fixed retrieval_count=4
  • public cache <= 2.5 MiB
  • full bundle <= 3.5 MiB

Unsupported shapes fail closed rather than silently widening the claim. The older simplepir-rs-experimental-v0 profile remains smoke-scale, experimental, unaudited, and compatibility/development only. The toy-development-only-v0 backend remains for development and regression tests only.

The supported profile is unaudited, not externally audited, and not production-grade cryptographic assurance. See docs/backend_profile_claims.md for limits, metadata, evidence, non-claims, and fail-closed behavior.

How It Works

PrivateRAG separates search from storage:

  1. A bundle producer builds a static bundle containing fixed-size chunk records, a client-visible index, manifest/checksum files, and PIR artifacts.
  2. The retrieval server hosts the bundle and answers POST /pir/query requests. It does not run embedding, vector search, reranking, LLM calls, or direct chunk lookup.
  3. The client downloads the public cache, validates checksums, embeds locally, searches the local index, prepares a padded batched PIR request, posts only the PIR envelope, and decodes the returned records locally.

The server can still observe request timing, request count, retrieval_count, bundle identity, public-cache downloads, body sizes, network metadata, and deployment logs. See docs/threat_model.md and docs/limitations.md.

Quickstart

Install the Python SDK:

pip install privaterag

Build a fixture bundle:

cargo run -p privaterag-pack -- \
  --chunks fixtures/tiny_corpus/chunks.jsonl \
  --embeddings fixtures/tiny_corpus/embeddings.json \
  --out /tmp/tiny-kb.bundle \
  --bundle-id tiny-corpus \
  --bundle-version 2026-05-04 \
  --created-at 2026-05-04T00:00:00Z

Run the local HTTP server:

cargo run -p privaterag-server -- \
  --bundle /tmp/tiny-kb.bundle \
  --host 127.0.0.1 \
  --port 8080

Use the CLI:

cargo run -p privaterag-cli -- --help
cargo run -p privaterag-cli -- verify --bundle /tmp/tiny-kb.bundle

For the complete first-run path covering bundle packing, verification, serving, CLI retrieval, Python SDK retrieval, and Docker usage, see docs/getting_started.md.

For Python SDK usage, see docs/client_api.md and python/privaterag.

Workspace Layout

crates/
  privaterag-core/
  privaterag-pack/
  privaterag-server/
  privaterag-client/
  privaterag-cli/
  privaterag-python/
  privaterag-pir/
  privaterag-index/
python/
  privaterag/
  privaterag-langchain/
  privaterag-llamaindex/
examples/
docs/
benches/
tests/
fixtures/

Verification

Run Rust tests:

cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --all-targets

Run Python tests:

python3 -m unittest discover -s tests/python

Run Python compile checks:

python3 -m compileall -q examples scripts python tests

Check local Markdown links:

python3 scripts/check_markdown_links.py

python3 is used intentionally because some systems do not provide a python executable.

Documentation

Project metadata:

Core docs:

Security and deployment docs:

Benchmark and evidence docs:

Adapter docs and examples:

Current Boundaries

  • PrivateRAG does not hide the public index from authorized clients.
  • PrivateRAG does not hide corpus contents from the retrieval server.
  • PrivateRAG does not protect downstream prompts or retrieved context sent to an LLM provider.
  • PrivateRAG does not implement authentication, authorization, transport security, tenant isolation, or deployment log controls.
  • Unsigned bundles are the current format. Checksums are not signatures; they detect file drift only after a trusted deployment obtains checksums.json, and they do not authenticate the bundle producer.
  • The supported backend profile is limited and unaudited.

About

No description, website, or topics provided.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors