PrivateRAG v0 is an open-source privacy layer for Retrieval-Augmented Generation systems. It lets a client search a local, client-visible index and retrieve matching remote chunks through PIR without sending the raw query, query embedding, selected chunk IDs, or selected internal record IDs to the retrieval server.
The current repository contains a working local v0 implementation:
- Rust bundle packing, verification, local index search, PIR client/server logic, HTTP server, and CLI commands.
- A Python SDK with public-cache download/validation, local search, PIR request preparation, HTTP retrieval, and local response decode.
- Optional LangChain and LlamaIndex adapters that delegate to the core Python SDK.
- Local benchmark, smoke, and evidence tooling for the supported backend profile.
PrivateRAG is not a full RAG framework, vector database, embedding service, LLM orchestration layer, authentication system, or general confidential-computing platform.
PrivateRAG v0 has one narrow supported private retrieval profile:
simplepir-rs-v0-1000-1k. It is unaudited, not externally audited, and not
production-grade cryptographic assurance.
The default private retrieval profile is simplepir-rs-v0-1000-1k over the
simplepir-lwe-v0 scheme. It is a supported v0 unaudited profile only
within this envelope:
- at most 1,000 records
- exactly 1,024-byte PIR records
- exactly 384-dimensional finite
f32local index rows - fixed
top_k=2 - fixed
retrieval_count=4 - public cache <= 2.5 MiB
- full bundle <= 3.5 MiB
Unsupported shapes fail closed rather than silently widening the claim. The
older simplepir-rs-experimental-v0 profile remains smoke-scale,
experimental, unaudited, and compatibility/development only. The
toy-development-only-v0 backend remains for development and regression tests
only.
The supported profile is unaudited, not externally audited, and not
production-grade cryptographic assurance. See
docs/backend_profile_claims.md for limits,
metadata, evidence, non-claims, and fail-closed behavior.
PrivateRAG separates search from storage:
- A bundle producer builds a static bundle containing fixed-size chunk records, a client-visible index, manifest/checksum files, and PIR artifacts.
- The retrieval server hosts the bundle and answers
POST /pir/queryrequests. It does not run embedding, vector search, reranking, LLM calls, or direct chunk lookup. - The client downloads the public cache, validates checksums, embeds locally, searches the local index, prepares a padded batched PIR request, posts only the PIR envelope, and decodes the returned records locally.
The server can still observe request timing, request count, retrieval_count,
bundle identity, public-cache downloads, body sizes, network metadata, and
deployment logs. See docs/threat_model.md and
docs/limitations.md.
Install the Python SDK:
pip install privateragBuild a fixture bundle:
cargo run -p privaterag-pack -- \
--chunks fixtures/tiny_corpus/chunks.jsonl \
--embeddings fixtures/tiny_corpus/embeddings.json \
--out /tmp/tiny-kb.bundle \
--bundle-id tiny-corpus \
--bundle-version 2026-05-04 \
--created-at 2026-05-04T00:00:00ZRun the local HTTP server:
cargo run -p privaterag-server -- \
--bundle /tmp/tiny-kb.bundle \
--host 127.0.0.1 \
--port 8080Use the CLI:
cargo run -p privaterag-cli -- --help
cargo run -p privaterag-cli -- verify --bundle /tmp/tiny-kb.bundleFor the complete first-run path covering bundle packing, verification, serving,
CLI retrieval, Python SDK retrieval, and Docker usage, see
docs/getting_started.md.
For Python SDK usage, see docs/client_api.md and
python/privaterag.
crates/
privaterag-core/
privaterag-pack/
privaterag-server/
privaterag-client/
privaterag-cli/
privaterag-python/
privaterag-pir/
privaterag-index/
python/
privaterag/
privaterag-langchain/
privaterag-llamaindex/
examples/
docs/
benches/
tests/
fixtures/
Run Rust tests:
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace --all-targetsRun Python tests:
python3 -m unittest discover -s tests/pythonRun Python compile checks:
python3 -m compileall -q examples scripts python testsCheck local Markdown links:
python3 scripts/check_markdown_links.pypython3 is used intentionally because some systems do not provide a python
executable.
Project metadata:
Core docs:
docs/architecture.mddocs/getting_started.mddocs/bundle_format.mddocs/manifest.mddocs/index_format.mddocs/pir_protocol.mddocs/server_api.mddocs/client_api.mddocs/adapter_compatibility.mddocs/versioning.mddocs/publishing.mddocs/release_checklist.mddocs/cli.md
Security and deployment docs:
docs/backend_profile_claims.mddocs/security_claims.mddocs/threat_model.mddocs/limitations.mddocs/padding_policy.mddocs/deployment_security.mddocs/release_scope.mddocs/adr/0010-signature-authenticity-deferral.md
Benchmark and evidence docs:
docs/benchmark_plan.mddocs/benchmark_corpus_generator.md- In-process benchmark evidence
- HTTP benchmark evidence
- Benchmark report and scale limits
- Backend profile evidence
Adapter docs and examples:
docs/langchain.mddocs/llamaindex.mdexamples/simple_python_ragexamples/langchain_retrieverexamples/llamaindex_retrieverexamples/local_docs_demo
- PrivateRAG does not hide the public index from authorized clients.
- PrivateRAG does not hide corpus contents from the retrieval server.
- PrivateRAG does not protect downstream prompts or retrieved context sent to an LLM provider.
- PrivateRAG does not implement authentication, authorization, transport security, tenant isolation, or deployment log controls.
- Unsigned bundles are the current format. Checksums are not signatures; they
detect file drift only after a trusted deployment obtains
checksums.json, and they do not authenticate the bundle producer. - The supported backend profile is limited and unaudited.