Chat with your own documents. You control the infrastructure. Encryption protects what you store.
95% of professionals have documents they can't put into ChatGPT.
Medical records, legal contracts, financial reports, proprietary research, HR files, client data — all of it is off-limits for most AI tools because of a single fundamental design flaw: your documents go to a corporate server in plaintext, where someone else controls them.
The market for enterprise AI document tools is $4.4B and growing at 28% annually (Grand View Research, 2024), yet most solutions require organizations to hand their most sensitive information to a third party to process and store in readable form. Regulated industries — healthcare, finance, legal, government — are effectively locked out entirely.
The gap isn't just about privacy preferences. It's a structural trust problem: the moment you paste a document into ChatGPT, OpenAI receives it in full, can use it for model improvement, and you have no visibility into where it goes.
PrivateAI is a self-hosted, privacy-first RAG (Retrieval-Augmented Generation) platform. You deploy it on infrastructure you control, and every document is encrypted before it is stored — so even a database breach exposes only ciphertext, never readable content.
PrivateAI doesn't eliminate the concept of a server — it changes who operates it and what that operator can see.
| Scenario | Who can read your documents? |
|---|---|
| ChatGPT / Claude | The AI company receives plaintext immediately |
| PrivateAI, self-hosted (Railway, AWS, your machine) | Only you — you are the operator |
| PrivateAI, hosted by a third party | That operator — same trust question as any hosted service |
| PostgreSQL DB breach (self or hosted) | Nobody — stored text is encrypted, blobs are unreadable without the key |
| Server operator with DB + filesystem access | Theoretically yes — this is true of any server-side application |
The primary use case is self-hosted deployment. When you run PrivateAI on your own infrastructure, there is no corporate third party involved — no OpenAI, no Anthropic, no cloud vendor reading your documents. If you use a managed host, you're trusting that host, just as you would with any web application.
What encryption adds in all scenarios: a database breach alone cannot expose document content. An attacker needs both the database and the per-user key files on the filesystem to decrypt anything.
What makes it different from corporate AI APIs:
| Feature | ChatGPT / Claude API | PrivateAI (self-hosted) |
|---|---|---|
| Documents received in plaintext by vendor | ✅ yes | ✗ never |
| Document text encrypted at rest | ✗ | ✅ Fernet AES-128 |
| Per-user key isolation | ✗ | ✅ |
| DB breach exposes readable text | ✅ yes | ✗ only ciphertext |
| Audit trail of every query | ✗ | ✅ |
| Runs fully offline | ✗ | ✅ (with Ollama) |
| No training on your data | ✗ (varies by plan) | ✅ |
| Open source, auditable | ✗ | ✅ |
- Healthcare: Clinicians query patient records, lab results, and research literature without violating HIPAA.
- Legal: Associates query case files and contracts without privileged data leaving the firm.
- Finance: Analysts query earnings reports and internal memos without SEC disclosure concerns.
- Government: Analysts query classified or sensitive documents on air-gapped infrastructure.
- Enterprise: Any team that needs AI on internal documentation without an IT security exception.
graph TB
subgraph Browser["User Browser"]
UI[Streamlit UI]
end
subgraph App["PrivateAI Application"]
AUTH[Auth Layer<br/>bcrypt passwords]
CRYPTO[Encryption Layer<br/>Fernet AES-128]
ROUTER[Model Router<br/>local vs cloud]
RAG[RAG Chain<br/>LangChain LCEL]
end
subgraph Storage["Persistent Storage"]
PG[(PostgreSQL<br/>user accounts<br/>document metadata<br/>audit log)]
CHROMA[(ChromaDB<br/>per-user vector store<br/>encrypted chunks)]
KEYS[(.key files<br/>Fernet keys<br/>per user)]
end
subgraph Models["AI Models"]
OLLAMA[Ollama<br/>local LLM<br/>llama3 / mistral]
OPENAI[OpenAI API<br/>GPT-4o<br/>cloud fallback]
EMBED[Embeddings<br/>sentence-transformers<br/>or OpenAI]
end
UI --> AUTH
AUTH --> CRYPTO
CRYPTO --> RAG
RAG --> ROUTER
ROUTER -->|simple queries| OLLAMA
ROUTER -->|complex queries| OPENAI
RAG --> CHROMA
CHROMA --> EMBED
AUTH --> PG
CRYPTO --> KEYS
RAG --> PG
sequenceDiagram
participant U as User
participant UI as Streamlit UI
participant P as Pipeline
participant C as ChromaDB
participant DB as PostgreSQL
U->>UI: Upload file (PDF/DOCX/TXT)
UI->>P: ingest_file(file, fernet_key, user_id)
P->>P: Check file size ≤ 1GB
P->>P: SHA256 hash → deduplication check
DB-->>P: Already ingested? → skip
P->>P: Load text → chunk into segments
P->>P: Encrypt each chunk (Fernet)
P->>C: Store vectors + encrypted metadata
P->>DB: Record document metadata (doc_id, user_id, hash)
P-->>UI: Result: ingested / skipped / error
UI-->>U: Confirmation + document list
sequenceDiagram
participant U as User
participant UI as Chat Page
participant R as Retriever
participant C as ChromaDB
participant LLM as Model Router
participant OL as Ollama (local)
participant OA as OpenAI (cloud)
U->>UI: Ask a question
UI->>R: retrieve_context(query, fernet, user_id)
R->>C: Vector search (enabled docs only)
C-->>R: Top-K encrypted chunks
R->>R: Decrypt chunks with Fernet key
R-->>UI: Context string + source citations
UI->>LLM: get_llm(query, complexity_score)
alt Ollama running AND complexity < threshold
LLM-->>OL: Query stays local
OL-->>UI: Answer (LOCAL badge)
else Ollama unavailable OR complex query
LLM-->>OA: Query sent to OpenAI
OA-->>UI: Answer (CLOUD badge)
end
UI->>UI: Log event to audit trail
UI-->>U: Answer + sources + privacy badge
graph LR
subgraph PerUser["Per-User Isolation"]
K1[User A key]
K2[User B key]
VS1[(User A vector store)]
VS2[(User B vector store)]
end
subgraph DB["Shared Database"]
T1[users table]
T2[documents table<br/>+ user_id column]
T3[audit_log table<br/>+ user_id column]
end
K1 -->|decrypts only| VS1
K2 -->|decrypts only| VS2
T2 -->|row-level isolation| UserA[User A rows]
T2 -->|row-level isolation| UserB[User B rows]
style K1 fill:#e8f5e9
style K2 fill:#e8f5e9
style VS1 fill:#e8f5e9
style VS2 fill:#e8f5e9
Every document chunk is encrypted with Fernet (AES-128-CBC + HMAC-SHA256) before being stored. The vector store contains only numerical embeddings — never your actual text. The PostgreSQL database stores only ciphertext.
Each user generates their own encryption key during onboarding. The key is stored:
- In the user's browser session (cleared on logout)
- In a
.keyfile in the user's private data directory on the server
User A's key cannot decrypt User B's data — ever. This is enforced cryptographically, not just by access control.
However, it is important to be clear: a server operator who has access to both the filesystem (where .key files live) and the database (where encrypted chunks are stored) could theoretically decrypt documents. This is the same trust boundary as any server-side application. The solution to this is self-hosting — when you control the server, you are the only one with that access.
If only the PostgreSQL database is compromised:
| Table | What an attacker sees |
|---|---|
users |
Usernames + bcrypt password hashes (not reversible) |
documents |
Filenames, chunk counts, SHA256 hashes — no content |
audit_log |
Event types, timestamps, model used — no content |
| ChromaDB | Encrypted blobs (text_enc field) — unreadable without the key |
A database breach alone exposes metadata, not document content. The attacker also needs the .key files from the filesystem.
Each key is deterministically derived from a 12-word BIP39-style mnemonic phrase. The phrase is shown once during setup and never stored digitally. It is the only way to recover data if the key file is lost.
PrivateAI scores each query for complexity and routes accordingly:
Complexity Score < Threshold → Ollama (local, private, free)
Complexity Score ≥ Threshold → OpenAI GPT-4o (cloud, more capable)
The UI always shows a LOCAL or CLOUD badge on every response so you know exactly where your query went. The audit log captures every event permanently.
All database tables include a user_id column. Every query is scoped by user_id — there is no data sharing between accounts at the application layer. An admin account can see user registration records but cannot access or decrypt any user's document content.
File uploads are capped at 1 GB per file to prevent resource exhaustion.
| Choice | Rationale |
|---|---|
| Streamlit | Fastest path to a production-quality data app without a separate frontend. Ideal for AI tooling. |
| LangChain LCEL | Composable chain definition — easy to swap LLM, retriever, or parser without rewriting logic. |
| ChromaDB | Embedded vector store — no extra service required locally, simple to persist on a volume in production. |
| Fernet encryption | Symmetric encryption with built-in authentication (HMAC). Simple, audited, no key management service needed. |
| Hybrid routing (Ollama + OpenAI) | Maximizes privacy (local first) while maintaining quality for complex queries. Users control the threshold. |
| PostgreSQL on Railway | Durable, scalable, free tier available. SQLite fallback means zero friction for local dev. |
| sentence-transformers | Local embedding model (~90MB) — documents can be indexed without any external API call. |
Want to use PrivateAI without sending any queries to OpenAI? You can run a free, open-source AI model entirely on your own machine using Ollama.
→ Full step-by-step guide for beginners: docs/local-ai-setup.md
Covers: hardware requirements, Windows/Mac/Linux install, choosing the right model for your computer, and connecting it to PrivateAI. No technical background required.
- Python 3.11+
- Ollama (optional — enables local-only mode; see setup guide)
git clone https://github.com/virtualryder/private-ai.git
cd private-ai
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtcp .env.example .env
# Edit .env — at minimum add your OPENAI_API_KEY if you want cloud fallbackollama pull llama3
ollama servestreamlit run app.pyOpen http://localhost:8501. Create your first account — it will automatically become the admin.
Runs PostgreSQL + Ollama + PrivateAI in one command:
cp .env.example .env
# Add OPENAI_API_KEY to .env
docker compose up --buildThen open http://localhost:8501.
- Fork this repo to your GitHub account
- Go to railway.app → New Project
- Select Deploy from GitHub repo → pick your fork
In your Railway project → New Service → Database → PostgreSQL
Railway will automatically create a DATABASE_URL variable available to your app.
In your PrivateAI service → Variables:
| Variable | Value |
|---|---|
DATABASE_URL |
Auto-populated from PostgreSQL service |
OPENAI_API_KEY |
Your OpenAI API key |
DATA_DIR |
/app/data |
In your PrivateAI service → Volumes → create a volume mounted at /app/data.
This persists:
data/users/{user_id}/.key— Fernet encryption keysdata/users/{user_id}/vector_store/— ChromaDB vector storesdata/users/{user_id}/uploads/— Temporary upload staging
Click Deploy. Railway builds the Dockerfile and starts the app. First startup downloads the sentence-transformers model (~90MB) — this takes ~60 seconds.
pip install pytest
pytest tests/ -vtests/test_crypto.py — Fernet key generation, encrypt/decrypt, recovery phrase (7 tests)
tests/test_ingestion.py — File loading, chunking, pipeline (6 tests)
tests/test_router.py — Model routing logic, complexity scoring (8 tests)
private-ai/
├── app.py # Entry point — auth gate, sidebar, page routing
├── core/
│ ├── database.py # DB layer — PostgreSQL (prod) / SQLite (local)
│ ├── auth.py # User accounts — bcrypt password hashing
│ ├── crypto.py # Fernet encryption + BIP39 recovery phrases
│ ├── embeddings.py # Embedding provider (local sentence-transformers / OpenAI)
│ ├── model_router.py # Hybrid routing — Ollama vs OpenAI
│ ├── audit.py # Audit event logger
│ └── user_paths.py # Per-user filesystem paths
├── pages/
│ ├── auth.py # Login / signup page
│ ├── onboarding.py # Key generation / restore
│ ├── ingestion_ui.py # Document upload and knowledge base management
│ ├── chat.py # Conversational RAG interface
│ ├── settings.py # Model config, routing threshold, audit log
│ └── admin.py # Admin panel — user management
├── ingestion/
│ ├── pipeline.py # Upload → chunk → encrypt → embed → store
│ ├── loader.py # PDF, DOCX, TXT file loaders
│ └── chunker.py # Overlapping text chunker
├── rag/
│ ├── chain.py # LangChain LCEL RAG chain
│ └── retriever.py # ChromaDB retrieval + Fernet decryption
├── config/
│ ├── settings.yaml # Default model settings
│ └── permissions.yaml # Agent permission flags
├── tests/ # pytest test suite
├── Dockerfile # Production container
├── docker-compose.yml # Local dev stack (PostgreSQL + Ollama + app)
├── railway.toml # Railway deployment config
└── .env.example # Environment variable template
If you discover a security vulnerability, please open a GitHub issue marked [SECURITY] or email directly. Do not post exploit details publicly before a fix is available.
MIT License. See LICENSE for details.