PrivateAI — Private AI Agent Platform

Chat with your own documents. You control the infrastructure. Encryption protects what you store.

The Problem

95% of professionals have documents they can't put into ChatGPT.

Medical records, legal contracts, financial reports, proprietary research, HR files, client data — all of it is off-limits for most AI tools because of a single fundamental design flaw: your documents go to a corporate server in plaintext, where someone else controls them.

The market for enterprise AI document tools is $4.4B and growing at 28% annually (Grand View Research, 2024), yet most solutions require organizations to hand their most sensitive information to a third party to process and store in readable form. Regulated industries — healthcare, finance, legal, government — are effectively locked out entirely.

The gap isn't just about privacy preferences. It's a structural trust problem: the moment you paste a document into ChatGPT, OpenAI receives it in full, can use it for model improvement, and you have no visibility into where it goes.

The Solution

PrivateAI is a self-hosted, privacy-first RAG (Retrieval-Augmented Generation) platform. You deploy it on infrastructure you control, and every document is encrypted before it is stored — so even a database breach exposes only ciphertext, never readable content.

An honest framing of the trust model

PrivateAI doesn't eliminate the concept of a server — it changes who operates it and what that operator can see.

Scenario	Who can read your documents?
ChatGPT / Claude	The AI company receives plaintext immediately
PrivateAI, self-hosted (Railway, AWS, your machine)	Only you — you are the operator
PrivateAI, hosted by a third party	That operator — same trust question as any hosted service
PostgreSQL DB breach (self or hosted)	Nobody — stored text is encrypted, blobs are unreadable without the key
Server operator with DB + filesystem access	Theoretically yes — this is true of any server-side application

The primary use case is self-hosted deployment. When you run PrivateAI on your own infrastructure, there is no corporate third party involved — no OpenAI, no Anthropic, no cloud vendor reading your documents. If you use a managed host, you're trusting that host, just as you would with any web application.

What encryption adds in all scenarios: a database breach alone cannot expose document content. An attacker needs both the database and the per-user key files on the filesystem to decrypt anything.

What makes it different from corporate AI APIs:

Feature	ChatGPT / Claude API	PrivateAI (self-hosted)
Documents received in plaintext by vendor	✅ yes	✗ never
Document text encrypted at rest	✗	✅ Fernet AES-128
Per-user key isolation	✗	✅
DB breach exposes readable text	✅ yes	✗ only ciphertext
Audit trail of every query	✗	✅
Runs fully offline	✗	✅ (with Ollama)
No training on your data	✗ (varies by plan)	✅
Open source, auditable	✗	✅

Potential Outcomes

Healthcare: Clinicians query patient records, lab results, and research literature without violating HIPAA.
Legal: Associates query case files and contracts without privileged data leaving the firm.
Finance: Analysts query earnings reports and internal memos without SEC disclosure concerns.
Government: Analysts query classified or sensitive documents on air-gapped infrastructure.
Enterprise: Any team that needs AI on internal documentation without an IT security exception.

Architecture

System Overview

graph TB
    subgraph Browser["User Browser"]
        UI[Streamlit UI]
    end

    subgraph App["PrivateAI Application"]
        AUTH[Auth Layer<br/>bcrypt passwords]
        CRYPTO[Encryption Layer<br/>Fernet AES-128]
        ROUTER[Model Router<br/>local vs cloud]
        RAG[RAG Chain<br/>LangChain LCEL]
    end

    subgraph Storage["Persistent Storage"]
        PG[(PostgreSQL<br/>user accounts<br/>document metadata<br/>audit log)]
        CHROMA[(ChromaDB<br/>per-user vector store<br/>encrypted chunks)]
        KEYS[(.key files<br/>Fernet keys<br/>per user)]
    end

    subgraph Models["AI Models"]
        OLLAMA[Ollama<br/>local LLM<br/>llama3 / mistral]
        OPENAI[OpenAI API<br/>GPT-4o<br/>cloud fallback]
        EMBED[Embeddings<br/>sentence-transformers<br/>or OpenAI]
    end

    UI --> AUTH
    AUTH --> CRYPTO
    CRYPTO --> RAG
    RAG --> ROUTER
    ROUTER -->|simple queries| OLLAMA
    ROUTER -->|complex queries| OPENAI
    RAG --> CHROMA
    CHROMA --> EMBED
    AUTH --> PG
    CRYPTO --> KEYS
    RAG --> PG

Document Ingestion Flow

sequenceDiagram
    participant U as User
    participant UI as Streamlit UI
    participant P as Pipeline
    participant C as ChromaDB
    participant DB as PostgreSQL

    U->>UI: Upload file (PDF/DOCX/TXT)
    UI->>P: ingest_file(file, fernet_key, user_id)
    P->>P: Check file size ≤ 1GB
    P->>P: SHA256 hash → deduplication check
    DB-->>P: Already ingested? → skip
    P->>P: Load text → chunk into segments
    P->>P: Encrypt each chunk (Fernet)
    P->>C: Store vectors + encrypted metadata
    P->>DB: Record document metadata (doc_id, user_id, hash)
    P-->>UI: Result: ingested / skipped / error
    UI-->>U: Confirmation + document list

RAG Query Flow

sequenceDiagram
    participant U as User
    participant UI as Chat Page
    participant R as Retriever
    participant C as ChromaDB
    participant LLM as Model Router
    participant OL as Ollama (local)
    participant OA as OpenAI (cloud)

    U->>UI: Ask a question
    UI->>R: retrieve_context(query, fernet, user_id)
    R->>C: Vector search (enabled docs only)
    C-->>R: Top-K encrypted chunks
    R->>R: Decrypt chunks with Fernet key
    R-->>UI: Context string + source citations
    UI->>LLM: get_llm(query, complexity_score)
    alt Ollama running AND complexity < threshold
        LLM-->>OL: Query stays local
        OL-->>UI: Answer (LOCAL badge)
    else Ollama unavailable OR complex query
        LLM-->>OA: Query sent to OpenAI
        OA-->>UI: Answer (CLOUD badge)
    end
    UI->>UI: Log event to audit trail
    UI-->>U: Answer + sources + privacy badge

Security Model

graph LR
    subgraph PerUser["Per-User Isolation"]
        K1[User A key]
        K2[User B key]
        VS1[(User A vector store)]
        VS2[(User B vector store)]
    end

    subgraph DB["Shared Database"]
        T1[users table]
        T2[documents table<br/>+ user_id column]
        T3[audit_log table<br/>+ user_id column]
    end

    K1 -->|decrypts only| VS1
    K2 -->|decrypts only| VS2
    T2 -->|row-level isolation| UserA[User A rows]
    T2 -->|row-level isolation| UserB[User B rows]

    style K1 fill:#e8f5e9
    style K2 fill:#e8f5e9
    style VS1 fill:#e8f5e9
    style VS2 fill:#e8f5e9

How Your Data Is Secured

1. Encryption at Rest

Every document chunk is encrypted with Fernet (AES-128-CBC + HMAC-SHA256) before being stored. The vector store contains only numerical embeddings — never your actual text. The PostgreSQL database stores only ciphertext.

2. Per-User Key Isolation

Each user generates their own encryption key during onboarding. The key is stored:

In the user's browser session (cleared on logout)
In a .key file in the user's private data directory on the server

User A's key cannot decrypt User B's data — ever. This is enforced cryptographically, not just by access control.

However, it is important to be clear: a server operator who has access to both the filesystem (where .key files live) and the database (where encrypted chunks are stored) could theoretically decrypt documents. This is the same trust boundary as any server-side application. The solution to this is self-hosting — when you control the server, you are the only one with that access.

3. What a Database Breach Actually Exposes

If only the PostgreSQL database is compromised:

Table	What an attacker sees
`users`	Usernames + bcrypt password hashes (not reversible)
`documents`	Filenames, chunk counts, SHA256 hashes — no content
`audit_log`	Event types, timestamps, model used — no content
ChromaDB	Encrypted blobs (`text_enc` field) — unreadable without the key

A database breach alone exposes metadata, not document content. The attacker also needs the .key files from the filesystem.

4. 12-Word Recovery Phrase

Each key is deterministically derived from a 12-word BIP39-style mnemonic phrase. The phrase is shown once during setup and never stored digitally. It is the only way to recover data if the key file is lost.

5. Hybrid AI Routing with Privacy Awareness

PrivateAI scores each query for complexity and routes accordingly:

Complexity Score < Threshold → Ollama (local, private, free)
Complexity Score ≥ Threshold → OpenAI GPT-4o (cloud, more capable)

The UI always shows a LOCAL or CLOUD badge on every response so you know exactly where your query went. The audit log captures every event permanently.

6. Multi-Tenant Row-Level Isolation

All database tables include a user_id column. Every query is scoped by user_id — there is no data sharing between accounts at the application layer. An admin account can see user registration records but cannot access or decrypt any user's document content.

7. Upload Size Limit

File uploads are capped at 1 GB per file to prevent resource exhaustion.

Why This Architecture

Choice	Rationale
Streamlit	Fastest path to a production-quality data app without a separate frontend. Ideal for AI tooling.
LangChain LCEL	Composable chain definition — easy to swap LLM, retriever, or parser without rewriting logic.
ChromaDB	Embedded vector store — no extra service required locally, simple to persist on a volume in production.
Fernet encryption	Symmetric encryption with built-in authentication (HMAC). Simple, audited, no key management service needed.
Hybrid routing (Ollama + OpenAI)	Maximizes privacy (local first) while maintaining quality for complex queries. Users control the threshold.
PostgreSQL on Railway	Durable, scalable, free tier available. SQLite fallback means zero friction for local dev.
sentence-transformers	Local embedding model (~90MB) — documents can be indexed without any external API call.

Running AI Locally (No Internet Required)

Want to use PrivateAI without sending any queries to OpenAI? You can run a free, open-source AI model entirely on your own machine using Ollama.

→ Full step-by-step guide for beginners: docs/local-ai-setup.md

Covers: hardware requirements, Windows/Mac/Linux install, choosing the right model for your computer, and connecting it to PrivateAI. No technical background required.

Quick Start — Local Development

Prerequisites

Python 3.11+
Ollama (optional — enables local-only mode; see setup guide)

1. Clone and install

git clone https://github.com/virtualryder/private-ai.git
cd private-ai
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -r requirements.txt

2. Configure environment

cp .env.example .env
# Edit .env — at minimum add your OPENAI_API_KEY if you want cloud fallback

3. (Optional) Start Ollama

ollama pull llama3
ollama serve

4. Run the app

streamlit run app.py

Open http://localhost:8501. Create your first account — it will automatically become the admin.

Quick Start — Docker Compose (Full Stack)

Runs PostgreSQL + Ollama + PrivateAI in one command:

cp .env.example .env
# Add OPENAI_API_KEY to .env
docker compose up --build

Then open http://localhost:8501.

Deploy to Railway

Step 1 — Fork and create project

Fork this repo to your GitHub account
Go to railway.app → New Project
Select Deploy from GitHub repo → pick your fork

Step 2 — Add PostgreSQL

In your Railway project → New Service → Database → PostgreSQL

Railway will automatically create a DATABASE_URL variable available to your app.

Step 3 — Set environment variables

In your PrivateAI service → Variables:

Variable	Value
`DATABASE_URL`	Auto-populated from PostgreSQL service
`OPENAI_API_KEY`	Your OpenAI API key
`DATA_DIR`	`/app/data`

Step 4 — Create a Volume

In your PrivateAI service → Volumes → create a volume mounted at /app/data.

This persists:

data/users/{user_id}/.key — Fernet encryption keys
data/users/{user_id}/vector_store/ — ChromaDB vector stores
data/users/{user_id}/uploads/ — Temporary upload staging

Step 5 — Deploy

Click Deploy. Railway builds the Dockerfile and starts the app. First startup downloads the sentence-transformers model (~90MB) — this takes ~60 seconds.

Running Tests

pip install pytest
pytest tests/ -v

Test coverage

tests/test_crypto.py     — Fernet key generation, encrypt/decrypt, recovery phrase (7 tests)
tests/test_ingestion.py  — File loading, chunking, pipeline (6 tests)
tests/test_router.py     — Model routing logic, complexity scoring (8 tests)

Project Structure

private-ai/
├── app.py                    # Entry point — auth gate, sidebar, page routing
├── core/
│   ├── database.py           # DB layer — PostgreSQL (prod) / SQLite (local)
│   ├── auth.py               # User accounts — bcrypt password hashing
│   ├── crypto.py             # Fernet encryption + BIP39 recovery phrases
│   ├── embeddings.py         # Embedding provider (local sentence-transformers / OpenAI)
│   ├── model_router.py       # Hybrid routing — Ollama vs OpenAI
│   ├── audit.py              # Audit event logger
│   └── user_paths.py         # Per-user filesystem paths
├── pages/
│   ├── auth.py               # Login / signup page
│   ├── onboarding.py         # Key generation / restore
│   ├── ingestion_ui.py       # Document upload and knowledge base management
│   ├── chat.py               # Conversational RAG interface
│   ├── settings.py           # Model config, routing threshold, audit log
│   └── admin.py              # Admin panel — user management
├── ingestion/
│   ├── pipeline.py           # Upload → chunk → encrypt → embed → store
│   ├── loader.py             # PDF, DOCX, TXT file loaders
│   └── chunker.py            # Overlapping text chunker
├── rag/
│   ├── chain.py              # LangChain LCEL RAG chain
│   └── retriever.py          # ChromaDB retrieval + Fernet decryption
├── config/
│   ├── settings.yaml         # Default model settings
│   └── permissions.yaml      # Agent permission flags
├── tests/                    # pytest test suite
├── Dockerfile                # Production container
├── docker-compose.yml        # Local dev stack (PostgreSQL + Ollama + app)
├── railway.toml              # Railway deployment config
└── .env.example              # Environment variable template

Security Disclosure

If you discover a security vulnerability, please open a GitHub issue marked [SECURITY] or email directly. Do not post exploit details publicly before a fix is available.

License

MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
core		core
docs		docs
ingestion		ingestion
rag		rag
tests		tests
views		views
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
docker-compose.yml		docker-compose.yml
railway.toml		railway.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PrivateAI — Private AI Agent Platform

The Problem

The Solution

An honest framing of the trust model

Potential Outcomes

Architecture

System Overview

Document Ingestion Flow

RAG Query Flow

Security Model

How Your Data Is Secured

1. Encryption at Rest

2. Per-User Key Isolation

3. What a Database Breach Actually Exposes

4. 12-Word Recovery Phrase

5. Hybrid AI Routing with Privacy Awareness

6. Multi-Tenant Row-Level Isolation

7. Upload Size Limit

Why This Architecture

Running AI Locally (No Internet Required)

Quick Start — Local Development

Prerequisites

1. Clone and install

2. Configure environment

3. (Optional) Start Ollama

4. Run the app

Quick Start — Docker Compose (Full Stack)

Deploy to Railway

Step 1 — Fork and create project

Step 2 — Add PostgreSQL

Step 3 — Set environment variables

Step 4 — Create a Volume

Step 5 — Deploy

Running Tests

Test coverage

Project Structure

Security Disclosure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages