Skip to content

virtualryder/PrivateAI

Repository files navigation

PrivateAI — Private AI Agent Platform

Chat with your own documents. You control the infrastructure. Encryption protects what you store.

Python 3.11 Streamlit LangChain Railway


The Problem

95% of professionals have documents they can't put into ChatGPT.

Medical records, legal contracts, financial reports, proprietary research, HR files, client data — all of it is off-limits for most AI tools because of a single fundamental design flaw: your documents go to a corporate server in plaintext, where someone else controls them.

The market for enterprise AI document tools is $4.4B and growing at 28% annually (Grand View Research, 2024), yet most solutions require organizations to hand their most sensitive information to a third party to process and store in readable form. Regulated industries — healthcare, finance, legal, government — are effectively locked out entirely.

The gap isn't just about privacy preferences. It's a structural trust problem: the moment you paste a document into ChatGPT, OpenAI receives it in full, can use it for model improvement, and you have no visibility into where it goes.


The Solution

PrivateAI is a self-hosted, privacy-first RAG (Retrieval-Augmented Generation) platform. You deploy it on infrastructure you control, and every document is encrypted before it is stored — so even a database breach exposes only ciphertext, never readable content.

An honest framing of the trust model

PrivateAI doesn't eliminate the concept of a server — it changes who operates it and what that operator can see.

Scenario Who can read your documents?
ChatGPT / Claude The AI company receives plaintext immediately
PrivateAI, self-hosted (Railway, AWS, your machine) Only you — you are the operator
PrivateAI, hosted by a third party That operator — same trust question as any hosted service
PostgreSQL DB breach (self or hosted) Nobody — stored text is encrypted, blobs are unreadable without the key
Server operator with DB + filesystem access Theoretically yes — this is true of any server-side application

The primary use case is self-hosted deployment. When you run PrivateAI on your own infrastructure, there is no corporate third party involved — no OpenAI, no Anthropic, no cloud vendor reading your documents. If you use a managed host, you're trusting that host, just as you would with any web application.

What encryption adds in all scenarios: a database breach alone cannot expose document content. An attacker needs both the database and the per-user key files on the filesystem to decrypt anything.

What makes it different from corporate AI APIs:

Feature ChatGPT / Claude API PrivateAI (self-hosted)
Documents received in plaintext by vendor ✅ yes ✗ never
Document text encrypted at rest ✅ Fernet AES-128
Per-user key isolation
DB breach exposes readable text ✅ yes ✗ only ciphertext
Audit trail of every query
Runs fully offline ✅ (with Ollama)
No training on your data ✗ (varies by plan)
Open source, auditable

Potential Outcomes

  • Healthcare: Clinicians query patient records, lab results, and research literature without violating HIPAA.
  • Legal: Associates query case files and contracts without privileged data leaving the firm.
  • Finance: Analysts query earnings reports and internal memos without SEC disclosure concerns.
  • Government: Analysts query classified or sensitive documents on air-gapped infrastructure.
  • Enterprise: Any team that needs AI on internal documentation without an IT security exception.

Architecture

System Overview

graph TB
    subgraph Browser["User Browser"]
        UI[Streamlit UI]
    end

    subgraph App["PrivateAI Application"]
        AUTH[Auth Layer<br/>bcrypt passwords]
        CRYPTO[Encryption Layer<br/>Fernet AES-128]
        ROUTER[Model Router<br/>local vs cloud]
        RAG[RAG Chain<br/>LangChain LCEL]
    end

    subgraph Storage["Persistent Storage"]
        PG[(PostgreSQL<br/>user accounts<br/>document metadata<br/>audit log)]
        CHROMA[(ChromaDB<br/>per-user vector store<br/>encrypted chunks)]
        KEYS[(.key files<br/>Fernet keys<br/>per user)]
    end

    subgraph Models["AI Models"]
        OLLAMA[Ollama<br/>local LLM<br/>llama3 / mistral]
        OPENAI[OpenAI API<br/>GPT-4o<br/>cloud fallback]
        EMBED[Embeddings<br/>sentence-transformers<br/>or OpenAI]
    end

    UI --> AUTH
    AUTH --> CRYPTO
    CRYPTO --> RAG
    RAG --> ROUTER
    ROUTER -->|simple queries| OLLAMA
    ROUTER -->|complex queries| OPENAI
    RAG --> CHROMA
    CHROMA --> EMBED
    AUTH --> PG
    CRYPTO --> KEYS
    RAG --> PG
Loading

Document Ingestion Flow

sequenceDiagram
    participant U as User
    participant UI as Streamlit UI
    participant P as Pipeline
    participant C as ChromaDB
    participant DB as PostgreSQL

    U->>UI: Upload file (PDF/DOCX/TXT)
    UI->>P: ingest_file(file, fernet_key, user_id)
    P->>P: Check file size ≤ 1GB
    P->>P: SHA256 hash → deduplication check
    DB-->>P: Already ingested? → skip
    P->>P: Load text → chunk into segments
    P->>P: Encrypt each chunk (Fernet)
    P->>C: Store vectors + encrypted metadata
    P->>DB: Record document metadata (doc_id, user_id, hash)
    P-->>UI: Result: ingested / skipped / error
    UI-->>U: Confirmation + document list
Loading

RAG Query Flow

sequenceDiagram
    participant U as User
    participant UI as Chat Page
    participant R as Retriever
    participant C as ChromaDB
    participant LLM as Model Router
    participant OL as Ollama (local)
    participant OA as OpenAI (cloud)

    U->>UI: Ask a question
    UI->>R: retrieve_context(query, fernet, user_id)
    R->>C: Vector search (enabled docs only)
    C-->>R: Top-K encrypted chunks
    R->>R: Decrypt chunks with Fernet key
    R-->>UI: Context string + source citations
    UI->>LLM: get_llm(query, complexity_score)
    alt Ollama running AND complexity < threshold
        LLM-->>OL: Query stays local
        OL-->>UI: Answer (LOCAL badge)
    else Ollama unavailable OR complex query
        LLM-->>OA: Query sent to OpenAI
        OA-->>UI: Answer (CLOUD badge)
    end
    UI->>UI: Log event to audit trail
    UI-->>U: Answer + sources + privacy badge
Loading

Security Model

graph LR
    subgraph PerUser["Per-User Isolation"]
        K1[User A key]
        K2[User B key]
        VS1[(User A vector store)]
        VS2[(User B vector store)]
    end

    subgraph DB["Shared Database"]
        T1[users table]
        T2[documents table<br/>+ user_id column]
        T3[audit_log table<br/>+ user_id column]
    end

    K1 -->|decrypts only| VS1
    K2 -->|decrypts only| VS2
    T2 -->|row-level isolation| UserA[User A rows]
    T2 -->|row-level isolation| UserB[User B rows]

    style K1 fill:#e8f5e9
    style K2 fill:#e8f5e9
    style VS1 fill:#e8f5e9
    style VS2 fill:#e8f5e9
Loading

How Your Data Is Secured

1. Encryption at Rest

Every document chunk is encrypted with Fernet (AES-128-CBC + HMAC-SHA256) before being stored. The vector store contains only numerical embeddings — never your actual text. The PostgreSQL database stores only ciphertext.

2. Per-User Key Isolation

Each user generates their own encryption key during onboarding. The key is stored:

  • In the user's browser session (cleared on logout)
  • In a .key file in the user's private data directory on the server

User A's key cannot decrypt User B's data — ever. This is enforced cryptographically, not just by access control.

However, it is important to be clear: a server operator who has access to both the filesystem (where .key files live) and the database (where encrypted chunks are stored) could theoretically decrypt documents. This is the same trust boundary as any server-side application. The solution to this is self-hosting — when you control the server, you are the only one with that access.

3. What a Database Breach Actually Exposes

If only the PostgreSQL database is compromised:

Table What an attacker sees
users Usernames + bcrypt password hashes (not reversible)
documents Filenames, chunk counts, SHA256 hashes — no content
audit_log Event types, timestamps, model used — no content
ChromaDB Encrypted blobs (text_enc field) — unreadable without the key

A database breach alone exposes metadata, not document content. The attacker also needs the .key files from the filesystem.

4. 12-Word Recovery Phrase

Each key is deterministically derived from a 12-word BIP39-style mnemonic phrase. The phrase is shown once during setup and never stored digitally. It is the only way to recover data if the key file is lost.

5. Hybrid AI Routing with Privacy Awareness

PrivateAI scores each query for complexity and routes accordingly:

Complexity Score < Threshold → Ollama (local, private, free)
Complexity Score ≥ Threshold → OpenAI GPT-4o (cloud, more capable)

The UI always shows a LOCAL or CLOUD badge on every response so you know exactly where your query went. The audit log captures every event permanently.

6. Multi-Tenant Row-Level Isolation

All database tables include a user_id column. Every query is scoped by user_id — there is no data sharing between accounts at the application layer. An admin account can see user registration records but cannot access or decrypt any user's document content.

7. Upload Size Limit

File uploads are capped at 1 GB per file to prevent resource exhaustion.


Why This Architecture

Choice Rationale
Streamlit Fastest path to a production-quality data app without a separate frontend. Ideal for AI tooling.
LangChain LCEL Composable chain definition — easy to swap LLM, retriever, or parser without rewriting logic.
ChromaDB Embedded vector store — no extra service required locally, simple to persist on a volume in production.
Fernet encryption Symmetric encryption with built-in authentication (HMAC). Simple, audited, no key management service needed.
Hybrid routing (Ollama + OpenAI) Maximizes privacy (local first) while maintaining quality for complex queries. Users control the threshold.
PostgreSQL on Railway Durable, scalable, free tier available. SQLite fallback means zero friction for local dev.
sentence-transformers Local embedding model (~90MB) — documents can be indexed without any external API call.

Running AI Locally (No Internet Required)

Want to use PrivateAI without sending any queries to OpenAI? You can run a free, open-source AI model entirely on your own machine using Ollama.

Full step-by-step guide for beginners: docs/local-ai-setup.md

Covers: hardware requirements, Windows/Mac/Linux install, choosing the right model for your computer, and connecting it to PrivateAI. No technical background required.


Quick Start — Local Development

Prerequisites

1. Clone and install

git clone https://github.com/virtualryder/private-ai.git
cd private-ai
python -m venv .venv
source .venv/bin/activate      # Windows: .venv\Scripts\activate
pip install -r requirements.txt

2. Configure environment

cp .env.example .env
# Edit .env — at minimum add your OPENAI_API_KEY if you want cloud fallback

3. (Optional) Start Ollama

ollama pull llama3
ollama serve

4. Run the app

streamlit run app.py

Open http://localhost:8501. Create your first account — it will automatically become the admin.


Quick Start — Docker Compose (Full Stack)

Runs PostgreSQL + Ollama + PrivateAI in one command:

cp .env.example .env
# Add OPENAI_API_KEY to .env
docker compose up --build

Then open http://localhost:8501.


Deploy to Railway

Step 1 — Fork and create project

  1. Fork this repo to your GitHub account
  2. Go to railway.appNew Project
  3. Select Deploy from GitHub repo → pick your fork

Step 2 — Add PostgreSQL

In your Railway project → New Service → Database → PostgreSQL

Railway will automatically create a DATABASE_URL variable available to your app.

Step 3 — Set environment variables

In your PrivateAI service → Variables:

Variable Value
DATABASE_URL Auto-populated from PostgreSQL service
OPENAI_API_KEY Your OpenAI API key
DATA_DIR /app/data

Step 4 — Create a Volume

In your PrivateAI service → Volumes → create a volume mounted at /app/data.

This persists:

  • data/users/{user_id}/.key — Fernet encryption keys
  • data/users/{user_id}/vector_store/ — ChromaDB vector stores
  • data/users/{user_id}/uploads/ — Temporary upload staging

Step 5 — Deploy

Click Deploy. Railway builds the Dockerfile and starts the app. First startup downloads the sentence-transformers model (~90MB) — this takes ~60 seconds.


Running Tests

pip install pytest
pytest tests/ -v

Test coverage

tests/test_crypto.py     — Fernet key generation, encrypt/decrypt, recovery phrase (7 tests)
tests/test_ingestion.py  — File loading, chunking, pipeline (6 tests)
tests/test_router.py     — Model routing logic, complexity scoring (8 tests)

Project Structure

private-ai/
├── app.py                    # Entry point — auth gate, sidebar, page routing
├── core/
│   ├── database.py           # DB layer — PostgreSQL (prod) / SQLite (local)
│   ├── auth.py               # User accounts — bcrypt password hashing
│   ├── crypto.py             # Fernet encryption + BIP39 recovery phrases
│   ├── embeddings.py         # Embedding provider (local sentence-transformers / OpenAI)
│   ├── model_router.py       # Hybrid routing — Ollama vs OpenAI
│   ├── audit.py              # Audit event logger
│   └── user_paths.py         # Per-user filesystem paths
├── pages/
│   ├── auth.py               # Login / signup page
│   ├── onboarding.py         # Key generation / restore
│   ├── ingestion_ui.py       # Document upload and knowledge base management
│   ├── chat.py               # Conversational RAG interface
│   ├── settings.py           # Model config, routing threshold, audit log
│   └── admin.py              # Admin panel — user management
├── ingestion/
│   ├── pipeline.py           # Upload → chunk → encrypt → embed → store
│   ├── loader.py             # PDF, DOCX, TXT file loaders
│   └── chunker.py            # Overlapping text chunker
├── rag/
│   ├── chain.py              # LangChain LCEL RAG chain
│   └── retriever.py          # ChromaDB retrieval + Fernet decryption
├── config/
│   ├── settings.yaml         # Default model settings
│   └── permissions.yaml      # Agent permission flags
├── tests/                    # pytest test suite
├── Dockerfile                # Production container
├── docker-compose.yml        # Local dev stack (PostgreSQL + Ollama + app)
├── railway.toml              # Railway deployment config
└── .env.example              # Environment variable template

Security Disclosure

If you discover a security vulnerability, please open a GitHub issue marked [SECURITY] or email directly. Do not post exploit details publicly before a fix is available.


License

MIT License. See LICENSE for details.

About

Self-hosted RAG platform for private document Q&A. Deploy on your own infrastructure — no corporate third party receives your documents. Fernet encryption at rest, per-user key isolation, hybrid local/cloud LLM routing, and a full audit trail. Built with LangChain, ChromaDB, Streamlit, and PostgreSQL.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors