Enterprise-grade private AI document processing platform - Chat with your documents securely using your own infrastructure, powered by Ollama LLM, ChromaDB vector store, and n8n automation workflows.
Private Document AI is a complete self-hosted solution that enables organizations to implement secure document AI capabilities without data leaving their infrastructure. Built with enterprise DevOps practices, this platform provides:
- π Complete Data Privacy: All processing happens on your infrastructure
- π Production-Ready Deployment: Automated CI/CD with infrastructure as code
- π§ Extensible Workflows: Visual automation with n8n
- β‘ High Performance: Vector search with ChromaDB and local LLM inference
- π Scalable Architecture: Containerized microservices design
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β Client Web βββββΆβ n8n Platform βββββΆβ Ollama LLM β
β Interface β β :5678 β β :11434 β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β β
βΌ βΌ
βββββββββββββββββββ βββββββββββββββββββ
β ChromaDB β β Vector β
β :8000 β β Embeddings β
βββββββββββββββββββ βββββββββββββββββββ
| Component | Technology | Purpose | Port |
|---|---|---|---|
| Orchestration | n8n | Workflow automation & UI | 5678 |
| LLM Engine | Ollama (Llama3 8B) | Text generation & embeddings | 11434 |
| Vector Database | ChromaDB | Document storage & similarity search | 8000 |
| Infrastructure | Terraform + DigitalOcean | Cloud provisioning | - |
| Containerization | Docker Compose | Service orchestration | - |
| CI/CD | GitHub Actions | Automated deployment | - |
- π Document Ingestion: Upload β Text extraction β Chunking β Embedding β Vector storage
- π¬ Question Answering: Query β Embedding β Similarity search β Context retrieval β LLM response
- DigitalOcean account with API token
- SSH key pair configured in DigitalOcean
- GitHub repository with Actions enabled
Before deploying, you need to set up SSH keys for secure server access. Follow this step-by-step guide:
# Generate a new SSH key pair specifically for this project
ssh-keygen -t ed25519 -C "private-ai-deployment" -f ~/.ssh/private-ai
# This creates two files:
# ~/.ssh/private-ai (private key - keep secret!)
# ~/.ssh/private-ai.pub (public key - safe to share)Alternative for older systems:
# If ed25519 is not supported, use RSA
ssh-keygen -t rsa -b 4096 -C "private-ai-deployment" -f ~/.ssh/private-ai-
Display your public key:
cat ~/.ssh/private-ai.pub -
Copy the output (starts with
ssh-ed25519orssh-rsa) -
Add to DigitalOcean:
- Go to DigitalOcean Control Panel
- Navigate to Account β Security β SSH Keys
- Click "Add SSH Key"
- Paste your public key content
- Name it:
private-ai-deployment - Click "Add SSH Key"
-
Get your SSH Key ID:
# Method 1: Using doctl CLI (if installed) doctl compute ssh-key list # Method 2: Using curl curl -X GET \ -H "Content-Type: application/json" \ -H "Authorization: Bearer YOUR_DO_TOKEN" \ "https://api.digitalocean.com/v2/account/keys"
Copy the numeric ID (e.g.,
12345678) - you'll need this for Terraform.
-
Get your private key content:
cat ~/.ssh/private-ai -
Copy the entire content including
-----BEGINand-----ENDlines -
Add to GitHub Secrets:
- Go to your repository β Settings β Secrets and variables β Actions
- Click "New repository secret"
- Name:
SSH_PRIVATE_KEY - Value: Paste the entire private key content
- Click "Add secret"
Edit terraform/main.tf and replace the SSH key ID:
resource "digitalocean_droplet" "private_ai_server" {
# Replace with your SSH key ID from Step 2
ssh_keys = ["12345678"] # β Replace this number
# ... rest of configuration
}- β Never commit private keys to Git
- β Use unique keys for each project
- β Store private keys securely
- β Regularly rotate SSH keys
- β Don't share private key content
- β Don't reuse personal SSH keys
| Issue | Solution |
|---|---|
Permission denied (publickey) |
Verify key is added to DigitalOcean and ID is correct |
Bad permissions error |
Fix permissions: chmod 600 ~/.ssh/private-ai |
| Key not found in DO | Re-upload public key to DigitalOcean account |
| GitHub secret not working | Ensure entire private key is copied, including headers |
# Test SSH connection to DigitalOcean (after deployment)
ssh -i ~/.ssh/private-ai root@YOUR_DROPLET_IP
# Check key permissions
ls -la ~/.ssh/private-ai*
# Verify public key format
ssh-keygen -l -f ~/.ssh/private-ai.pub# Fork this repository
git clone https://github.com/YOUR_USERNAME/private-ai.git
cd private-aiNavigate to your repository β Settings β Secrets and variables β Actions:
| Secret Name | Description | Example |
|---|---|---|
DIGITALOCEAN_TOKEN |
DigitalOcean API token | dop_v1_xxxxx |
SSH_PRIVATE_KEY |
Private SSH key content | -----BEGIN OPENSSH PRIVATE KEY----- |
Edit terraform/main.tf and replace the SSH key ID:
resource "digitalocean_droplet" "private_ai_server" {
ssh_keys = ["YOUR_SSH_KEY_ID"] # Replace with your key ID
# ... rest of configuration
}git add .
git commit -m "feat: configure deployment secrets"
git push origin mainπ That's it! GitHub Actions will automatically:
- Provision DigitalOcean infrastructure
- Install and configure all services
- Deploy the complete AI stack
After deployment completes, access your services:
- π n8n Interface:
http://YOUR_DROPLET_IP:5678 - π ChromaDB API:
http://YOUR_DROPLET_IP:8000 - π€ Ollama API:
http://YOUR_DROPLET_IP:11434
- Navigate to n8n interface
- Import
n8n_workflows/ingestion_workflow.json - Activate the workflow
- Send documents via webhook:
POST http://YOUR_DROPLET_IP:5678/webhook/ingest-document
- Import
n8n_workflows/qa_workflow.json - Activate the chat workflow
- Use the built-in chat interface to ask questions about your documents
# Clone repository
git clone https://github.com/YOUR_USERNAME/private-ai.git
cd private-ai
# Start services locally
docker-compose up -d
# Access local services
# n8n: http://localhost:5678
# ChromaDB: http://localhost:8000
# Ollama: http://localhost:11434The platform includes two pre-built workflows:
ingestion_workflow.json: Document processing pipelineqa_workflow.json: Question-answering interface
Customize these workflows in the n8n interface or create new ones for your specific use cases.
# SSH into your server
ssh root@YOUR_DROPLET_IP
# List available models
docker exec ollama_service ollama list
# Pull additional models
docker exec ollama_service ollama pull codellama:7b
docker exec ollama_service ollama pull mistral:7b- All services run on a private Docker network
- Firewall configured to allow only necessary ports (22, 5678, 8000, 11434)
- Consider adding SSL/TLS certificates for production use
- β No data leaves your infrastructure
- β All processing happens locally
- β No external API calls to OpenAI/Anthropic
- β Full control over your data
# Enable SSL with Let's Encrypt
certbot --nginx -d yourdomain.com
# Set up backup automation
# Configure log rotation
# Implement monitoring with Prometheus/Grafana
# Set up automated security updates# Check all services status
docker-compose ps
# View logs
docker-compose logs -f n8n
docker-compose logs -f ollama
docker-compose logs -f chroma# Backup n8n workflows and data
tar -czf backup-n8n-$(date +%Y%m%d).tar.gz ./n8n_data
# Backup ChromaDB vector store
tar -czf backup-chroma-$(date +%Y%m%d).tar.gz ./chroma_data
# Backup Ollama models
tar -czf backup-ollama-$(date +%Y%m%d).tar.gz ./ollama_data| Issue | Solution |
|---|---|
Connection refused on port 5678 |
Check if firewall allows port 5678: ufw status |
| Docker containers not starting | Check logs: docker-compose logs SERVICE_NAME |
| Out of disk space | Clean up: docker system prune -a |
| n8n workflow errors | Verify environment variables in .env file |
# Check system resources
df -h
free -h
docker stats
# Check service connectivity
curl http://localhost:5678
curl http://localhost:8000/api/v1/heartbeat
curl http://localhost:11434/api/tagsThe automated deployment pipeline includes:
- Infrastructure Provisioning (Terraform)
- Environment Setup (Ubuntu + Docker)
- Application Deployment (Docker Compose)
- Service Verification (Health checks)
- Model Preparation (Ollama model download)
- Automatic: Push to
mainbranch - Manual: Workflow dispatch with custom commands
Edit terraform/main.tf to upgrade server size:
resource "digitalocean_droplet" "private_ai_server" {
size = "s-4vcpu-8gb" # Upgrade from s-2vcpu-4gb
}- GPU Support: Use GPU-enabled droplets for faster inference
- Load Balancing: Deploy multiple instances behind a load balancer
- Caching: Implement Redis for frequently accessed embeddings
- CDN: Use DigitalOcean Spaces for static content delivery
This project uses a Custom Commercial License:
- β FREE for: Personal use, education, research, and non-commercial projects
- π° PAID for: Commercial use, selling services, or creating commercial products
Perfect for businesses looking to:
- π’ Enterprise Document AI: Internal knowledge management systems
- π‘ SaaS Products: Build document processing services
- π§ Custom Solutions: White-label AI document platforms
- π Consulting Services: AI implementation for clients
Ready to use this commercially? Let's discuss:
- π§ Email: [YOUR_EMAIL]
- πΌ LinkedIn: [YOUR_LINKEDIN]
- π GitHub: Contact via issues or discussions
- π° Pricing: Flexible licensing based on your use case
Custom enterprise features available:
- Priority support and updates
- Custom integrations and modifications
- On-premise deployment assistance
- Training and consultation services
We welcome contributions! Please see our Contributing Guide for details.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- π Documentation: Wiki
- π Bug Reports: Issues
- π¬ Discussions: GitHub Discussions
This project is licensed under the MIT License - see the LICENSE file for details.
- n8n - Workflow automation platform
- Ollama - Local LLM inference engine
- ChromaDB - Vector database
- DigitalOcean - Cloud infrastructure
- Terraform - Infrastructure as code
π Built with β€οΈ for private, secure AI document processing
For enterprise support and custom implementations, please contact us.