Skip to content

killerlux/terraform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

46 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– Private Document AI Stack

Deploy Status License: Custom DigitalOcean Docker

Enterprise-grade private AI document processing platform - Chat with your documents securely using your own infrastructure, powered by Ollama LLM, ChromaDB vector store, and n8n automation workflows.

🎯 Project Overview

Private Document AI is a complete self-hosted solution that enables organizations to implement secure document AI capabilities without data leaving their infrastructure. Built with enterprise DevOps practices, this platform provides:

  • πŸ”’ Complete Data Privacy: All processing happens on your infrastructure
  • πŸš€ Production-Ready Deployment: Automated CI/CD with infrastructure as code
  • πŸ”§ Extensible Workflows: Visual automation with n8n
  • ⚑ High Performance: Vector search with ChromaDB and local LLM inference
  • πŸ“Š Scalable Architecture: Containerized microservices design

🎯 Architecture

System Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Client Web    │───▢│   n8n Platform  │───▢│  Ollama LLM     β”‚
β”‚   Interface     β”‚    β”‚   :5678         β”‚    β”‚  :11434         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚                       β”‚
                                β–Ό                       β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚   ChromaDB      β”‚    β”‚   Vector        β”‚
                       β”‚   :8000         β”‚    β”‚   Embeddings    β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Technology Stack

Component Technology Purpose Port
Orchestration n8n Workflow automation & UI 5678
LLM Engine Ollama (Llama3 8B) Text generation & embeddings 11434
Vector Database ChromaDB Document storage & similarity search 8000
Infrastructure Terraform + DigitalOcean Cloud provisioning -
Containerization Docker Compose Service orchestration -
CI/CD GitHub Actions Automated deployment -

Data Flow

  1. πŸ“„ Document Ingestion: Upload β†’ Text extraction β†’ Chunking β†’ Embedding β†’ Vector storage
  2. πŸ’¬ Question Answering: Query β†’ Embedding β†’ Similarity search β†’ Context retrieval β†’ LLM response

πŸš€ Quick Start

Prerequisites

  • DigitalOcean account with API token
  • SSH key pair configured in DigitalOcean
  • GitHub repository with Actions enabled

πŸ”‘ SSH Key Setup Guide

Before deploying, you need to set up SSH keys for secure server access. Follow this step-by-step guide:

Step 1: Generate SSH Key Pair

# Generate a new SSH key pair specifically for this project
ssh-keygen -t ed25519 -C "private-ai-deployment" -f ~/.ssh/private-ai

# This creates two files:
# ~/.ssh/private-ai      (private key - keep secret!)
# ~/.ssh/private-ai.pub  (public key - safe to share)

Alternative for older systems:

# If ed25519 is not supported, use RSA
ssh-keygen -t rsa -b 4096 -C "private-ai-deployment" -f ~/.ssh/private-ai

Step 2: Add Public Key to DigitalOcean

  1. Display your public key:

    cat ~/.ssh/private-ai.pub
  2. Copy the output (starts with ssh-ed25519 or ssh-rsa)

  3. Add to DigitalOcean:

    • Go to DigitalOcean Control Panel
    • Navigate to Account β†’ Security β†’ SSH Keys
    • Click "Add SSH Key"
    • Paste your public key content
    • Name it: private-ai-deployment
    • Click "Add SSH Key"
  4. Get your SSH Key ID:

    # Method 1: Using doctl CLI (if installed)
    doctl compute ssh-key list
    
    # Method 2: Using curl
    curl -X GET \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer YOUR_DO_TOKEN" \
      "https://api.digitalocean.com/v2/account/keys"

    Copy the numeric ID (e.g., 12345678) - you'll need this for Terraform.

Step 3: Configure GitHub Secrets

  1. Get your private key content:

    cat ~/.ssh/private-ai
  2. Copy the entire content including -----BEGIN and -----END lines

  3. Add to GitHub Secrets:

    • Go to your repository β†’ Settings β†’ Secrets and variables β†’ Actions
    • Click "New repository secret"
    • Name: SSH_PRIVATE_KEY
    • Value: Paste the entire private key content
    • Click "Add secret"

Step 4: Update Terraform Configuration

Edit terraform/main.tf and replace the SSH key ID:

resource "digitalocean_droplet" "private_ai_server" {
  # Replace with your SSH key ID from Step 2
  ssh_keys = ["12345678"]  # ← Replace this number
  # ... rest of configuration
}

πŸ”’ Security Best Practices

  • βœ… Never commit private keys to Git
  • βœ… Use unique keys for each project
  • βœ… Store private keys securely
  • βœ… Regularly rotate SSH keys
  • ❌ Don't share private key content
  • ❌ Don't reuse personal SSH keys

πŸ› οΈ Troubleshooting SSH Setup

Issue Solution
Permission denied (publickey) Verify key is added to DigitalOcean and ID is correct
Bad permissions error Fix permissions: chmod 600 ~/.ssh/private-ai
Key not found in DO Re-upload public key to DigitalOcean account
GitHub secret not working Ensure entire private key is copied, including headers

βœ… Verification Commands

# Test SSH connection to DigitalOcean (after deployment)
ssh -i ~/.ssh/private-ai root@YOUR_DROPLET_IP

# Check key permissions
ls -la ~/.ssh/private-ai*

# Verify public key format
ssh-keygen -l -f ~/.ssh/private-ai.pub

1. Fork & Configure

# Fork this repository
git clone https://github.com/YOUR_USERNAME/private-ai.git
cd private-ai

2. Set GitHub Secrets

Navigate to your repository β†’ Settings β†’ Secrets and variables β†’ Actions:

Secret Name Description Example
DIGITALOCEAN_TOKEN DigitalOcean API token dop_v1_xxxxx
SSH_PRIVATE_KEY Private SSH key content -----BEGIN OPENSSH PRIVATE KEY-----

3. Update Terraform Configuration

Edit terraform/main.tf and replace the SSH key ID:

resource "digitalocean_droplet" "private_ai_server" {
  ssh_keys = ["YOUR_SSH_KEY_ID"]  # Replace with your key ID
  # ... rest of configuration
}

4. Deploy

git add .
git commit -m "feat: configure deployment secrets"
git push origin main

πŸŽ‰ That's it! GitHub Actions will automatically:

  • Provision DigitalOcean infrastructure
  • Install and configure all services
  • Deploy the complete AI stack

πŸ’» Usage

Access Your Platform

After deployment completes, access your services:

  • πŸ“Š n8n Interface: http://YOUR_DROPLET_IP:5678
  • πŸ” ChromaDB API: http://YOUR_DROPLET_IP:8000
  • πŸ€– Ollama API: http://YOUR_DROPLET_IP:11434

Document Ingestion

  1. Navigate to n8n interface
  2. Import n8n_workflows/ingestion_workflow.json
  3. Activate the workflow
  4. Send documents via webhook: POST http://YOUR_DROPLET_IP:5678/webhook/ingest-document

Chat with Documents

  1. Import n8n_workflows/qa_workflow.json
  2. Activate the chat workflow
  3. Use the built-in chat interface to ask questions about your documents

πŸ”§ Development & Customization

Local Development

# Clone repository
git clone https://github.com/YOUR_USERNAME/private-ai.git
cd private-ai

# Start services locally
docker-compose up -d

# Access local services
# n8n: http://localhost:5678
# ChromaDB: http://localhost:8000
# Ollama: http://localhost:11434

Workflow Customization

The platform includes two pre-built workflows:

  • ingestion_workflow.json: Document processing pipeline
  • qa_workflow.json: Question-answering interface

Customize these workflows in the n8n interface or create new ones for your specific use cases.

LLM Model Management

# SSH into your server
ssh root@YOUR_DROPLET_IP

# List available models
docker exec ollama_service ollama list

# Pull additional models
docker exec ollama_service ollama pull codellama:7b
docker exec ollama_service ollama pull mistral:7b

πŸ”’ Security Considerations

Network Security

  • All services run on a private Docker network
  • Firewall configured to allow only necessary ports (22, 5678, 8000, 11434)
  • Consider adding SSL/TLS certificates for production use

Data Privacy

  • βœ… No data leaves your infrastructure
  • βœ… All processing happens locally
  • βœ… No external API calls to OpenAI/Anthropic
  • βœ… Full control over your data

Recommended Production Hardening

# Enable SSL with Let's Encrypt
certbot --nginx -d yourdomain.com

# Set up backup automation
# Configure log rotation
# Implement monitoring with Prometheus/Grafana
# Set up automated security updates

πŸ“Š Monitoring & Maintenance

Service Health Checks

# Check all services status
docker-compose ps

# View logs
docker-compose logs -f n8n
docker-compose logs -f ollama
docker-compose logs -f chroma

Backup Strategy

# Backup n8n workflows and data
tar -czf backup-n8n-$(date +%Y%m%d).tar.gz ./n8n_data

# Backup ChromaDB vector store
tar -czf backup-chroma-$(date +%Y%m%d).tar.gz ./chroma_data

# Backup Ollama models
tar -czf backup-ollama-$(date +%Y%m%d).tar.gz ./ollama_data

πŸ› οΈ Troubleshooting

Common Issues

Issue Solution
Connection refused on port 5678 Check if firewall allows port 5678: ufw status
Docker containers not starting Check logs: docker-compose logs SERVICE_NAME
Out of disk space Clean up: docker system prune -a
n8n workflow errors Verify environment variables in .env file

Debug Commands

# Check system resources
df -h
free -h
docker stats

# Check service connectivity
curl http://localhost:5678
curl http://localhost:8000/api/v1/heartbeat
curl http://localhost:11434/api/tags

πŸ”„ CI/CD Pipeline

The automated deployment pipeline includes:

  1. Infrastructure Provisioning (Terraform)
  2. Environment Setup (Ubuntu + Docker)
  3. Application Deployment (Docker Compose)
  4. Service Verification (Health checks)
  5. Model Preparation (Ollama model download)

Pipeline Triggers

  • Automatic: Push to main branch
  • Manual: Workflow dispatch with custom commands

πŸ“ˆ Scaling & Performance

Vertical Scaling

Edit terraform/main.tf to upgrade server size:

resource "digitalocean_droplet" "private_ai_server" {
  size = "s-4vcpu-8gb"  # Upgrade from s-2vcpu-4gb
}

Performance Optimization

  • GPU Support: Use GPU-enabled droplets for faster inference
  • Load Balancing: Deploy multiple instances behind a load balancer
  • Caching: Implement Redis for frequently accessed embeddings
  • CDN: Use DigitalOcean Spaces for static content delivery

πŸ’Ό Commercial Licensing & Business Opportunities

πŸ“‹ License Overview

This project uses a Custom Commercial License:

  • βœ… FREE for: Personal use, education, research, and non-commercial projects
  • πŸ’° PAID for: Commercial use, selling services, or creating commercial products

πŸš€ Commercial Use Cases

Perfect for businesses looking to:

  • 🏒 Enterprise Document AI: Internal knowledge management systems
  • πŸ’‘ SaaS Products: Build document processing services
  • πŸ”§ Custom Solutions: White-label AI document platforms
  • πŸ“Š Consulting Services: AI implementation for clients

πŸ’¬ Get Commercial License

Ready to use this commercially? Let's discuss:

  • πŸ“§ Email: [YOUR_EMAIL]
  • πŸ’Ό LinkedIn: [YOUR_LINKEDIN]
  • πŸ™ GitHub: Contact via issues or discussions
  • πŸ’° Pricing: Flexible licensing based on your use case

Custom enterprise features available:

  • Priority support and updates
  • Custom integrations and modifications
  • On-premise deployment assistance
  • Training and consultation services

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Workflow

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“ž Support

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments


πŸš€ Built with ❀️ for private, secure AI document processing

For enterprise support and custom implementations, please contact us.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors