Skip to content

arsyad7/Arsys

Repository files navigation

AI-Cad RAG Backend Service

Knowledge Base RAG (Retrieval-Augmented Generation) backend service built with Go, PostgreSQL pgvector, and Ollama. This service enables intelligent question-answering over company FAQ documents using vector embeddings and LLM generation.

Features

  • 📝 Document Ingestion: Upload and automatically embed documents into vector database
  • 🔍 Semantic Search: Fast vector similarity search using HNSW index
  • 🤖 RAG Query: Intelligent question-answering with context retrieval
  • 🚀 High Performance: Clean architecture with Go for low latency
  • 🐳 Docker Ready: Full docker-compose setup with all dependencies

Tech Stack

  • Backend: Go 1.24+
  • Database: PostgreSQL 16 with pgvector extension
  • Vector Index: HNSW (Hierarchical Navigable Small World)
  • Embedding Model: nomic-embed-text (768 dimensions)
  • LLM Model: llama3.2
  • AI Runtime: Ollama
  • HTTP Router: gorilla/mux

Architecture

Client Request
    ↓
POST /api/query
    ↓
1. Generate embedding from query (Ollama: nomic-embed-text)
2. Vector similarity search (Postgres pgvector HNSW, cosine similarity)
3. Build context from top-K similar documents
4. Generate answer with LLM (Ollama: llama3.2)
    ↓
Return JSON response with answer + source documents

Quick Start

Prerequisites

  • Docker and Docker Compose
  • Go 1.24+ (for local development)
  • curl and jq (for testing)

1. Start Services

# Start all services (Postgres, Ollama, API)
docker-compose up -d

# Check logs
docker-compose logs -f

2. Pull Ollama Models

# Pull embedding model (nomic-embed-text)
docker exec -it arsys-ollama ollama pull nomic-embed-text

# Pull LLM model (llama3.2)
docker exec -it arsys-ollama ollama pull llama3.2

# Verify models are installed
docker exec -it arsys-ollama ollama list

3. Verify Services

# Check health endpoint
curl http://localhost:8080/api/health | jq

# Expected output:
# {
#   "status": "healthy",
#   "services": {
#     "database": "healthy",
#     "ollama": "healthy"
#   },
#   "timestamp": "2026-02-17T..."
# }

4. Seed Sample Data

# Seed Indonesian FAQ data
bash scripts/seed_data.sh

5. Test RAG Query

# Ask a question
curl -X POST http://localhost:8080/api/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Jam kerja perusahaan apa?",
    "top_k": 5
  }' | jq

# Run full test suite
bash scripts/test_api.sh

API Endpoints

GET /api/health

Health check for all services.

Response:

{
  "status": "healthy",
  "services": {
    "database": "healthy",
    "ollama": "healthy"
  },
  "timestamp": "2026-02-17T10:30:00Z"
}

POST /api/ingest

Ingest a single document into the vector database.

Request:

{
  "content": "Jam kerja perusahaan adalah Senin-Jumat 09:00-17:00 WIB",
  "source": "hr_policy",
  "metadata": {
    "category": "working_hours",
    "language": "id"
  }
}

Response:

{
  "document_id": "123e4567-e89b-12d3-a456-426614174000",
  "message": "Document ingested successfully"
}

POST /api/ingest/batch

Ingest multiple documents at once.

Request:

{
  "documents": [
    {
      "content": "Document 1 content...",
      "source": "hr_policy",
      "metadata": {"category": "leave"}
    },
    {
      "content": "Document 2 content...",
      "source": "it_support",
      "metadata": {"category": "vpn"}
    }
  ]
}

Response:

{
  "document_ids": ["uuid1", "uuid2"],
  "count": 2,
  "message": "Batch ingestion completed"
}

POST /api/query

Main RAG endpoint - query the knowledge base.

Request:

{
  "query": "Bagaimana cara mengajukan cuti?",
  "top_k": 5
}

Response:

{
  "answer": "Untuk mengajukan cuti, silakan login ke portal HR...",
  "source_documents": [
    {
      "id": "uuid",
      "content": "Untuk mengajukan cuti, silakan...",
      "source": "hr_policy",
      "similarity": 0.89,
      "metadata": {"category": "leave"},
      "created_at": "2026-02-17T..."
    }
  ],
  "processing_time_ms": 1234
}

Configuration

Configuration is managed through environment variables. Copy .env.example to .env and adjust as needed.

Key Configuration Options

# Database
DB_HOST=localhost
DB_PORT=55432
DB_USER=arsys
DB_PASSWORD=arsys123
DB_NAME=arsys

# Ollama
OLLAMA_BASE_URL=http://localhost:11434
EMBEDDING_MODEL=nomic-embed-text
LLM_MODEL=llama3.2

# RAG Settings
RAG_TOP_K=5                      # Number of similar documents to retrieve
RAG_SIMILARITY_THRESHOLD=0.7     # Minimum similarity score (0-1)
RAG_MAX_CONTEXT_LENGTH=2000      # Max characters for LLM context

# Server
SERVER_PORT=8080
LOG_LEVEL=info

Development

Local Development

# Install dependencies
go mod download

# Run locally (without Docker)
go run ./cmd/api

# Build binary
make build

# Run binary
./bin/api

Project Structure

AI-Cad/
├── cmd/api/                    # Application entry point
├── internal/
│   ├── api/                    # HTTP layer
│   │   ├── handlers/          # Request handlers
│   │   ├── middleware/        # HTTP middleware
│   │   └── router.go          # Route configuration
│   ├── client/ollama/         # Ollama API client
│   ├── config/                # Configuration management
│   ├── models/                # Data models
│   ├── repository/            # Database layer
│   └── service/               # Business logic (RAG flow)
├── pkg/logger/                # Logging utility
├── migrations/                # Database migrations
├── scripts/                   # Helper scripts
├── docker-compose.yaml        # Docker services
├── Dockerfile                 # API container build
└── Makefile                   # Build commands

Available Make Commands

make help           # Show all available commands
make build          # Build Go binary
make run            # Run locally
make docker-up      # Start all Docker services
make docker-down    # Stop all Docker services
make docker-logs    # Follow Docker logs
make models-pull    # Pull Ollama models
make seed           # Seed sample data
make test-api       # Run API tests
make clean          # Clean build artifacts

Database Schema

Documents Table

CREATE TABLE documents (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    content TEXT NOT NULL,
    embedding vector(768),              -- nomic-embed-text dimension
    metadata JSONB DEFAULT '{}'::jsonb,
    source VARCHAR(255),
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- HNSW index for fast vector similarity search
CREATE INDEX documents_embedding_idx ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

Index Parameters:

  • m = 16: Connections per layer (balance between recall and memory)
  • ef_construction = 64: Build quality (higher = better but slower)
  • vector_cosine_ops: Cosine similarity (standard for text embeddings)

How It Works

1. Document Ingestion

When you ingest a document:

  1. Content is sent to Ollama for embedding generation (nomic-embed-text)
  2. Document + embedding + metadata stored in PostgreSQL
  3. HNSW index automatically updated for fast search

2. RAG Query Flow

When you query:

  1. Embedding: Query text → Ollama → 768-dimensional vector
  2. Search: Vector similarity search using HNSW index (cosine similarity)
  3. Retrieve: Top-K most similar documents above threshold
  4. Context: Build context string from retrieved documents
  5. Generate: Send context + query to LLM (llama3.2)
  6. Return: Answer + source documents + processing time

3. Vector Similarity

Using cosine similarity to find relevant documents:

  • Score range: 0 to 1 (1 = identical, 0 = completely different)
  • Default threshold: 0.7 (filters out low-relevance results)
  • HNSW provides approximate nearest neighbor search (fast with good recall)

Troubleshooting

Services Won't Start

# Check service status
docker-compose ps

# View logs
docker-compose logs postgres
docker-compose logs ollama
docker-compose logs api

# Restart services
docker-compose restart

Ollama Models Not Found

# Pull models manually
docker exec -it arsys-ollama ollama pull nomic-embed-text
docker exec -it arsys-ollama ollama pull llama3.2

# Check if models are loaded
docker exec -it arsys-ollama ollama list

Database Connection Issues

# Check PostgreSQL is ready
docker exec -it arsys-postgres pg_isready -U arsys

# Verify pgvector extension
docker exec -it arsys-postgres psql -U arsys -d arsys -c "\dx"

# Check documents table
docker exec -it arsys-postgres psql -U arsys -d arsys -c "\d documents"

API Not Responding

# Check API logs
docker-compose logs -f api

# Verify health endpoint
curl http://localhost:8080/api/health

# Check if port is in use
netstat -an | grep 8080

Slow Query Response

  • First query is slow: Ollama loads models on first use (normal)
  • All queries slow: Check Ollama container resources, consider GPU
  • High similarity threshold: Lower RAG_SIMILARITY_THRESHOLD to get more results
  • Too many documents: HNSW index handles millions efficiently, but check RAM

Performance Tips

  1. Adjust HNSW parameters for your data size:

    • Small dataset (<10K docs): m=8, ef_construction=32
    • Medium (10K-100K): m=16, ef_construction=64 (default)
    • Large (>100K): m=32, ef_construction=128
  2. Tune RAG settings:

    • RAG_TOP_K: Higher = more context but slower
    • RAG_SIMILARITY_THRESHOLD: Lower = more results
    • RAG_MAX_CONTEXT_LENGTH: Balance between context and speed
  3. Ollama optimization:

    • Use GPU for faster inference
    • Keep models loaded (first query is slower)
    • Adjust timeout for large prompts

License

MIT License - feel free to use this project for learning or production!

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Support

For issues or questions, please open an issue on GitHub.


Built with ❤️ using Go, PostgreSQL pgvector, and Ollama

About

Arsys is an AI-powered DevOps assistant that ingests your codebase and logs, then uses RAG (Retrieval-Augmented Generation) to diagnose errors with pinpoint accuracy — referencing exact files, functions, and line numbers. Powered by Ollama + pgvector. Fully self-hosted.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors