The Go-RAG-System is a complete, production-ready implementation of a Retrieval-Augmented Generation (RAG) pipeline, written entirely in Go. It combines the raw retrieval power of a PostgreSQL + pgvector vector database with the language generation capabilities of OpenAI GPT-4o and an advanced 17-pattern Agentic Architecture.
Unlike toy implementations, this system is designed to handle real-world complexity:
- It doesn't just generate text β it reasons, plans, executes tools, and self-corrects.
- It doesn't just store data β it indexes, retrieves, and augments with semantic similarity search using 1536-dimensional embeddings.
- It doesn't just run one AI call β it orchestrates multi-agent pipelines with full observability, security authorization, and human approval gates.
The RAG workflow runs through 6 distinct phases before a response is returned:
User Query
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 1: Embedding β
β text-embedding-3-small (1536d) β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 2: Vector DB Retrieval β
β PostgreSQL + pgvector β
β IVFFlat Cosine Similarity Search β
β β Returns Top K Document Chunks β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 3: Prompt Augmentation β
β Retrieved chunks + Original query β
β β Enriched context prompt β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 4: LLM Generation β
β GPT-4o processes enriched prompt β
β β Generates precise answer β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 5: Tool Calling (if needed) β
β LLM triggers Web Search / DB Query β
β β External data fetched & merged β
βββββββββββββββββββ¬ββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 6: Agentic Post-Processing β
β Observability β Evaluation β
β β Reflection β Final Response β
βββββββββββββββββββββββββββββββββββββββ
Before running the application, ensure your PostgreSQL instance has pgvector installed and execute the following initialization SQL:
-- Step 1: Enable the pgvector extension
CREATE EXTENSION IF NOT EXISTS vector;
-- Step 2: Create the documents table to store text chunks and their embeddings
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT NOT NULL,
embedding VECTOR(1536), -- 1536 dimensions matches text-embedding-3-small
metadata JSONB, -- Supports hybrid filtering (e.g., source, date)
created_at TIMESTAMP DEFAULT NOW()
);
-- Step 3: Create an IVFFlat index for fast approximate nearest-neighbor search
-- 'lists' controls the number of cluster centroids (tune based on dataset size)
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
-- Optional: HNSW index for higher recall at slightly more memory cost
-- CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);| Index Type | Recall | Speed | Memory | Best For |
|---|---|---|---|---|
| IVFFlat | ~95% | Very Fast | Low | Datasets < 1M rows |
| HNSW | ~99% | Fast | Higher | High-precision requirements |
| No Index (exact) | 100% | Slow | Minimal | Dev / < 100k rows |
- ACID Compliance: Your document store and relational data live in one reliable, transactional database β no eventual consistency issues.
- Hybrid Search: Combine vector similarity with SQL
WHEREclauses. Example: "Find semantically similar documents, but only from the last 30 days, tagged ascompliance". - Familiar Tooling: Use standard PostgreSQL tooling (pgAdmin, psql, pg_dump) for backups, monitoring, and schema migrations.
- No Additional Infrastructure: Eliminates the need for a separate dedicated vector database (Pinecone, Weaviate, etc.) reducing cost and ops overhead.
Go-RAG-System/
β
βββ main.go # Application entry point β wires all components
βββ security.go # Agent Security & Authorization (RBAC checks)
βββ observability.go # TraceID generation, structured logging, metrics
βββ evaluation.go # Response quality & safety scoring
βββ agent_logic.go # Reflection & Self-Correction loop
βββ react_planner.go # ReAct (Reason + Act) Planning Framework
βββ human_in_the_loop.go # HITL Approval Gates
βββ memory.go # Short-term & Long-term Agent Memory
βββ single_agent.go # Perceive β Reason β Act β Learn loop
βββ db_query_agent.go # NL2SQL β Natural Language to SQL
βββ web_search_agent.go # Autonomous Web Research (SerpAPI)
βββ ml_models.go # Dynamic AI Model Selection Engine
βββ multi_agent.go # Multi-Agent Coordinator (Planner + Executor)
βββ tool_caller.go # Tool & External API Calling Interface
βββ rest_api.go # REST API CRUD + Bulk Operations mapping
βββ goal_based_agent.go # BFS Goal-Based Pathfinding Agent
βββ concurrency.go # Goroutine-based Concurrent Task Execution
βββ custom_errors.go # AgentHallucinationError & structured exceptions
β
βββ go.mod # Go module definition
βββ go.sum # Dependency lock file
βββ .env.example # Environment variable template
βββ .github/
β βββ workflows/
β βββ daily-contribution-sync.yml # Automated daily maintenance
βββ README.md
This repository implements every major pattern from modern agentic AI research and engineering:
1. Agent Observability (observability.go)
Every agent execution generates a pseudo-UUID TraceID. Structured log events are emitted at every stage β tool call attempts, LLM responses, evaluation scores β providing a complete, reproducible audit trail.
2. Agent Evaluation (evaluation.go)
After each response, scores are computed across four dimensions:
- Goal Achievement: Did the agent actually answer the question?
- Task Success Rate: How many sub-steps completed without error?
- Efficiency: Response latency relative to complexity.
- Safety Score: Did the response stay within ethical bounds?
3. Reflection & Self-Correction (agent_logic.go)
When evaluation fails, instead of raising an error, the agent enters a self-correction loop. It analyzes why it failed (e.g., "Irrelevant search results", "Hallucination detected"), updates its strategy ("Add metadata filter", "Use a different tool"), and retries.
// The agent catches its own failure and corrects:
// Attempt 1: Failed β "Irrelevant results"
// Correction: "Narrow query to last 7 days using metadata filter"
// Attempt 2: Success4. ReAct Framework (react_planner.go)
The agent actively reasons in a loop: Thought β Action β Observation β Thought. Each iteration brings it closer to the goal by incorporating real-world feedback from tool executions.
5. Agent Planning / Task Decomposition (react_planner.go)
High-level user goals (e.g., "Prepare a Q1 compliance report") are autonomously broken into executable sub-tasks before any action is taken, preventing cascading failures from ambiguous prompts.
6. Goal-Based Agent (goal_based_agent.go)
Uses BFS (Breadth-First Search) pathfinding to evaluate multiple action paths and select the most optimal sequence to reach a desired end state β critical for multi-step workflows.
7. Agent Security & Authorization (security.go)
Enforces Principle of Least Privilege at the agent layer. Every tool call is checked against an RBAC table before execution.
// Role: "agent" β Action: "execute" β APPROVED β
// Role: "agent" β Action: "delete" β DENIED β (fallback triggered)8. Human-in-the-Loop (human_in_the_loop.go)
For destructive or irreversible actions, the agent halts and routes to a human approval gate. In production, this triggers a webhook notification to an admin interface and awaits a response.
9. Agent Memory (memory.go)
Dual-layer memory:
- Short-Term: An in-memory key-value store cleared at session end.
- Long-Term: PostgreSQL-backed persistent memory for cross-session recall.
10. Tool Calling (tool_caller.go)
Exposes a structured JSON interface enabling the LLM to invoke real-world tools β Calculators, Weather APIs, Custom REST endpoints β extending its capabilities far beyond its static training data.
11. Database Query Agent / NL2SQL (db_query_agent.go)
Translates natural language questions into optimized SQL queries, executes them against the PostgreSQL store, and synthesizes a human-readable response from the result set.
12. Web Search Agent (web_search_agent.go)
Connects to SerpAPI to retrieve real-time web results. Parses the JSON payload, extracts the top results, and synthesizes a grounded, citation-backed response.
13. Single Agent Workflow (single_agent.go)
A formalized Perceive β Reason β Act β Learn loop β the fundamental building block for all agentic systems.
14. Multi-Agent System (multi_agent.go)
A Planner Agent decomposes a complex goal and dispatches sub-tasks to specialized Executor Agents running concurrently via goroutines.
15. Concurrent Task Execution (concurrency.go)
Leverages native Go goroutines and sync.WaitGroup to execute multiple agent tasks in parallel β no blocking, no deadlocks.
16. Dynamic AI Model Selection (ml_models.go)
Selects the appropriate algorithmic approach (NLP, Neural Network, Linear Regression) at runtime based on the semantic classification of the task.
17. Custom Exceptions (custom_errors.go)
Implements AgentHallucinationError and structured exception types that carry rich context (trace ID, tool name, expected vs. actual values) enabling precise error recovery.
| Dependency | Version | Notes |
|---|---|---|
| Go | 1.22+ | Install from go.dev |
| PostgreSQL | 15+ | With pgvector extension installed |
| OpenAI API Key | β | Billing account required |
1. Clone the repository:
git clone https://github.com/Raphasha27/Go-RAG-System.git
cd Go-RAG-System2. Install Go dependencies:
go mod tidy3. Configure environment variables:
cp .env.example .envEdit .env:
DATABASE_URL="postgres://user:password@localhost:5432/ragdb?sslmode=disable"
OPENAI_API_KEY="sk-your-openai-api-key"4. Initialize the database:
psql -U your_user -d ragdb -f schema.sql5. Run the application:
go run main.goExpected output:
[TraceID: a1b2c3d4] RAG System Starting...
[TraceID: a1b2c3d4] Connecting to PostgreSQL...
[TraceID: a1b2c3d4] Embedding query: "What is our refund policy?" (1536 dims)
[TraceID: a1b2c3d4] Retrieved 3 context chunks from pgvector
[TraceID: a1b2c3d4] π§ Tool Call Triggered: web_search
[TraceID: a1b2c3d4] Agent Evaluation: Quality=95 Safety=100 β
PASSED
The system implements the following security controls at the agent layer:
| Control | Implementation | File |
|---|---|---|
| Privilege Limitation | RBAC β agents cannot execute delete or write |
security.go |
| Audit Logging | Every tool call logged with TraceID | observability.go |
| Hallucination Detection | AgentHallucinationError thrown + caught |
custom_errors.go |
| Human Override Gate | High-risk actions paused for approval | human_in_the_loop.go |
| Input Validation | All queries validated before vectorization | main.go |
# Run all tests
go test ./...
# Run with verbose output
go test ./... -v
# Run with race condition detector (recommended before production)
go test ./... -raceFix: Ensure the extension is installed in your PostgreSQL instance:
CREATE EXTENSION IF NOT EXISTS vector;
-- If this fails, install via: sudo apt install postgresql-15-pgvectorCause: A concurrent task may be holding a resource without releasing it.
Fix: Review concurrency.go. Ensure all goroutines are wrapped with a defer wg.Done() call.
Fix: Add exponential backoff retry logic around the client.chat.completions.create() call. The current implementation will return nil on failure β a future PR will add automatic retry.
Cause: The lists parameter may be too small for your dataset size.
Fix: PostgreSQL recommends lists = sqrt(rows). For 1M rows, use lists = 1000.
- Add a REST HTTP server (using
net/httporgin) to expose as a microservice - Implement HNSW index as a configuration option alongside IVFFlat
- Add streaming response support (Server-Sent Events)
- Integrate Weaviate as an alternative vector backend
- Build a CLI tool for document ingestion and index management
- Add Prometheus metrics endpoint for production monitoring
- Chat with your documents / PDFs: Index company policy documents, technical manuals, or legal contracts and query them in plain English.
- Enterprise Knowledge Base: Build a Confluence / SharePoint replacement that actually understands what you're asking.
- Customer Support Bots: Ground AI responses in your actual product documentation to eliminate hallucinations.
- Research Assistants: Index academic papers and ask cross-paper synthesis questions.
- Automated Compliance Reporting: Query regulatory documents and generate gap analysis reports.
| Aspect | Fine-Tuning | RAG |
|---|---|---|
| Cost | Thousands of dollars per run | One-time embedding cost |
| Freshness | Stale after training cutoff | Real-time via DB updates |
| Explainability | Black box | Traceable to source chunks |
| Domain Adaptation | Requires large labeled dataset | Works with any documents |
| Deployment | New model version per update | Index update only |
- Fork the repository.
- Create a feature branch:
git checkout -b feat/your-feature - Commit your changes with a conventional commit message:
git commit -m 'feat: add HNSW index support' - Push to the branch:
git push origin feat/your-feature - Open a Pull Request describing the change and its motivation.
MIT License. See LICENSE for details.
Part of the Future AGI ecosystem β built by Koketso Raphasha (Raphasha27). Kirov Dynamics Technology | Building the Infrastructure of Autonomous Systems.