# Building a Basic RAG Agent with GoodMem

## Overview

This tutorial will guide you through building a complete **Retrieval-Augmented Generation (RAG)** system using GoodMem's vector memory capabilities. By the end of this guide, you'll have a functional Q&A system that can:

- 🔍 **Semantically search** through your documents
- 📝 **Generate contextual answers** using retrieved information 
- 🏗️ **Scale to handle** large document collections

### What is RAG?

RAG combines the power of **retrieval** (finding relevant information) with **generation** (creating natural language responses). This approach allows AI systems to provide accurate, context-aware answers by:

1. **Retrieving** relevant documents from a knowledge base
2. **Augmenting** the query with this context
3. **Generating** a comprehensive answer using both the query and retrieved information

### Why GoodMem for RAG?

GoodMem provides enterprise-grade vector storage with:
- **Multiple embedder support** for optimal retrieval accuracy
- **Streaming APIs** for real-time responses
- **Advanced post-processing** with reranking and summarization
- **Scalable architecture** for production workloads


## Prerequisites

Before starting, ensure you have:

- ✅ **GoodMem server running** (install with: `curl -s https://get.goodmem.ai | bash`)
- ✅ **Go 1.18+** installed
- ✅ **API key** for your GoodMem instance
- ✅ **OpenAI API key** (for embedder and LLM in complete RAG demo)
- ✅ **Voyage AI API key** (for reranker in complete RAG demo)

## Installation & Setup

First, let's install the required packages:

In [1]:
// Fix for import
%env GOPROXY="https://go-proxy.fury.io/weisi/"
%env GOPRIVATE="github.com/PAIR-Systems-Inc/goodmem"
%env GOSUMDB=off
!* go get github.com/PAIR-Systems-Inc/goodmem/clients/go@v1.0.25
!*cat go.mod

Set: GOPROXY="https://go-proxy.fury.io/weisi/"
Set: GOPRIVATE="github.com/PAIR-Systems-Inc/goodmem"
Set: GOSUMDB="off"


go: added github.com/PAIR-Systems-Inc/goodmem/clients/go v1.0.25


module gonb_a94ab2e3

go 1.25.5

require github.com/PAIR-Systems-Inc/goodmem/clients/go v1.0.25 // indirect


In [2]:
import (
    "context"
    "fmt"
    "log"
    "os"
    "time"
    "github.com/janpfeifer/gonb/cache"    // Used by gonb to persist variables across cells
    goodmem_client "github.com/PAIR-Systems-Inc/goodmem/clients/go"
)

// Helper functions for pointer creation
func PtrInt32(v int32) *int32 { return &v }
func PtrBool(v bool) *bool    { return &v }

## Authentication & Configuration

### Why This Matters

GoodMem uses API key authentication to secure your vector memory data. Proper configuration ensures:
- **Secure access** to your GoodMem instance
- **Isolated environments** (development, staging, production)
- **Usage tracking** and access control per API key

### What We'll Do

1. Configure the GoodMem host URL (where your server is running)
2. Set up API key authentication
3. Verify the configuration is correct

### Configuration Options

- **Local development**: `http://localhost:8080` (default)
- **Remote/production**: Your deployed GoodMem URL
- **Environment variables**: Best practice for managing credentials

Let's configure our GoodMem client and test the connection:

In [3]:
// Configuration - Update these values for your setup
var (
    GOODMEM_HOST    = getEnv("GOODMEM_HOST", "localhost:8080")
    GOODMEM_API_KEY = getEnv("GOODMEM_API_KEY", "")
)

func getEnv(key, defaultValue string) string {
    if value := os.Getenv(key); value != "" {
        return value
    }
    return defaultValue
}

%%
fmt.Printf("GoodMem Host: %s\n", GOODMEM_HOST)
if GOODMEM_API_KEY == "your-api-key-here" {
    fmt.Println("API Key configured: No - Please update")
} else {
    fmt.Println("API Key configured: Yes")
}

GoodMem Host: localhost:8080
API Key configured: Yes


In [4]:
// Create GoodMem API client
func getClient() *goodmem_client.APIClient {
    configuration := goodmem_client.NewConfiguration()
    configuration.Host = GOODMEM_HOST
    configuration.Scheme = "http"
    configuration.DefaultHeader["X-API-Key"] = GOODMEM_API_KEY
    client := goodmem_client.NewAPIClient(configuration)
    return client
}

%%
client := getClient()
ctx := context.Background()
// Test connection by listing spaces
listResponse, httpResp, err := client.SpacesAPI.ListSpaces(ctx).Execute()
if err != nil {
    log.Fatalf("❌ Error connecting to GoodMem: %v (HTTP Status: %d)", err, httpResp.StatusCode)
}

fmt.Println("✅ Successfully connected to GoodMem!")
if listResponse.Spaces != nil {
    fmt.Printf("   Found %d existing spaces\n", len(listResponse.Spaces))
}

✅ Successfully connected to GoodMem!
   Found 0 existing spaces


## Creating an Embedder

### Why Embedders Matter

An **embedder** is the foundation of semantic search. It converts text into high-dimensional vectors (embeddings) that capture meaning:

```
Text: "vacation policy" → Vector: [0.23, -0.45, 0.67, ...]  (1536 dimensions)
```

These vectors enable:
- **Semantic similarity**: Find conceptually similar content, not just keyword matches
- **Context understanding**: Capture meaning beyond exact word matches
- **Efficient retrieval**: Fast vector comparisons using specialized indexes

### The RAG Pipeline Flow

```
Documents → Embedder → Vector Storage → Semantic Search → Retrieved Context
```

### Choosing an Embedder

**OpenAI `text-embedding-3-small`** (what we'll use):
- ✅ **High quality**: Excellent for most use cases
- ✅ **Fast**: Low latency for real-time applications  
- ✅ **1536 dimensions**: Good balance of quality and storage
- ✅ **Cost-effective**: $0.02 per 1M tokens

**Other options**:
- **text-embedding-3-large**: Higher quality, 3072 dimensions, more expensive
- **Voyage AI**: Specialized for search, excellent retrieval performance
- **Cohere**: Good multilingual support
- **Local models**: HuggingFace sentence transformers for privacy/offline

### What We'll Do

1. Check if an embedder already exists
2. If not, create an OpenAI embedder with proper authentication
3. Verify the embedder is ready for use

**Note**: You'll need an OpenAI API key set in your environment variable `OPENAI_API_KEY`.

In [5]:
// Create OpenAI text-embedding-3-small embedder
func createOpenAIEmbedder() string {
    openaiApiKey := getEnv("OPENAI_API_KEY", "")
    if openaiApiKey == "" {
        fmt.Println("⚠️  OPENAI_API_KEY environment variable not set")
        fmt.Println("   Please set your OpenAI API key to create an embedder")
        return ""
    }
    
    client := getClient()
    ctx := context.Background()
    
    // Check if embedder already exists
    existingEmbedders, _, _ := client.EmbeddersAPI.ListEmbedders(ctx).Execute()
    for _, embedder := range existingEmbedders.Embedders {
        if embedder.ProviderType == "OPENAI" && embedder.ModelIdentifier == "text-embedding-3-small" {
            fmt.Printf("✅ OpenAI embedder already exists\n")
            fmt.Printf("   Display Name: %s\n", embedder.DisplayName)
            fmt.Printf("   Embedder ID: %s\n", embedder.EmbedderId)
            fmt.Printf("   Model: %s\n", embedder.ModelIdentifier)
            fmt.Printf("   Dimensionality: %d\n", embedder.Dimensionality)
            return embedder.EmbedderId
        }
    }
    
    // Create new embedder
    fmt.Println("📝 Creating new OpenAI text-embedding-3-small embedder...")
    
    // Create string variables for NullableString fields
    headerName := "Authorization"
    prefix := "Bearer "
    apiPath := "/embeddings"
    
    credentials := goodmem_client.EndpointAuthentication{
        Kind: goodmem_client.CREDENTIAL_KIND_API_KEY,
        ApiKey: &goodmem_client.ApiKeyAuth{
            InlineSecret: *goodmem_client.NewNullableString(&openaiApiKey),
            HeaderName:   *goodmem_client.NewNullableString(&headerName),
            Prefix:       *goodmem_client.NewNullableString(&prefix),
        },
    }
    
    embedderRequest := goodmem_client.EmbedderCreationRequest{
        DisplayName:         "OpenAI Text Embedding 3 Small",
        ProviderType:        "OPENAI",
        EndpointUrl:         "https://api.openai.com/v1",
        ModelIdentifier:     "text-embedding-3-small",
        Dimensionality:      1536,
        ApiPath:             *goodmem_client.NewNullableString(&apiPath),
        DistributionType:    goodmem_client.DistributionType("DENSE"),
        SupportedModalities: []goodmem_client.Modality{goodmem_client.Modality("TEXT")},
        Credentials:         &credentials,
    }
    
    newEmbedder, httpResp, err := client.EmbeddersAPI.CreateEmbedder(ctx).EmbedderCreationRequest(embedderRequest).Execute()
    if err != nil {
        fmt.Printf("❌ Failed to create embedder: %v (HTTP Status: %d)\n", err, httpResp.StatusCode)
        return ""
    }
    
    fmt.Printf("✅ Successfully created OpenAI embedder\n")
    fmt.Printf("   Display Name: %s\n", newEmbedder.DisplayName)
    fmt.Printf("   Embedder ID: %s\n", newEmbedder.EmbedderId)
    fmt.Printf("   Model: %s\n", newEmbedder.ModelIdentifier)
    fmt.Printf("   Dimensionality: %d\n", newEmbedder.Dimensionality)
    
    return newEmbedder.EmbedderId
}

%%
createOpenAIEmbedder()

📝 Creating new OpenAI text-embedding-3-small embedder...
✅ Successfully created OpenAI embedder
   Display Name: OpenAI Text Embedding 3 Small
   Embedder ID: 258ede3c-af3a-4e8e-87e1-87f5ff9a64be
   Model: text-embedding-3-small
   Dimensionality: 1536


## Creating Your First Space

### What is a Space?

A **Space** in GoodMem is a logical container for organizing related memories (documents). Think of it as a database or collection where you store and retrieve semantically similar content.

Each space has:
- **Associated embedders**: Which models convert text to vectors
- **Chunking configuration**: How documents are split into searchable pieces
- **Access controls**: Public or private, with permission management
- **Metadata labels**: For organization and filtering

### Use Cases for Multiple Spaces

You might create different spaces for:
- **By domain**: Technical docs, HR policies, product specs
- **By environment**: Development, staging, production
- **By customer**: Tenant-specific data in multi-tenant apps
- **By privacy level**: Public FAQ vs. internal knowledge base

### Why Chunking Matters

Documents are too large to search efficiently as whole units. Chunking:
- **Improves relevance**: Match specific sections, not entire documents
- **Enables context**: Return focused chunks that answer specific questions  
- **Optimizes retrieval**: Process and compare smaller text segments

**Our chunking strategy**:
- **256 characters**: Short enough for focused context, long enough for meaning
- **25 character overlap**: Ensures concepts spanning chunk boundaries aren't lost
- **Hierarchical separators**: Split on paragraphs first, then sentences, then words

### What We'll Do

1. List available embedders
2. Create a space with our embedder and chunking configuration
3. Add metadata labels for organization
4. Verify the space is ready

Let's create a space for our RAG demo:

In [6]:
// First, let's see what embedders are available
func getEmbedders() []goodmem_client.EmbedderResponse {
    client := getClient()
    ctx := context.Background()
    
    listResponse, httpResp, err := client.EmbeddersAPI.ListEmbedders(ctx).Execute()
    if err != nil {
        log.Fatalf("❌ Error connecting to GoodMem: %v (HTTP Status: %d)", err, httpResp.StatusCode)
    }
    return listResponse.Embedders
}

%%
availableEmbedders := getEmbedders()
fmt.Printf("📋 Available Embedders (%d):\n", len(availableEmbedders))
for i, embedder := range availableEmbedders {
    fmt.Printf("   %d. %s - %s\n", i+1, embedder.DisplayName, embedder.ProviderType)
    fmt.Printf("      Model: %s\n", embedder.ModelIdentifier)
    fmt.Printf("      ID: %s\n", embedder.EmbedderId)
    fmt.Println()
}

var defaultEmbedder *goodmem_client.EmbedderResponse
if len(availableEmbedders) > 0 {
    defaultEmbedder = &availableEmbedders[0]
    fmt.Printf("🎯 Using embedder: %s\n", defaultEmbedder.DisplayName)
} else {
    fmt.Println("⚠️  No embedders found. You may need to configure an embedder first.")
    fmt.Println("   Refer to the documentation: https://docs.goodmem.ai/docs/reference/cli/goodmem_embedder_create/")
}

📋 Available Embedders (1):
   1. OpenAI Text Embedding 3 Small - OPENAI
      Model: text-embedding-3-small
      ID: 258ede3c-af3a-4e8e-87e1-87f5ff9a64be

🎯 Using embedder: OpenAI Text Embedding 3 Small


In [7]:
// Execute to clear gonb cache on demoSpaceId
%%
cache.ResetKey("demoSpaceId")

In [8]:
// Create a space for our RAG demo
const SPACE_NAME = "RAG Demo Knowledge Base (Go)"

// Define chunking configuration that we'll reuse throughout the tutorial
func get_chunking_config() *goodmem_client.ChunkingConfiguration {
    jsonData := `
    {
        "recursive": {
            "chunkSize":           256,
            "chunkOverlap":        25,
            "separators":          ["\n\n", "\n", ". ", " ", ""],
            "keepStrategy":        "KEEP_END",
            "separatorIsRegex":    false,
            "lengthMeasurement":   "CHARACTER_COUNT"
        }
    }`

    var config goodmem_client.NullableChunkingConfiguration
    json.Unmarshal([]byte(jsonData), &config)
    return config.Get()
}
var DEMO_CHUNKING_CONFIG = get_chunking_config()

func create_demo_space() string {
    client := getClient()
    ctx := context.Background()
    // Check if space already exists
    existingSpaces, _, _ := client.SpacesAPI.ListSpaces(ctx).Execute()
    var demoSpace *goodmem_client.Space
    
    for _, space := range existingSpaces.Spaces {
        if space.Name == SPACE_NAME {
            fmt.Printf("📁 Space '%s' already exists\n", SPACE_NAME)
            fmt.Printf("   Space ID: %s\n", space.SpaceId)
            fmt.Println("   To remove existing space, see https://docs.goodmem.ai/docs/reference/cli/goodmem_space_delete/")
            demoSpace = &space
            return demoSpace.SpaceId
        }
    }
    
    if demoSpace == nil {
        // Configure space embedders if we have available embedders
        defaultEmbedder := getEmbedders()[0]
        var spaceEmbedders []goodmem_client.SpaceEmbedderConfig
        spaceEmbedders = []goodmem_client.SpaceEmbedderConfig{
            {
                EmbedderId:              defaultEmbedder.EmbedderId,
                DefaultRetrievalWeight:  1.0,
            },
        }
    
        falseValue := false
        falseBool := goodmem_client.NewNullableBool(&falseValue)
        // Create space request
        createRequest := goodmem_client.SpaceCreationRequest{
            Name: SPACE_NAME,
            Labels: map[string]string{
                "purpose":      "rag-demo",
                "environment":  "tutorial",
                "content-type": "documentation",
            },
            SpaceEmbedders:          spaceEmbedders,
            PublicRead:              *falseBool,
            DefaultChunkingConfig:   DEMO_CHUNKING_CONFIG,
        }
        
        // Create the space
        newSpace, httpResp, err := client.SpacesAPI.CreateSpace(ctx).SpaceCreationRequest(createRequest).Execute()
        if err != nil {
            log.Fatalf("❌ Error creating space: %v (HTTP Status: %d)", err, httpResp.StatusCode)
        }
        
        demoSpace = newSpace
        
        fmt.Printf("✅ Created space: %s\n", newSpace.Name)
        fmt.Printf("   Space ID: %s\n", newSpace.SpaceId)
        fmt.Printf("   Embedders: %d\n", len(newSpace.SpaceEmbedders))
        if newSpace.Labels != nil {
            fmt.Printf("   Labels: %v\n", newSpace.Labels)
        }
        fmt.Println("   Chunking Config Saved: 256 chars with 25 overlap")
        fmt.Println("   💡 This chunking config will be reused for all memory creation!")
        return demoSpace.SpaceId
    }
    return ""
}

var demoSpaceId = cache.Cache("demoSpaceId", create_demo_space)

✅ Created space: RAG Demo Knowledge Base (Go)
   Space ID: 50a96fe3-280e-4efe-a4af-adb9103dc827
   Embedders: 1
   Labels: map[content-type:documentation environment:tutorial purpose:rag-demo]
   Chunking Config Saved: 256 chars with 25 overlap
   💡 This chunking config will be reused for all memory creation!


In [9]:
// Verify our space configuration
%%
if demoSpaceId != "" {
    client := getClient()
    ctx := context.Background()
    
    spaceDetails, httpResp, err := client.SpacesAPI.GetSpace(ctx, demoSpaceId).Execute()
    if err != nil {
        log.Fatalf("❌ Error getting space details: %v (HTTP Status: %d)", err, httpResp.StatusCode)
    }
    
    fmt.Println("🔍 Space Configuration:")
    fmt.Printf("   Name: %s\n", spaceDetails.Name)
    fmt.Printf("   Owner ID: %s\n", spaceDetails.OwnerId)
    fmt.Printf("   Public Read: %v\n", spaceDetails.PublicRead)
    fmt.Printf("   Created: %d\n", spaceDetails.CreatedAt)
    if spaceDetails.Labels != nil {
        fmt.Printf("   Labels: %v\n", spaceDetails.Labels)
    }
    
    fmt.Println("\n🤖 Associated Embedders:")
    for _, embedderAssoc := range spaceDetails.SpaceEmbedders {
        fmt.Printf("   Embedder ID: %s\n", embedderAssoc.EmbedderId)
        fmt.Printf("   Retrieval Weight: %.1f\n", embedderAssoc.DefaultRetrievalWeight)
    }
} else {
    fmt.Println("⚠️  No space available for the demo")
}

🔍 Space Configuration:
   Name: RAG Demo Knowledge Base (Go)
   Owner ID: cf5df949-31c6-4c54-af50-f8002107164e
   Public Read: false
   Created: 1765502292472
   Labels: map[content-type:documentation environment:tutorial purpose:rag-demo]

🤖 Associated Embedders:
   Embedder ID: 258ede3c-af3a-4e8e-87e1-87f5ff9a64be
   Retrieval Weight: 1.0


## Adding Documents to Memory

### The Document Processing Pipeline

When you add a document to GoodMem, it goes through several automated steps:

```
1. Ingestion → 2. Chunking → 3. Embedding → 4. Indexing → 5. Ready for Search
```

**What happens**:
1. **Ingestion**: Document content and metadata are stored
2. **Chunking**: Text is split according to your configuration (256 chars, 25 overlap)
3. **Embedding**: Each chunk is converted to a vector by your embedder
4. **Indexing**: Vectors are indexed for fast similarity search
5. **Status**: Document marked as `COMPLETED` and ready for retrieval

### Single vs. Batch Operations

**Single memory creation** (`CreateMemory`):
- ✅ Good for: Real-time ingestion, single documents
- ✅ Synchronous processing with immediate status
- ⚠️ Higher overhead for bulk operations

**Batch memory creation** (`BatchCreateMemory`):
- ✅ Good for: Bulk imports, initial setup, periodic updates
- ✅ Lower overhead, efficient for multiple documents
- ✅ Async processing - check status via `ListMemories`
- ⚠️ Takes longer to get individual status feedback

### Metadata Best Practices

Rich metadata helps with:
- **Filtering**: Retrieve specific document types
- **Source attribution**: Show users where information came from
- **Organization**: Group and manage related documents
- **Debugging**: Track ingestion methods and dates

### What We'll Do

1. Load sample documents from local files
2. Create one document using single memory creation (to demo the API)
3. Create remaining documents using batch operation (more efficient)
4. Monitor processing status until all documents are ready

We'll use sample company documents that represent common business use cases:

In [10]:
// Execute to clear gonb cache on memoryId
%%
cache.ResetKey("sampleDocs")

In [11]:
import (
    "io/ioutil"
    "path/filepath"
    "encoding/base64"
    "sort"
)

// Document structure
type Document struct {
    Filename    string
    Content     string
    ContentB64  string
    ContentType string
    IsBinary    bool
}

// Load sample documents with auto-discovery
func loadSampleDocuments() []Document {
    /**
     * Load sample documents from the sample_documents directory.
     * 
     * Automatically discovers all files in the directory and handles:
     * - .txt files: Read as plain text
     * - .pdf files: Read as binary and base64 encode
     */
    documents := []Document{}
    sampleDir := "sample_documents"
    
    // Check if directory exists and read files
    files, err := ioutil.ReadDir(sampleDir)
    if err != nil {
        fmt.Printf("⚠️  Directory not found: %s\n", sampleDir)
        return documents
    }
    
    // Sort files for consistent ordering
    sort.Slice(files, func(i, j int) bool {
        return files[i].Name() < files[j].Name()
    })
    
    for _, fileInfo := range files {
        if fileInfo.IsDir() {
            continue
        }
        
        filename := fileInfo.Name()
        fullPath := filepath.Join(sampleDir, filename)
        ext := filepath.Ext(filename)
        
        if ext == ".txt" {
            // Handle text files
            content, err := ioutil.ReadFile(fullPath)
            if err != nil {
                fmt.Printf("⚠️  Error reading %s: %v\n", filename, err)
                continue
            }
            
            documents = append(documents, Document{
                Filename:    filename,
                Content:     string(content),
                ContentType: "text/plain",
                IsBinary:    false,
            })
            
            fmt.Printf("📄 Loaded: %s (%d characters)\n", filename, len(content))
            
        } else if ext == ".pdf" {
            // Handle PDF files with base64 encoding
            binaryContent, err := ioutil.ReadFile(fullPath)
            if err != nil {
                fmt.Printf("⚠️  Error reading %s: %v\n", filename, err)
                continue
            }
            
            contentB64 := base64.StdEncoding.EncodeToString(binaryContent)
            
            documents = append(documents, Document{
                Filename:    filename,
                ContentB64:  contentB64,
                ContentType: "application/pdf",
                IsBinary:    true,
            })
            
            fmt.Printf("📄 Loaded: %s (%d bytes, base64: %d chars)\n", filename, len(binaryContent), len(contentB64))
            
        } else {
            fmt.Printf("⚠️  Skipping unsupported file type: %s\n", filename)
        }
    }
    
    return documents
}

// Load the documents
var sampleDocs = cache.Cache("sampleDocs", loadSampleDocuments)

%%
fmt.Printf("\n📚 Total documents loaded: %d\n", len(sampleDocs))

📄 Loaded: company_handbook.txt (2342 characters)
📄 Loaded: employee_handbook.pdf (399615 bytes, base64: 532820 chars)
📄 Loaded: product_faq.txt (4043 characters)
📄 Loaded: security_policy.txt (4211 characters)
📄 Loaded: technical_documentation.txt (2384 characters)

📚 Total documents loaded: 5


In [12]:
// Execute to clear gonb cache on memoryId
%%
cache.ResetKey("memoryId")

In [13]:
import "strings"

// Create the first memory individually to demonstrate single memory creation
func createMemory() string {
    createSingleMemory := func(spaceId string, document Document) (*goodmem_client.Memory, error) {
        // Create memory request - use appropriate content field based on binary flag
        memoryRequest := goodmem_client.MemoryCreationRequest{
            SpaceId:        spaceId,
            ContentType:    document.ContentType,
            ChunkingConfig: DEMO_CHUNKING_CONFIG,
            Metadata: map[string]interface{}{
                "filename":         document.Filename,
                "source":           "sample_documents",
                "ingestion_method": "single",
            },
        }
        
        // Add content field based on type
        if document.IsBinary {
            memoryRequest.OriginalContentB64 = *goodmem_client.NewNullableString(&document.ContentB64)
        } else {
            memoryRequest.OriginalContent = *goodmem_client.NewNullableString(&document.Content)
        }
        
        // Create the memory
        client := getClient()
        ctx := context.Background()
        memory, httpResp, err := client.MemoriesAPI.CreateMemory(ctx).MemoryCreationRequest(memoryRequest).Execute()
        if err != nil {
            return nil, fmt.Errorf("failed to create memory: %v (HTTP Status: %d)", err, httpResp.StatusCode)
        }
        
        fmt.Printf("✅ Created single memory: %s\n", document.Filename)
        fmt.Printf("   Memory ID: %s\n", memory.MemoryId)
        fmt.Printf("   Content Type: %s\n", document.ContentType)
        fmt.Printf("   Status: %s\n", memory.ProcessingStatus)
        fmt.Println()
        
        return memory, nil
    }
    
    var singleMemory *goodmem_client.Memory
    
    if len(sampleDocs) > 0 {
        firstDoc := sampleDocs[0]
        fmt.Println("📝 Creating first document using CreateMemory API:")
        fmt.Printf("   Document: %s\n", firstDoc.Filename)
        fmt.Printf("   Content Type: %s\n", firstDoc.ContentType)
        fmt.Println("   Method: Individual memory creation")
        fmt.Println()
        
        memory, err := createSingleMemory(demoSpaceId, firstDoc)
        if err != nil {
            fmt.Printf("⚠️  Single memory creation failed: %v\n", err)
        } else {
            singleMemory = memory
            fmt.Println("🎯 Single memory creation completed successfully!")
        }
    } else {
        fmt.Println("⚠️  Cannot create memory: missing space or documents")
    }
    return singleMemory.MemoryId
}

var memoryId = cache.Cache("memoryId", createMemory)

📝 Creating first document using CreateMemory API:
   Document: company_handbook.txt
   Content Type: text/plain
   Method: Individual memory creation

✅ Created single memory: company_handbook.txt
   Memory ID: c0a0b6fb-73c2-4a77-9c66-968dca22ce59
   Content Type: text/plain
   Status: PENDING

🎯 Single memory creation completed successfully!


In [14]:
%%
// Demonstrate retrieving a memory by ID using get_memory
fmt.Println("📖 Retrieving memory details using GetMemory API:")
fmt.Printf("   Memory ID: %s\n", memoryId)
fmt.Println()

client := getClient()
ctx := context.Background()
// Retrieve the memory without content
retrievedMemory, httpResp, err := client.MemoriesAPI.GetMemory(ctx, memoryId).IncludeContent(false).Execute()
if err != nil {
    log.Fatalf("❌ Error retrieving memory: %v (HTTP Status: %d)", err, httpResp.StatusCode)
}

fmt.Println("✅ Successfully retrieved memory:")
fmt.Printf("   Memory ID: %s\n", retrievedMemory.MemoryId)
fmt.Printf("   Space ID: %s\n", retrievedMemory.SpaceId)
fmt.Printf("   Status: %s\n", retrievedMemory.ProcessingStatus)
fmt.Printf("   Content Type: %s\n", retrievedMemory.ContentType)
fmt.Printf("   Created At: %d\n", retrievedMemory.CreatedAt)
fmt.Printf("   Updated At: %d\n", retrievedMemory.UpdatedAt)

if retrievedMemory.Metadata != nil {
    fmt.Println("\n   📋 Metadata:")
    for key, value := range retrievedMemory.Metadata {
        fmt.Printf("      %s: %v\n", key, value)
    }
}

// Now retrieve with content included
fmt.Println("\n📖 Retrieving memory with content:")
retrievedWithContent, httpResp, err := client.MemoriesAPI.GetMemory(ctx, memoryId).IncludeContent(true).Execute()
if err != nil {
    log.Fatalf("❌ Error retrieving memory with content: %v (HTTP Status: %d)", err, httpResp.StatusCode)
}

if retrievedWithContent.OriginalContent.IsSet() {
    // Decode the base64 encoded content
    decodedContent, err := base64.StdEncoding.DecodeString(*retrievedWithContent.OriginalContent.Get())
    if err != nil {
        log.Fatalf("❌ Error decoding content: %v", err)
    }
    
    contentStr := string(decodedContent)
    fmt.Println("✅ Content retrieved and decoded:")
    fmt.Printf("   Content length: %d characters\n", len(contentStr))
    if len(contentStr) > 200 {
        fmt.Printf("   First 200 chars: %s...\n", contentStr[:200])
    } else {
        fmt.Printf("   Content: %s\n", contentStr)
    }
} else {
    fmt.Println("⚠️  No content available")
}

📖 Retrieving memory details using GetMemory API:
   Memory ID: c0a0b6fb-73c2-4a77-9c66-968dca22ce59

✅ Successfully retrieved memory:
   Memory ID: c0a0b6fb-73c2-4a77-9c66-968dca22ce59
   Space ID: 50a96fe3-280e-4efe-a4af-adb9103dc827
   Status: PENDING
   Content Type: text/plain
   Created At: 1765502298161
   Updated At: 1765502298161

   📋 Metadata:
      source: sample_documents
      filename: company_handbook.txt
      ingestion_method: single

📖 Retrieving memory with content:
✅ Content retrieved and decoded:
   Content length: 2342 characters
   First 200 chars: ACME Corporation Employee Handbook

Welcome to ACME Corporation! This handbook provides essential information about our company policies, procedures, and culture.

COMPANY OVERVIEW
ACME Corporation is...


In [15]:
// Create the remaining documents using batch memory creation
func createBatchMemories(spaceId string, documents []Document) error {
    var memoryRequests []goodmem_client.MemoryCreationRequest
    
    for _, doc := range documents {
        memoryRequest := goodmem_client.MemoryCreationRequest{
            SpaceId:        spaceId,
            ContentType:    doc.ContentType,
            ChunkingConfig: DEMO_CHUNKING_CONFIG,
            Metadata: map[string]interface{}{
                "filename":         doc.Filename,
                "source":           "sample_documents",
                "ingestion_method": "batch",
            },
        }
        
        // Add content field based on type
        if doc.IsBinary {
            memoryRequest.OriginalContentB64 = *goodmem_client.NewNullableString(&doc.ContentB64)
        } else {
            memoryRequest.OriginalContent = *goodmem_client.NewNullableString(&doc.Content)
        }
        
        memoryRequests = append(memoryRequests, memoryRequest)
    }
    
    // Create batch request
    batchRequest := goodmem_client.BatchMemoryCreationRequest{
        Requests: memoryRequests,
    }
    
    fmt.Printf("📦 Creating %d memories using BatchCreateMemory API:\n", len(memoryRequests))

    client := getClient()
    ctx := context.Background()
    // Execute batch creation
    httpResp, err := client.MemoriesAPI.BatchCreateMemory(ctx).BatchMemoryCreationRequest(batchRequest).Execute()
    if err != nil {
        return fmt.Errorf("batch creation failed: %v (HTTP Status: %d)", err, httpResp.StatusCode)
    }
    
    return nil
}

%%
if len(sampleDocs) > 1 {
    // Create the remaining documents (skip the first one we already created)
    remainingDocs := sampleDocs[1:]
    err := createBatchMemories(demoSpaceId, remainingDocs)
    if err != nil {
        fmt.Printf("⚠️  Batch creation error: %v\n", err)
    }
    
    fmt.Println("\n📋 Total Memory Creation Summary:")
    fmt.Println("   📄 Single CreateMemory: 1 document")
    fmt.Printf("   📦 Batch CreateMemory: %d documents submitted\n", len(remainingDocs))
    fmt.Println("   ⏳ Check processing status in the next cell")
} else {
    fmt.Println("⚠️  Cannot create batch memories: insufficient documents or missing space")
}

📦 Creating 4 memories using BatchCreateMemory API:

📋 Total Memory Creation Summary:
   📄 Single CreateMemory: 1 document
   📦 Batch CreateMemory: 4 documents submitted
   ⏳ Check processing status in the next cell


In [16]:
// List all memories in our space to verify they're ready
%%
client := getClient()
ctx := context.Background()
memoriesResponse, httpResp, err := client.MemoriesAPI.ListMemories(ctx, demoSpaceId).Execute()
if err != nil {
    log.Fatalf("❌ Failed to list memories: %v (HTTP Status: %d)", err, httpResp.StatusCode)
}

memories := memoriesResponse.Memories

fmt.Printf("📚 Memories in space '%s':\n", demoSpaceId)
fmt.Printf("   Total memories: %d\n", len(memories))
fmt.Println()

for i, memory := range memories {
    var filename string
    if memory.Metadata != nil {
        if fn, ok := (memory.Metadata)["filename"]; ok {
            filename = fmt.Sprintf("%v", fn)
        } else {
            filename = "Unknown"
        }
    }
    
    fmt.Printf("   %d. %s\n", i+1, filename)
    fmt.Printf("      Status: %s\n", memory.ProcessingStatus)
    fmt.Printf("      Created: %d\n", memory.CreatedAt)
    
    if memory.Metadata != nil {
        fmt.Println("      Metadata:")
        for key, value := range memory.Metadata {
            fmt.Printf("         %s: %v\n", key, value)
        }
    }
    fmt.Println()
}

📚 Memories in space '50a96fe3-280e-4efe-a4af-adb9103dc827':
   Total memories: 5

   1. security_policy.txt
      Status: PENDING
      Created: 1765502300413
      Metadata:
         ingestion_method: batch
         source: sample_documents
         filename: security_policy.txt

   2. product_faq.txt
      Status: PENDING
      Created: 1765502300413
      Metadata:
         filename: product_faq.txt
         ingestion_method: batch
         source: sample_documents

   3. technical_documentation.txt
      Status: PENDING
      Created: 1765502300413
      Metadata:
         source: sample_documents
         filename: technical_documentation.txt
         ingestion_method: batch

   4. employee_handbook.pdf
      Status: PENDING
      Created: 1765502300413
      Metadata:
         source: sample_documents
         filename: employee_handbook.pdf
         ingestion_method: batch

   5. company_handbook.txt
      Status: PENDING
      Created: 1765502298161
      Metadata:
         sou

In [17]:
// Monitor processing status for all created memories
func waitForProcessingCompletion(spaceId string, maxWaitSeconds int) bool {
    fmt.Println("⏳ Waiting for document processing to complete...")
    fmt.Println("   💡 Note: Batch memories are processed asynchronously, so we check by listing all memories in the space")
    fmt.Println()
    
    startTime := time.Now()
    maxWaitDuration := time.Duration(maxWaitSeconds) * time.Second

    client := getClient()
    ctx := context.Background()
    for time.Since(startTime) < maxWaitDuration {
        memoriesResponse, httpResp, err := client.MemoriesAPI.ListMemories(ctx, spaceId).Execute()
        if err != nil {
            fmt.Printf("❌ Error checking processing status: %v (HTTP Status: %d)\n", err, httpResp.StatusCode)
            return false
        }
        
        memories := memoriesResponse.Memories
        
        // Check processing status
        statusCounts := make(map[string]int)
        for _, memory := range memories {
            statusCounts[memory.ProcessingStatus]++
        }
        
        fmt.Printf("📊 Processing status: %v (Total: %d memories)\n", statusCounts, len(memories))
        
        // Check if all are completed
        allCompleted := true
        for _, memory := range memories {
            if memory.ProcessingStatus != "COMPLETED" {
                allCompleted = false
                break
            }
        }
        
        if allCompleted {
            fmt.Println("✅ All documents processed successfully!")
            return true
        }
        
        // Check for failures
        if failedCount, ok := statusCounts["FAILED"]; ok && failedCount > 0 {
            fmt.Printf("❌ %d memories failed processing\n", failedCount)
            return false
        }
        
        time.Sleep(5 * time.Second)
    }
    
    fmt.Printf("⏰ Timeout waiting for processing (waited %ds)\n", maxWaitSeconds)
    return false
}

%%
processingComplete := waitForProcessingCompletion(demoSpaceId, 120)

if processingComplete {
    fmt.Println("🎉 Ready for semantic search and retrieval!")
    fmt.Println("📈 Batch API benefit: Multiple documents submitted in a single API call")
    fmt.Println("🔧 Consistent chunking: All memories use DEMO_CHUNKING_CONFIG")
} else {
    fmt.Println("⚠️  Some documents may still be processing. You can continue with the tutorial.")
}

⏳ Waiting for document processing to complete...
   💡 Note: Batch memories are processed asynchronously, so we check by listing all memories in the space

📊 Processing status: map[COMPLETED:4 PENDING:1] (Total: 5 memories)
📊 Processing status: map[COMPLETED:5] (Total: 5 memories)
✅ All documents processed successfully!
🎉 Ready for semantic search and retrieval!
📈 Batch API benefit: Multiple documents submitted in a single API call
🔧 Consistent chunking: All memories use DEMO_CHUNKING_CONFIG


## Semantic Search & Retrieval

### Why Semantic Search?

**Traditional keyword search**:
- Matches exact words or simple variations
- Misses conceptually similar content with different wording
- Example: "vacation days" won't match "time off policy"

**Semantic search**:
- Understands meaning and context
- Finds conceptually similar content regardless of exact wording
- Example: "vacation days" successfully matches "time off policy"

### How It Works

```
Query: "vacation policy" 
   ↓ (embed with same embedder)
Query Vector: [0.23, -0.45, ...]
   ↓ (compare to all chunk vectors)
Most Similar Chunks: (by cosine similarity)
   1. "TIME OFF POLICY..." (score: -0.604)
   2. "Vacation requests..." (score: -0.544)
   3. "WORK HOURS..." (score: -0.458)
```

### Understanding Relevance Scores

GoodMem uses **cosine distance** (negative cosine similarity):
- **Lower values = more relevant** (e.g., -0.6 is better than -0.4)
- **Range**: Typically -1.0 (most similar) to 0.0 (unrelated)
- **Good threshold**: Results under -0.3 are usually relevant
- **Context matters**: Exact scores vary by embedder and content

### Streaming API Benefits

GoodMem's streaming API:
- **Real-time results**: Process chunks as they arrive
- **Low latency**: Start showing results immediately
- **Memory efficient**: No need to buffer entire result set
- **Progressive UI**: Update interface as more results come in

### What We'll Do

1. Implement a semantic search function using GoodMem's streaming API
2. Process different event types (chunks, memories, metadata)
3. Display results with relevance scores
4. Test with various queries to see semantic matching in action

Now comes the exciting part! Let's perform semantic search using GoodMem's streaming API. This will:

- **Find relevant chunks** based on semantic similarity
- **Stream results** in real-time
- **Include relevance scores** for ranking
- **Return structured data** for easy processing

In [18]:
// ChunkResult represents a search result chunk
type ChunkResult struct {
    ChunkText      string
    RelevanceScore float64
    MemoryIndex    int32
    ResultSetID    string
    ChunkSequence  int32
}

// Perform semantic search using GoodMem's streaming API
func semanticSearch(query string, spaceId string, maxResults int32) []ChunkResult {
    fmt.Printf("🔍 Searching for: '%s'\n", query)
    fmt.Printf("📁 Space ID: %s\n", spaceId)
    fmt.Printf("📊 Max results: %d\n", maxResults)
    fmt.Println(strings.Repeat("-", 50))

    client := getClient()
    ctx := context.Background()
    // Create streaming client
    streamingClient := goodmem_client.NewStreamingClient(client)
    
    // Create stream request
    streamReq := &goodmem_client.MemoryStreamRequest{
        Message:            query,
        SpaceIDs:           []string{spaceId},
        RequestedSize:      PtrInt32(maxResults),
        FetchMemory:        PtrBool(true),
        FetchMemoryContent: PtrBool(false),
        GenerateAbstract:   PtrBool(false),
        Format:             goodmem_client.FormatNDJSON,
    }
    
    // Perform streaming search
    streamCtx, cancel := context.WithTimeout(ctx, 30*time.Second)
    defer cancel()
    
    stream, err := streamingClient.RetrieveMemoryStream(streamCtx, streamReq)
    if err != nil {
        fmt.Printf("❌ Failed to start streaming: %v\n", err)
        return nil
    }
    
    eventCount := 0
    var retrievedChunks []ChunkResult
    
    for event := range stream {
        eventCount++
        
        if event.RetrievedItem != nil && event.RetrievedItem.Chunk != nil {
            chunkInfo := event.RetrievedItem.Chunk
            chunkData := chunkInfo.Chunk
            
            var chunkText string
            var chunkSeq int32
            
            if text, ok := chunkData["chunkText"]; ok {
                chunkText = fmt.Sprintf("%v", text)
            }
            if seq, ok := chunkData["chunkSequenceNumber"]; ok {
                if seqFloat, ok := seq.(float64); ok {
                    chunkSeq = int32(seqFloat)
                }
            }
            
            result := ChunkResult{
                ChunkText:      chunkText,
                RelevanceScore: chunkInfo.RelevanceScore,
                MemoryIndex:    int32(chunkInfo.MemoryIndex),
                ResultSetID:    chunkInfo.ResultSetId,
                ChunkSequence:  chunkSeq,
            }
            retrievedChunks = append(retrievedChunks, result)
            
            fmt.Printf("📄 Chunk %d:\n", len(retrievedChunks))
            fmt.Printf("   Relevance: %.3f\n", chunkInfo.RelevanceScore)
            displayText := chunkText
            if len(displayText) > 200 {
                displayText = displayText[:200] + "..."
            }
            fmt.Printf("   Text: %s\n", displayText)
            fmt.Println()
        }
    }
    
    fmt.Printf("✅ Search completed: %d chunks found, %d events processed\n", len(retrievedChunks), eventCount)
    return retrievedChunks
}

%%
// Test semantic search with a sample query
sampleQuery := "What is the vacation policy for employees?"
semanticSearch(sampleQuery, demoSpaceId, 5)

🔍 Searching for: 'What is the vacation policy for employees?'
📁 Space ID: 50a96fe3-280e-4efe-a4af-adb9103dc827
📊 Max results: 5
--------------------------------------------------
📄 Chunk 1:
   Relevance: -0.680
   Text: TIME OFF POLICY
All full-time employees receive:
- 15 days of paid vacation annually (increases to 20 days after 3 years)
- 10 sick days per year
- 8 company holidays
- Personal days as needed with ma...

📄 Chunk 2:
   Relevance: -0.675
   Text: 1.  Eligibility 

 
All regular full-time employees are eligible for vacation benefits. 

 
2.  Accrual 

 
Eligible employees accrue vacation in accordance with the following scheduleix: 

 
Years of...

📄 Chunk 3:
   Relevance: -0.662
   Text: [ORGANIZATION] has established the following vacation plan to provide eligible employees 
time off with pay so that they may be free from their regular duties for a period of rest and 
relaxation with...

📄 Chunk 4:
   Relevance: -0.646
   Text: Vacation Pay: Vacation pay shall be based 

In [19]:
// Let's try a few different queries to see how semantic search works
func testMultipleQueries(spaceId string) {
    testQueries := []string{
        "How do I reset my password?",
        "What are the security requirements for remote work?",
        "API authentication and rate limits",
        "Employee benefits and health insurance",
        "How much does the software cost?",
    }
    
    for i, query := range testQueries {
        fmt.Printf("\n🔍 Test Query %d: %s\n", i+1, query)
        fmt.Println(strings.Repeat("=", 60))
        
        semanticSearch(query, spaceId, 3)
        
        fmt.Println("\n" + strings.Repeat("-", 60))
    }
}

%%
testMultipleQueries(demoSpaceId)


🔍 Test Query 1: How do I reset my password?
🔍 Searching for: 'How do I reset my password?'
📁 Space ID: 50a96fe3-280e-4efe-a4af-adb9103dc827
📊 Max results: 3
--------------------------------------------------
📄 Chunk 1:
   Relevance: -0.370
   Text: password they use to gain access to computers or the Internet, as well as any change to 
such password.  Such notice must be made immediately. 

 
4. Compliance

📄 Chunk 2:
   Relevance: -0.363
   Text: - No reuse of last 12 passwords
- Must be changed every 90 days for privileged accounts
- Multi-factor authentication required for all business systems
- Password managers recommended for personal pas...

📄 Chunk 3:
   Relevance: -0.306
   Text: Each classification level has specific handling, storage, and transmission requirements outlined in our data handling procedures.

PASSWORD POLICY
Strong passwords are essential for system security:
-...

✅ Search completed: 3 chunks found, 8 events processed

----------------------------------------

## Advanced Features

Congratulations! 🎉 You've successfully built a semantic search system using GoodMem. Here's what you've accomplished:

### ✅ What You Built
- **Document ingestion pipeline** with automatic chunking and embedding
- **Semantic search system** with relevance scoring
- **Simple Q&A system** using GoodMem's vector capabilities

### 🚀 Next Steps for Advanced Implementation

#### Reranking
Improve search quality by adding a reranking stage. **Rerankers** are specialized models that re-score search results to improve relevance:

- **Two-stage retrieval**: Fast initial retrieval with embeddings, then precise reranking
- **Better relevance**: Rerankers use cross-attention to understand query-document relationships
- **Reduced costs**: Rerank only top-K results instead of entire corpus
- **Voyage AI reranker**: Industry-leading reranking model with state-of-the-art performance

The combination of fast embedding-based retrieval followed by accurate reranking provides the best balance of speed and quality for production RAG systems.

## Configuring a Reranker

To further improve search quality, we can add a **reranker** to our RAG pipeline. While embedders provide fast semantic search, rerankers use more sophisticated models to re-score the top results for better accuracy.

### Why Use Reranking?

1. **Higher Accuracy**: Rerankers use cross-encoder architectures that directly compare queries and documents
2. **Two-Stage Pipeline**: Fast retrieval with embeddings + precise reranking = optimal performance
3. **Cost Effective**: Only rerank top-K results (e.g., top 20) rather than entire corpus

### Voyage AI Reranker

We'll use Voyage AI's `rerank-2.5` model, which provides:
- **State-of-the-art performance** on reranking benchmarks
- **Fast inference** optimized for production use
- **Simple API** that integrates seamlessly with GoodMem

**Note**: You'll need a Voyage AI API key set in your environment variable `VOYAGE_API_KEY`.

In [20]:
// Execute to clear gonb cache on voyageRerankerId
%%
cache.ResetKey("voyageRerankerId")

In [21]:
// Create Voyage AI rerank-2.5 reranker
func createVoyageReranker() string {
    voyageApiKey := getEnv("VOYAGE_API_KEY", "")
    if voyageApiKey == "" {
        fmt.Println("⚠️  VOYAGE_API_KEY environment variable not set")
        fmt.Println("   Please set your Voyage AI API key to create a reranker")
        return ""
    }
    
    client := getClient()
    ctx := context.Background()
    
    // Check if reranker already exists
    existingRerankers, _, _ := client.RerankersAPI.ListRerankers(ctx).Execute()
    for _, reranker := range existingRerankers.Rerankers {
        if reranker.ProviderType == "VOYAGE" && reranker.ModelIdentifier == "rerank-2.5" {
            fmt.Printf("✅ Voyage reranker already exists\n")
            fmt.Printf("   Display Name: %s\n", reranker.DisplayName)
            fmt.Printf("   Reranker ID: %s\n", reranker.RerankerId)
            fmt.Printf("   Model: %s\n", reranker.ModelIdentifier)
            return reranker.RerankerId
        }
    }
    
    // Create new reranker
    fmt.Println("📝 Creating new Voyage AI rerank-2.5 reranker...")
    
    // Create string variables for NullableString fields
    headerName := "Authorization"
    prefix := "Bearer "
    apiPath := "/rerank"
    
    credentials := goodmem_client.EndpointAuthentication{
        Kind: goodmem_client.CREDENTIAL_KIND_API_KEY,
        ApiKey: &goodmem_client.ApiKeyAuth{
            InlineSecret: *goodmem_client.NewNullableString(&voyageApiKey),
            HeaderName:   *goodmem_client.NewNullableString(&headerName),
            Prefix:       *goodmem_client.NewNullableString(&prefix),
        },
    }
    
    rerankerRequest := goodmem_client.RerankerCreationRequest{
        DisplayName:     "Voyage Rerank 2.5",
        ProviderType:    "VOYAGE",
        EndpointUrl:     "https://api.voyageai.com/v1",
        ModelIdentifier: "rerank-2.5",
        ApiPath:         *goodmem_client.NewNullableString(&apiPath),
        Credentials:     &credentials,
    }
    
    newReranker, httpResp, err := client.RerankersAPI.CreateReranker(ctx).RerankerCreationRequest(rerankerRequest).Execute()
    if err != nil {
        fmt.Printf("❌ Failed to create reranker: %v (HTTP Status: %d)\n", err, httpResp.StatusCode)
        return ""
    }
    
    fmt.Printf("✅ Successfully created Voyage reranker\n")
    fmt.Printf("   Display Name: %s\n", newReranker.DisplayName)
    fmt.Printf("   Reranker ID: %s\n", newReranker.RerankerId)
    fmt.Printf("   Model: %s\n", newReranker.ModelIdentifier)
    
    return newReranker.RerankerId
}

var voyageRerankerId = cache.Cache("voyageRerankerId", createVoyageReranker)

📝 Creating new Voyage AI rerank-2.5 reranker...
✅ Successfully created Voyage reranker
   Display Name: Voyage Rerank 2.5
   Reranker ID: 950de400-8c06-44c9-98d9-9d7ce953afd9
   Model: rerank-2.5


## Registering an LLM

The final component in our RAG pipeline is the **LLM (Large Language Model)** - the generation component that creates natural language responses using the retrieved and reranked context.

### Role of LLMs in RAG

After retrieving and reranking relevant chunks, the LLM:
1. **Receives the query** and retrieved context
2. **Generates a response** that synthesizes information from multiple sources
3. **Maintains coherence** while staying grounded in the retrieved facts

### OpenAI GPT-4o-mini

We'll use OpenAI's `gpt-4o-mini` model, which provides:
- **Fast inference** with low latency for real-time applications
- **Cost-effective** pricing compared to larger models
- **High quality** responses suitable for most RAG use cases
- **Function calling** support for advanced workflows

**Note**: This uses the same `OPENAI_API_KEY` environment variable as the embedder.

In [22]:
// Execute to clear gonb cache on openaiLlmId
%%
cache.ResetKey("openaiLlmId")

In [23]:
// Register OpenAI GPT-4o-mini LLM
func createOpenAILlm() string {
    openaiApiKey := getEnv("OPENAI_API_KEY", "")
    if openaiApiKey == "" {
        fmt.Println("⚠️  OPENAI_API_KEY environment variable not set")
        fmt.Println("   Please set your OpenAI API key to register an LLM")
        return ""
    }
    
    client := getClient()
    ctx := context.Background()
    
    // Check if LLM already exists
    existingLlms, _, _ := client.LLMsAPI.ListLLMs(ctx).Execute()
    for _, llm := range existingLlms.Llms {
        if llm.ProviderType == "OPENAI" && llm.ModelIdentifier == "gpt-4o-mini" {
            fmt.Printf("✅ OpenAI LLM already exists\n")
            fmt.Printf("   Display Name: %s\n", llm.DisplayName)
            fmt.Printf("   LLM ID: %s\n", llm.LlmId)
            fmt.Printf("   Model: %s\n", llm.ModelIdentifier)
            return llm.LlmId
        }
    }
    
    // Create new LLM
    fmt.Println("📝 Registering new OpenAI GPT-4o-mini LLM...")
    
    // Create string variables for NullableString fields
    headerName := "Authorization"
    prefix := "Bearer "
    apiPath := "/chat/completions"
    
    credentials := goodmem_client.EndpointAuthentication{
        Kind: goodmem_client.CREDENTIAL_KIND_API_KEY,
        ApiKey: &goodmem_client.ApiKeyAuth{
            InlineSecret: *goodmem_client.NewNullableString(&openaiApiKey),
            HeaderName:   *goodmem_client.NewNullableString(&headerName),
            Prefix:       *goodmem_client.NewNullableString(&prefix),
        },
    }
    
    llmRequest := goodmem_client.LLMCreationRequest{
        DisplayName:     "OpenAI GPT-4o Mini",
        ProviderType:    "OPENAI",
        EndpointUrl:     "https://api.openai.com/v1",
        ModelIdentifier: "gpt-4o-mini",
        ApiPath:         *goodmem_client.NewNullableString(&apiPath),
        Capabilities:  goodmem_client.LLMCapabilities{
            SupportsChat:               *goodmem_client.NewNullableBool(PtrBool(true)),
            SupportsCompletion:         *goodmem_client.NewNullableBool(PtrBool(false)),
            SupportsFunctionCalling:    *goodmem_client.NewNullableBool(PtrBool(true)),
            SupportsSystemMessages:     *goodmem_client.NewNullableBool(PtrBool(true)),
            SupportsStreaming:          *goodmem_client.NewNullableBool(PtrBool(true)),
            SupportsSamplingParameters: *goodmem_client.NewNullableBool(PtrBool(true)),
        },
        Credentials: &credentials,
    }
    
    newLlm, httpResp, err := client.LLMsAPI.CreateLLM(ctx).LLMCreationRequest(llmRequest).Execute()
    if err != nil {
        fmt.Printf("❌ Failed to register LLM: %v (HTTP Status: %d)\n", err, httpResp.StatusCode)
        return ""
    }
    
    fmt.Printf("✅ Successfully registered OpenAI LLM\n")
    fmt.Printf("   Display Name: %s\n", newLlm.Llm.DisplayName)
    fmt.Printf("   LLM ID: %s\n", newLlm.Llm.LlmId)
    fmt.Printf("   Model: %s\n", newLlm.Llm.ModelIdentifier)
    
    return newLlm.Llm.LlmId
}

var openaiLlmId = cache.Cache("openaiLlmId", createOpenAILlm)

📝 Registering new OpenAI GPT-4o-mini LLM...
✅ Successfully registered OpenAI LLM
   Display Name: OpenAI GPT-4o Mini
   LLM ID: 49095c98-e6a0-4d0c-a25c-dfa946f5da49
   Model: gpt-4o-mini


## Enhanced RAG with Reranking and LLM Generation

Now that we have all the components configured (embedder, reranker, and LLM), let's use the complete RAG pipeline! This demonstrates the full power of GoodMem:

1. **Retrieval**: Fast semantic search finds relevant chunks
2. **Reranking**: Voyage AI reranker re-scores results for better relevance  
3. **Generation**: OpenAI GPT-4o-mini generates a coherent response using the reranked context

This provides significantly better answer quality compared to simple retrieval alone.

In [24]:
// RagResult represents the complete RAG response
type RagResult struct {
    Answer         string
    RetrievedItems []RetrievedItem
    EventCount     int
}

type RetrievedItem struct {
    ChunkText      string
    RelevanceScore float64
    MemoryIndex    int32
}

// Complete RAG pipeline with streaming
func ragPipelineStreaming(query string, spaceId string, rerankerId string, llmId string, maxResults int32) *RagResult {
    if voyageRerankerId == "" {
        fmt.Println("⚠️  Reranker not available")
        return nil
    }
    if openaiLlmId == "" {
        fmt.Println("⚠️  LLM not available")
        return nil
    }
    
    fmt.Printf("🔍 Query: %s\n", query)
    fmt.Println(strings.Repeat("=", 60))
    
    client := getClient()
    ctx := context.Background()
    streamingClient := goodmem_client.NewStreamingClient(client)
    
    // Create advanced streaming request with reranker and LLM
    streamReq := &goodmem_client.AdvancedMemoryStreamRequest{
        Message:            query,
        SpaceIDs:           []string{spaceId},
        RequestedSize:      PtrInt32(maxResults),
        FetchMemory:        PtrBool(true),
        FetchMemoryContent: PtrBool(false),
        Format:             goodmem_client.FormatNDJSON,
        PostProcessorName:   "com.goodmem.retrieval.postprocess.ChatPostProcessorFactory",
        PostProcessorConfig: map[string]interface{}{
            "llm_id":              openaiLlmId,
            "reranker_id":         voyageRerankerId,
            "relevance_threshold": 0.3,
            "max_results":         maxResults,
        },
    }
    
    // Start streaming
    streamCtx, cancel := context.WithTimeout(ctx, 60*time.Second)
    defer cancel()
    
    stream, err := streamingClient.RetrieveMemoryStreamAdvanced(streamCtx, streamReq)
    if err != nil {
        fmt.Printf("❌ Error during RAG: %v\n", err)
        return nil
    }
    
    result := &RagResult{
        RetrievedItems: []RetrievedItem{},
    }
    
    fmt.Println("\n📝 Generated Answer (streaming):")
    fmt.Println(strings.Repeat("-", 60))
    
    for event := range stream {
        result.EventCount++
        
        // Process retrieved chunks
        if event.RetrievedItem != nil && event.RetrievedItem.Chunk != nil {
            chunkInfo := event.RetrievedItem.Chunk
            chunkData := chunkInfo.Chunk
            
            var chunkText string
            if text, ok := chunkData["chunkText"]; ok {
                chunkText = fmt.Sprintf("%v", text)
            }
            
            result.RetrievedItems = append(result.RetrievedItems, RetrievedItem{
                ChunkText:      chunkText,
                RelevanceScore: chunkInfo.RelevanceScore,
                MemoryIndex:    int32(chunkInfo.MemoryIndex),
            })

            fmt.Printf("    Chunk %d: \n", len(result.RetrievedItems))
            fmt.Printf("    Relevance: %.3f\n", chunkInfo.RelevanceScore)
            displayText := chunkText
            if len(displayText) > 150 {
                displayText = displayText[:150] + "..."
            }
            fmt.Printf("    Text: %s\n\n", displayText)
        }
        
        // Process LLM response chunks
        if event.AbstractReply != nil {
            result.Answer = event.AbstractReply.GetText()
            fmt.Printf("\n    LLM Generated Response:\n %s\n", result.Answer)
        }
    }
    
    fmt.Println("\n" + strings.Repeat("-", 60))
    fmt.Printf("\n✅ RAG Pipeline completed\n")
    fmt.Printf("   Retrieved items: %d\n", len(result.RetrievedItems))
    fmt.Printf("   Events processed: %d\n", result.EventCount)
    
    return result
}

%%
// Test the complete RAG pipeline
testQuery := "What is the vacation policy for employees?"
fmt.Printf("🚀 Testing Enhanced RAG Pipeline\n")
fmt.Printf("   Query: %s\n", testQuery)
fmt.Printf("   Using reranker: %s\n", voyageRerankerId)
fmt.Printf("   Using LLM: %s\n", openaiLlmId)
fmt.Println()

ragPipelineStreaming(
    testQuery,
    demoSpaceId,
    voyageRerankerId,
    openaiLlmId,
    10,
)

🚀 Testing Enhanced RAG Pipeline
   Query: What is the vacation policy for employees?
   Using reranker: 950de400-8c06-44c9-98d9-9d7ce953afd9
   Using LLM: 49095c98-e6a0-4d0c-a25c-dfa946f5da49

🔍 Query: What is the vacation policy for employees?

📝 Generated Answer (streaming):
------------------------------------------------------------
    Chunk 1: 
    Relevance: 0.863
    Text: TIME OFF POLICY
All full-time employees receive:
- 15 days of paid vacation annually (increases to 20 days after 3 years)
- 10 sick days per year
- 8 ...

    Chunk 2: 
    Relevance: 0.730
    Text: Vacation requests should be submitted at least 2 weeks in advance through the HR portal. Sick leave can be used for personal illness or to care for im...

    Chunk 3: 
    Relevance: 0.824
    Text: [ORGANIZATION] has established the following vacation plan to provide eligible employees 
time off with pay so that they may be free from their regula...

    Chunk 4: 
    Relevance: 0.777
    Text: employees can us

## 🎉 Congratulations! What You Built

You've successfully built a complete **Retrieval-Augmented Generation (RAG) system** using GoodMem! Let's recap what you accomplished.

### Components You Configured

| Component | Purpose | Function |
|-----------|---------|----------|
| **Embedder** | Convert text to vectors | Transform documents into semantic embeddings |
| **Space** | Organize and store documents | Logical container with chunking configuration |
| **Memories** | Store searchable content | Documents chunked and indexed for retrieval |
| **Reranker** | Improve search precision | Re-score results for better relevance |
| **LLM** | Generate natural language | Create coherent answers from retrieved context |

### The Complete RAG Pipeline

```
📄 Documents
   ↓ Chunking (256 chars, 25 overlap)
   ↓ Embedding (convert to vectors)
🗄️  Vector Storage (GoodMem Space)
   ↓ 
🔍 User Query
   ↓ Semantic Search (retrieve top-K)
   ↓ Reranking (re-score for precision)
   ↓ Context Selection (most relevant chunks)
🤖 LLM Generation (synthesize answer)
   ↓
✨ Natural Language Answer
```

### Key Concepts You Learned

1. **Embedders**: Transform text into semantic vectors for similarity search
2. **Spaces**: Logical containers for organizing and searching documents
3. **Chunking**: Breaking documents into optimal sizes for retrieval
4. **Semantic Search**: Finding conceptually similar content, not just keyword matches
5. **Reranking**: Two-stage retrieval for better precision
6. **Streaming API**: Real-time, memory-efficient result processing
7. **RAG Architecture**: Combining retrieval and generation for accurate, grounded responses

### Performance Improvements

**Basic search** (retrieval only):
- Fast retrieval using vector similarity
- Good recall, but may include less relevant results

**Enhanced RAG** (with reranker + LLM):
- Reranker improves precision significantly
- LLM synthesizes information from multiple chunks
- Better user experience with natural language answers
- Grounded in actual document content (no hallucinations)

### Next Steps & Advanced Topics

**Enhance Your RAG System**:
- **Multiple embedders**: Combine different embedders for better coverage
- **Custom chunking**: Tune chunk size/overlap for your content type
- **Metadata filtering**: Add filters to narrow search by document type, date, etc.
- **Hybrid search**: Combine semantic and keyword search
- **Context augmentation**: Include surrounding chunks for better LLM context

**Production Deployment**:
- **Monitoring**: Track query latency, relevance scores, user feedback
- **Scaling**: Horizontal scaling for high-traffic applications
- **Cost optimization**: Balance quality vs. API costs
- **Caching**: Cache frequent queries for faster responses
- **Error handling**: Robust exception handling and retry logic

**Advanced Features**:
- **Multi-space search**: Query across multiple knowledge bases
- **Query expansion**: Transform queries for better retrieval
- **Result aggregation**: Combine and deduplicate results
- **Streaming generation**: Progressive LLM responses for real-time UX
- **Fine-tuning**: Customize models for your specific domain

### Resources

- **Documentation**: [https://docs.goodmem.ai](https://docs.goodmem.ai)
- **Community**: Join discussions and share your implementations
- **Examples**: Explore more advanced use cases and patterns

---

**Great job!** You now have a solid foundation for building production RAG systems with GoodMem. 🚀
