A text RAG (Retrieval Augmented Generation) System based on Redis with REST API endpoints and MCP (Model Context Protocol) server support.
VectorMind is a lightweight vector database service that provides semantic search capabilities using Redis as the backend storage. It creates embeddings from text content and enables similarity-based search operations.
- Dual Interface: Exposes both REST API (port 8080) and MCP server (port 9090) for flexibility
- Vector Storage: Uses Redis with HNSW (Hierarchical Navigable Small World) indexing for efficient similarity search
- Embedding Support: For example: creates embeddings using the
ai/mxbai-embed-largemodel - Document Management: Store documents with optional labels and metadata
- Document Chunking: Automatically split long documents into overlapping chunks for better semantic search
- Similarity Search: Find similar documents based on text queries with configurable distance thresholds and label filtering
VectorMind consists of:
- Redis Server: Stores embeddings and provides vector search capabilities via RediSearch
- VectorMind Service: Go application that handles embedding generation and exposes APIs
- Embedding Model: For example, it uses
ai/mxbai-embed-largemodel for text embeddings (configurable)
graph TD
CLIENT1[Client REST API]:::client
CLIENT2[Client MCP Protocol]:::client
CLIENT1 -->|HTTP POST<br/>/embeddings<br/>/search<br/>/chunk-and-store| API
CLIENT2 -->|MCP Protocol| MCP
subgraph "Docker Compose Environment"
subgraph "VectorMind Service - Ports: 9090, 8080"
MCP[MCP Server<br/>Port: 9090]:::mcpserver
API[REST API<br/>Port: 8080]:::restapi
VM[VectorMind Container]:::vectormind
MCP --> VM
API --> VM
end
VM -.->|MODEL_RUNNER_BASE_URL| MODEL
subgraph "AI Model Layer"
MODEL[Embedding Model<br/>ai/mxbai-embed-large]:::model
end
VM -.->|REDIS_ADDRESS<br/>depends_on| REDIS
subgraph "Storage Layer"
REDIS[(Redis Server<br/>Port: 6379)]:::redis
DATA[(/data Volume)]:::volume
REDIS --> DATA
end
end
classDef vectormind fill:#4A90E2,stroke:#2E5C8A,stroke-width:2px,color:#fff
classDef redis fill:#DC382D,stroke:#A72822,stroke-width:2px,color:#fff
classDef model fill:#10B981,stroke:#059669,stroke-width:2px,color:#fff
classDef mcpserver fill:#8B5CF6,stroke:#6D28D9,stroke-width:2px,color:#fff
classDef restapi fill:#F59E0B,stroke:#D97706,stroke-width:2px,color:#fff
classDef client fill:#6B7280,stroke:#4B5563,stroke-width:2px,color:#fff
classDef volume fill:#EC4899,stroke:#BE185D,stroke-width:2px,color:#fff
- Docker, Docker Model Runner and Docker Agentic Compose
- Using Docker Compose (recommended):
Create a compose.yml file with the following content:
services:
redis-server:
image: redis:8.2.3-alpine3.22
environment:
- REDIS_ARGS=--save 30 1
ports:
- 6379:6379
volumes:
- ./data:/data
vectormind-tests:
image: k33g/vectormind:0.0.3
ports:
- 9090:9090
- 8080:8080
environment:
REDIS_INDEX_NAME: vectormind_index
REDIS_ADDRESS: redis-server:6379
REDIS_PASSWORD: ""
MCP_HTTP_PORT: 9090
API_REST_PORT: 8080
models:
embedding-model:
endpoint_var: MODEL_RUNNER_BASE_URL
model_var: EMBEDDING_MODEL
depends_on:
redis-server:
condition: service_started
models:
embedding-model:
model: ai/mxbai-embed-largeRun the following command to start VectorMind:
docker compose up -dThis will start:
- Redis server on port
6379 - VectorMind MCP server on port
9090 - VectorMind REST API on port
8080
- Environment Variables:
The compose file automatically configures:
REDIS_INDEX_NAME: vectormind_indexREDIS_ADDRESS: redis-server:6379MCP_HTTP_PORT: 9090API_REST_PORT: 8080MODEL_RUNNER_BASE_URL: Set via models configurationEMBEDDING_MODEL: ai/mxbai-embed-large
Check if VectorMind is running:
curl http://localhost:8080/healthExpected response:
{
"status": "healthy",
"server": "mcp-vectormind-server"
}Get information about the embedding model being used:
curl http://localhost:8080/embedding-model-infoResponse:
{
"success": true,
"model_id": "ai/mxbai-embed-large",
"dimension": 1024
}This endpoint returns:
success: Boolean indicating if the request was successfulmodel_id: The identifier of the embedding model being useddimension: The dimension of the embedding vectors
Store text content with optional labels and metadata:
curl -X POST http://localhost:8080/embeddings \
-H "Content-Type: application/json" \
-d '{
"content": "Squirrels run in the forest",
"label": "animals",
"metadata": "id=animals_1"
}'
curl -X POST http://localhost:8080/embeddings \
-H "Content-Type: application/json" \
-d '{
"content": "Birds fly in the sky",
"label": "animals",
"metadata": "id=animals_2"
}'
curl -X POST http://localhost:8080/embeddings \
-H "Content-Type: application/json" \
-d '{
"content": "Frogs swim in the pond",
"label": "animals",
"metadata": "id=animals_3"
}'
curl -X POST http://localhost:8080/embeddings \
-H "Content-Type: application/json" \
-d '{
"content": "Fishes swim in the sea",
"label": "animals",
"metadata": "id=animals_4"
}'Response:
{"id":"doc:b1c36710-9d94-41cb-abfc-aa404b896d1f","content":"Squirrels run in the forest","label":"animals","metadata":"id=animals_1","created_at":"2025-11-09T08:36:01.962629337Z","success":true}
{"id":"doc:fbc259cc-eb8d-425e-a444-d4f5b26400cb","content":"Birds fly in the sky","label":"animals","metadata":"id=animals_2","created_at":"2025-11-09T08:36:02.093359462Z","success":true}
{"id":"doc:0154fc6d-887b-4af2-a5a7-c8b37183554f","content":"Frogs swim in the pond","label":"animals","metadata":"id=animals_3","created_at":"2025-11-09T08:36:02.247079753Z","success":true}
{"id":"doc:3953dfdd-2a92-48de-b61b-0119c9d106fc","content":"Fishes swim in the sea","label":"animals","metadata":"id=animals_4","created_at":"2025-11-09T08:36:02.367855295Z","success":true}Find documents similar to a query text:
curl -X POST http://localhost:8080/search \
-H "Content-Type: application/json" \
-d '{
"text": "Which animals swim?",
"max_count": 3,
"distance_threshold": 0.7
}'
curl -X POST http://localhost:8080/search \
-H "Content-Type: application/json" \
-d '{
"text": "Where are the squirrels?",
"max_count": 3,
"distance_threshold": 0.7
}'
curl -X POST http://localhost:8080/search \
-H "Content-Type: application/json" \
-d '{
"text": "What can be found in the pond?",
"max_count": 3,
"distance_threshold": 0.7
}'Response:
{"results":[{"id":"doc:050c7cee-5891-4052-a3c9-40f2bd3abff7","content":"Fishes swim in the sea","distance":0.5175167322158813},{"id":"doc:efe2868d-3330-452c-ac2a-0e835caecdc9","content":"Frogs swim in the pond","distance":0.6700224280357361}],"success":true}
{"results":[{"id":"doc:14e7a8fb-78e5-4fe7-8969-7559b7cd9752","content":"Squirrels run in the forest","distance":0.48874980211257935}],"success":true}
{"results":[{"id":"doc:efe2868d-3330-452c-ac2a-0e835caecdc9","content":"Frogs swim in the pond","distance":0.6417693495750427}],"success":true}Parameters:
text(required): The search querymax_count(optional): Maximum number of results (default: 5)distance_threshold(optional): Maximum distance to filter results (lower = more similar)
curl -X POST http://localhost:8080/search_with_label \
-H "Content-Type: application/json" \
-d '{
"text": "What lives in the forest?",
"label": "animals",
"max_count": 5,
"distance_threshold": 0.8
}'Parameters:
text(required): The search querylabel(required): The label to filter results bymax_count(optional): Maximum number of results (default: 5)distance_threshold(optional): Maximum distance to filter results (lower = more similar)
Chunk a long document into smaller pieces with overlap and store all chunks with the same label and metadata:
# Read the document content and escape it for JSON
DOCUMENT_CONTENT=$(cat document.md | jq -Rs .)
curl -X POST http://localhost:8080/chunk-and-store \
-H "Content-Type: application/json" \
-d "{
\"document\": ${DOCUMENT_CONTENT},
\"label\": \"my-label\",
\"metadata\": \"category=documentation\",
\"chunk_size\": 1024,
\"overlap\": 256
}"Parameters:
document(required): The document content to chunk and storelabel(optional): Label to apply to all chunksmetadata(optional): Metadata to apply to all chunkschunk_size(required): Size of each chunk in characters (must be β€ embedding dimension)overlap(required): Number of characters to overlap between chunks (must be < chunk_size)
Response:
{
"success": true,
"chunk_ids": ["doc:uuid-1", "doc:uuid-2", "doc:uuid-3"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}This endpoint is useful for:
- Processing long documents that exceed embedding model limits
- Creating overlapping chunks for better context preservation
- Batch storing multiple chunks with consistent labeling
Split a markdown document by sections (headers like #, ##, ###) and store all sections with embeddings. Sections larger than embedding dimension are automatically subdivided while preserving the section header:
# Read the markdown document and escape it for JSON
MARKDOWN_CONTENT=$(cat document.md | jq -Rs .)
curl -X POST http://localhost:8080/split-and-store-markdown-sections \
-H "Content-Type: application/json" \
-d "{
\"document\": ${MARKDOWN_CONTENT},
\"label\": \"documentation\",
\"metadata\": \"project=vectormind\"
}"Parameters:
document(required): The markdown document content to split and storelabel(optional): Label to apply to all sections/chunksmetadata(optional): Metadata to apply to all sections/chunks
Response:
{
"success": true,
"chunk_ids": ["doc:uuid-1", "doc:uuid-2", "doc:uuid-3"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}How it works:
- Splits the markdown document by headers (# ## ### etc.)
- Each section is stored as a separate chunk
- If a section exceeds the embedding dimension, it is automatically subdivided
- Important: When subdivided, each sub-chunk (except the first) will have the section header prepended to preserve context
- All chunks share the same label and metadata
Example: If a section "## Introduction to Vectors" is 3000 characters long and exceeds the embedding dimension (1024), it will be split into 3 sub-chunks:
## Introduction to Vectors\n\n[first 1024 chars of content]## Introduction to Vectors\n\n[next 1024 chars of content]## Introduction to Vectors\n\n[remaining content]
This endpoint is useful for:
- Processing structured markdown documentation
- Preserving semantic context through section headers
- Automatic handling of large sections without manual chunking
Split a document by a custom delimiter and store all chunks with embeddings. Chunks larger than embedding dimension are automatically subdivided while preserving the first 2 non-empty lines as context:
# Read the document and escape it for JSON
DOCUMENT_CONTENT=$(cat startrek.txt | jq -Rs .)
curl -X POST http://localhost:8080/split-and-store-with-delimiter \
-H "Content-Type: application/json" \
-d "{
\"document\": ${DOCUMENT_CONTENT},
\"delimiter\": \"-----\",
\"label\": \"star-trek-diseases\",
\"metadata\": \"source=federation-medical-database\"
}"Parameters:
document(required): The document content to split and storedelimiter(required): The delimiter used to split the document (e.g., "-----", "###", etc.)label(optional): Label to apply to all chunksmetadata(optional): Metadata to apply to all chunks
Response:
{
"success": true,
"chunk_ids": ["doc:uuid-1", "doc:uuid-2", "doc:uuid-3"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}How it works:
- Splits the document by the specified delimiter
- Each chunk is stored as a separate document
- If a chunk exceeds the embedding dimension, it is automatically subdivided
- Important: When subdivided, the first 2 non-empty lines of the original chunk are prepended to each sub-chunk (except the first) to preserve context
- All chunks share the same label and metadata
Example: If a chunk starts with:
Disease: Andorian Ice Plague
Provenance: Andoria, Andorian Empire
and exceeds the embedding dimension, it will be split into sub-chunks where each sub-chunk (except the first) will start with:
Disease: Andorian Ice Plague
Provenance: Andoria, Andorian Empire
[remaining content]
This endpoint is useful for:
- Processing structured data with custom delimiters
- Maintaining document context through key identifying lines
- Working with datasets that use consistent separators (CSV-like, log files, etc.)
- Automatic handling of large records without manual chunking
Split a markdown document by headers while preserving hierarchical context. Each chunk includes structured metadata with TITLE, HIERARCHY, and CONTENT fields. Chunks larger than embedding dimension are automatically subdivided:
# Read the markdown document and escape it for JSON
MARKDOWN_CONTENT=$(cat document.md | jq -Rs .)
curl -X POST http://localhost:8080/split-and-store-markdown-with-hierarchy \
-H "Content-Type: application/json" \
-d "{
\"document\": ${MARKDOWN_CONTENT},
\"label\": \"documentation\",
\"metadata\": \"project=vectormind\"
}"Parameters:
document(required): The markdown document content to split and storelabel(optional): Label to apply to all chunksmetadata(optional): Metadata to apply to all chunks
Response:
{
"success": true,
"chunk_ids": ["doc:uuid-1", "doc:uuid-2", "doc:uuid-3"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}How it works:
- Parses the markdown document and extracts headers with their hierarchical relationships
- Each chunk is formatted with:
TITLE:The header prefix (e.g.,##) and titleHIERARCHY:The full hierarchical path (e.g.,Introduction > Getting Started > Installation)CONTENT:The section content
- If a chunk exceeds the embedding dimension, it is automatically subdivided
- All chunks share the same label and metadata
Example chunk format:
TITLE: ## Installation
HIERARCHY: Getting Started > Installation
CONTENT: To install VectorMind, follow these steps...
Use cases:
- Processing documentation with deep hierarchical structure
- Maintaining semantic context through parent-child relationships
- Searching within specific document hierarchies
- Preserving document navigation structure in vector databases
Note: This feature is experimental and the chunk format may change in future versions.
VectorMind exposes the following MCP tools:
Provides information about the VectorMind MCP server.
Parameters: None
Create and store an embedding from text content with optional label and metadata.
Parameters:
content(required): The text content to create an embedding fromlabel(optional): Label/tag for the documentmetadata(optional): Metadata for the document
Returns: JSON object with document ID, content, label, metadata, and creation timestamp
Search for similar documents based on text query. Returns documents ordered by similarity (closest first).
Parameters:
text(required): The text query to search for similar documentsmax_count(optional): Maximum number of results to return (default: 1)distance_threshold(optional): Only returns documents with distance <= threshold
Returns: JSON object with array of matching documents including ID, content, label, metadata, distance, and created_at
Get information about the embedding model being used, including the model ID and dimension.
Parameters: None
Returns: JSON object with:
model_id: The identifier of the embedding model being useddimension: The dimension of the embedding vectors
Example response:
{
"model_id": "ai/mxbai-embed-large",
"dimension": 1024
}Search for similar documents filtered by label. Returns documents ordered by similarity (closest first).
Parameters:
text(required): The text query to search for similar documentslabel(required): The label to filter documents bymax_count(optional): Maximum number of results to return (default: 1)distance_threshold(optional): Only returns documents with distance <= threshold
Returns: JSON object with array of matching documents including ID, content, label, metadata, distance, and created_at
Chunk a document into smaller pieces with overlap and store all chunks with embeddings. All chunks will share the same label and metadata.
Parameters:
document(required): The document content to chunk and storelabel(optional): Label to apply to all chunksmetadata(optional): Metadata to apply to all chunkschunk_size(required): Size of each chunk in characters (must be β€ embedding dimension)overlap(required): Number of characters to overlap between consecutive chunks (must be < chunk_size)
Returns: JSON object with:
success: Boolean indicating if the operation was successfulchunk_ids: Array of document IDs for all stored chunkschunks_stored: Number of chunks that were storedcreated_at: Timestamp of when the chunks were created
Example response:
{
"success": true,
"chunk_ids": ["doc:abc-123", "doc:def-456", "doc:ghi-789"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}Use cases:
- Processing long documents that exceed embedding model limits
- Creating overlapping chunks for better semantic search
- Batch importing large text files with consistent metadata
Split a markdown document by sections (headers like #, ##, ###) and store all sections with embeddings. Sections larger than embedding dimension are automatically subdivided while preserving the section header.
Parameters:
document(required): The markdown document content to split and storelabel(optional): Label to apply to all sections/chunksmetadata(optional): Metadata to apply to all sections/chunks
Returns: JSON object with:
success: Boolean indicating if the operation was successfulchunk_ids: Array of document IDs for all stored chunkschunks_stored: Number of chunks that were storedcreated_at: Timestamp of when the chunks were created
Example response:
{
"success": true,
"chunk_ids": ["doc:abc-123", "doc:def-456", "doc:ghi-789"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}How it works:
- Splits the markdown document by headers (# ## ### etc.)
- Each section is stored as a separate chunk
- If a section exceeds the embedding dimension, it is automatically subdivided
- Important: When subdivided, each sub-chunk (except the first) will have the section header prepended to preserve context
- All chunks share the same label and metadata
Example: If a section "## Deep Dive into the Monsters of Aethelgard" is too large, it will be split into multiple sub-chunks, each starting with "## Deep Dive into the Monsters of Aethelgard" to maintain semantic context during similarity searches.
Use cases:
- Processing structured markdown documentation
- Preserving semantic context through section headers
- Automatic handling of large sections without manual chunking
- Maintaining document structure in vector databases
Split a document by a custom delimiter and store all chunks with embeddings. Chunks larger than embedding dimension are automatically subdivided while preserving the first 2 non-empty lines as context.
Parameters:
document(required): The document content to split and storedelimiter(required): The delimiter used to split the document (e.g., "-----", "###", etc.)label(optional): Label to apply to all chunksmetadata(optional): Metadata to apply to all chunks
Returns: JSON object with:
success: Boolean indicating if the operation was successfulchunk_ids: Array of document IDs for all stored chunkschunks_stored: Number of chunks that were storedcreated_at: Timestamp of when the chunks were created
Example response:
{
"success": true,
"chunk_ids": ["doc:abc-123", "doc:def-456", "doc:ghi-789"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}How it works:
- Splits the document by the specified delimiter
- Each chunk is stored as a separate document
- If a chunk exceeds the embedding dimension, it is automatically subdivided
- Important: When subdivided, the first 2 non-empty lines of the original chunk are prepended to each sub-chunk (except the first) to preserve context
- All chunks share the same label and metadata
Example: Perfect for processing structured data like the Star Trek Federation Medical Database where each disease record is separated by "-----" and starts with identifying information like "Disease: Name" and "Provenance: Location".
Use cases:
- Processing structured data with custom delimiters
- Maintaining document context through key identifying lines
- Working with datasets that use consistent separators
- Medical databases, catalogs, or any structured text data
- Automatic handling of large records without manual chunking
Split a markdown document by headers while preserving hierarchical context. Each chunk includes structured metadata with TITLE, HIERARCHY, and CONTENT fields. Chunks larger than embedding dimension are automatically subdivided.
Parameters:
document(required): The markdown document content to split and storelabel(optional): Label to apply to all chunksmetadata(optional): Metadata to apply to all chunks
Returns: JSON object with:
success: Boolean indicating if the operation was successfulchunk_ids: Array of document IDs for all stored chunkschunks_stored: Number of chunks that were storedcreated_at: Timestamp of when the chunks were created
Example response:
{
"success": true,
"chunk_ids": ["doc:abc-123", "doc:def-456", "doc:ghi-789"],
"chunks_stored": 3,
"created_at": "2025-11-30T10:30:00Z"
}How it works:
- Parses the markdown document and extracts headers with their hierarchical relationships
- Each chunk is formatted with:
TITLE:The header prefix (e.g.,##) and titleHIERARCHY:The full hierarchical path (e.g.,Introduction > Getting Started > Installation)CONTENT:The section content
- If a chunk exceeds the embedding dimension, it is automatically subdivided
- All chunks share the same label and metadata
Example chunk format:
TITLE: ## Installation
HIERARCHY: Getting Started > Installation
CONTENT: To install VectorMind, follow these steps...
Use cases:
- Processing documentation with deep hierarchical structure
- Maintaining semantic context through parent-child relationships
- Searching within specific document hierarchies
- Preserving document navigation structure in vector databases
Note: This feature is experimental and the chunk format may change in future versions.
import OpenAI from "openai";
// OpenAI Client
const openai = new OpenAI({
baseURL: "http://localhost:12434/engines/v1",
apiKey: "i-love-docker-model-runner",
});
const VECTORMIND_API = "http://localhost:8080";
const chunks = [
`# Orcs
Orcs are savage, brutish humanoids with dark green skin and prominent tusks.
These fierce warriors inhabit dense forests where they hunt in packs,
using crude but effective weapons forged from scavenged metal and bone.
Their tribal society revolves around strength and combat prowess,
making them formidable opponents for any adventurer brave enough to enter their woodland domain.`,
`# Dragons
Dragons are magnificent and ancient creatures of immense power, soaring through the skies on massive wings.
These intelligent beings possess scales that shimmer like precious metals and breathe devastating elemental attacks.
Known for their vast hoards of treasure and centuries of accumulated knowledge,
dragons command both fear and respect throughout the realm.
Their aerial dominance makes them nearly untouchable in their celestial domain.`,
`# Goblins
Goblins are small, cunning creatures with mottled green skin and sharp, pointed ears.
Despite their diminutive size, they are surprisingly agile swimmers who have adapted to life around ponds and marshlands.
These mischievous beings are known for their quick wit and tendency to play pranks on unwary travelers.
They build elaborate underwater lairs connected by hidden tunnels beneath the murky pond waters.`,
`# Krakens
Krakens are colossal sea monsters with massive tentacles that can crush entire ships with ease.
These legendary creatures dwell in the deepest ocean trenches, surfacing only to hunt or when disturbed.
Their intelligence rivals that of the wisest sages, and their tentacles can stretch for hundreds of feet.
Sailors speak in hushed tones of these maritime titans, whose very presence can create devastating whirlpools
and tidal waves that reshape entire coastlines.`,
];
// Function to create embeddings
async function createEmbedding(content, label = "", metadata = "") {
const response = await fetch(`${VECTORMIND_API}/embeddings`, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
content,
label,
metadata,
}),
});
return await response.json();
}
// Function to search for similar documents
async function searchSimilar(text, maxCount = 5, distanceThreshold = 0.7) {
const response = await fetch(`${VECTORMIND_API}/search`, {
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify({
text,
max_count: maxCount,
distance_threshold: distanceThreshold,
}),
});
return await response.json();
}
let userInput = "Tell me something about the dragons";
try {
// Create embeddings from chunks
console.log("Creating embeddings...\n");
for (const chunk of chunks) {
const result = await createEmbedding(chunk, "fantasy-creatures", "");
console.log("Created embedding:", result);
}
// Search for similar documents
console.log("\n\nSearching for similar documents...\n");
const searchResult = await searchSimilar(userInput, 1, 0.7);
console.log("Search results:\n", JSON.stringify(searchResult, null, 2));
const documents = searchResult.results.map(r => r.content).join("\n");
const completion = await openai.chat.completions.create({
model: "hf.co/menlo/jan-nano-gguf:q4_k_m",
messages: [
{ role: "system", content: "Using the following documents:" },
{ role: "system", content: "documents:\n"+ documents },
{ role: "user", content: "userInput" }
],
stream: true,
});
console.log("=".repeat);
for await (const chunk of completion) {
process.stdout.write(chunk.choices[0].delta.content || "");
}
} catch (error) {
console.error("Error:", error);
}package main
import (
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"log"
"net/http"
"strings"
"github.com/openai/openai-go"
"github.com/openai/openai-go/option"
)
const VECTORMIND_API = "http://localhost:8080"
// EmbeddingRequest reprΓ©sente la requΓͺte pour crΓ©er un embedding
type EmbeddingRequest struct {
Content string `json:"content"`
Label string `json:"label,omitempty"`
Metadata string `json:"metadata,omitempty"`
}
// EmbeddingResponse reprΓ©sente la rΓ©ponse de crΓ©ation d'embedding
type EmbeddingResponse struct {
ID string `json:"id"`
Content string `json:"content"`
Label string `json:"label"`
Metadata string `json:"metadata"`
CreatedAt string `json:"created_at"`
Success bool `json:"success"`
}
// SearchRequest reprΓ©sente la requΓͺte de recherche
type SearchRequest struct {
Text string `json:"text"`
MaxCount int `json:"max_count,omitempty"`
DistanceThreshold float64 `json:"distance_threshold,omitempty"`
}
// SearchResult reprΓ©sente un rΓ©sultat de recherche
type SearchResult struct {
ID string `json:"id"`
Content string `json:"content"`
Distance float64 `json:"distance"`
}
// SearchResponse reprΓ©sente la rΓ©ponse de recherche
type SearchResponse struct {
Results []SearchResult `json:"results"`
Success bool `json:"success"`
}
// CreateEmbedding crΓ©e un embedding dans VectorMind
func CreateEmbedding(content, label, metadata string) (*EmbeddingResponse, error) {
reqBody := EmbeddingRequest{
Content: content,
Label: label,
Metadata: metadata,
}
jsonData, err := json.Marshal(reqBody)
if err != nil {
return nil, fmt.Errorf("erreur marshaling: %w", err)
}
resp, err := http.Post(VECTORMIND_API+"/embeddings", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("erreur requΓͺte: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("erreur lecture rΓ©ponse: %w", err)
}
var result EmbeddingResponse
if err := json.Unmarshal(body, &result); err != nil {
return nil, fmt.Errorf("erreur unmarshaling: %w", err)
}
return &result, nil
}
// SearchSimilar recherche des documents similaires
func SearchSimilar(text string, maxCount int, distanceThreshold float64) (*SearchResponse, error) {
reqBody := SearchRequest{
Text: text,
MaxCount: maxCount,
DistanceThreshold: distanceThreshold,
}
jsonData, err := json.Marshal(reqBody)
if err != nil {
return nil, fmt.Errorf("erreur marshaling: %w", err)
}
resp, err := http.Post(VECTORMIND_API+"/search", "application/json", bytes.NewBuffer(jsonData))
if err != nil {
return nil, fmt.Errorf("erreur requΓͺte: %w", err)
}
defer resp.Body.Close()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("erreur lecture rΓ©ponse: %w", err)
}
var result SearchResponse
if err := json.Unmarshal(body, &result); err != nil {
return nil, fmt.Errorf("erreur unmarshaling: %w", err)
}
return &result, nil
}
func main() {
baseURL := "http://localhost:12434/engines/llama.cpp/v1/"
model := "hf.co/menlo/jan-nano-gguf:q4_k_m"
client := openai.NewClient(
option.WithBaseURL(baseURL),
option.WithAPIKey(""),
)
ctx := context.Background()
chunks := []string{
`# Orcs
Orcs are savage, brutish humanoids with dark green skin and prominent tusks.
These fierce warriors inhabit dense forests where they hunt in packs,
using crude but effective weapons forged from scavenged metal and bone.
Their tribal society revolves around strength and combat prowess,
making them formidable opponents for any adventurer brave enough to enter their woodland domain.`,
`# Dragons
Dragons are magnificent and ancient creatures of immense power, soaring through the skies on massive wings.
These intelligent beings possess scales that shimmer like precious metals and breathe devastating elemental attacks.
Known for their vast hoards of treasure and centuries of accumulated knowledge,
dragons command both fear and respect throughout the realm.
Their aerial dominance makes them nearly untouchable in their celestial domain.`,
`# Goblins
Goblins are small, cunning creatures with mottled green skin and sharp, pointed ears.
Despite their diminutive size, they are surprisingly agile swimmers who have adapted to life around ponds and marshlands.
These mischievous beings are known for their quick wit and tendency to play pranks on unwary travelers.
They build elaborate underwater lairs connected by hidden tunnels beneath the murky pond waters.`,
`# Krakens
Krakens are colossal sea monsters with massive tentacles that can crush entire ships with ease.
These legendary creatures dwell in the deepest ocean trenches, surfacing only to hunt or when disturbed.
Their intelligence rivals that of the wisest sages, and their tentacles can stretch for hundreds of feet.
Sailors speak in hushed tones of these maritime titans, whose very presence can create devastating whirlpools
and tidal waves that reshape entire coastlines.`,
}
// Creation of embeddings
fmt.Println("Creation of embeddings...")
for _, chunk := range chunks {
result, err := CreateEmbedding(chunk, "fantasy-creatures", "")
if err != nil {
fmt.Printf("Error when embedding: %v\n", err)
continue
}
fmt.Printf("Embedding created: ID=%s, Success=%v\n", result.ID, result.Success)
}
// Search for similar documents
fmt.Println("\n\nSearch for similar documents...")
userInput := "Tell me something about the dragons"
searchResult, err := SearchSimilar(userInput, 2, 0.7)
if err != nil {
fmt.Printf("Error search: %v\n", err)
}
fmt.Printf("Found: %d\n", len(searchResult.Results))
var documents string
for i, result := range searchResult.Results {
fmt.Printf(" %d. Distance: %.4f\n", i+1, result.Distance)
fmt.Printf(" ID: %s\n", result.ID)
fmt.Printf(" Content: %s...\n", result.Content[:50])
documents += result.Content + "\n"
}
fmt.Println(strings.Repeat("-", 50))
fmt.Println("Chat Completion with retrieved documents as context:")
messages := []openai.ChatCompletionMessageParamUnion{
openai.SystemMessage("Using the following documents:"),
openai.SystemMessage("documents:\n" + documents),
openai.UserMessage(userInput),
}
param := openai.ChatCompletionNewParams{
Messages: messages,
Model: model,
Temperature: openai.Opt(0.0),
}
stream := client.Chat.Completions.NewStreaming(ctx, param)
for stream.Next() {
chunk := stream.Current()
// Stream each chunk as it arrives
if len(chunk.Choices) > 0 && chunk.Choices[0].Delta.Content != "" {
fmt.Print(chunk.Choices[0].Delta.Content)
}
}
if err := stream.Err(); err != nil {
log.Fatalln("Error with the completion:", err)
}
}package main
import (
"context"
"encoding/json"
"fmt"
"log"
"strings"
"github.com/mark3labs/mcp-go/client"
"github.com/mark3labs/mcp-go/client/transport"
"github.com/mark3labs/mcp-go/mcp"
)
type SearchResult struct {
ID string `json:"id"`
Content string `json:"content"`
Distance float64 `json:"distance"`
}
type SearchResponse struct {
Results []SearchResult `json:"results"`
Success bool `json:"success"`
}
var chunks = []string{
`# Orcs
Orcs are savage, brutish humanoids with dark green skin and prominent tusks.
These fierce warriors inhabit dense forests where they hunt in packs,
using crude but effective weapons forged from scavenged metal and bone.
Their tribal society revolves around strength and combat prowess,
making them formidable opponents for any adventurer brave enough to enter their woodland domain.`,
`# Dragons
Dragons are magnificent and ancient creatures of immense power, soaring through the skies on massive wings.
These intelligent beings possess scales that shimmer like precious metals and breathe devastating elemental attacks.
Known for their vast hoards of treasure and centuries of accumulated knowledge,
dragons command both fear and respect throughout the realm.
Their aerial dominance makes them nearly untouchable in their celestial domain.`,
`# Goblins
Goblins are small, cunning creatures with mottled green skin and sharp, pointed ears.
Despite their diminutive size, they are surprisingly agile swimmers who have adapted to life around ponds and marshlands.
These mischievous beings are known for their quick wit and tendency to play pranks on unwary travelers.
They build elaborate underwater lairs connected by hidden tunnels beneath the murky pond waters.`,
`# Krakens
Krakens are colossal sea monsters with massive tentacles that can crush entire ships with ease.
These legendary creatures dwell in the deepest ocean trenches, surfacing only to hunt or when disturbed.
Their intelligence rivals that of the wisest sages, and their tentacles can stretch for hundreds of feet.
Sailors speak in hushed tones of these maritime titans, whose very presence can create devastating whirlpools
and tidal waves that reshape entire coastlines.`,
}
func main() {
ctx := context.Background()
// MCP client initialization
fmt.Println("π Initializing MCP StreamableHTTP client...")
// Create HTTP transport
httpURL := "http://localhost:9090/mcp"
httpTransport, err := transport.NewStreamableHTTP(httpURL)
if err != nil {
log.Fatalf("Failed to create HTTP transport: %v", err)
}
// Create client with the transport
mcpClient := client.NewClient(httpTransport)
// Start the client
if err := mcpClient.Start(ctx); err != nil {
log.Fatalf("Failed to start client: %v", err)
}
initRequest := mcp.InitializeRequest{}
initRequest.Params.ProtocolVersion = mcp.LATEST_PROTOCOL_VERSION
initRequest.Params.ClientInfo = mcp.Implementation{
Name: "MCP-Go Simple Client Example",
Version: "1.0.0",
}
initRequest.Params.Capabilities = mcp.ClientCapabilities{}
_, err = mcpClient.Initialize(ctx, initRequest)
if err != nil {
log.Fatalf("Failed to initialize: %v", err)
}
// Tools listing
toolsRequest := mcp.ListToolsRequest{}
// Get the list of tools
toolsResult, err := mcpClient.ListTools(ctx, toolsRequest)
if err != nil {
log.Fatalf("Failed to list tools: %v", err)
}
fmt.Println("π οΈ Available tools:")
for _, tool := range toolsResult.Tools {
fmt.Printf("- %s: %s\n", tool.Name, tool.Description)
}
// Create Embeddings with `create_embedding` MCP tool
fmt.Println("\n\nCreation of embeddings...")
for _, chunk := range chunks {
request := mcp.CallToolRequest{
Params: mcp.CallToolParams{
Name: "create_embedding",
Arguments: map[string]any{
"content": chunk,
"label": "fantasy-creatures",
"metadata": "",
},
},
}
toolResponse, err := mcpClient.CallTool(ctx, request)
if err != nil {
fmt.Printf("Error when embedding: %v\n", err)
continue
}
if toolResponse == nil || len(toolResponse.Content) == 0 {
fmt.Printf("No response from embedding tool\n")
continue
}
fmt.Println("π οΈ Tool response:", toolResponse.Content[0].(mcp.TextContent).Text)
}
fmt.Println(strings.Repeat("=", 50))
fmt.Println("Search for similar documents...")
userInput := "Tell me something about the dragons"
searchRequest := mcp.CallToolRequest{
Params: mcp.CallToolParams{
Name: "similarity_search",
Arguments: map[string]any{
"text": userInput,
"max_count": 2,
"distance_threshold": 0.7,
},
},
}
searchResponse, err := mcpClient.CallTool(ctx, searchRequest)
if err != nil {
log.Fatalf("Error search: %v", err)
}
if searchResponse == nil || len(searchResponse.Content) == 0 {
log.Fatalf("No response from search tool")
}
searchResult := searchResponse.Content[0].(mcp.TextContent).Text
// Parse the JSON response
var response SearchResponse
err = json.Unmarshal([]byte(searchResult), &response)
if err != nil {
log.Fatalf("Error parsing search result: %v", err)
}
// Loop through results
fmt.Println("\nπ Search Results:")
for _, result := range response.Results {
fmt.Printf("\nID: %s\n", result.ID)
fmt.Printf("Distance: %f\n", result.Distance)
fmt.Printf("Content: %s\n", result.Content)
fmt.Println(strings.Repeat("-", 50))
}
}VectorMind uses a local CI pipeline based on Docker Compose with the following files:
- Main pipeline:
compose.ci.yml compose.ci.redis-test-server.ymlcompose.ci.unit-tests.ymlcompose.ci.multi-arch-build.ymlcompose.ci.start-vectormind.ymlcompose.ci.get-model-info.ymlcompose.ci.create-embeddings.ymlcompose.ci.search-embeddings.ymlcompose.ci.stop-vectormind.ymlcompose.ci.stop-redis.yml
Start the CI pipeline:
docker compose -f compose.ci.yml up --remove-orphans --buildStop the CI pipeline (in a clean way):
docker compose -f compose.ci.yml downThe local CI pipeline is orchestrated using Docker Compose and follows this workflow:
graph TD
Start([Start CI Pipeline]):::startNode
Start --> Redis[redis-test-server<br/>Redis Server]:::service
Start --> UnitTests[unit-tests<br/>Run Unit Tests]:::test
UnitTests -->|success| MultiBuild[multi-arch-build<br/>Multi-Architecture Build]:::build
MultiBuild -->|success| StartVM[start-vectormind<br/>Start VectorMind Service]:::service
Redis --> StartVM
StartVM -->|healthy| GetModelInfo[get-model-info<br/>Get Embedding Model Info]:::test
GetModelInfo -->|success| CreateEmb[create-embeddings<br/>Create Test Embeddings]:::test
StartVM -->|healthy| SearchEmb[search-embeddings<br/>Search Embeddings]:::test
CreateEmb -->|success| SearchEmb
SearchEmb -->|success| StopVM[stop-vectormind<br/>Stop VectorMind]:::cleanup
StopVM -->|success| StopRedis[stop-redis<br/>Stop Redis Server]:::cleanup
StopRedis --> Complete([Pipeline Complete]):::endNode
classDef startNode fill:#10B981,stroke:#059669,stroke-width:2px,color:#fff
classDef service fill:#3B82F6,stroke:#2563EB,stroke-width:2px,color:#fff
classDef test fill:#F59E0B,stroke:#D97706,stroke-width:2px,color:#fff
classDef build fill:#8B5CF6,stroke:#7C3AED,stroke-width:2px,color:#fff
classDef cleanup fill:#EF4444,stroke:#DC2626,stroke-width:2px,color:#fff
classDef endNode fill:#6B7280,stroke:#4B5563,stroke-width:2px,color:#fff
- redis-test-server: Starts Redis server for testing
- unit-tests: Runs unit tests in short mode
- multi-arch-build: Builds multi-architecture Docker image (depends on unit-tests success)
- start-vectormind: Starts VectorMind service (depends on multi-arch-build success and redis-test-server)
- get-model-info: Gets embedding model information (depends on start-vectormind healthy)
- create-embeddings: Creates test embeddings (depends on start-vectormind healthy and get-model-info success)
- search-embeddings: Tests embedding search functionality (depends on create-embeddings success and start-vectormind healthy)
- stop-vectormind: Stops VectorMind service (depends on search-embeddings success)
- stop-redis: Stops Redis server (depends on stop-vectormind success)