From 7d3f6756ee7b7574cccab7d387a8d47279e951e1 Mon Sep 17 00:00:00 2001
From: bsatapat <bsatapat@redhat.com>
Date: Tue, 23 Sep 2025 15:45:45 +0530
Subject: [PATCH] Added BYOK documentation

---
 docs/byok_guide.md | 438 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 438 insertions(+)
 create mode 100644 docs/byok_guide.md

diff --git a/docs/byok_guide.md b/docs/byok_guide.md
new file mode 100644
index 00000000..d81b6b6b
--- /dev/null
+++ b/docs/byok_guide.md
@@ -0,0 +1,438 @@
+# BYOK (Bring Your Own Knowledge) Feature Documentation
+
+## Overview
+
+The BYOK (Bring Your Own Knowledge) feature in Lightspeed Core enables users to integrate their own knowledge sources into the AI system through Retrieval-Augmented Generation (RAG) functionality. This feature allows the AI to access and utilize custom knowledge bases to provide more accurate, contextual, and domain-specific responses.
+
+---
+
+## Table of Contents
+
+* [What is BYOK?](#what-is-byok)
+* [How BYOK Works](#how-byok-works)
+* [Prerequisites](#prerequisites)
+* [Configuration Guide](#configuration-guide)
+  * [Step 1: Prepare Your Knowledge Sources](#step-1-prepare-your-knowledge-sources)
+  * [Step 2: Create Vector Database](#step-2-create-vector-database)
+  * [Step 3: Configure Embedding Model](#step-3-configure-embedding-model)
+  * [Step 4: Configure Llama Stack](#step-4-configure-llama-stack)
+  * [Step 5: Enable RAG Tools](#step-5-enable-rag-tools)
+* [Supported Vector Database Types](#supported-vector-database-types)
+* [Configuration Examples](#configuration-examples)
+* [Conclusion](#conclusion)
+
+---
+
+## What is BYOK?
+
+BYOK (Bring Your Own Knowledge) is Lightspeed Core's implementation of Retrieval-Augmented Generation (RAG) that allows you to:
+
+- **Integrate custom knowledge sources**: Add your organization's documentation, manuals, FAQs, or any text-based knowledge
+- **Enhance AI responses**: Provide contextual, accurate answers based on your specific domain knowledge
+- **Maintain data control**: Keep your knowledge sources within your infrastructure
+- **Improve relevance**: Get responses that are tailored to your organization's context and terminology
+
+## How BYOK Works
+
+The BYOK system operates through a sophisticated chain of components:
+
+1. **Agent Orchestrator**: The AI agent acts as the central coordinator, using the LLM as its reasoning engine
+2. **Knowledge Search**: When the agent needs external information, it queries your custom vector database
+3. **Vector Database**: Your indexed knowledge sources, stored as vector embeddings for semantic search
+4. **Embedding Model**: Converts queries and documents into vector representations for similarity matching
+5. **Context Integration**: Retrieved knowledge is integrated into the AI's response generation process
+
+```mermaid
+graph TD
+    A[User Query] --> B[AI Agent]
+    B --> C{Need External Knowledge?}
+    C -->|Yes| D[Knowledge Search Tool]
+    C -->|No| E[Generate Response]
+    D --> F[Vector Database]
+    F --> G[Retrieve Relevant Context]
+    G --> H[Integrate Context]
+    H --> E
+    E --> I[Response to User]
+```
+
+---
+
+## Prerequisites
+
+Before implementing BYOK, ensure you have:
+
+### Required Tools
+- **rag-content tool**: For creating compatible vector databases
+  - Repository: https://github.com/lightspeed-core/rag-content
+  - Used for indexing your knowledge sources
+
+### System Requirements
+- **Llama Stack**: Compatible vector database backend
+- **Embedding Model**: Local or downloadable embedding model
+- **LLM Provider**: OpenAI, vLLM, or other supported inference provider
+
+### Knowledge Sources
+- Text-based documents (PDFs, Markdown, TXT, etc.)
+- Structured data that can be converted to text
+- Documentation, manuals, FAQs, knowledge bases
+
+---
+
+## Configuration Guide
+
+### Step 1: Prepare Your Knowledge Sources
+
+1. **Collect your documents**: Gather all text-based knowledge sources you want to include
+2. **Organize content**: Structure your documents for optimal indexing
+3. **Format validation**: Ensure documents are in supported formats (PDF, TXT, MD, etc.)
+
+### Step 2: Create Vector Database
+
+Use the `rag-content` tool to create a compatible vector database:
+Please refer https://github.com/lightspeed-core/rag-content to create your vector database
+
+**Important Notes:**
+- The vector database must be compatible with Llama Stack
+- Supported formats: 
+  You can generate the vector database either using:
+    Llama-Index Faiss Vector Store
+    Llama-Index Postgres (PGVector) Vector Store
+    Llama-Stack Faiss Vector-IO
+    Llama-Stack SQLite-vec Vector-IO
+- The same embedding model must be used for both creation and querying
+
+### Step 3: Configure Embedding Model
+
+Download and configure your embedding model:
+Use the embedding generation step mentioned in the rag-content repo.
+For example:
+```bash
+mkdir ./embeddings_model
+pdm run python ./scripts/download_embeddings_model.py -l ./embeddings_model/ -r sentence-transformers/all-mpnet-base-v2 
+```
+
+### Step 4: Configure Llama Stack
+
+Edit your `run.yaml` file to include BYOK configuration:
+
+```yaml
+version: 2
+image_name: byok-configuration
+
+# Required APIs for BYOK
+apis:
+- agents
+- inference
+- vector_io
+- tool_runtime
+- safety
+
+models:
+  # Your LLM model
+  - model_id: your-llm-model
+    provider_id: openai  # or your preferred provider
+    model_type: llm
+    provider_model_id: gpt-4o-mini
+
+  # Embedding model for BYOK
+  - model_id: sentence-transformers/all-mpnet-base-v2
+    metadata:
+        embedding_dimension: 768
+    model_type: embedding
+    provider_id: sentence-transformers
+    provider_model_id: /path/to/embedding_models/all-mpnet-base-v2
+
+providers:
+  inference:
+  # Embedding model provider
+  - provider_id: sentence-transformers
+    provider_type: inline::sentence-transformers
+    config: {}
+
+  # LLM provider (example: OpenAI)
+  - provider_id: openai
+    provider_type: remote::openai
+    config:
+      api_key: ${env.OPENAI_API_KEY}
+
+  agents:
+  - provider_id: meta-reference
+    provider_type: inline::meta-reference
+    config:
+      persistence_store:
+        type: sqlite
+        db_path: .llama/distributions/ollama/agents_store.db
+      responses_store:
+        type: sqlite
+        db_path: .llama/distributions/ollama/responses_store.db
+
+  safety:
+  - provider_id: llama-guard
+    provider_type: inline::llama-guard
+    config:
+      excluded_categories: []
+
+  # Vector database configuration
+  vector_io:
+  - provider_id: your-knowledge-base
+    provider_type: inline::faiss  # or remote::pgvector
+    config:
+      kvstore:
+        type: sqlite
+        db_path: /path/to/vector_db/faiss_store.db
+        namespace: null
+
+  tool_runtime:
+  - provider_id: rag-runtime
+    provider_type: inline::rag-runtime
+    config: {}
+
+# Enable RAG tools
+tool_groups:
+- provider_id: rag-runtime
+  toolgroup_id: builtin::rag
+
+# Vector database configuration
+vector_dbs:
+- embedding_dimension: 768
+  embedding_model: sentence-transformers/all-mpnet-base-v2
+  provider_id: your-knowledge-base
+  vector_db_id: your-index-id  # ID used during index generation
+```
+
+### Step 5: Enable RAG Tools
+
+The configuration above automatically enables the RAG tools. The system will:
+
+1. **Detect RAG availability**: Automatically identify when knowledge search is available
+2. **Enhance prompts**: Encourage the AI to use knowledge search tools
+3. **Force knowledge usage**: Modify queries to ensure knowledge base consultation
+
+---
+
+## Supported Vector Database Types
+
+### 1. FAISS (Recommended)
+- **Type**: Local vector database with SQLite metadata
+- **Best for**: Small to medium-sized knowledge bases
+- **Configuration**: `inline::faiss`
+- **Storage**: SQLite database file
+
+```yaml
+vector_io:
+- provider_id: faiss-knowledge
+  provider_type: inline::faiss
+  config:
+    kvstore:
+      type: sqlite
+      db_path: /path/to/faiss_store.db
+      namespace: null
+```
+
+### 2. pgvector (PostgreSQL)
+- **Type**: PostgreSQL with pgvector extension
+- **Best for**: Large-scale deployments, shared knowledge bases
+- **Configuration**: `remote::pgvector`
+- **Requirements**: PostgreSQL with pgvector extension
+
+```yaml
+vector_io:
+- provider_id: pgvector-knowledge
+  provider_type: remote::pgvector
+  config:
+    host: localhost
+    port: 5432
+    db: knowledge_db
+    user: lightspeed_user
+    password: ${env.DB_PASSWORD}
+    kvstore:
+      type: sqlite
+      db_path: .llama/distributions/pgvector/registry.db
+```
+
+**pgvector Table Schema:**
+- `id` (text): UUID identifier of the chunk
+- `document` (jsonb): JSON containing content and metadata
+- `embedding` (vector(n)): The embedding vector (n = embedding dimension)
+
+---
+
+## Configuration Examples
+
+### Example 1: OpenAI + FAISS
+Complete configuration for OpenAI LLM with local FAISS knowledge base:
+
+```yaml
+version: 2
+image_name: openai-faiss-byok
+
+apis:
+- agents
+- inference
+- vector_io
+- tool_runtime
+- safety
+
+models:
+- model_id: gpt-4o-mini
+  provider_id: openai
+  model_type: llm
+  provider_model_id: gpt-4o-mini
+
+- model_id: sentence-transformers/all-mpnet-base-v2
+  metadata:
+      embedding_dimension: 768
+  model_type: embedding
+  provider_id: sentence-transformers
+  provider_model_id: /home/user/embedding_models/all-mpnet-base-v2
+
+providers:
+  inference:
+  - provider_id: sentence-transformers
+    provider_type: inline::sentence-transformers
+    config: {}
+  - provider_id: openai
+    provider_type: remote::openai
+    config:
+      api_key: ${env.OPENAI_API_KEY}
+
+  agents:
+  - provider_id: meta-reference
+    provider_type: inline::meta-reference
+    config:
+      persistence_store:
+        type: sqlite
+        db_path: .llama/distributions/ollama/agents_store.db
+      responses_store:
+        type: sqlite
+        db_path: .llama/distributions/ollama/responses_store.db
+
+  safety:
+  - provider_id: llama-guard
+    provider_type: inline::llama-guard
+    config:
+      excluded_categories: []
+
+  vector_io:
+  - provider_id: company-docs
+    provider_type: inline::faiss
+    config:
+      kvstore:
+        type: sqlite
+        db_path: /home/user/vector_dbs/company_docs/faiss_store.db
+        namespace: null
+
+  tool_runtime:
+  - provider_id: rag-runtime
+    provider_type: inline::rag-runtime
+    config: {}
+
+tool_groups:
+- provider_id: rag-runtime
+  toolgroup_id: builtin::rag
+
+vector_dbs:
+- embedding_dimension: 768
+  embedding_model: sentence-transformers/all-mpnet-base-v2
+  provider_id: company-docs
+  vector_db_id: company-knowledge-index
+```
+
+### Example 2: vLLM + pgvector
+Configuration for local vLLM inference with PostgreSQL knowledge base:
+
+```yaml
+version: 2
+image_name: vllm-pgvector-byok
+
+apis:
+- agents
+- inference
+- vector_io
+- tool_runtime
+- safety
+
+models:
+- model_id: meta-llama/Llama-3.1-8B-Instruct
+  provider_id: vllm
+  model_type: llm
+  provider_model_id: null
+
+- model_id: sentence-transformers/all-mpnet-base-v2
+  metadata:
+      embedding_dimension: 768
+  model_type: embedding
+  provider_id: sentence-transformers
+  provider_model_id: sentence-transformers/all-mpnet-base-v2
+
+providers:
+  inference:
+  - provider_id: sentence-transformers
+    provider_type: inline::sentence-transformers
+    config: {}
+  - provider_id: vllm
+    provider_type: remote::vllm
+    config:
+      url: http://localhost:8000/v1/
+      api_token: your-token-here
+
+  agents:
+  - provider_id: meta-reference
+    provider_type: inline::meta-reference
+    config:
+      persistence_store:
+        type: sqlite
+        db_path: .llama/distributions/ollama/agents_store.db
+      responses_store:
+        type: sqlite
+        db_path: .llama/distributions/ollama/responses_store.db
+
+  safety:
+  - provider_id: llama-guard
+    provider_type: inline::llama-guard
+    config:
+      excluded_categories: []
+
+  vector_io:
+  - provider_id: enterprise-knowledge
+    provider_type: remote::pgvector
+    config:
+      host: postgres.company.com
+      port: 5432
+      db: enterprise_kb
+      user: rag_user
+      password: ${env.POSTGRES_PASSWORD}
+      kvstore:
+        type: sqlite
+        db_path: .llama/distributions/pgvector/registry.db
+
+  tool_runtime:
+  - provider_id: rag-runtime
+    provider_type: inline::rag-runtime
+    config: {}
+
+tool_groups:
+- provider_id: rag-runtime
+  toolgroup_id: builtin::rag
+  args: null
+  mcp_endpoint: null
+
+vector_dbs:
+- embedding_dimension: 768
+  embedding_model: sentence-transformers/all-mpnet-base-v2
+  provider_id: enterprise-knowledge
+  vector_db_id: enterprise-docs
+```
+
+---
+
+## Conclusion
+
+The BYOK (Bring Your Own Knowledge) feature in Lightspeed Core provides powerful capabilities for integrating custom knowledge sources through RAG technology. By following this guide, you can successfully implement and configure BYOK to enhance your AI system with domain-specific knowledge.
+
+For additional support and advanced configurations, refer to:
+- [RAG Configuration Guide](rag_guide.md)
+- [Llama Stack Documentation](https://llama-stack.readthedocs.io/)
+- [rag-content Tool Repository](https://github.com/lightspeed-core/rag-content)
+
+Remember to regularly update your knowledge sources and monitor system performance to maintain optimal BYOK functionality.
\ No newline at end of file