# RAG Knowledge Base

This notebook demonstrates how to use Llama Stack's Vector I/O for retrieval-augmented generation.

## Prerequisites

- Llama Stack running at `http://localhost:5001`
- MyloWare installed: `pip install -e .`
- Knowledge documents in `data/knowledge/`


In [None]:
import sys
sys.path.insert(0, '../src')

from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:5001")


## 1. List Vector Databases

Check which knowledge bases are available:


In [None]:
# List all vector databases
vector_dbs = client.vector_dbs.list()
for db in vector_dbs:
    print(f"- {db.identifier}: {db.embedding_dimension}d embeddings")


## 2. Register a New Vector Database

Create a new vector database for storing embeddings:


In [None]:
# Register a vector database (idempotent - safe to run multiple times)
client.vector_dbs.register(
    vector_db_id="demo_knowledge",
    embedding_model="all-MiniLM-L6-v2",
    embedding_dimension=384,
    provider_id="faiss",
)
print("Vector database 'demo_knowledge' registered")


## 3. Insert Documents

Add documents to the vector database:


In [None]:
# Sample documents about video production
documents = [
    {
        "document_id": "video_tips_1",
        "content": "For ASMR videos, use soft lighting and minimal background noise. Keep transitions gentle and use slow fades rather than hard cuts.",
        "metadata": {"topic": "asmr", "type": "tips"},
    },
    {
        "document_id": "video_tips_2", 
        "content": "Vertical video (9:16) works best for TikTok and Instagram Reels. Horizontal (16:9) is better for YouTube and web embeds.",
        "metadata": {"topic": "formats", "type": "tips"},
    },
    {
        "document_id": "zodiac_aries",
        "content": "Aries (March 21 - April 19): Bold, ambitious fire sign. Video themes: action, leadership, new beginnings. Colors: red, orange.",
        "metadata": {"topic": "zodiac", "sign": "aries"},
    },
]

# Insert documents
client.vector_io.insert(
    vector_db_id="demo_knowledge",
    chunks=documents,
)
print(f"Inserted {len(documents)} documents")


## 4. Query the Knowledge Base

Retrieve relevant documents based on semantic similarity:


In [None]:
# Query for relevant documents
results = client.vector_io.query(
    vector_db_id="demo_knowledge",
    query="What aspect ratio should I use for TikTok?",
    params={"max_chunks": 3},
)

print("Query Results:")
for i, chunk in enumerate(results.chunks):
    print(f"\n{i+1}. Score: {results.scores[i]:.3f}")
    print(f"   Content: {chunk.content[:100]}...")


In [None]:
# See how knowledge is loaded in MyloWare

# This loads all markdown files from data/knowledge/ and data/projects/{project}/
# into the project's vector database
# load_knowledge_documents(client, "motivational")

print("Knowledge is automatically loaded when workflows start")
