# Minecraft Genie - Embedding & Query Testing

This notebook demonstrates how to work with the Minecraft Lore vector database using ChromaDB and LlamaIndex.

## Overview
- **Purpose**: Load persisted vector embeddings of Minecraft lore and test querying functionality
- **Technology Stack**: ChromaDB for vector storage, LlamaIndex for indexing/querying, OpenAI embeddings
- **Data**: Minecraft lore documents embedded as vectors for semantic search

## Workflow
1. **Setup**: Import necessary modules and configure paths
2. **Load Database**: Connect to persisted ChromaDB vector store
3. **Query Testing**: Test semantic search capabilities on Minecraft lore data

In [11]:
# Setup
import sys
from pathlib import Path
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import VectorStoreIndex, StorageContext
import chromadb

project_root = Path("..").resolve()
sys.path.insert(0, str(project_root))

## Loading Persisted Vector Database

The next cell connects to an already-created ChromaDB database containing Minecraft lore embeddings. The database was created using the `build_vector_index()` function from the `data.embedder` module.

**Key Components:**
- **ChromaDB**: Vector database storing document embeddings
- **PersistentClient**: Connects to the existing database file
- **Collection**: Named container ("minecraft_lore") holding the embedded documents
- **VectorStoreIndex**: LlamaIndex interface for querying the vector store

In [12]:
# Create persistent client to connect to existing database
client = chromadb.PersistentClient(path="../db/minecraft_lore")

# Get the existing collection
collection = client.get_collection("minecraft_lore")

# Create vector store with the existing collection
vector_store = ChromaVectorStore(chroma_collection=collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_vector_store(
    vector_store, storage_context=storage_context
)

## Semantic Search Testing

Now we can test the semantic search capabilities by creating a query engine and asking questions about Minecraft lore. The query engine will:

1. **Embed the query**: Convert the question into a vector using the same embedding model
2. **Find similar vectors**: Search the database for the most relevant document chunks
3. **Generate response**: Use the retrieved context to provide an informed answer

In [14]:
query_engine = index.as_query_engine()
query = "List the blocks that are affected by gravity. They are 11 of them."
results = query_engine.query(query)

print(str(results))

Empty Response


In [None]:
# Inspect the contents of the ChromaDB collection
print(f"Collection name: {collection.name}")
print(f"Number of documents: {collection.count()}")

# Fetch all documents from the collection for validation
docs = collection.get(include=["documents"])

# Validate gold prompts against database content
import json
with open("../evaluation/gold_prompts.json", "r", encoding="utf-8") as f:
    gold_prompts = json.load(f)

# Print status for snippet matches
for prompt in gold_prompts:
    question = prompt["question"]
    expected_snippets = prompt["expected_answer_contains"]
    contains_all_snippets = all(
        any(snippet.lower() in doc.lower() for doc in docs["documents"]) for snippet in expected_snippets
    )
    status = "✔" if contains_all_snippets else "✘"
    print(f"[{status}] Question: {question}")
    for snippet in expected_snippets:
        found = any(snippet.lower() in doc.lower() for doc in docs["documents"])
        print(f"  Snippet '{snippet}' found: {found}")

Collection name: minecraft_lore
Number of documents: 239
Query: Give me the trading link
Results: Empty Response
[✘] Question: Give me the trading web link
  Snippet 'https://minecraft.wiki/w/Trading' found: False
[✔] Question: Whats the xp level a villager need to become a Master?
  Snippet '250' found: True
[✔] Question: How many raw chicken are needed to get one emerauld for a novice butcher?
  Snippet '14' found: True
[✔] Question: Whats the probability of having the dried kelp block trade with expert butcher?
  Snippet '100%' found: True
[✔] Question: For the fisherman villager, what does the type of boat traded depends on?
  Snippet 'biome' found: True
  Snippet 'outfit' found: True
[✔] Question: What's the list of the brewing equipment?
  Snippet 'brewing stand' found: True
  Snippet 'water' found: True
  Snippet 'blaze powder' found: True
  Snippet 'water bottle' found: True
[✔] Question: What is the effect of adding gunpowder in a potion?
  Snippet 'splash' found: True
[✔] Que