# 🔧 Caliper-AI: DIY Assistant Demo

This notebook demonstrates the complete RAG (Retrieval-Augmented Generation) system for DIY project assistance.

## What This Demo Shows
1. **Load DIY data** from CSV
2. **Setup ChromaDB** collection
3. **Generate embeddings** for semantic search
4. **Query the system** for relevant DIY guidance

## System Architecture
- **Data Source**: CSV with DIY Q&As
- **Vector Database**: ChromaDB for semantic search
- **Embeddings**: Mock OpenAI-style vectors
- **Query System**: RAG-based content retrieval


## Step 1: Load DIY Data
Load DIY snippets from CSV and prepare for ChromaDB storage.


In [None]:
# Import the data ingestion module
import sys
sys.path.append('../scripts')
from ingest_data import load_diy_data

# Load DIY data
print("📁 Loading DIY data...")
documents = load_diy_data("../data/diy_snippets.csv")

if documents:
    print(f"✅ Loaded {len(documents)} DIY snippets")
    categories = list(set(doc['metadata']['category'] for doc in documents))
    print(f"   Categories: {categories}")
    
    # Show sample document
    print("\n📄 Sample document:")
    sample = documents[0]
    print(f"   ID: {sample['id']}")
    print(f"   Category: {sample['metadata']['category']}")
    print(f"   Text: {sample['text'][:100]}...")
else:
    print("❌ Failed to load data")


## Step 2: Setup ChromaDB Collection
Initialize ChromaDB and store the DIY documents.


In [None]:
# Import ChromaDB setup module
from setup_chroma import setup_chroma_collection, store_documents, verify_collection

print("🗄️ Setting up ChromaDB...")

# Setup collection
client, collection = setup_chroma_collection("../chroma_db")

if collection:
    print("✅ ChromaDB collection ready")
    
    # Store documents
    if store_documents(collection, documents):
        print("✅ Documents stored in ChromaDB")
        
        # Verify collection
        verify_collection(collection)
    else:
        print("❌ Failed to store documents")
else:
    print("❌ Failed to setup ChromaDB")


## Step 3: Generate Embeddings
Create vector embeddings for semantic search.


In [None]:
# Import embedding generation module
from generate_embeddings import generate_embeddings_for_documents, store_embeddings_in_chroma

print("🧠 Generating embeddings...")

# Generate embeddings
documents_with_embeddings = generate_embeddings_for_documents(documents)

if documents_with_embeddings:
    print("✅ Embeddings generated")
    print(f"   Embedding dimensions: {len(documents_with_embeddings[0]['embedding'])}")
    
    # Store embeddings
    if store_embeddings_in_chroma(documents_with_embeddings):
        print("✅ Embeddings stored in ChromaDB")
    else:
        print("❌ Failed to store embeddings")
else:
    print("❌ Failed to generate embeddings")


## Step 4: Test Semantic Search
Query the system to find relevant DIY guidance.


In [None]:
# Import query system module
from query_system import search_chroma

print("🔍 Testing semantic search...")

# Test queries
test_queries = [
    "how to fix a leaky faucet",
    "what tools do I need for woodworking",
    "how to paint a room properly",
    "safety equipment for sanding"
]

for query in test_queries:
    print(f"\n🔍 Query: '{query}'")
    results = search_chroma(query, top_k=2)
    
    if results:
        print(f"✅ Found {len(results)} relevant snippets")
        for i, result in enumerate(results, 1):
            print(f"   {i}. {result['metadata']['category']} - {result['text'][:80]}...")
            print(f"      Tools: {result['metadata']['tools_required']}")
    else:
        print("❌ No results found")


## Interactive Query
Try your own DIY questions!


In [None]:
# Interactive query function
def ask_caliper(question):
    """Ask Caliper a DIY question"""
    results = search_chroma(question, top_k=3)
    
    print(f"\n🔧 Caliper's Answer to: '{question}'")
    print("=" * 50)
    
    if results:
        for i, result in enumerate(results, 1):
            print(f"\n{i}. {result['metadata']['category']} - ID: {result['id']}")
            print(f"   Tools: {result['metadata']['tools_required']}")
            print(f"   PPE: {result['metadata']['ppe_required']}")
            print(f"   Content: {result['text']}")
            print(f"   Relevance: {1 - result['distance']:.2f}")
    else:
        print("Sorry, I couldn't find relevant DIY guidance for that question.")

# Try some questions
ask_caliper("how do I install floating shelves?")
ask_caliper("what safety equipment do I need for refinishing furniture?")


## 🎉 Demo Complete!

### What We Built
- **Complete RAG system** for DIY project assistance
- **Semantic search** using vector embeddings
- **ChromaDB integration** for scalable storage
- **Mock AI pipeline** ready for real API integration

### Key Features
- ✅ **Data ingestion** from CSV
- ✅ **Vector storage** in ChromaDB
- ✅ **Semantic search** with embeddings
- ✅ **RAG queries** for relevant content
- ✅ **Professional logging** and error handling

### Next Steps
- Add real OpenAI API integration
- Scale to larger DIY datasets
- Deploy to cloud infrastructure
- Add user interface

**Caliper-AI is ready to help DIY enthusiasts turn overwhelm into action!** 🚀
