# Dynamic In-Context Learning (dICL) Demo

This notebook demonstrates the dICL system by:
1. Taking a user question
2. Finding the top 3 most relevant examples from the database
3. Using those examples as context for generating an answer


## Setup and Imports


In [3]:
import sys
import os
import requests
import pandas as pd
import lancedb
from typing import List, Dict, Any
from dataclasses import dataclass
import logging

# Add the src directory to the path
sys.path.append('/Users/davidhughes/dev/mmgraphrag-odsc-west-2025/src')

# Import our dICL system
from mmgraphrag_odsc_west_2025.lib.dicl_system import DICLSystem
from mmgraphrag_odsc_west_2025.config import config

# Set up logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

print("Setup complete!")


Setup complete!


## Check Prerequisites

Before we start, let's make sure Ollama is running and accessible.


In [4]:
# Check if Ollama is running
import requests

ollama_url = "http://localhost:11434"
try:
    response = requests.get(f"{ollama_url}/api/tags", timeout=5)
    if response.status_code == 200:
        print("✅ Ollama is running and accessible")
        
        # Check if phi4 model is available
        models = response.json().get('models', [])
        phi4_models = [m for m in models if 'phi4' in m.get('name', '').lower()]
        if phi4_models:
            print(f"✅ Phi4 model found: {phi4_models[0]['name']}")
        else:
            print("⚠️  Phi4 model not found. You may need to pull it with: ollama pull phi4")
    else:
        print("❌ Ollama is not responding properly")
except Exception as e:
    print(f"❌ Cannot connect to Ollama: {e}")
    print("Please make sure Ollama is running: ollama serve")


✅ Ollama is running and accessible
✅ Phi4 model found: phi4-ctx16k:latest


## Initialize the dICL System

Now let's initialize the dICL system and connect to the database.


## View Database Contents

Let's take a look at what's stored in the LanceDB database.


In [7]:
# View the LanceDB table contents
import pandas as pd

def view_database_contents():
    """Display the contents of the LanceDB database."""
    try:
        # Get all data from the table
        all_data = system.table.to_pandas()
        
        print(f"📊 Database Overview:")
        print(f"   Total examples: {len(all_data)}")
        print(f"   Columns: {list(all_data.columns)}")
        print("\n" + "="*80)
        
        # Show basic statistics
        print(f"\n📈 Data Statistics:")
        print(f"   - Unique IDs: {all_data['id'].nunique()}")
        
        # Group by topic (based on ID prefix)
        topics = all_data['id'].str.split('_').str[0].value_counts()
        print(f"   - Topics: {dict(topics)}")
        
        print("\n" + "="*80)
        print(f"\n📋 Sample Data (first 5 examples):")
        print("\n" + "-"*80)
        
        # Display sample data
        for i, (_, row) in enumerate(all_data.head().iterrows()):
            print(f"\n🔹 Example {i+1}:")
            print(f"   ID: {row['id']}")
            print(f"   Question: {row['input']}")
            print(f"   Answer: {row['output'][:100]}{'...' if len(row['output']) > 100 else ''}")
            print(f"   Vector dimension: {len(row['vector'])}")
            print("-" * 60)
        
        # Show all data in a table format (without vectors for readability)
        print(f"\n📊 Complete Data Table (without vectors):")
        display_data = all_data[['id', 'input', 'output']].copy()
        display_data['output'] = display_data['output'].str[:50] + '...'
        display(display_data)
        
        return all_data
        
    except Exception as e:
        print(f"❌ Error viewing database: {e}")
        return None

# Call the function to view the database
database_data = view_database_contents()


📊 Database Overview:
   Total examples: 50
   Columns: ['id', 'vector', 'input', 'output']


📈 Data Statistics:
   - Unique IDs: 50
   - Topics: {'technology': 20, 'bananas': 15, 'functional programming, lambda calculus': 15}


📋 Sample Data (first 5 examples):

--------------------------------------------------------------------------------

🔹 Example 1:
   ID: bananas_0
   Question: What is the botanical name of a banana?
   Answer: The botanical name of a banana is Musa. Bananas belong to the genus Musa, which includes several spe...
   Vector dimension: 768
------------------------------------------------------------

🔹 Example 2:
   ID: bananas_1
   Question: Where do bananas originally come from?
   Answer: Bananas are believed to have originated in Southeast Asia, particularly in regions that include pres...
   Vector dimension: 768
------------------------------------------------------------

🔹 Example 3:
   ID: bananas_2
   Question: What is the primary nutrient found in banan

Unnamed: 0,id,input,output
0,bananas_0,What is the botanical name of a banana?,The botanical name of a banana is Musa. Banana...
1,bananas_1,Where do bananas originally come from?,Bananas are believed to have originated in Sou...
2,bananas_2,What is the primary nutrient found in bananas?,The primary nutrient found in bananas is carbo...
3,bananas_3,How do bananas ripen after being harvested?,Bananas continue to ripen after being harveste...
4,bananas_4,What are some common varieties of bananas?,Some common varieties of bananas include the C...
5,bananas_5,What role do bananas play in global agriculture?,Bananas are one of the most important staple f...
6,bananas_6,What is the significance of bananas in terms o...,Bananas are well-known for their high potassiu...
7,bananas_7,How do bananas contribute to a balanced diet?,Bananas contribute to a balanced diet by provi...
8,bananas_8,What is the difference between bananas and pla...,The main difference between bananas and planta...
9,bananas_9,What environmental conditions do banana plants...,"Banana plants thrive in warm, tropical climate..."


In [9]:
# Advanced database exploration
def explore_database():
    """Provide additional database exploration options."""
    if database_data is None:
        print("❌ No database data available. Run the previous cell first.")
        return
    
    print("🔍 Advanced Database Exploration:")
    print("\n" + "="*50)
    
    # 1. Search by topic
    print("\n1️⃣ Search by Topic:")
    topics = database_data['id'].str.split('_').str[0].unique()
    for topic in topics:
        topic_data = database_data[database_data['id'].str.startswith(topic)]
        print(f"   {topic}: {len(topic_data)} examples")
    
    # 2. Show longest/shortest examples
    print(f"\n2️⃣ Example Length Analysis:")
    database_data['input_length'] = database_data['input'].str.len()
    database_data['output_length'] = database_data['output'].str.len()
    
    longest_input = database_data.loc[database_data['input_length'].idxmax()]
    shortest_input = database_data.loc[database_data['input_length'].idxmin()]
    
    print(f"   Longest question: {longest_input['input'][:80]}...")
    print(f"   Shortest question: {shortest_input['input']}")
    
    # 3. Vector analysis
    print(f"\n3️⃣ Vector Analysis:")
    vector_dims = [len(row['vector']) for _, row in database_data.iterrows()]
    print(f"   Vector dimension: {vector_dims[0]} (all vectors should be same size)")
    print(f"   Number of vectors: {len(vector_dims)}")
    
    # 4. Interactive search
    print(f"\n4️⃣ Interactive Search:")
    print("   You can search for specific content using pandas:")
    print("   - database_data[database_data['input'].str.contains('your_search_term')]")
    print("   - database_data[database_data['id'].str.startswith('bananas')]")
    
    return database_data

# Run the exploration
explored_data = explore_database()


🔍 Advanced Database Exploration:


1️⃣ Search by Topic:
   bananas: 15 examples
   technology: 20 examples
   functional programming, lambda calculus: 15 examples

2️⃣ Example Length Analysis:
   Longest question: Describe the difference between 'call by value' and 'call by name' evaluation st...
   Shortest question: How are bananas typically harvested?

3️⃣ Vector Analysis:
   Vector dimension: 768 (all vectors should be same size)
   Number of vectors: 50

4️⃣ Interactive Search:
   You can search for specific content using pandas:
   - database_data[database_data['input'].str.contains('your_search_term')]
   - database_data[database_data['id'].str.startswith('bananas')]


In [5]:
# Initialize the dICL system
system = DICLSystem()

# Try to connect to the existing database (using absolute path)
db_path = "/Users/davidhughes/dev/mmgraphrag-odsc-west-2025/dicl_examples"

try:
    system.initialize_database(db_path)
    print("✅ Successfully connected to dICL database")
    print(f"Database location: {system.db_path}")
except Exception as e:
    print(f"❌ Error connecting to database: {e}")
    print("The database may not exist or may not be properly populated.")
    print("Let's try to populate it now...")
    
    # Try to populate the database
    try:
        system.populate_database(db_path)
        print("✅ Database populated successfully!")
        print(f"Database location: {system.db_path}")
    except Exception as populate_error:
        print(f"❌ Error populating database: {populate_error}")
        print("Make sure Ollama is running with the phi4 model")


2025-10-23 07:55:25,747 - mmgraphrag_odsc_west_2025.lib.dicl_system - INFO - Connected to database at /Users/davidhughes/dev/mmgraphrag-odsc-west-2025/dicl_examples
INFO:mmgraphrag_odsc_west_2025.lib.dicl_system:Connected to database at /Users/davidhughes/dev/mmgraphrag-odsc-west-2025/dicl_examples


✅ Successfully connected to dICL database
Database location: /Users/davidhughes/dev/mmgraphrag-odsc-west-2025/dicl_examples


## Step 1: User Input and Similar Example Search

Enter your question below, and the system will find the top 3 most relevant examples from the database.


In [12]:
# Interactive user input
import ipywidgets as widgets
from IPython.display import display, clear_output

# Create input widgets
question_input = widgets.Textarea(
    value="What are the health benefits of bananas?",
    placeholder="Enter your question here...",
    description="Question:",
    layout=widgets.Layout(width='100%', height='80px')
)

submit_button = widgets.Button(
    description="Search for Similar Examples",
    button_style='primary',
    layout=widgets.Layout(width='200px')
)

output_area = widgets.Output()

def on_submit_clicked(b):
    """Handle submit button click."""
    with output_area:
        clear_output(wait=True)
        
        user_question = question_input.value.strip()
        
        if not user_question:
            print("❌ Please enter a question!")
            return
            
        print(f"🔍 User Question: {user_question}")
        print("\n" + "="*50)

        # Search for similar examples
        try:
            similar_examples = system._search_similar_examples(user_question, num_examples=3)
            
            print(f"📊 Found {len(similar_examples)} similar examples:")
            print("\n" + "-"*50)
            
            for i, example in enumerate(similar_examples, 1):
                print(f"\n📝 Example {i}:")
                print(f"   ID: {example.id}")
                print(f"   Question: {example.input}")
                print(f"   Answer: {example.output}")
                print("\n" + "-"*30)
            
            # Store the results globally for the next cell
            globals()['user_question'] = user_question
            globals()['similar_examples'] = similar_examples
            
            print(f"\n✅ Ready for Step 2! Found {len(similar_examples)} examples to use as context.")
                
        except Exception as e:
            print(f"❌ Error searching for examples: {e}")

# Connect the button to the function
submit_button.on_click(on_submit_clicked)

# Display the widgets
print("📝 Enter your question below and click 'Search for Similar Examples':")
display(question_input)
display(submit_button)
display(output_area)


📝 Enter your question below and click 'Search for Similar Examples':


Textarea(value='What are the health benefits of bananas?', description='Question:', layout=Layout(height='80px…

Button(button_style='primary', description='Search for Similar Examples', layout=Layout(width='200px'), style=…

Output()

## Step 2: Dynamic In-Context Learning

Now we'll use the BAML DynamicInContextLearning function with the similar examples as context.


In [13]:
# Import BAML client
import mmgraphrag_odsc_west_2025.baml_client as baml
from mmgraphrag_odsc_west_2025.baml_client.types import DICLInput, Example as BAMLExample

print("🤖 Using Dynamic In-Context Learning with BAML...")
print("\n" + "="*50)

# Check if we have the required variables from Step 1
if 'user_question' not in globals() or 'similar_examples' not in globals():
    print("❌ Please run Step 1 first and click 'Search for Similar Examples'!")
    print("The variables 'user_question' and 'similar_examples' are not available.")
else:
    try:
        # Convert dICL Example objects to BAML Example objects
        print("🔄 Converting examples to BAML format...")
        baml_examples = []
        for example in similar_examples:
            baml_example = BAMLExample(
                id=example.id,
                input=example.input,
                output=example.output
            )
            baml_examples.append(baml_example)
        
        print(f"✅ Converted {len(baml_examples)} examples")
        
        # Create the BAML input object
        dicl_input = DICLInput(
            query=user_question,
            examples=baml_examples
        )
        
        print(f"📤 Sending to BAML:")
        print(f"   Query: {dicl_input.query}")
        print(f"   Number of context examples: {len(dicl_input.examples)}")
        print("\n" + "-"*30)
        
        # Call the DynamicInContextLearning function
        result = baml.b.DynamicInContextLearning(dicl_input)
        
        print("\n🎯 Final Answer:")
        print("\n" + "="*50)
        print(f"{result.answer}")
        
        print("\n\n🧠 Reasoning Process:")
        print("\n" + "="*50)
        print(f"{result.reasoning}")
        
    except Exception as e:
        print(f"❌ Error in dynamic in-context learning: {e}")
        import traceback
        traceback.print_exc()


🤖 Using Dynamic In-Context Learning with BAML...

🔄 Converting examples to BAML format...
✅ Converted 3 examples
📤 Sending to BAML:
   Query: What are the health benefits of bananas?
   Number of context examples: 3

------------------------------

🎯 Final Answer:

The health benefits of consuming bananas include improved digestion due to their high dietary fiber content, enhanced muscle function and cardiovascular health from potassium, better mood regulation through tryptophan which is a precursor to serotonin, and increased energy levels because of natural sugars like glucose, fructose, and sucrose. Additionally, they provide essential vitamins such as vitamin C and B6, contributing to overall well-being.


🧠 Reasoning Process:

To answer the question about the health benefits of bananas, I analyzed similar examples provided in the context. These examples focused on specific nutrients found in bananas and their respective health impacts. For instance, dietary fiber aids digestion, p

## Summary

This demo showed how the dICL system:
1. ✅ Took your question and embedded it
2. ✅ Found the 3 most relevant examples from the database
3. ✅ Used those examples as context for the BAML DynamicInContextLearning function
4. ✅ Generated a contextual answer with reasoning

The system successfully used dynamic in-context learning to provide a more relevant and contextual answer based on similar examples from the database.


## Try Different Questions

You can modify the `user_question` variable in the first cell and re-run the cells to test different queries. Some suggested questions:

- "What is lambda calculus?"
- "How does machine learning work?"
- "What are the nutritional benefits of bananas?"
- "Explain functional programming concepts"
- "What is artificial intelligence?"
