# Wut Is: Notebook Demo

This notebook demonstrates how to use `pretty_little_summary` to turn any Python variable into a textual description.

**Features shown:**
- Installation from local repo
- OpenRouter configuration for LLM-powered summaries
- Deterministic mode (no API calls)
- Multiple object types: built-ins, numpy arrays, pandas DataFrames, matplotlib figures

## 1. Installation

**IMPORTANT**: After running the installation cells below, **restart your Jupyter kernel** before continuing!

Install the local `pretty_little_summary` library in editable mode:

In [None]:
# Install pretty_little_summary from local repo (run this once)
!pip install -e ..

In [None]:
# Install optional dependencies for this demo
!pip install numpy pandas matplotlib

---

**⚠️ RESTART YOUR KERNEL NOW**

After running the installation cells above, restart your Jupyter kernel:
- **Jupyter Notebook**: Kernel → Restart
- **JupyterLab**: Kernel → Restart Kernel
- **VS Code**: Click the "Restart" button in the notebook toolbar

Then continue from the next cell.

---

## 2. Import and Configuration

Import pretty_little_summary and configure OpenRouter for LLM-powered summaries.

In [None]:
import pretty_little_summary as pls
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Enable inline plotting for matplotlib
%matplotlib inline

### Option A: Configure via environment variable

Set `OPENROUTER_API_KEY` in your environment or `.env` file:
```bash
export OPENROUTER_API_KEY="sk-or-v1-your-key-here"
```

### Option B: Configure programmatically (recommended for notebooks)

In [None]:
# Configure OpenRouter API key
# Replace with your actual key, or load from .env
import os
from dotenv import load_dotenv

# Load from .env file in parent directory
load_dotenv('../.env')

# Configure pretty_little_summary
api_key = os.getenv('OPENROUTER_API_KEY')
if api_key:
    pls.configure(openrouter_api_key=api_key)
    print("✓ OpenRouter configured successfully")
else:
    print("⚠ No API key found - LLM mode will not work (deterministic mode will still work)")

## 3. Built-in Types (Deterministic Mode)

Start with simple built-in types using **deterministic mode** (no API calls).

### Example 1: Dictionary

In [None]:
# Create a dictionary
user_data = {
    'name': 'Alice Johnson',
    'age': 28,
    'email': 'alice@example.com',
    'roles': ['admin', 'developer'],
    'active': True
}

# Get deterministic summary (no LLM call)
result = pls.describe(user_data, explain=False)

print("Content (Summary):")
print(result.content)
print("\nMetadata:")
print(f"  Type: {result.meta['object_type']}")
print(f"  Adapter: {result.meta['adapter_used']}")

### Example 2: List

In [None]:
# Create a list
numbers = list(range(1, 101))  # 1 to 100

result = pls.describe(numbers, explain=False)

print("Content (Summary):")
print(result.content)
print("\nMetadata:")
print(f"  Type: {result.meta['object_type']}")
print(f"  Length: {result.meta.get('metadata', {}).get('length', 'N/A')}")

## 4. NumPy Arrays

Demonstrate pretty_little_summary with numpy arrays.

In [None]:
# Create a numpy array
arr = np.random.randn(100, 5)  # 100 rows, 5 columns

# Deterministic mode
result = pls.describe(arr, explain=False)

print("Deterministic Summary:")
print(result.content)
print("\nMetadata:")
for key, value in result.meta.items():
    if key != 'metadata':
        print(f"  {key}: {value}")

### With LLM Explanation

Now use the LLM to generate a natural language explanation:

In [None]:
# LLM-powered explanation (requires API key)
if api_key:
    result = pls.describe(arr, explain=True)
    
    print("LLM-Powered Summary:")
    print(result.content)
    print("\n" + "="*60)
else:
    print("⚠ Skipping LLM mode - no API key configured")

## 5. Pandas DataFrames

Demonstrate with pandas DataFrames - one of the most common use cases.

In [None]:
# Create a sample sales DataFrame
df = pd.DataFrame({
    'date': pd.date_range('2024-01-01', periods=100, freq='D'),
    'product': np.random.choice(['Widget A', 'Widget B', 'Widget C'], 100),
    'quantity': np.random.randint(1, 50, 100),
    'price': np.random.uniform(10, 100, 100).round(2),
    'region': np.random.choice(['North', 'South', 'East', 'West'], 100)
})

# Add a calculated column
df['revenue'] = df['quantity'] * df['price']

# Show the first few rows
print("Sample Data:")
df.head()

### Deterministic Summary

In [None]:
# Deterministic mode (fast, no API call)
result = pls.describe(df, explain=False)

print("Deterministic Summary:")
print(result.content)
print("\nStructured Metadata:")
print(f"  Shape: {result.meta['shape']}")
print(f"  Columns: {result.meta['columns']}")
print(f"\nData Types:")
for col, dtype in result.meta['dtypes'].items():
    print(f"    {col}: {dtype}")

### LLM-Powered Explanation

In [None]:
# LLM mode - generates natural language summary
if api_key:
    result = pls.describe(df, explain=True)
    
    print("LLM-Powered Summary:")
    print(result.content)
    print("\n" + "="*60)
    
    # Show code history captured from notebook
    if result.history:
        print("\nCode History (captured from notebook):")
        for i, line in enumerate(result.history[-5:], 1):
            print(f"  {i}. {line}")
else:
    print("⚠ Skipping LLM mode - no API key configured")

## 6. Matplotlib Figures

Demonstrate with visualization objects - pretty_little_summary can describe plots!

In [None]:
# Create a matplotlib figure
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

# Plot 1: Line plot
x = np.linspace(0, 10, 100)
axes[0].plot(x, np.sin(x), label='sin(x)')
axes[0].plot(x, np.cos(x), label='cos(x)')
axes[0].set_title('Trigonometric Functions')
axes[0].set_xlabel('X')
axes[0].set_ylabel('Y')
axes[0].legend()
axes[0].grid(True)

# Plot 2: Bar chart from our sales data
product_revenue = df.groupby('product')['revenue'].sum()
axes[1].bar(product_revenue.index, product_revenue.values)
axes[1].set_title('Total Revenue by Product')
axes[1].set_xlabel('Product')
axes[1].set_ylabel('Revenue ($)')

plt.tight_layout()
plt.show()

### Check the Figure

In [None]:
# Deterministic summary of the figure
result = pls.describe(fig, explain=False)

print("Deterministic Summary:")
print(result.content)
print("\nMetadata:")
print(f"  Number of Axes: {result.meta.get('num_axes', 'N/A')}")
print(f"  Figure Size: {result.meta.get('figure_size', 'N/A')}")

In [None]:
# LLM-powered explanation (understands the plot context from history!)
if api_key:
    result = pls.describe(fig, explain=True)
    
    print("LLM-Powered Summary:")
    print(result.content)
    print("\n" + "="*60)
    print("\n💡 Notice: The LLM uses the code history to understand")
    print("   what data was plotted and can describe the visualization!")
else:
    print("⚠ Skipping LLM mode - no API key configured")

## 7. Comparing Modes: Deterministic vs LLM

Let's compare both modes side-by-side with a complex object.

In [None]:
# Create a more complex DataFrame with aggregations
summary_df = df.groupby(['product', 'region']).agg({
    'quantity': ['sum', 'mean'],
    'revenue': ['sum', 'mean', 'max']
}).round(2)

summary_df

In [None]:
# Deterministic mode - fast, structured
result_deterministic = pls.describe(summary_df, explain=False)

print("🔧 DETERMINISTIC MODE (explain=False)")
print("="*60)
print(result_deterministic.content)
print("\nCharacteristics:")
print("  ✓ No API call required")
print("  ✓ Instant results")
print("  ✓ Structured, predictable output")
print("  ✓ Perfect for quick inspection")

In [None]:
# LLM mode - natural language, context-aware
if api_key:
    result_llm = pls.describe(summary_df, explain=True)
    
    print("\n🤖 LLM MODE (explain=True)")
    print("="*60)
    print(result_llm.content)
    print("\nCharacteristics:")
    print("  ✓ Natural language explanation")
    print("  ✓ Context from code history")
    print("  ✓ Semantic understanding")
    print("  ✓ Perfect for LLM consumption")
else:
    print("\n⚠ Skipping LLM mode - no API key configured")

## 8. Using with Custom Objects

Vibe check works with any Python object via the GenericAdapter fallback.

In [None]:
# Define a custom class
class DataPipeline:
    def __init__(self, name, steps):
        self.name = name
        self.steps = steps
        self.executed = False
        self.results = None
    
    def run(self, data):
        self.executed = True
        self.results = f"Processed {len(data)} items"
        return self.results

# Create an instance
pipeline = DataPipeline(
    name="SalesETL",
    steps=['extract', 'transform', 'load']
)
pipeline.run(df)

# Check it
result = pls.describe(pipeline, explain=False)

print("Custom Object Summary:")
print(result.content)
print("\nAttributes extracted:")
if 'metadata' in result.meta and 'attributes' in result.meta['metadata']:
    for attr in result.meta['metadata']['attributes'][:10]:
        print(f"  - {attr}")

## 9. Best Practices

### When to use `explain=False` (Deterministic Mode)

- ✅ Quick debugging and inspection
- ✅ When you don't have an API key
- ✅ When you need instant results
- ✅ When structured metadata is sufficient
- ✅ In CI/CD pipelines or automated tests

### When to use `explain=True` (LLM Mode)

- ✅ Generating documentation
- ✅ Providing context to other LLMs
- ✅ When you need semantic understanding
- ✅ When code history/provenance matters
- ✅ For complex objects that need explanation

## 10. Complete Workflow Example

A real-world data science workflow using pretty_little_summary:

In [None]:
# Step 1: Load data
sales_data = pd.DataFrame({
    'customer_id': np.random.randint(1000, 2000, 500),
    'purchase_date': pd.date_range('2024-01-01', periods=500, freq='H'),
    'amount': np.random.lognormal(4, 1, 500).round(2),
    'category': np.random.choice(['Electronics', 'Clothing', 'Food', 'Books'], 500)
})

print("✓ Data loaded")
pls.describe(sales_data, explain=False).content

In [None]:
# Step 2: Data transformation
sales_data['month'] = sales_data['purchase_date'].dt.to_period('M')
monthly_summary = sales_data.groupby(['month', 'category'])['amount'].agg(['sum', 'count', 'mean'])

print("✓ Data transformed")
pls.describe(monthly_summary, explain=False).content

In [None]:
# Step 3: Create visualization
top_categories = sales_data.groupby('category')['amount'].sum().sort_values(ascending=False)

fig, ax = plt.subplots(figsize=(10, 6))
top_categories.plot(kind='barh', ax=ax)
ax.set_title('Total Sales by Category')
ax.set_xlabel('Total Amount ($)')
plt.tight_layout()
plt.show()

print("\n✓ Visualization created")
result = pls.describe(fig, explain=False)
print(result.content)

In [None]:
# Step 4: Generate final report with LLM explanations
if api_key:
    print("📊 ANALYSIS REPORT")
    print("="*60)
    
    print("\n1. Raw Data:")
    print(pls.describe(sales_data, explain=True).content)
    
    print("\n2. Monthly Summary:")
    print(pls.describe(monthly_summary, explain=True).content)
    
    print("\n3. Visualization:")
    print(pls.describe(fig, explain=True).content)
    
    print("\n" + "="*60)
    print("✓ Report generated with full context and explanations")
else:
    print("⚠ Skipping report generation - no API key configured")
    print("   Set OPENROUTER_API_KEY to enable LLM mode")

## Summary

This notebook demonstrated:

1. ✅ **Installation**: `pip install -e .` from the repo
2. ✅ **Configuration**: OpenRouter via `pls.configure()` or environment variables
3. ✅ **Built-in types**: Dictionaries, lists, custom classes
4. ✅ **NumPy arrays**: Shape, dtype, and statistical summaries
5. ✅ **Pandas DataFrames**: Columns, dtypes, sample data
6. ✅ **Matplotlib figures**: Plot descriptions with code history context
7. ✅ **Deterministic mode**: Fast summaries without API calls (`explain=False`)
8. ✅ **LLM mode**: Natural language explanations (`explain=True`)

**Key takeaway**: All LLM handling is inside `pretty_little_summary` - you just call `pls.describe(obj)` and get back a description!