# Phase 3 Feature Demo: Story Generator Enhancements

This notebook demonstrates the new **Schema Tracking** and **Data Snapshot** capabilities added to the Story Generator in Phase 3.

We will:
1. Create a simple pipeline that modifies the schema (adds columns, pivots data)
2. Run the pipeline to generate a story
3. Display the generated story to see the new enhancements in action

In [1]:
import sys
sys.path.insert(0, r'C:\Users\hodibi\OneDrive - Ingredion\Desktop\Repos\Odibi')

# Verify it worked
import odibi
print(f"‚úÖ ODIBI loaded from: {odibi.__file__}")
# Or manually:
%pip install azure-identity azure-keyvault-secrets adlfs

‚úÖ ODIBI loaded from: C:\Users\hodibi\OneDrive - Ingredion\Desktop\Repos\Odibi\odibi\__init__.py
Note: you may need to restart the kernel to use updated packages.




In [2]:
# Setup - Create a temporary pipeline config
import os
import yaml
from IPython.display import Markdown, display

config = {
    "project": "Schema Tracking Demo",
    "version": "1.0.0",
    "engine": "pandas",
    
    # Connections
    "connections": {
        "local_data": {
            "type": "local",
            "base_path": "./data"
        }
    },
    
    # Story Config
    "story": {
        "connection": "local_data",
        "path": "stories",
        "max_sample_rows": 5,  # Snapshot feature
        "auto_generate": True
    },
    
    # Pipeline Definition
    "pipelines": [
        {
            "pipeline": "schema_evolution_demo",
            "description": "Demonstrates schema changes between nodes",
            "nodes": [
                # Node 1: Create source data
                {
                    "name": "create_source",
                    "description": "Generates initial sales data",
                    "transform": {
                        "steps": [
                            {
                                "sql": """
                                SELECT 
                                    'A' as product, 
                                    'North' as region, 
                                    100 as sales
                                UNION ALL
                                SELECT 'B', 'North', 150
                                UNION ALL
                                SELECT 'A', 'South', 200
                                """
                            }
                        ]
                    }
                },
                
                # Node 2: Add new column (Schema Change!)
                {
                    "name": "enrich_data",
                    "description": "Adds a new column 'tax'",
                    "depends_on": ["create_source"],
                    "transform": {
                        "steps": [
                            # Calculate 10% tax
                            {"sql": "SELECT *, sales * 0.1 as tax FROM create_source"}
                        ]
                    }
                },
                
                # Node 3: Remove column (Schema Change!)
                {
                    "name": "cleanup_data",
                    "description": "Removes the region column",
                    "depends_on": ["enrich_data"],
                    "transform": {
                        "steps": [
                            {"sql": "SELECT product, sales, tax FROM enrich_data"}
                        ]
                    }
                }
            ]
        }
    ]
}

# Write config to file
os.makedirs("data/stories", exist_ok=True)
with open("demo_project.yaml", "w") as f:
    yaml.dump(config, f)
    
print("Created demo_project.yaml")

Created demo_project.yaml


In [3]:
# Run the pipeline
from odibi.pipeline import PipelineManager

# Initialize manager
manager = PipelineManager.from_yaml("demo_project.yaml")

# Run pipeline
results = manager.run("schema_evolution_demo")

print(f"\nStory generated at: {results.story_path}")


Running pipeline: schema_evolution_demo


‚úÖ SUCCESS - schema_evolution_demo
  Completed: 3 nodes
  Failed: 0 nodes
  Duration: 0.08s
  Story: c:\Users\hodibi\OneDrive - Ingredion\Desktop\Repos\Odibi\walkthroughs\data\stories\schema_evolution_demo_20251119_165610.md

Story generated at: c:\Users\hodibi\OneDrive - Ingredion\Desktop\Repos\Odibi\walkthroughs\data\stories\schema_evolution_demo_20251119_165610.md


In [4]:
# Display the generated story
if results.story_path:
    with open(results.story_path, "r", encoding="utf-8") as f:
        story_content = f.read()
    
    display(Markdown(story_content))
else:
    print("No story generated!")

# Pipeline Run Story: schema_evolution_demo

**Executed:** 2025-11-19T16:56:10.535762
**Completed:** 2025-11-19T16:56:10.611748
**Duration:** 0.08s
**Status:** ‚úÖ Success

---

## Summary

- ‚úÖ **Completed:** 3 nodes
- ‚ùå **Failed:** 0 nodes
- ‚è≠Ô∏è **Skipped:** 0 nodes
- ‚è±Ô∏è **Duration:** 0.08s

**Completed nodes:** create_source, enrich_data, cleanup_data

---

## Node: create_source

**Status:** ‚úÖ Success
**Duration:** 0.0336s

**Execution steps:**
- Applied 1 transform steps

**Output schema:**
- Columns (3): product, region, sales
- Rows: 3

**Sample output** (first 3 rows):

| product | region | sales |
| --- | --- | --- |
| A | North | 100 |
| B | North | 150 |
| A | South | 200 |

---

## Node: enrich_data

**Status:** ‚úÖ Success
**Duration:** 0.0264s

**Execution steps:**
- Applied 1 transform steps

**Input schema:**
- Columns (3): product, region, sales

**Sample input** (first 3 rows):

| product | region | sales |
| --- | --- | --- |
| A | North | 100 |
| B | North | 150 |
| A | South | 200 |

**Output schema:**
- Columns (4): product, region, sales, tax

**Schema Changes:**
- üü¢ **Added:** tax
- Rows: 3

**Sample output** (first 3 rows):

| product | region | sales | tax |
| --- | --- | --- | --- |
| A | North | 100 | 10.0 |
| B | North | 150 | 15.0 |
| A | South | 200 | 20.0 |

---

## Node: cleanup_data

**Status:** ‚úÖ Success
**Duration:** 0.0160s

**Execution steps:**
- Applied 1 transform steps

**Input schema:**
- Columns (4): product, region, sales, tax

**Sample input** (first 3 rows):

| product | region | sales | tax |
| --- | --- | --- | --- |
| A | North | 100 | 10.0 |
| B | North | 150 | 15.0 |
| A | South | 200 | 20.0 |

**Output schema:**
- Columns (3): product, sales, tax

**Schema Changes:**
- üî¥ **Removed:** region
- Rows: 3

**Sample output** (first 3 rows):

| product | sales | tax |
| --- | --- | --- |
| A | 100 | 10.0 |
| B | 150 | 15.0 |
| A | 200 | 20.0 |

---


## What to look for:

1. **Node `enrich_data`**:
   - Should show `Schema Changes` section
   - `üü¢ Added: tax`

2. **Node `cleanup_data`**:
   - Should show `Schema Changes` section
   - `üî¥ Removed: region`

3. **Sample Data**:
   - Each node should show a table with the first 5 rows (as configured)