# Kedro Compose Patterns Demo

This notebook demonstrates the 4 main patterns supported by kedro-compose using materialized Kedro pipelines instead of factory functions.

In [29]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [30]:
from kedro.pipeline import Pipeline, node
from kedro_compose import ComposablePipeline, compose_grid, draw_pipeline

# Define some dummy functions for demonstration
def extract_data():
    """Extract raw data."""
    return "raw_data"

def train_linear_model(features):
    """Train linear model."""
    return f"linear_model_{features}"

def train_ensemble_model(features):
    """Train ensemble model."""
    return f"ensemble_model_{features}"

def train_tree_model(features):
    """Train tree model."""
    return f"tree_model_{features}"

## Pattern 1: Simple Branching

One parent pipeline branches into multiple child pipelines.

**Structure**: `parent` → `parent.child1`, `parent.child2`, `parent.child3`

In [31]:
# Create materialized pipelines
data_prep_pipeline = Pipeline([
    node(extract_data, inputs=None, outputs="features", name="extract", namespace="fe1")
])

model1_pipeline = Pipeline([
    node(train_linear_model, inputs="features", outputs="linear_predictions")
])

model2_pipeline = Pipeline([
    node(train_ensemble_model, inputs="features", outputs="ensemble_predictions")
])

model3_pipeline = Pipeline([
    node(train_tree_model, inputs="features", outputs="linear_evaluation")
])

# Create simple branching pattern
parent = ComposablePipeline(data_prep_pipeline)
parent.add_child("Model1", model1_pipeline)
parent.add_child("Model2", model2_pipeline)
parent.add_child("Model3", model3_pipeline)

# Build and examine the result
branching_result = parent.build()
print(draw_pipeline(branching_result))

                                                  +---------+                                                   
                                                  | extract |                                                   
                                                  +---------+                                                   
                                                       *                                                        
                                                       *                                                        
                                                       *                                                        
                                                 +----------+                                                   
                                              ***| features |****                                               
                                       *******   +----------+    *******                        

In [32]:
branching_result

Pipeline([
Node(extract_data, None, 'features', 'extract'),
Node(train_linear_model, 'features', 'Model1.linear_predictions', None),
Node(train_ensemble_model, 'features', 'Model2.ensemble_predictions', None),
Node(train_tree_model, 'features', 'Model3.linear_evaluation', None)
])

## Pattern 2: Multi-Level Tree
Hierarchical branching structure with multiple levels.


In [33]:
# Structure: `root` → `root.fe`/`root.ml` → `root.fe.basic`/`root.fe.advanced`/`root.ml.linear`/`root.ml.ensemble`
def extract_data():
    return "raw_data"

def basic_features(data):
    return f"basic_features_{data}"

def advanced_features(data):
    return f"advanced_features_{data}"

def train_linear_model(features):
    return f"linear_model_{features}"

def train_ensemble_model(features):
    return f"ensemble_model_{features}"

root_pipeline = Pipeline([
    node(extract_data, inputs=None, outputs="raw_data", name="extract_root")
])

fe_pipeline = Pipeline([
    node(lambda x: f"preprocessed_{x}", inputs="raw_data", outputs="preprocessed_data", name="preprocess")
])

ml_pipeline = Pipeline([
    node(lambda x: f"ml_ready_{x}", inputs="raw_data", outputs="ml_ready_data", name="prepare_ml")
])

basic_fe_pipeline = Pipeline([
    node(basic_features, inputs="preprocessed_data", outputs="basic_features", name="create_basic_features")
])

advanced_fe_pipeline = Pipeline([
    node(advanced_features, inputs="preprocessed_data", outputs="advanced_features", name="create_advanced_features")
])

linear_ml_pipeline = Pipeline([
    node(train_linear_model, inputs="ml_ready_data", outputs="linear_model", name="train_linear_model")
])

ensemble_ml_pipeline = Pipeline([
    node(train_ensemble_model, inputs="ml_ready_data", outputs="ensemble_model", name="train_ensemble_model")
])

In [34]:
# Create multi-level tree pattern
root = ComposablePipeline(root_pipeline)

# Level 1: Feature Engineering and ML branches
fe_branch = root.add_child("FE", fe_pipeline)
ml_branch = root.add_child("ML", ml_pipeline)

# Level 2: Specific implementations
fe_branch.add_child("Basic", basic_fe_pipeline)
fe_branch.add_child("Advanced", advanced_fe_pipeline)

ml_branch.add_child("Linear", linear_ml_pipeline)
ml_branch.add_child("Ensemble", ensemble_ml_pipeline)

# Build and examine the result
tree_result = root.build()
print(draw_pipeline(tree_result))

                                                                      +--------------+                                                                    
                                                                      | extract_root |                                                                    
                                                                      +--------------+                                                                    
                                                                              *                                                                           
                                                                              *                                                                           
                                                                              *                                                                           
                                                                      

## Pattern 3: Merge Pattern

Multiple separate pipelines converge into a single combined pipeline.

**Structure**: `pipeline1`, `pipeline2`, `pipeline3` → `merged`

In [35]:
def extract_data():
    """Extract raw data."""
    return "raw_data"

def basic_features():
    """Create basic features."""
    return f"basic_features"

def advanced_features():
    """Create advanced features."""
    return f"advanced_features"

def combine_features(*args):
    return "combined_features"

In [36]:
# Create separate pipeline branches
extraction_pipeline = Pipeline([
    node(extract_data, inputs=None, outputs="extracted_data", name="extract_data")
])

feature_pipeline = Pipeline([
    node(basic_features, inputs=None, outputs="processed_features", name="process_features")
])

modeling_pipeline = Pipeline([
    node(advanced_features, inputs=None, outputs="advanced_features", name="train_final_model")
])

evaluation_pipeline = Pipeline([
    node(combine_features, inputs=["extracted_data","processed_features", "advanced_features"], outputs="model_metrics", name="evaluate_final_model")
])

In [37]:
# Create separate branches as ComposablePipelines
branch1 = ComposablePipeline(extraction_pipeline)
branch2 = ComposablePipeline(feature_pipeline)
branch3 = ComposablePipeline(modeling_pipeline)
branch4 = ComposablePipeline(evaluation_pipeline)

# Merge via Kedro's native pipeline combination
merged_result = branch1.build() + branch2.build() + branch3.build() + branch4.build()

print(draw_pipeline(merged_result))

 +--------------+                  +------------------+                 +-------------------+  
 | extract_data |                  | process_features |                 | train_final_model |  
 +--------------+                  +------------------+                 +-------------------+  
         *                                   *                                    *            
         *                                   *                                    *            
         *                                   *                                    *            
+----------------+                +--------------------+                +-------------------+  
| extracted_data |*               | processed_features |                | advanced_features |  
+----------------+ ******         +--------------------+            ****+-------------------+  
                         ******              *               *******                           
                               ******   

## Pattern 4: Grid Composition (Cartesian Product)

Create all combinations of parent and child pipelines.

**Structure**: `grid` → `grid.fe1`/`grid.fe2` → `grid.fe1.model1`, `grid.fe1.model2`, `grid.fe2.model1`, `grid.fe2.model2`

In [38]:
def extract_data():
    return "raw_data"

def basic_features(data):
    return f"basic_features_{data}"

def advanced_features(data):
    return f"advanced_features_{data}"

def train_linear_model(features):
    return f"linear_model_{features}"

def train_ensemble_model(features):
    return f"ensemble_model_{features}"
def evaluate_model(model):
    """Evaluate model performance."""
    return f"evaluation_of_{model}"


In [39]:
# Create root data pipeline
data_pipeline = Pipeline([
    node(extract_data, inputs=None, outputs="raw_data", name="extract_data")
])

# Create feature engineering pipelines
fe1_pipeline = Pipeline([
    node(basic_features, inputs="raw_data", outputs="fe1_features", name="basic_fe")
])

fe2_pipeline = Pipeline([
    node(advanced_features, inputs="raw_data", outputs="fe2_features", name="advanced_fe")
])


In [40]:

# Grid Composition: Create all combinations of FE x Models
root = ComposablePipeline(data_pipeline)

# Level 1: Feature engineering branches
fe1_branch = root.add_child("FE1", fe1_pipeline)
fe2_branch = root.add_child("FE2", fe2_pipeline)

# Level 2: Model combinations for each FE branch (Grid = FE x Models)
# FE1 combinations
fe1_branch.add_child("Linear", Pipeline([
    node(train_linear_model, inputs="fe1_features", outputs="fe1_linear_pred", name="fe1_linear_model")
]))
fe1_branch.add_child("Ensemble", Pipeline([
    node(train_ensemble_model, inputs="fe1_features", outputs="fe1_ensemble_pred", name="fe1_ensemble_model")
]))

# FE2 combinations
fe2_branch.add_child("Linear", Pipeline([
    node(train_linear_model, inputs="fe2_features", outputs="fe2_linear_pred", name="fe2_linear_model")
]))
fe2_branch.add_child("Ensemble", Pipeline([
    node(train_ensemble_model, inputs="fe2_features", outputs="fe2_ensemble_pred", name="fe2_ensemble_model")
]))

# Build and examine the result
grid_result = root.build()
print("Grid Composition Pattern - Generated Combinations:")
for node in grid_result.nodes:
    if 'model' in node.name:
        print(f"  {node.name}: {node.inputs} → {node.outputs}")

print(f"\nTotal combinations generated: 2 FE × 2 Models = 4 model nodes")
print(f"Plus 3 infrastructure nodes = {len(grid_result.nodes)} total nodes")

# Visualize the grid
print(draw_pipeline(grid_result))

Grid Composition Pattern - Generated Combinations:
  FE1.Ensemble.fe1_ensemble_model: ['FE1.fe1_features'] → ['FE1.Ensemble.fe1_ensemble_pred']
  FE1.Linear.fe1_linear_model: ['FE1.fe1_features'] → ['FE1.Linear.fe1_linear_pred']
  FE2.Ensemble.fe2_ensemble_model: ['FE2.fe2_features'] → ['FE2.Ensemble.fe2_ensemble_pred']
  FE2.Linear.fe2_linear_model: ['FE2.fe2_features'] → ['FE2.Linear.fe2_linear_pred']

Total combinations generated: 2 FE × 2 Models = 4 model nodes
Plus 3 infrastructure nodes = 7 total nodes
                                                                           +--------------+                                                                          
                                                                           | extract_data |                                                                          
                                                                           +--------------+                                                               

## Summary

This notebook demonstrated all 4 patterns supported by kedro-compose:

1. **Simple Branching**: One parent → multiple children
2. **Multi-Level Tree**: Hierarchical structure with multiple levels
3. **Merge Pattern**: Combining separate pipelines
4. **Grid Composition**: Cartesian product of pipeline combinations

Each pattern shows how kedro-compose automatically generates namespace strings, eliminating the need for manual string concatenation and reducing errors in complex pipeline structures.