# Kedro Compose Patterns Demo

This notebook demonstrates the 3 main patterns supported by kedro-compose using materialized Kedro pipelines instead of factory functions.

In [110]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [111]:
from kedro.pipeline import Pipeline, node
from kedro_compose import ComposablePipeline, compose_grid, draw_pipeline


# Define some dummy functions for demonstration
def extract_data():
    """Extract raw data."""
    return "raw_data"


def train_linear_model(features):
    """Train linear model."""
    return f"linear_model_{features}"


def train_ensemble_model(features):
    """Train ensemble model."""
    return f"ensemble_model_{features}"


def train_tree_model(features):
    """Train tree model."""
    return f"tree_model_{features}"

## Pattern 1: Simple Branching

One parent pipeline branches into multiple child pipelines.

**Structure**: `parent` → `parent.child1`, `parent.child2`, `parent.child3`

In [112]:
# Create materialized pipelines
data_prep_pipeline = Pipeline(
    [
        node(
            extract_data,
            inputs=None,
            outputs="features",
            name="extract",
            namespace="fe1",
        )
    ]
)

model1_pipeline = Pipeline(
    [node(train_linear_model, inputs="features", outputs="linear_predictions")]
)

model2_pipeline = Pipeline(
    [node(train_ensemble_model, inputs="features", outputs="ensemble_predictions")]
)

model3_pipeline = Pipeline(
    [node(train_tree_model, inputs="features", outputs="linear_evaluation")]
)

# Create simple branching pattern
parent = ComposablePipeline(data_prep_pipeline)
parent.add_child("Model1", model1_pipeline)
parent.add_child("Model2", model2_pipeline)
parent.add_child("Model3", model3_pipeline)

# Build and examine the result
branching_result = parent.build()
print(draw_pipeline(branching_result))

                                                  +---------+                                                   
                                                  | extract |                                                   
                                                  +---------+                                                   
                                                       *                                                        
                                                       *                                                        
                                                       *                                                        
                                                 +----------+                                                   
                                              ***| features |****                                               
                                       *******   +----------+    *******                        

In [113]:
branching_result

Pipeline([
Node(extract_data, None, 'features', 'extract'),
Node(train_linear_model, 'features', 'Model1.linear_predictions', None),
Node(train_ensemble_model, 'features', 'Model2.ensemble_predictions', None),
Node(train_tree_model, 'features', 'Model3.linear_evaluation', None)
])

## Pattern 2: Merge Pattern

Multiple separate pipelines converge into a single combined pipeline.

**Structure**: `pipeline1`, `pipeline2`, `pipeline3` → `merged`

In [114]:
def extract_data():
    """Extract raw data."""
    return "extracted_data"


def process_features():
    """Create processed features."""
    return "processed_features"


def advance_features():
    """Train final model."""
    return "advanced_features"


def combine_features(*args):
    return "combined_features"

In [115]:
extraction = node(
    extract_data, inputs=None, outputs="extracted_data", name="extract_data"
)

basic_features = node(
    process_features,
    inputs=None,
    outputs="processed_features",
    name="process_features",
)


advanced_features = node(
    advance_features,
    inputs=None,
    outputs="advanced_features",
    name="advance_features",
)

In [116]:
feature_pipeline = Pipeline(
    [extraction, basic_features, advanced_features]
)  # Three parallel output

branch = ComposablePipeline(feature_pipeline)
final = branch.merge(
    func=combine_features, outputs="combined_all_features"
)  # It should automatically detect inputs

result = final.build()
print(draw_pipeline(result))

 +------------------+                 +------------------+                  +--------------+   
 | process_features |                 | advance_features |                  | extract_data |   
 +------------------+                 +------------------+                  +--------------+   
           *                                    *                                   *          
           *                                    *                                   *          
           *                                    *                                   *          
+--------------------+                +-------------------+                +----------------+  
| processed_features |                | advanced_features |                | extracted_data |  
+--------------------+****            +-------------------+           *****+----------------+  
                          *******               *              *******                         
                                 *****  

In [117]:
result

Pipeline([
Node(advance_features, None, 'advanced_features', 'advance_features'),
Node(extract_data, None, 'extracted_data', 'extract_data'),
Node(process_features, None, 'processed_features', 'process_features'),
Node(combine_features, ['processed_features', 'advanced_features', 'extracted_data'], 'combined_all_features', 'merge_3_outputs')
])

## Pattern 3: Nested Tree / Cartesian Product

Create all combinations of parent and child pipelines using nested for-loops.

**Structure**: `grid` → `grid.fe1`/`grid.fe2` → `grid.fe1.model1`, `grid.fe1.model2`, `grid.fe2.model1`, `grid.fe2.model2`

In [118]:
def extract_data():
    return "raw_data"

def basic_features(data):
    return f"basic_features_{data}"

def advanced_features(data):
    return f"advanced_features_{data}"

def train_linear_model(features):
    return f"linear_model_{features}"

def train_ensemble_model(features):
    return f"ensemble_model_{features}"

def evaluate_model(model):
    """Evaluate model performance."""
    return f"evaluation_of_{model}"


In [119]:
# Create root data pipeline
data_pipeline = Pipeline(
    [node(extract_data, inputs=None, outputs="raw_data", name="extract_data")]
)

# Create feature engineering pipelines
fe1_pipeline = Pipeline(
    [node(basic_features, inputs="raw_data", outputs="features", name="basic_fe")]
)

fe2_pipeline = Pipeline([
    node(advanced_features, inputs="raw_data", outputs="features", name="advanced_fe")
])

In [120]:
# Nested Tree / Cartesian Product: Create all combinations using for-loops
root = ComposablePipeline(data_pipeline)

# Define feature engineering and modeling approaches
fe_approaches = {
    "FE1": fe1_pipeline,
    "FE2": fe2_pipeline
}

model_approaches = {
    "Linear": Pipeline([
        node(train_linear_model, inputs="features", outputs="linear_predictions", name="linear_model")
    ]),
    "Ensemble": Pipeline([
        node(train_ensemble_model, inputs="features", outputs="ensemble_predictions", name="ensemble_model")
    ])
}

# Level 1: Create FE branches using for-loop
fe_branches = {}
for fe_name, fe_pipeline in fe_approaches.items():
    fe_branches[fe_name] = root.add_child(fe_name, fe_pipeline)

# Level 2: Create all FE × Model combinations using nested for-loops
for fe_name, fe_branch in fe_branches.items():
    for model_name, model_pipeline in model_approaches.items():
        fe_branch.add_child(model_name, model_pipeline)

# Build and examine the result
grid_result = root.build()
print(draw_pipeline(grid_result))

                                                                                 +--------------+                                                                                
                                                                                 | extract_data |                                                                                
                                                                                 +--------------+                                                                                
                                                                                         *                                                                                       
                                                                                         *                                                                                       
                                                                                         *                    

## Summary

This notebook demonstrated all 3 patterns supported by kedro-compose:

1. **Simple Branching**: One parent → multiple children
2. **Merge Pattern**: Combining separate pipelines using `.merge()` functionality  
3. **Nested Tree / Cartesian Product**: Cartesian product using nested for-loops to create all FE × Model combinations

Each pattern shows how kedro-compose automatically generates namespace strings, eliminating the need for manual string concatenation and reducing errors in complex pipeline structures.