# Pipeline Workflow

CausalPy provides a composable pipeline API that chains causal inference steps into a single, reproducible workflow. Instead of manually calling experiment construction, sensitivity analysis, and report generation separately, you can define them as steps in a pipeline.

In [None]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression

import causalpy as cp

## Manual approach (before pipeline)

Traditionally, a CausalPy analysis involves several sequential steps:

In [None]:
df = (
    cp.load_data("its")
    .assign(date=lambda x: pd.to_datetime(x["date"]))
    .set_index("date")
)
treatment_time = pd.to_datetime("2017-01-01")

model = cp.create_causalpy_compatible_class(LinearRegression())

# Step 1: Fit the experiment
result = cp.InterruptedTimeSeries(
    df,
    treatment_time,
    formula="y ~ 1 + t",
    model=model,
)

# Step 2: Get effect summary
summary = result.effect_summary()
print(summary.text)

## Pipeline approach

The pipeline wraps these steps into a single, declarative workflow. Each step is configured upfront, and the pipeline validates everything before running.

In [None]:
df = (
    cp.load_data("its")
    .assign(date=lambda x: pd.to_datetime(x["date"]))
    .set_index("date")
)

result = cp.Pipeline(
    data=df,
    steps=[
        cp.EstimateEffect(
            method=cp.InterruptedTimeSeries,
            treatment_time=pd.to_datetime("2017-01-01"),
            formula="y ~ 1 + t",
            model=cp.create_causalpy_compatible_class(LinearRegression()),
        ),
        cp.GenerateReport(include_plots=False),
    ],
).run()

print("Experiment type:", type(result.experiment).__name__)
print("Effect summary available:", result.effect_summary is not None)
print("Report generated:", result.report is not None)

## Adding sensitivity analysis

The `SensitivityAnalysis` step runs a suite of diagnostic checks against the fitted experiment. Checks are pluggable, and you can choose which ones to run.

In [None]:
result = cp.Pipeline(
    data=df,
    steps=[
        cp.EstimateEffect(
            method=cp.InterruptedTimeSeries,
            treatment_time=pd.to_datetime("2017-01-01"),
            formula="y ~ 1 + t",
            model=cp.create_causalpy_compatible_class(LinearRegression()),
        ),
        cp.SensitivityAnalysis(
            checks=[
                cp.checks.PlaceboInTime(n_folds=2),
            ]
        ),
        cp.GenerateReport(include_plots=False),
    ],
).run()

print(f"Sensitivity checks run: {len(result.sensitivity_results)}")
for check_result in result.sensitivity_results:
    print(f"  - {check_result.check_name}: {check_result.text[:80]}...")

## Available checks

CausalPy provides a range of sensitivity checks, each applicable to specific experiment types:

| Check | Applicable methods | Description |
|-------|-------------------|-------------|
| `PlaceboInTime` | ITS, SC | Shifts treatment time backward to test for spurious effects |
| `PriorSensitivity` | All Bayesian | Re-fits with different priors |
| `ConvexHullCheck` | SC | Validates treated values are within control range |
| `PersistenceCheck` | ITS (3-period) | Checks if effects persist after intervention ends |
| `PreTreatmentPlaceboCheck` | Staggered DiD | Validates parallel trends via pre-treatment effects |
| `BandwidthSensitivity` | RD, RKink | Re-fits with multiple bandwidths |
| `LeaveOneOut` | SC | Drops each control unit and refits |
| `PlaceboInSpace` | SC | Treats each control as placebo treated |
| `McCraryDensityTest` | RD | Tests for running variable manipulation |

## Pipeline result

The `PipelineResult` object contains all accumulated outputs:

In [None]:
print("result.experiment      ->", type(result.experiment).__name__)
print("result.effect_summary  ->", type(result.effect_summary).__name__)
print("result.sensitivity_results ->", len(result.sensitivity_results), "checks")
print("result.report          ->", "HTML" if result.report else "None")

The effect summary provides both a table and prose:

In [None]:
if result.effect_summary is not None:
    print(result.effect_summary.text)
    display(result.effect_summary.table)