# Transportation Demand Modeling with Consist

If you've run ActivitySim, POLARIS, or a similar TDM, you know the pain: dozens of scenarios,
unclear what changed between runs, and difficulty reproducing results months later. This notebook
shows how Consist addresses those problems with provenance tracking, reproducible configs, and
incremental computation.

**Prerequisites:** Quickstart and the Iterative Workflows tutorial. This notebook focuses on
transportation-specific patterns, not core Consist mechanics.


## What you'll learn

TDM practitioners often ask:

- "What exact parameters produced this VMT forecast I sent to the MPO six months ago?"
- "Did we run the 2045 scenario with the updated BPR parameters or the old ones?"
- "Can I extend this run to 15 iterations without losing a day of compute time?"
- "Which skim matrix fed the trip table in this scenario?"

This notebook shows how Consist answers those questions with provenance,
config tracking, and cache-aware workflows.


## Model Overview

We simulate morning commute mode choice in a stylized 5-zone linear city:

```
Zone 1 ←→ Zone 2 ←→ Zone 3 ←→ Zone 4 ←→ Zone 5
(Low)     (Med)     (High/CBD)  (Med)     (Low)
```

**Feedback loop:**
1. **Logsums**: Compute accessibility from current travel times
2. **Trip Distribution**: Workers choose job locations based on accessibility
3. **Mode Choice**: Each commuter chooses car, transit, or walk
4. **Assignment**: Car trips create congestion
5. **Skim Update**: Congestion slows travel times → back to step 1

The model iterates until mode shares stabilize.


## Workflow Structure (Iteration Loop)

This is the conceptual loop we'll run in each scenario:

```
┌─────────────┐
│    init     │ (iteration 0 only)
└──────┬──────┘
       │ skims, zones, population
       ▼
┌─────────────┐
│   logsums   │◄─── skims, zones
└──────┬──────┘
       │ logsums
       ▼
┌─────────────┐
│ trip_dist   │◄─── logsums, zones, population
└──────┬──────┘
       │ trips
       ▼
┌─────────────┐
│ utilities   │◄─── trips, skims, zones
└──────┬──────┘
       │ utilities
       ▼
┌─────────────┐
│ mode_choice │◄─── utilities, (prev trips_with_modes)
└──────┬──────┘
       │ trips_with_modes
       ▼
┌─────────────┐
│ assignment  │◄─── trips_with_modes
└──────┬──────┘
       │ volumes
       ▼
┌─────────────┐
│ traffic_sim │◄─── volumes, skims
└──────┬──────┘
       │ updated skims ──► next iteration
       ▼
```

For `run(...)` vs `trace(...)`, see the [Concepts Overview](../docs/concepts.md#when-to-use-each-pattern).


## Setup

We initialize the tracker, import the model functions, and define a scenario configuration.


In [None]:
from __future__ import annotations

import sys
from pathlib import Path


def _find_repo_root(start: Path) -> Path:
    for candidate in (start, *start.parents):
        if (candidate / "pyproject.toml").exists():
            return candidate
    raise RuntimeError("Could not locate repo root (missing pyproject.toml)")


REPO_ROOT = _find_repo_root(Path.cwd())
EXAMPLES_DIR = REPO_ROOT / "examples"
EXAMPLES_SRC = EXAMPLES_DIR / "src"

for path in (REPO_ROOT, EXAMPLES_SRC):
    if str(path) not in sys.path:
        sys.path.insert(0, str(path))

In [None]:
from dataclasses import asdict, replace

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import os
from tqdm import tqdm

import consist
from consist import Tracker

from travel_demand_functions import (
    AssignmentParams,
    DestinationChoiceParams,
    ModeChoiceParams,
    TravelDemandScenarioConfig,
    ZoneParams,
    apply_congestion,
    apply_mode_choice,
    compute_mode_shares,
    compute_mode_utilities,
    compute_od_logsums,
    compute_od_volumes,
    create_skims_dataset,
    distribute_trips,
    generate_distances,
    generate_population,
    generate_zones,
    save_skims,
    summarize_iteration,
)

sns.set_theme(style="whitegrid")

In [None]:
EXAMPLES_DIR = REPO_ROOT / "examples"
RUN_DIR = EXAMPLES_DIR / "runs" / "travel_demand_demo"
SESSION_ID = os.getenv("CONSIST_SESSION_ID", "demo")
DB_PATH = RUN_DIR / f"travel_demand_demo_{SESSION_ID}.duckdb"
if DB_PATH.exists():
    DB_PATH.unlink()

tracker = Tracker(
    run_dir=RUN_DIR,
    db_path=DB_PATH,
    hashing_strategy="fast",
    project_root=str(RUN_DIR),
)

## Model Configuration

The model has four parameter groups:

| Parameter Group           | Controls                                                     |
|---------------------------|--------------------------------------------------------------|
| `ZoneParams`              | City layout: population, jobs, parking costs, transit access |
| `ModeChoiceParams`        | Utility coefficients, speeds, costs                          |
| `DestinationChoiceParams` | How accessibility affects job location choice                |
| `AssignmentParams`        | BPR congestion function parameters                           |

In typical TDM practice, these parameters live in scattered config files, INI sections,
and hardcoded values. When you return to a run months later, reconstructing exactly
what was used is painful. Consist captures all of this in a queryable format.

**ZoneParams** defines the city structure. The central zone (Zone 3) has the most jobs and highest
parking costs, mimicking a downtown CBD. Outer zones are residential with cheaper or free parking.

**ModeChoiceParams** sets coefficients for a standard multinomial logit. `beta_cost` and
`beta_time` control sensitivity to cost/time; ASCs capture baseline preferences.

**DestinationChoiceParams** governs how workers choose job locations based on accessibility
(logsum) and employment size.

**AssignmentParams** controls the BPR (Bureau of Public Roads) function that converts volumes into
congested travel times.


In [None]:
zone_params = ZoneParams()
mode_params = ModeChoiceParams()
dest_params = DestinationChoiceParams()
assignment_params = AssignmentParams()

DEFAULT_SEED = 0

base_config = TravelDemandScenarioConfig(
    n_iterations=10,
    seed=DEFAULT_SEED,
    zone_params=zone_params,
    mode_params=mode_params,
    dest_params=dest_params,
    assignment_params=assignment_params,
)

SCENARIO_NAME = "travel_demand_demo"

SKIM_PERTURBATION = 0.1
DESTINATION_UPDATE_SHARE = 0.2

## Workflow Functions

The next cell defines the model steps. It is long but mirrors typical TDM pipelines:
initialization, trip distribution, mode choice, assignment, and summaries.


In [None]:
@tracker.define_step(outputs=["skims", "persons", "zones"])
def initialize_scenario(*, zone_params, mode_params, skim_perturbation, _consist_ctx):
    zones = generate_zones(zone_params)
    population = generate_population(zones)
    distances = generate_distances(zone_params)

    skims = create_skims_dataset(zones, distances, mode_params)
    skims["time_car_mins"] *= skim_perturbation

    output_dir = _consist_ctx.run_dir
    output_dir.mkdir(parents=True, exist_ok=True)
    input_skims_path = output_dir / "skims_init.zarr"

    save_skims(skims, input_skims_path)

    consist.log_artifact(input_skims_path, key="skims", direction="output")
    consist.log_dataframe(
        population,
        key="persons",
        direction="output",
    )
    consist.log_dataframe(zones, key="zones", direction="output")


@tracker.define_step(outputs=["trips"])
def distribute_trips_step(*, dest_params, seed, update_share, _consist_ctx):
    zones = _consist_ctx.load("zones")
    population = _consist_ctx.load("persons")
    prev_trips_art = _consist_ctx.inputs.get("trips")
    prev_trips = _consist_ctx.load(prev_trips_art) if prev_trips_art else None
    logsums_df = _consist_ctx.load("logsums")
    if "origin" not in logsums_df.columns or "destination" not in logsums_df.columns:
        logsums_df = logsums_df.reset_index()
    logsums = logsums_df.set_index(["origin", "destination"])["logsum"].to_xarray()
    trips = distribute_trips(
        population,
        zones,
        logsums,
        dest_params,
        seed=seed,
        prev_trips=prev_trips,
        update_share=update_share,
    )
    consist.log_dataframe(trips, key="trips")


@tracker.define_step(
    outputs=[
        "mode_shares",
        "iteration_summaries",
        "pmt_totals",
        "mode_shares_plot",
        "iteration_totals_plot",
    ]
)
def summarize_results_step(*, mode_shares, summaries, pmt_totals, _consist_ctx):
    output_dir = _consist_ctx.run_dir
    summary_dir = output_dir / "summary"
    summary_dir.mkdir(parents=True, exist_ok=True)

    mode_shares_df = (
        pd.DataFrame.from_dict(mode_shares, orient="index")
        .sort_index()
        .rename_axis("iteration")
        .reset_index()
    )
    summaries_df = (
        pd.DataFrame.from_dict(summaries, orient="index")
        .sort_values("iteration")
        .reset_index(drop=True)
    )
    pmt_totals_df = (
        pd.DataFrame.from_dict(pmt_totals, orient="index")
        .sort_index()
        .rename_axis("iteration")
        .reset_index()
    )

    mode_shares_path = summary_dir / "mode_shares.csv"
    summaries_path = summary_dir / "iteration_summaries.csv"
    pmt_totals_path = summary_dir / "pmt_totals.csv"

    shares_long = mode_shares_df.melt(
        id_vars="iteration", var_name="mode", value_name="share"
    )
    plt.figure(figsize=(8, 4))
    sns.lineplot(data=shares_long, x="iteration", y="share", hue="mode", marker="o")
    plt.title("Mode Shares by Iteration")
    plt.tight_layout()
    mode_share_plot_path = summary_dir / "mode_shares.png"
    plt.savefig(mode_share_plot_path)
    plt.close()

    pmt_long = pmt_totals_df.melt(
        id_vars="iteration", var_name="mode", value_name="pmt"
    )
    fig, axes = plt.subplots(1, 2, figsize=(10, 4))
    sns.lineplot(data=summaries_df, x="iteration", y="vmt", marker="o", ax=axes[0])
    axes[0].set_title("Vehicle Miles Traveled")
    sns.lineplot(
        data=pmt_long,
        x="iteration",
        y="pmt",
        hue="mode",
        marker="o",
        ax=axes[1],
    )
    axes[1].set_title("Person Miles Traveled by Mode")
    fig.tight_layout()
    totals_plot_path = summary_dir / "iteration_totals.png"
    fig.savefig(totals_plot_path)
    plt.close(fig)

    consist.log_dataframe(
        mode_shares_df,
        key="mode_shares",
        path=mode_shares_path,
    )
    consist.log_dataframe(
        summaries_df,
        key="iteration_summaries",
        path=summaries_path,
    )
    consist.log_dataframe(
        pmt_totals_df,
        key="pmt_totals",
        path=pmt_totals_path,
    )
    consist.log_artifact(
        mode_share_plot_path,
        key="mode_shares_plot",
        direction="output",
    )
    consist.log_artifact(
        totals_plot_path,
        key="iteration_totals_plot",
        direction="output",
    )


def run_scenario(scenario_config, scenario_run_id):
    mode_shares = {}
    summaries = {}
    pmt_totals = {}

    zone_params = scenario_config.zone_params
    mode_params = scenario_config.mode_params
    dest_params = scenario_config.dest_params
    assignment_params = scenario_config.assignment_params
    seed = scenario_config.seed

    with tracker.scenario(
        scenario_run_id,
        config={
            **asdict(scenario_config),
            "scenario_name": SCENARIO_NAME,
        },
        facet_from=[
            "n_iterations",
            "seed",
            "zone_params",
            "mode_params",
            "dest_params",
            "assignment_params",
            "scenario_name",
        ],
        tags=["examples", "simulation", "travel_demand"],
    ) as scenario:
        cache_validation = "lazy"  # Skip output checks on cache hits for speed; use 'eager' to validate files.
        cache_hydration = (
            "inputs-missing"  # Copy cached inputs into the new run_dir on cache misses.
        )

        scenario.run(
            name="init",
            fn=initialize_scenario,
            config=asdict(mode_params)
            | asdict(zone_params)
            | {"skim_perturbation": SKIM_PERTURBATION},
            facet_from=["skim_perturbation"],
            inject_context=True,
            fn_args={
                "zone_params": zone_params,
                "mode_params": mode_params,
                "skim_perturbation": SKIM_PERTURBATION,
            },
        )

        for i in tqdm(range(scenario_config.n_iterations)):
            # Inline trace pattern (no wrapper function required).
            with scenario.trace(
                name="logsums",
                run_id=f"{scenario.run_id}_logsums_{i}",
                config=asdict(mode_params),
                inputs={"skims": "skims", "zones": "zones"},
                facet_from=[
                    "beta_cost",
                    "beta_time",
                    "asc_walk",
                    "asc_transit",
                    "fuel_cost_per_mile",
                    "transit_fare",
                ],
                validate_cached_outputs=cache_validation,
                cache_hydration=cache_hydration,
                iteration=i,
            ) as t:
                if not t.is_cached:
                    logsums = compute_od_logsums(
                        t.load(scenario.coupler.require("skims")),
                        t.load(scenario.coupler.require("zones")),
                        mode_params,
                    )
                    consist.log_dataframe(
                        logsums.to_dataframe().reset_index(),
                        key="logsums",
                    )

            scenario.run(
                name="trip_distribution",
                fn=distribute_trips_step,
                run_id=f"{scenario.run_id}_trip_distribution_{i}",
                config={
                    **asdict(dest_params),
                    "update_share": DESTINATION_UPDATE_SHARE,
                },
                inputs={
                    "skims": "skims",
                    "zones": "zones",
                    "persons": "persons",
                    "logsums": "logsums",
                },
                optional_input_keys=["trips"],
                facet_from=["beta_size", "beta_access", "update_share"],
                inject_context=True,
                fn_args={
                    "dest_params": dest_params,
                    "seed": seed + i,
                    "update_share": DESTINATION_UPDATE_SHARE,
                },
                iteration=i,
            )

            # `load_inputs=True` hydrates artifacts into function args by name.
            scenario.run(
                name="calculate_utilities",
                fn=compute_mode_utilities,
                run_id=f"{scenario.run_id}_utilities_{i}",
                config=asdict(mode_params),
                inputs={"trips": "trips", "skims": "skims", "zones": "zones"},
                facet_from=[
                    "beta_cost",
                    "beta_time",
                    "asc_walk",
                    "asc_transit",
                ],
                validate_cached_outputs=cache_validation,
                cache_hydration=cache_hydration,
                load_inputs=True,
                fn_args={"mode_params": mode_params},
                outputs=["utilities"],
                iteration=i,
            )

            mode_choice_result = scenario.run(
                name="mode_choice",
                fn=apply_mode_choice,
                run_id=f"{scenario.run_id}_mode_choice_{i}",
                config={"seed": seed},
                inputs={"utilities_df": "utilities"},
                fn_args={"seed": seed + i},
                outputs=["trips_with_modes"],
                iteration=i,
            )

            scenario.run(
                name="assignment",
                fn=compute_od_volumes,
                run_id=f"{scenario.run_id}_assignment_{i}",
                inputs={"trips": "trips_with_modes"},
                outputs=["volumes"],
                iteration=i,
            )

            scenario.run(
                name="traffic_simulation",
                fn=apply_congestion,
                run_id=f"{scenario.run_id}_traffic_simulation_{i}",
                config=asdict(assignment_params),
                inputs={"volumes": "volumes", "skims": "skims"},
                facet_from=["bpr_alpha", "bpr_beta", "base_capacity"],
                fn_args={"assignment_params": assignment_params},
                outputs=["skims"],
                iteration=i,
            )

            trips_with_modes = tracker.load(
                mode_choice_result.outputs["trips_with_modes"]
            )
            shares = compute_mode_shares(trips_with_modes)
            mode_shares[i] = pd.Series(shares)
            distance_by_mode = trips_with_modes.groupby("mode")["distance_miles"].sum()
            pmt_totals[i] = distance_by_mode.sort_index()
            summaries[i] = summarize_iteration(i, trips_with_modes, shares, 0, False)

        summary_result = scenario.run(
            name="summaries",
            fn=summarize_results_step,
            inputs={"trips_with_modes": "trips_with_modes"},
            inject_context=True,
            validate_cached_outputs=cache_validation,
            cache_hydration=cache_hydration,
            fn_args={
                "mode_shares": mode_shares,
                "summaries": summaries,
                "pmt_totals": pmt_totals,
            },
        )

        mode_shares_df = tracker.load(summary_result.outputs["mode_shares"])
        summaries_df = tracker.load(summary_result.outputs["iteration_summaries"])
        pmt_totals_df = tracker.load(summary_result.outputs["pmt_totals"])
        mode_share_plot_path = Path(
            tracker.resolve_uri(summary_result.outputs["mode_shares_plot"].uri)
        )
        totals_plot_path = Path(
            tracker.resolve_uri(summary_result.outputs["iteration_totals_plot"].uri)
        )

    return {
        "scenario_run_id": scenario_run_id,
        "mode_shares_df": mode_shares_df,
        "summaries_df": summaries_df,
        "pmt_totals_df": pmt_totals_df,
        "mode_share_plot_path": mode_share_plot_path,
        "totals_plot_path": totals_plot_path,
    }


## Baseline Scenario

Let's run a 10-iteration baseline and verify the model converges before comparing scenarios.


In [None]:
base_run_id = f"{SCENARIO_NAME}_{SESSION_ID}"
base_results = run_scenario(base_config, base_run_id)

analysis = base_results

mode_shares_df = analysis["mode_shares_df"]
summaries_df = analysis["summaries_df"]
pmt_totals_df = analysis["pmt_totals_df"]
mode_share_plot_path = analysis["mode_share_plot_path"]
totals_plot_path = analysis["totals_plot_path"]
SCENARIO_RUN_ID = analysis["scenario_run_id"]
final_iteration = int(summaries_df["iteration"].max())


### Convergence results

Mode shares should stabilize as the feedback loop converges.


In [None]:
from IPython.display import Image

Image(filename=str(mode_share_plot_path))

## Scenario Comparison: Higher Parking Costs

Now run a policy scenario with higher CBD parking costs and compare to the baseline. This is a
common TDM workflow: a scenario change followed by summary comparisons.


In [None]:
high_parking_config = replace(
    base_config,
    zone_params=ZoneParams(parking_costs=(0.0, 8.0, 25.0, 8.0, 0.0)),
)
high_parking_run_id = f"{SCENARIO_NAME}_{SESSION_ID}_high_parking"
high_parking_results = run_scenario(high_parking_config, high_parking_run_id)


base_shares = base_results["mode_shares_df"].set_index("iteration").loc[final_iteration]
hp_shares = (
    high_parking_results["mode_shares_df"].set_index("iteration").loc[final_iteration]
)
mode_share_comparison = pd.DataFrame(
    {
        "base": base_shares,
        "high_parking": hp_shares,
    }
)
mode_share_comparison["delta"] = (
    mode_share_comparison["high_parking"] - mode_share_comparison["base"]
)

base_pmt = base_results["pmt_totals_df"].set_index("iteration").loc[final_iteration]
hp_pmt = (
    high_parking_results["pmt_totals_df"].set_index("iteration").loc[final_iteration]
)
pmt_comparison = pd.DataFrame(
    {
        "base": base_pmt,
        "high_parking": hp_pmt,
    }
)
pmt_comparison["delta"] = pmt_comparison["high_parking"] - pmt_comparison["base"]

summary_cols = ["vmt", "total_trips", "car_share", "transit_share", "walk_share"]
base_summary = (
    base_results["summaries_df"]
    .set_index("iteration")
    .loc[final_iteration, summary_cols]
)
hp_summary = (
    high_parking_results["summaries_df"]
    .set_index("iteration")
    .loc[final_iteration, summary_cols]
)
summary_comparison = pd.DataFrame(
    {
        "base": base_summary,
        "high_parking": hp_summary,
    }
)
summary_comparison["delta"] = (
    summary_comparison["high_parking"] - summary_comparison["base"]
)

display(mode_share_comparison)
display(pmt_comparison)
display(summary_comparison)


### Interpreting the scenario deltas

The tables above should align with standard expectations: higher CBD parking costs reduce car use,
raise transit share, and lower VMT. Let's quantify the shift.


In [None]:
car_delta_pp = mode_share_comparison.loc["car", "delta"] * 100
transit_delta_pp = mode_share_comparison.loc["transit", "delta"] * 100
walk_delta_pp = mode_share_comparison.loc["walk", "delta"] * 100
vmt_delta = summary_comparison.loc["vmt", "delta"]
vmt_base = summary_comparison.loc["vmt", "base"]
vmt_pct = (vmt_delta / vmt_base) * 100 if vmt_base else float("nan")

print(
    f"High parking shifts {abs(car_delta_pp):.1f} pp out of car, "
    f"with transit up {abs(transit_delta_pp):.1f} pp and walk {abs(walk_delta_pp):.1f} pp."
)
print(f"VMT changes by {vmt_pct:.1f}% (Δ={vmt_delta:.2f}).")


## Provenance for Scenario Management

This is the core scenario-management payoff. We'll use Consist's provenance to answer:
what changed between scenarios, how an output was produced, and which parameters drove it.

### Scenario config diff
Consist stores a flattened, queryable config facet for each run. Below we compare the
baseline and policy scenarios and show only the keys that differ.

We use `tracker.get_config_values(...)` for a clean, typed view of the facet data.


In [None]:
baseline_config = tracker.get_config_values(SCENARIO_RUN_ID, namespace="scenario")
policy_config = tracker.get_config_values(high_parking_results["scenario_run_id"], namespace="scenario")

diff_rows = []
for key in sorted(set(baseline_config) | set(policy_config)):
    if baseline_config.get(key) != policy_config.get(key):
        diff_rows.append({
            "key": key,
            "baseline": baseline_config.get(key),
            "policy": policy_config.get(key),
        })

diff_df = pd.DataFrame(diff_rows)
if diff_df.empty:
    print("No differences found between scenario facets.")
diff_df


### Lineage tree: how a trip table was produced

This trace shows exactly which skim matrix (and which iteration) fed into the mode choice step
that produced the trip table. In real projects, this is the difference between guessing and knowing.


In [None]:
from rich import print as rprint
from rich.tree import Tree

def _add_lineage(branch, node):
    artifact = node["artifact"]
    art_label = f"{artifact.key} ({artifact.id})"
    art_branch = branch.add(art_label)
    run_node = node.get("producing_run")
    if not run_node:
        return
    run = run_node["run"]
    run_label = f"{run.model_name} run={run.id} iter={run.iteration}"
    run_branch = art_branch.add(run_label)
    for child in run_node.get("inputs", []):
        _add_lineage(run_branch, child)

mode_choice_run = next(
    run
    for run in tracker.find_runs(
        parent_id=SCENARIO_RUN_ID,
        model="mode_choice",
        status="completed",
    )
    if run.iteration == final_iteration
)
mode_choice_artifacts = tracker.get_artifacts_for_run(mode_choice_run.id)
trip_table_art = mode_choice_artifacts.outputs["trips_with_modes"]

lineage = tracker.get_artifact_lineage(trip_table_art.id, max_depth=4)
if lineage:
    tree = Tree("lineage")
    _add_lineage(tree, lineage)
    rprint(tree)


The tree above shows that the trip table from the final iteration depends on utilities
computed from skims updated in the prior iteration, which trace back to initialization.
Any change to init parameters would invalidate this entire chain.


### Trace a specific output back to its inputs

VMT is derived from the final iteration's trip table. The cell below ties that
summary metric to the exact trip table artifact, the mode choice run that produced it,
and the utilities parameters used upstream.


In [None]:
vmt_value = (
    base_results["summaries_df"]
    .set_index("iteration")
    .loc[final_iteration, "vmt"]
)

mode_choice_run = next(
    run
    for run in tracker.find_runs(
        parent_id=SCENARIO_RUN_ID,
        model="mode_choice",
        status="completed",
    )
    if run.iteration == final_iteration
)
trip_table_art = tracker.get_artifacts_for_run(mode_choice_run.id).outputs["trips_with_modes"]

utilities_run = next(
    run
    for run in tracker.find_runs(
        parent_id=SCENARIO_RUN_ID,
        model="calculate_utilities",
        status="completed",
    )
    if run.iteration == final_iteration
)
utilities_config = tracker.get_config_values(
    utilities_run.id, namespace="calculate_utilities"
)

focus_keys = [
    "beta_cost",
    "beta_time",
    "asc_transit",
    "asc_walk",
    "fuel_cost_per_mile",
    "transit_fare",
]
params_subset = {k: utilities_config.get(k) for k in focus_keys}

pd.DataFrame([
    {
        "iteration": final_iteration,
        "vmt": vmt_value,
        "trip_table_artifact_id": str(trip_table_art.id),
        "mode_choice_run_id": mode_choice_run.id,
        "utilities_run_id": utilities_run.id,
        **params_subset,
    }
])


## Matrix views and cross-scenario analysis

Consist can consolidate matrix artifacts (like `skims`) into a lazy xarray Dataset across runs.
That makes it easy to compare skim values from two scenarios without manually managing files.


In [None]:
from consist.core.matrix import MatrixViewFactory

baseline_run = tracker.find_latest_run(
    parent_id=SCENARIO_RUN_ID,
    model="traffic_simulation",
    status="completed",
)
parking_run = tracker.find_latest_run(
    parent_id=high_parking_results["scenario_run_id"],
    model="traffic_simulation",
    status="completed",
)

matrix = MatrixViewFactory(tracker).load_matrix_view(
    "skims",
    variables=["time_car_mins"],
    run_ids=[baseline_run.id, parking_run.id],
)

baseline_skims = matrix.sel(run_id=baseline_run.id)
parking_skims = matrix.sel(run_id=parking_run.id)

downtown_zone = 3
baseline_to_cbd = baseline_skims.sel(destination=downtown_zone)["time_car_mins"]
parking_to_cbd = parking_skims.sel(destination=downtown_zone)["time_car_mins"]

compare = pd.DataFrame(
    {
        "origin": baseline_to_cbd.origin.values,
        "baseline_mins": baseline_to_cbd.values,
        "high_parking_mins": parking_to_cbd.values,
    }
)
compare["delta_mins"] = compare["high_parking_mins"] - compare["baseline_mins"]
compare.sort_values("delta_mins", ascending=False)

## Extending and resuming runs

Stakeholders often ask for "just a few more iterations." Here we extend the baseline run from 10
iterations to 15 and let Consist reuse cached steps from iterations 0-9.


In [None]:
os.environ["CONSIST_CACHE_DEBUG"] = (
    "1"  # Log cache hits/misses during the incremental run.
)
extended_config = replace(base_config, n_iterations=15)
extended_run_id = f"{SCENARIO_NAME}_{SESSION_ID}_extended"
extended_results = run_scenario(extended_config, extended_run_id)


In [None]:
# Quick cache-hit check for the extended run.
cached_runs = [
    run
    for run in tracker.find_runs(parent_id=extended_results["scenario_run_id"])
    if run.meta.get("cache_hit")
]
[(run.model_name, run.iteration, run.id) for run in cached_runs]

## Materializing artifacts for handoff

In practice, you may need to deliver scenario outputs to a partner agency or share a
clean directory for handoff. Cache hits do not create new files in the extended run
directory, but the artifacts still exist and can be materialized on demand.

See: [Caching and Hydration](../docs/caching-and-hydration.md) for more detail.


In [None]:
from consist.core.materialize import materialize_artifacts

cached_run = next(
    run
    for run in tracker.find_runs(
        parent_id=extended_results["scenario_run_id"],
        model="mode_choice",
        status="completed",
    )
    if run.iteration == 3 and run.meta.get("cache_hit")
)
trip_table = tracker.get_artifacts_for_run(cached_run.id).outputs["trips_with_modes"]

handoff_dir = RUN_DIR / "handoff"
materialized = materialize_artifacts(
    tracker,
    [(trip_table, handoff_dir / "trips_iter3.parquet")],
)
print(materialized)
sorted(handoff_dir.glob("*.parquet"))


## Calibration and sensitivity patterns (optional)

For parameter sweeps and calibration-style workflows, see
[Parameter Sweeps](../examples/01_parameter_sweep_monte_carlo.ipynb).


## Integration patterns

When coupling models (e.g., travel demand with BEAM, ActivitySim, or a grid model), Consist
tracks exactly which outputs from the travel model became inputs to the downstream model.
That means you can answer, "Which trip table version produced this load forecast?" even
across model boundaries.

See the Usage Guide for integration patterns; a worked coupling example is planned.


## Summary

Consist gives transportation modelers reproducibility, scenario management, and incremental
computation for iterative TDM workflows. You can trace outputs back through the full lineage,
compare scenarios with confidence, and extend runs without starting over.
