# SmartBuildSim Workflow Notebook

This notebook mirrors the [`examples/scripts/run_example.py`](../scripts/run_example.py) workflow using the public SmartBuildSim APIs.
It walks through synthetic data generation, model training, anomaly detection, clustering, reinforcement learning, and visualisation.

## Install dependencies

Run the same installation command documented in the [Quickstart guide](../../docs/quickstart.md) before executing the notebook:

```bash
pip install -e .[dev]
```

The extras include Jupyter, plotting, and testing dependencies so every cell can run without additional setup.

## Configure scenario and outputs

This example uses the built-in `office-small` preset, writing all artefacts to `examples/outputs/`.
Re-run cells to regenerate artefacts; existing files will be overwritten.

In [None]:
from pathlib import Path

import pandas as pd

from smartbuildsim.data.generator import DataGeneratorConfig, generate_dataset
from smartbuildsim.models.anomaly import AnomalyDetectionConfig, detect_anomalies
from smartbuildsim.models.clustering import ClusteringConfig, cluster_zones
from smartbuildsim.models.forecasting import ForecastingConfig, train_forecasting_model
from smartbuildsim.models.rl import RLConfig, train_policy
from smartbuildsim.scenarios.presets import get_scenario
from smartbuildsim.viz.plots import PlotConfig, plot_time_series

In [None]:
output_dir = Path('examples/outputs')
output_dir.mkdir(parents=True, exist_ok=True)

scenario = get_scenario('office-small')
data_config = DataGeneratorConfig(**scenario.data.dict())
forecast_config = ForecastingConfig(**scenario.forecasting.dict())
anomaly_config = AnomalyDetectionConfig(**scenario.anomaly.dict())
cluster_config = ClusteringConfig(**scenario.clustering.dict())
rl_config = RLConfig(**scenario.rl.dict())
plot_config = PlotConfig(sensor=scenario.forecasting.sensor)

## Generate synthetic telemetry

Use the `DataGeneratorConfig` to sample deterministic building telemetry and persist it for downstream steps.

In [None]:
dataset = generate_dataset(scenario.building, data_config)
dataset_path = output_dir / 'dataset.csv'
dataset.to_csv(dataset_path, index=False)
dataset.head()

## Train forecasting model

Fit the forecasting model defined by the preset and inspect a subset of predictions alongside the RMSE metric.

In [None]:
forecast_result = train_forecasting_model(dataset, forecast_config)
forecast_summary = {
    'rmse': forecast_result.rmse,
    'predictions': forecast_result.predictions[:5].tolist(),
}
forecast_summary

## Detect anomalies

The anomaly detector flags unusual telemetry points. The resulting dataframe is useful for plotting and further analysis.

In [None]:
anomaly_result = detect_anomalies(dataset, anomaly_config)
len(anomaly_result.data)

## Cluster building zones

Group zones using the clustering configuration shipped with the scenario.

In [None]:
cluster_result = cluster_zones(dataset, cluster_config)
cluster_result.assignments.head()

## Train reinforcement learning policy

Optimise the RL control policy and report the mean episodic reward.

In [None]:
rl_result = train_policy(rl_config)
rl_result.average_reward()

## Visualise sensor telemetry

Plot the target sensor with anomaly annotations. The file is written to `examples/outputs/sensor_plot.png`.

In [None]:
plot_path = output_dir / 'sensor_plot.png'
plot_time_series(dataset, plot_config, plot_path, anomalies=anomaly_result.data)
plot_path

## Summarise outputs

Collect key metrics and artefact locations for quick reference.

In [None]:
summary = pd.DataFrame(
    {
        'forecast_rmse': [forecast_result.rmse],
        'avg_rl_reward': [rl_result.average_reward()],
        'clusters': [cluster_result.assignments.to_dict(orient='records')],
    }
)
summary_path = output_dir / 'summary.csv'
summary.to_csv(summary_path, index=False)
print(f'Dataset saved to {dataset_path}')
print(f'Forecast summary: {forecast_summary}')
print(f'Anomaly output rows: {len(anomaly_result.data)}')
print(f'Cluster assignments saved with {len(cluster_result.assignments)} entries')
print(f'RL mean reward: {rl_result.average_reward():.3f}')
print(f'Plot saved to {plot_path}')
print(f'Summary saved to {summary_path}')
summary