# Example 3: Hierarchical Seasonal Selection

This notebook demonstrates **hierarchical candidate generation** — features are computed at **daily** resolution, but the selection operates at the **monthly** level with seasonal constraints.

Why hierarchical? Evaluating months by their daily composition gives a finer-grained quality signal than month-level statistics alone. And enforcing "one month per season" guarantees seasonal coverage, which pure optimization might sacrifice for aggregate fidelity.

Key concepts:
- `GroupQuotaHierarchicalCombiGen`: constrained candidate generation with seasonal quotas
- Pareto front visualization: understanding trade-offs between objectives

In [None]:
import pandas as pd
import energy_repset as rep
import energy_repset.diagnostics as diag

In [None]:
url = "https://tubcloud.tu-berlin.de/s/pKttFadrbTKSJKF/download/time-series-lecture-2.csv"
df_raw = pd.read_csv(url, index_col=0, parse_dates=True).rename_axis('variable', axis=1)
df_raw = df_raw.drop('prices', axis=1)

## Problem context with daily slicing

We slice at the **day** level (365 candidate periods). Features are computed per day, which gives the objective functions much more granular data to work with compared to month-level features.

In [None]:
child_slicer = rep.TimeSlicer(unit="day")
context = rep.ProblemContext(df_raw=df_raw, slicer=child_slicer)
print(f"{len(context.get_unique_slices())} daily slices")

In [None]:
feature_engineer = rep.StandardStatsFeatureEngineer()
context = feature_engineer.run(context)
print(f"Features computed for {len(context.df_features)} daily periods")

## Objectives: Wasserstein + Correlation

Two complementary fidelity metrics:
- **Wasserstein**: are the value distributions of each variable preserved?
- **Correlation**: are the dependencies *between* variables preserved?

The `ParetoMaxMinStrategy` picks the combination that is Pareto-optimal and maximizes the worst-performing objective — a robust, balanced choice.

In [None]:
objective_set = rep.ObjectiveSet({
    'wasserstein': (0.5, rep.WassersteinFidelity()),
    'correlation': (0.5, rep.CorrelationFidelity()),
})
policy = rep.ParetoMaxMinStrategy()

## Hierarchical combination generator

This is where the magic happens. `GroupQuotaHierarchicalCombiGen` does two things:

1. **Seasonal quotas**: enforces exactly 1 month per season (winter, spring, summer, fall) — so the 4 selected months are structurally diverse
2. **Hierarchical evaluation**: each candidate "month" is expanded to its constituent days for scoring

With 3 months per season and 1 pick each, we get $3^4 = 81$ candidate combinations — far fewer than the unconstrained $\binom{12}{4} = 495$.

In [None]:
combi_gen = rep.GroupQuotaHierarchicalCombiGen.from_slicers_with_seasons(
    parent_k=4,
    dt_index=df_raw.index,
    child_slicer=child_slicer,
    group_quota={'winter': 1, 'spring': 1, 'summer': 1, 'fall': 1}
)

days = context.get_unique_slices()
print(f"{combi_gen.count(days)} candidate combinations")
print("Each = 4 months (1 per season), evaluated on ~120 days total")

## Run the workflow

In [None]:
search_algorithm = rep.ObjectiveDrivenCombinatorialSearchAlgorithm(objective_set, policy, combi_gen)
representation_model = rep.KMedoidsClustersizeRepresentation()

workflow = rep.Workflow(feature_engineer, search_algorithm, representation_model)
experiment = rep.RepSetExperiment(context, workflow)
result = experiment.run()

In [None]:
# Identify which months were selected
selected_months = sorted({day.asfreq('M') for day in result.selection})
print(f"Selected months: {selected_months}")
print(f"Total days in selection: {len(result.selection)}")
print(f"Scores: {result.scores}")

## Pareto front analysis

The scatter plot shows all 81 evaluated combinations in objective space. The Pareto front (highlighted) contains the non-dominated solutions — no other combination is better on *both* objectives simultaneously. The selected combination is marked.

In [None]:
fig = diag.ParetoScatter2D(
    objective_x='wasserstein', objective_y='correlation'
).plot(search_algorithm=search_algorithm, selected_combination=result.selection)
fig.update_layout(title='Pareto Front: Wasserstein vs Correlation')
fig.show()

In [None]:
fig = diag.ParetoParallelCoordinates().plot(search_algorithm=search_algorithm)
fig.update_layout(title='Pareto Front: Parallel Coordinates')
fig.show()

## Score contributions and weights

In [None]:
fig = diag.ScoreContributionBars().plot(result.scores, normalize=True)
fig.update_layout(title='Score Component Contributions (Normalized)')
fig.show()

In [None]:
fig = diag.ResponsibilityBars().plot(result.weights, show_uniform_reference=True)
fig.update_layout(title='Responsibility Weights')
fig.show()

## Distribution fidelity per variable

ECDF overlays for each variable show how well the selection reproduces the full-year distributions.

In [None]:
selected_indices = child_slicer.get_indices_for_slice_combi(df_raw.index, result.selection)
df_selection = df_raw.loc[selected_indices]

for var in df_raw.columns:
    fig = diag.DistributionOverlayECDF().plot(df_raw[var], df_selection[var])
    fig.update_layout(title=f'ECDF Overlay: {var}')
    fig.show()

## Feature space with selection

In [None]:
cols = list(context.df_features.columns[:2])
fig = diag.FeatureSpaceScatter2D().plot(
    context.df_features, x=cols[0], y=cols[1], selection=result.selection
)
fig.update_layout(title='Feature Space with Selection')
fig.show()