# Notebook 2: Fixed-Size Conditioning for Structural Degrees of Freedom

This notebook performs **analytic size conditioning** in preparation for computing structural degrees of freedom in photovoltaic system configurations.

It consumes system-level structural context and configuration representations produced upstream in **Repo 3, Notebook 1**, where the admissible configuration state space was constructed without conditioning or inference. Here, system size is explicitly treated as **fixed by analysis**, rather than as a configuration dimension.

The purpose of this notebook is to:
- Define and validate size-based conditioning primitives
- Construct size indices and size bands suitable for stratified analysis
- Diagnose size variability and boundary behavior within and across bands
- Establish the analytic framework required to evaluate structural degrees of freedom at fixed size

This notebook does **not** compute structural degrees of freedom, define regimes, assess deviation surfaces, or assign risk. Its role is strictly to prepare the conditioned analytical substrate on which such computations can be performed consistently and comparably in subsequent stages.


## Phase 5 — Fixed-Size Conditioning

This phase defines what it means to “hold system size fixed” for the purposes of
structural analysis in Repo 3.

System size is conditioned relative to its installation-year cohort in order to
account for historical scale shifts and changing market baselines. Conditioning
is purely descriptive and establishes comparability, not evaluation.

No judgments regarding abnormality, deviation, efficiency, or risk are made in
this phase. Size conditioning serves only to place systems of different absolute
sizes onto a common, dimensionless scale suitable for downstream structural
analysis.

All subsequent phases operate on size-conditioned representations rather than
raw system size.

In [None]:
#  Size conditioning variable
# Cohort-relative size index (dimensionless)

# Defensive checks
required_cols = [
    "system_size_kw",
    "baseline_expected_system_size_kw",
    "installation_year_cohort",
]

missing = [c for c in required_cols if c not in df_system_context.columns]
assert not missing, f"Missing required columns for size conditioning: {missing}"

# Construct size index: ratio to cohort expectation
df_system_context["size_index"] = (
    df_system_context["system_size_kw"]
    / df_system_context["baseline_expected_system_size_kw"]
)

# Sanity checks
assert df_system_context["size_index"].notna().all(), (
    "size_index contains NaN values."
)

assert (df_system_context["size_index"] > 0).all(), (
    "size_index must be strictly positive."
)

# Quick distribution check (no interpretation)
df_system_context["size_index"].describe()


In [None]:
# Size band construction (descriptive)

# Number of bands (adjustable, but fixed for now)
N_BANDS = 10

df_system_context["size_band"] = pd.qcut(
    df_system_context["size_index"],
    q=N_BANDS,
    labels=False,
    duplicates="drop"
)

# Sanity checks
assert df_system_context["size_band"].notna().all(), (
    "size_band contains NaN values."
)

df_system_context["size_band"].value_counts().sort_index()
