# PSP HPS Workflow Parity Demo

This notebook reproduces the size-biased fit for BC PSP plot `4000002_PSP1_v1_p1`
using the `dbhdistfit` HPS workflow. The workflow mirrors the reference
manuscript figures by overlaying the fitted Weibull curve on the expanded stand
table and plotting residuals side-by-side.

In [None]:
import sys
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Ensure local src/ package is importable when running from examples/
sys.path.append(str(Path('..', 'src').resolve()))

In [None]:
from dbhdistfit.weighting import hps_expansion_factor
from dbhdistfit.workflows import fit_hps_inventory

## Load HPS Tallies

The example bundle ships pre-aggregated tallies for the first measurement of
`4000002-PSP1`. DBH midpoints are recorded in centimetres.

In [None]:
data_path = Path('hps_baf12/4000002_PSP1_v1_p1.csv')
data = pd.read_csv(data_path)
data.head()

## Fit the HPS Workflow

`fit_hps_inventory` expands the tallies by the HPS factor, assigns weights, and
fits the default candidate set (Weibull + Gamma).

In [None]:
dbh = data['dbh_cm'].to_numpy()
tally = data['tally'].to_numpy()

results = fit_hps_inventory(dbh, tally, baf=12.0)
best = min(results, key=lambda result: result.gof['rss'])
best

## Prepare Plotting Data

The stand table is the tally scaled by the HPS expansion factor. Diagnostics
from the fit already provide the fitted curve and residuals in stand-table
units.

In [None]:
expansion = hps_expansion_factor(dbh, baf=12.0)
stand_table = tally * expansion
fitted = best.diagnostics['fitted']
residuals = best.diagnostics['residuals']

dict(
    distribution=best.distribution,
    rss=best.gof['rss'],
    parameters=best.parameters,
)

## Overlay Stand Table and Fitted Curve

The left panel reproduces the manuscript-style comparison between the expanded
stand table and the fitted Weibull curve. The right panel shows residuals for
each DBH midpoint.

In [None]:
plt.style.use('seaborn-v0_8-muted')
fig, axes = plt.subplots(1, 2, figsize=(14, 5), gridspec_kw={'wspace': 0.25})
ax_table, ax_resid = axes

# Observed stand table
ax_table.bar(dbh, stand_table, width=0.9, alpha=0.6, label='Observed stand table')
ax_table.plot(dbh, fitted, color='C1', linewidth=2.5, label=f"Fitted {best.distribution.title()}")
ax_table.set_xlabel('DBH (cm)')
ax_table.set_ylabel('Expanded stems per hectare')
ax_table.set_title('HPS Stand Table vs Fitted Curve')
ax_table.legend(frameon=False)

# Residuals
ax_resid.axhline(0, color='0.4', linewidth=1, linestyle='--')
ax_resid.bar(dbh, residuals, width=0.9, color='C3', alpha=0.8)
ax_resid.set_xlabel('DBH (cm)')
ax_resid.set_ylabel('Residual (observed - fitted)')
ax_resid.set_title('Residual Diagnostics')

for axis in axes:
    axis.set_xlim(dbh.min() - 1, dbh.max() + 1)

plt.suptitle('BC PSP 4000002-PSP1 (BAF 12) HPS Fit', fontsize=16, y=1.02)
plt.show()

The curve and residuals reproduce the regression target locked in
`tests/test_hps_parity.py`, confirming parity with the reference workflow.