# Growthcurves Analysis Tutorial

This tutorial demonstrates how to fit growth models and extract growth statistics using the growthcurves package.

The analysis workflow includes:
1. Generating or loading growth data
2. Fitting **mechanistic** models (ODE-based, parametric)
3. Fitting **phenomenological** models (parametric and non-parametric)
4. Extracting growth statistics from all fits
5. Saving results for visualization

For visualization of the results, see the companion notebook: `plotting.ipynb`

In [None]:
from pprint import pprint

import numpy as np
import pandas as pd

import growthcurves as gc

## Generate synthetic data

This cell generates synthetic growth data from a clean logistic function.

In [2]:
# Generate synthetic growth data from logistic function
np.random.seed(42)

# Parameters for synthetic growth curve
n_points = 440
measurement_interval_minutes = 12
time = np.array([(measurement_interval_minutes * n) / 60 for n in range(n_points)])


def logistic_growth(t, baseline, N0, K, mu, lag):
    """Logistic growth model with smooth transition through lag phase"""
    # Standard logistic formula centered at lag time
    # This creates a smooth S-curve with inflection point at t = lag
    growth = K / (1 + ((K - N0) / N0) * np.exp(-mu * (t - lag)))
    return baseline + growth


# Generate clean logistic curve
data = logistic_growth(time, 0.05, 0.05, 0.45, 0.15, 30.0)
data = data.tolist()

## How Growth Parameters Are Calculated

The table below summarizes how the main reported growth statistics are calculated across model classes.

| Output key | Meaning | How it is calculated |
|---|---|---|
| `max_od` | Maximum observed/fitted OD | Maximum OD over the valid data range |
| `mu_max` | Maximum specific growth rate (μ_max) | Maximum of `d(ln N)/dt` from the fitted model (or local fit for non-parametric) |
| `intrinsic_growth_rate` | Intrinsic model rate parameter | For mechanistic models: fitted intrinsic `μ`; for phenomenological/non-parametric: `None` |
| `doubling_time` | Doubling time in hours | `ln(2) / mu_max` |
| `time_at_umax` | Time at maximum specific growth | Time where `mu_max` reaches its maximum |
| `od_at_umax` | OD at time of μ_max | Model-predicted OD at `time_at_umax` |
| `exp_phase_start`, `exp_phase_end` | Exponential phase boundaries | From threshold or tangent phase-boundary method in `extract_stats()` |
| `model_rmse` | Fit error | RMSE between observed OD and model-predicted OD over the model fit window |

For this tutorial:
- Mechanistic comparisons use mechanistic parametric fits.
- Phenomenological comparisons include both phenomenological parametric and non-parametric fits.


# Extract growth stats from the dataset

The `extract_stats_from_fit()` function calculates these key metrics:

- `max_od`: Maximum OD value within the fitted window
- `mu_max`: **Observed** maximum specific growth rate μ_max (hour⁻¹) - calculated from the fitted curve
- `intrinsic_growth_rate`: **Model parameter** for intrinsic growth rate (parametric models only, `None` for non-parametric)
- `doubling_time`: Time to double the population at peak growth (hours)
- `exp_phase_start`: When exponential phase begins (hours)
- `exp_phase_end`: When exponential phase ends (hours)
- `time_at_umax`: Time when μ reaches its maximum (hours)
- `od_at_umax`: OD value at time of maximum μ
- `fit_t_min`: Start of fitting window (hours)
- `fit_t_max`: End of fitting window (hours)
- `fit_method`: Identifier for the method used
- `model_rmse`: Root mean squared error

Descriptive parameters are extracted from the fits. Where parameters are not extracted directly from the fitted model, they are calculated. The table below shows how different stats are calculated according to the different approaches:

## MECHANISTIC MODELS

| Name | Model | Equation | Exp Start | Exp End | Intrinsic μ | μ max | Carrying Capacity | Fit |
|------|-------|----------|-----------|---------|-------------|-------|-------------------|-----|
| Logistic | parametric | `dN/dt = μ * (1 - N / K) * N` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | K | entire curve |
| Gompertz | parametric | `dN/dt = μ * math.log(K / N) * N` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | K | entire curve |
| Richards | parametric | `dN/dt = μ * (1 - (N / K)**beta) * N` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | A | entire curve |
| Baranyi | parametric | `dN/dt= μ * math.exp(μ * t) / (math.exp(h0) - 1 + math.exp(μ * t)) * (1 - N / K) * N` | threshold/<br>tangent | threshold/<br>tangent | μ | max dln(N)/dt | K | entire curve |

## PHENOMENOLOGICAL MODELS

| Name | Model | Equation | Exp Start | Exp End | Intrinsic μ | μ max | Max OD | Fit |
|------|-------|----------|-----------|---------|-------------|-------|--------|-----|
| Linear | non-parametric | `ln(N(t)) = N0 + b * t` | threshold/<br>tangent | threshold/<br>tangent | n.a. | b | max OD raw | only window |
| Spline | non-parametric | `ln(N(t)) = spline(t)` | threshold/<br>tangent | threshold/<br>tangent | n.a. | max of derivative of spline | max OD raw | only log phase |
| Logistic (phenom) | parametric | `ln(N(t)/N0) = A / (1 + exp(4 * μ_max * (λ - t) / A + 2))` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |
| Gompertz (phenom) | parametric | `ln(N(t)/N0) = A * exp(-exp(μ_max * exp(1) * (λ - t) / A + 1))` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |
| Gompertz (modified) | parametric | `ln(N(t)/N0) = A * exp(-exp(μ_max * exp(1) * (λ - t) / A + 1)) + A * exp(α * (t - t_shift))` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |
| Richards (phenom) | parametric | `ln(N(t)/N0) = A * (1 + ν * exp(1 + ν + μ_max * (1 + ν)**(1/ν) * (λ - t) / A))**(-1/ν)` | λ | threshold/<br>tangent | n.a. | μ_max | K | entire curve |

### Understanding Growth Rates: Intrinsic vs. Observed

**Important distinction:**

- **`mu_max`** (μ_max): The **observed** maximum specific growth rate calculated from the fitted curve as max(d(ln N)/dt). This is what you measure from the data.

- **`intrinsic_growth_rate`**: The **model parameter** representing intrinsic growth capacity:
  - **Parametric models**: This is a fitted parameter (e.g., `r` in Logistic, `mu_max` in Gompertz)
  - **Non-parametric methods**: Returns `None` (no model parameter exists)

## Mechanistic Models

Mechanistic models are ODE-based parametric models that encode growth dynamics as differential equations.

### Fit Models

In [3]:
# Fit mechanistic models
fit_mech_logistic = gc.parametric.fit_parametric(time, data, method="mech_logistic")
fit_mech_gompertz = gc.parametric.fit_parametric(time, data, method="mech_gompertz")
fit_mech_richards = gc.parametric.fit_parametric(time, data, method="mech_richards")
fit_mech_baranyi = gc.parametric.fit_parametric(time, data, method="mech_baranyi")

# Combine fits into a dictionary
mechanistic_fits = {
    "mech_logistic": fit_mech_logistic,
    "mech_gompertz": fit_mech_gompertz,
    "mech_richards": fit_mech_richards,
    "mech_baranyi": fit_mech_baranyi,
}

# Display example fit result
print("=== Logistic Fit Result ===")
pprint(fit_mech_logistic, indent=2)

=== Logistic Fit Result ===
{ 'model_type': 'mech_logistic',
  'params': { 'K': np.float64(0.4499768424093958),
              'N0': np.float64(0.000625037163808368),
              'fit_t_max': 87.8,
              'fit_t_min': 0.0,
              'mu': np.float64(0.14992595235745404),
              'y0': np.float64(0.050009668374832776)}}


### Extract Growth Statistics

In [4]:
# Extract stats from each mechanistic fit
stats_mech_logistic = gc.utils.extract_stats(fit_mech_logistic, time, data)
stats_mech_gompertz = gc.utils.extract_stats(fit_mech_gompertz, time, data)
stats_mech_richards = gc.utils.extract_stats(fit_mech_richards, time, data)
stats_mech_baranyi = gc.utils.extract_stats(fit_mech_baranyi, time, data)

# Combine stats into a dictionary
mechanistic_stats = {
    "mech_logistic": stats_mech_logistic,
    "mech_gompertz": stats_mech_gompertz,
    "mech_richards": stats_mech_richards,
    "mech_baranyi": stats_mech_baranyi,
}

# Display growth statistics for logistic fit
print("=== Logistic Growth Statistics ===")
pprint(stats_mech_logistic, indent=2)

# Create comparison dataframe
print("\n=== Mechanistic Models Comparison ===")
mechanistic_df = pd.DataFrame(mechanistic_stats).T[
    [
        "mu_max",
        "intrinsic_growth_rate",
        "doubling_time",
        "time_at_umax",
        "exp_phase_start",
        "exp_phase_end",
        "model_rmse",
    ]
]
mechanistic_df

=== Logistic Growth Statistics ===
{ 'N0': 0.050009668374832776,
  'doubling_time': 8.86118100208731,
  'exp_phase_end': 59.49048112187557,
  'exp_phase_start': 12.911739968062367,
  'fit_method': 'model_fitting_mech_logistic',
  'fit_t_max': 87.8,
  'fit_t_min': 0.0,
  'intrinsic_growth_rate': 0.14992595235745404,
  'max_od': 0.49998651078422857,
  'model_rmse': 4.597890449519072e-05,
  'mu_max': 0.07822288929620892,
  'od_at_umax': 0.15665653300736812,
  'time_at_umax': 36.07014028056112}

=== Mechanistic Models Comparison ===


Unnamed: 0,mu_max,intrinsic_growth_rate,doubling_time,time_at_umax,exp_phase_start,exp_phase_end,model_rmse
mech_logistic,0.078223,0.149926,8.861181,36.07014,12.91174,59.490481,4.6e-05
mech_gompertz,0.081681,0.055765,8.486063,22.873747,6.518301,65.025168,0.02491
mech_richards,0.081759,0.187789,8.477915,35.542285,15.607974,59.733666,0.001558
mech_baranyi,0.077833,0.149806,8.905564,36.246092,12.866155,59.527605,4e-05


## Phenomenological Models - Parametric

These are phenomenological parametric models fit in ln-space.

### Fit Models

In [5]:
# Fit phenomenological parametric models
fit_phenom_logistic = gc.parametric.fit_parametric(time, data, method="phenom_logistic")
fit_phenom_gompertz = gc.parametric.fit_parametric(time, data, method="phenom_gompertz")
fit_phenom_gompertz_modified = gc.parametric.fit_parametric(
    time, data, method="phenom_gompertz_modified"
)
fit_phenom_richards = gc.parametric.fit_parametric(time, data, method="phenom_richards")

# Combine fits into a dictionary
phenom_param_fits = {
    "phenom_logistic": fit_phenom_logistic,
    "phenom_gompertz": fit_phenom_gompertz,
    "phenom_gompertz_modified": fit_phenom_gompertz_modified,
    "phenom_richards": fit_phenom_richards,
}

# Display example fit
print("=== Phenomenological Logistic Fit ===")
pprint(fit_phenom_logistic, indent=2)

=== Phenomenological Logistic Fit ===
{ 'model_type': 'phenom_logistic',
  'params': { 'A': np.float64(2.299980001980642),
              'N0': np.float64(0.050273088558506006),
              'fit_t_max': 87.8,
              'fit_t_min': 0.0,
              'lam': np.float64(21.9238038781274),
              'mu_max': np.float64(0.07997940767621518)}}


### Extract Growth Statistics

In [6]:
# Extract stats from each phenomenological parametric fit
stats_phenom_logistic = gc.utils.extract_stats(
    fit_phenom_logistic, time, data, phase_boundary_method="tangent"
)
stats_phenom_gompertz = gc.utils.extract_stats(
    fit_phenom_gompertz, time, data, phase_boundary_method="tangent"
)
stats_phenom_gompertz_modified = gc.utils.extract_stats(
    fit_phenom_gompertz_modified, time, data, phase_boundary_method="tangent"
)
stats_phenom_richards = gc.utils.extract_stats(
    fit_phenom_richards, time, data, phase_boundary_method="tangent"
)

# Combine stats into a dictionary
phenom_param_stats = {
    "phenom_logistic": stats_phenom_logistic,
    "phenom_gompertz": stats_phenom_gompertz,
    "phenom_gompertz_modified": stats_phenom_gompertz_modified,
    "phenom_richards": stats_phenom_richards,
}

# Display example stats
print("=== Phenomenological Logistic Stats ===")
pprint(stats_phenom_logistic, indent=2)

# Create comparison dataframe
print("\n=== Phenomenological Parametric Models Comparison ===")
phenom_param_df = pd.DataFrame(phenom_param_stats).T[
    [
        "mu_max",
        "intrinsic_growth_rate",
        "doubling_time",
        "time_at_umax",
        "exp_phase_start",
        "exp_phase_end",
        "model_rmse",
    ]
]
phenom_param_df

=== Phenomenological Logistic Stats ===
{ 'N0': 0.050273088558506006,
  'doubling_time': 8.66657056733965,
  'exp_phase_end': 50.65869644991835,
  'exp_phase_start': 21.9238038781274,
  'fit_method': 'model_fitting_phenom_logistic',
  'fit_t_max': 87.8,
  'fit_t_min': 0.0,
  'intrinsic_growth_rate': None,
  'max_od': 0.5005310454700512,
  'model_rmse': 0.0008165377630810787,
  'mu_max': 0.07997940767621518,
  'od_at_umax': 0.1580573708100321,
  'time_at_umax': 36.246092184368734}

=== Phenomenological Parametric Models Comparison ===


Unnamed: 0,mu_max,intrinsic_growth_rate,doubling_time,time_at_umax,exp_phase_start,exp_phase_end,model_rmse
phenom_logistic,0.079979,,8.666571,36.246092,21.923804,50.658696,0.000817
phenom_gompertz,0.092212,,7.516896,33.782766,25.07681,48.666283,0.005205
phenom_gompertz_modified,0.089873,,7.712561,33.430862,23.467578,48.628295,0.00408
phenom_richards,0.078659,,8.812088,36.597996,21.241111,50.891908,0.000506


## Phenomenological Models - Non-Parametric

These are phenomenological non-parametric fits that estimate growth features directly from local trends and smoothing.

### Fit Models

In [7]:
# Fit non-parametric models
fit_spline = gc.non_parametric.fit_non_parametric(
    time,
    data,
    method="spline",
    spline_s=0.2,
)

fit_sliding_window = gc.non_parametric.fit_non_parametric(
    time,
    data,
    method="sliding_window",
    window_points=7,
)

# Combine fits into a dictionary
phenom_nonparam_fits = {
    "spline": fit_spline,
    "sliding_window": fit_sliding_window,
}

# Display non-parametric fit results
pprint(phenom_nonparam_fits, indent=2)

{ 'sliding_window': { 'model_type': 'sliding_window',
                      'params': { 'fit_t_max': 36.6,
                                  'fit_t_min': 35.4,
                                  'intercept': -4.663280857018275,
                                  'slope': 0.0778949612485742,
                                  'time_at_umax': 36.0,
                                  'window_points': 7}},
  'spline': { 'model_type': 'spline',
              'params': { 'fit_t_max': 54.0,
                          'fit_t_min': 18.4,
                          'mu_max': 0.07558066565298467,
                          'spline_s': 0.2,
                          'tck_c': [ -2.802541072553616,
                                     -2.6787962449355196,
                                     -1.0077796166455106,
                                     -0.8859363580093346,
                                     0.0,
                                     0.0,
                                     0.0,
             

### Extract Growth Statistics

In [8]:
# Extract stats from each non-parametric fit
stats_spline = gc.utils.extract_stats(
    fit_spline,
    time,
    data,
    phase_boundary_method="tangent",
)

stats_sliding_window = gc.utils.extract_stats(
    fit_sliding_window,
    time,
    data,
    phase_boundary_method="tangent",
)

# Combine stats into a dictionary
phenom_nonparam_stats = {
    "spline": stats_spline,
    "sliding_window": stats_sliding_window,
}

# Create comparison dataframe
print("=== Phenomenological Non-Parametric Models Comparison ===")
phenom_nonparam_df = pd.DataFrame(phenom_nonparam_stats).T[
    [
        "mu_max",
        "intrinsic_growth_rate",
        "doubling_time",
        "time_at_umax",
        "exp_phase_start",
        "exp_phase_end",
        "model_rmse",
    ]
]
phenom_nonparam_df

=== Phenomenological Non-Parametric Models Comparison ===


Unnamed: 0,mu_max,intrinsic_growth_rate,doubling_time,time_at_umax,exp_phase_start,exp_phase_end,model_rmse
spline,0.075581,,9.170959,36.110553,21.119377,51.404199,0.006
sliding_window,0.077895,,8.898485,36.0,21.566885,50.951931,1.4e-05


## Customizing Phase Boundary Detection

Two methods are available for determining exponential phase boundaries:

#### 1. **Threshold Method**
- Tracks the instantaneous specific growth rate μ(t)
- `exp_phase_start`: First time when μ exceeds a fraction of μ_max (default: 15%)
- `exp_phase_end`: First time after peak when μ drops below the threshold

#### 2. **Tangent Method**
- Constructs a tangent line in log space at the point of maximum growth rate
- Extends this tangent to intersect baseline (exp_phase_start) and plateau (exp_phase_end)

In [9]:
# Compare phase-boundary methods on the same fit
phase_boundary_rows = []

# Tangent method
stats_tangent = gc.utils.extract_stats(
    fit_spline,
    time,
    data,
    phase_boundary_method="tangent",
)
phase_boundary_rows.append(
    {
        "label": "tangent",
        "method": "tangent",
        "lag_frac": np.nan,
        "exp_frac": np.nan,
        "stats": stats_tangent,
    }
)

# Threshold method at different cutoffs
for frac, label in [(0.10, "threshold_low"), (0.30, "threshold_high")]:
    stats_threshold = gc.utils.extract_stats(
        fit_spline,
        time,
        data,
        phase_boundary_method="threshold",
        lag_frac=frac,
        exp_frac=frac,
    )
    phase_boundary_rows.append(
        {
            "label": label,
            "method": "threshold",
            "lag_frac": frac,
            "exp_frac": frac,
            "stats": stats_threshold,
        }
    )

# Create comparison dataframe
print("=== Phase Boundary Method Comparison ===")
phase_boundary_df = pd.DataFrame(
    [
        {
            "label": row["label"],
            "method": row["method"],
            "lag_frac": row["lag_frac"],
            "exp_frac": row["exp_frac"],
            "exp_phase_start": row["stats"]["exp_phase_start"],
            "exp_phase_end": row["stats"]["exp_phase_end"],
        }
        for row in phase_boundary_rows
    ]
)
phase_boundary_df

=== Phase Boundary Method Comparison ===


Unnamed: 0,label,method,lag_frac,exp_frac,exp_phase_start,exp_phase_end
0,tangent,tangent,,,21.119377,51.404199
1,threshold_low,threshold,0.1,0.1,9.854515,62.508161
2,threshold_high,threshold,0.3,0.3,18.178667,54.15848
