<img width="50" src="https://carbonplan-assets.s3.amazonaws.com/monogram/dark-small.png" style="margin-left:0px;margin-top:20px"/>

# FIA Biomass Model

_by Jeremy Freeman (CarbonPlan), September 19, 2020_

This notebook show examples of fitting predictive biomass growth curves from FIA
data


In [None]:
import numpy as np
import pandas as pd
from carbonplan_forest_risks import load, setup, plot, fit

In [None]:
setup.plotting(remote=True)

First we load the data. To speed things up, we'll just load data from
California. We load the raw FIA data, as well as two climatic variables `tavg`
and `ppt` from the terraclim dataset.


In [None]:
df = load.fia(store="az", states="CA")
df = load.terraclim(
    store="az",
    tlim=(2000, 2020),
    variables=["tmean", "ppt"],
    df=df,
)

We'll now pick a single forest type and plot biomass vs age colored by our
climatic variables


In [None]:
inds = df["type_code"] == 221
x = df[inds]["age"]
y = df[inds]["biomass"]
f = [df[inds]["tmean_mean"], df[inds]["ppt_mean"]]
(
    plot.xy(x=x, y=y, color=f[0], cmap="magma", xlim=[0, 250], ylim=[0, 600])
    | plot.xy(
        x=x, y=y, color=f[1], cmap="viridis", xlim=[0, 250], ylim=[0, 600]
    )
).resolve_scale(color="independent")

To fit the model to these data we use the `fit.biomass` method


In [None]:
model = fit.biomass(x=x, y=y, f=f, noise="gamma")

We can evaluate `r2` on the training data


In [None]:
model.r2(x, f, y)

And we can plot the fitted curves. When plotting, we show curves for different
levels of the climatic variables, to show the form of dependency.


In [None]:
xlim = [0, 250]
ylim = [0, 700]
(
    (
        plot.xy(x=x, y=y, color=f[0], cmap="magma", xlim=xlim, ylim=ylim)
        + plot.line(
            x=x,
            y=model.predict(x, f, [90, 50]),
            color=np.nanpercentile(f[0], 90),
        )
        + plot.line(
            x=x,
            y=model.predict(x, f, [10, 50]),
            color=np.nanpercentile(f[0], 10),
        )
    )
    | (
        plot.xy(x=x, y=y, color=f[1], cmap="viridis", xlim=xlim, ylim=ylim)
        + plot.line(
            x=x,
            y=model.predict(x, f, [50, 10]),
            color=np.nanpercentile(f[1], 10),
        )
        + plot.line(
            x=x,
            y=model.predict(x, f, [50, 90]),
            color=np.nanpercentile(f[1], 90),
        )
    )
).resolve_scale(color="independent")

As an inspection of model validity, we can plot the raw data and a sample from
the generative process underlying the model using the fitted parameters
(specifically, the fitted growth curve, and the Gamma noise model). It should
look qualatatively similar to the actual data. In particular, note how the noise
grows with age, and there are no negative values.


In [None]:
xlim = [0, 250]
ylim = [-200, 700]
(
    (
        plot.xy(x=x, y=y, xlim=xlim, ylim=ylim)
        + plot.line(x=x, y=model.predict(x, f, [50, 50]))
    )
    | (
        plot.xy(x=x, y=model.sample(x, f), xlim=xlim, ylim=ylim)
        + plot.line(x=x, y=model.predict(x, f, [50, 50]))
    )
)

We can set the noise to `'normal'` instead of `'gamma'` and see that the sampled
data no longer matches the real data. While the fitted curve is similar, the
variability is too high for low ages, and there are negative predictions where
there shouldn't be! These behaviors help justify the choice of Gamma
distribution.


In [None]:
model = fit.biomass(x=x, y=y, f=f, noise="normal")

In [None]:
xlim = [0, 250]
ylim = [-200, 700]
(
    (
        plot.xy(x=x, y=y, xlim=xlim, ylim=ylim)
        + plot.line(x=x, y=model.predict(x, f, [50, 50]))
    )
    | (
        plot.xy(x=x, y=model.sample(x, f), xlim=xlim, ylim=ylim)
        + plot.line(x=x, y=model.predict(x, f, [50, 50]))
    )
)