In [None]:
# import sys
# sys.path.append('../src')
import numpy as np
from numpy import random
from scipy.stats import norm
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression

# Custom Loss Functions for Fuel Moisture Models

*Author:* Jonathon Hirschi

A loss function is intended to measure the fitting accuracy of a statistical model. Loss functions are used for training the parameters machine learning models.

The purpose of this notebook is to discuss training fuel moisture models with various loss functions to try to account for the nonlinear effect of fuel moisture on wildfire rate of spread.

## Example with Weighted Least Squares

To illustrate the effect of changing loss functions, consider a simple linear model with one predictor for $n$ samples:

$$f(x_i, \pmb\beta) = \beta_0 + \beta_1 x_i, \text{ for }i=1,..., n$$

Ordinary least squares (OLS) is the most basic method of estimating the $\beta$ parameter values. The method minimizes the residual sum of squares (RSS), the loss function in this case. Equal weight is given to each residual value in the loss function,

$$r_i = y_i - f(x_i, \pmb \beta)$$

$$\pmb{\hat\beta_{OLS}} = argmin_\beta \sum_{i=1}^n r_i^2$$

Weighted least squares minimizes the weighted residual sum of squares, with a weight $w_i$ applied to each residual value. In principle, the weights could come from anywhere. 

$$\pmb{\hat\beta_{W}} = argmin_\beta \sum_{i=1}^n w_ir_i^2$$

In the following example, the OLS model for simulated data is compared to two different weighting schemes. In both cases, the weights come from normal distributions centered at the observed mean of the response value, but one has a large variance and the other has a much smaller variance. In this formulation of weights, as the variance of the normal distribution increases, the resulting model parameters approach the OLS parameters. 

In [None]:
# Simulate Data
random.seed(123)
npts = 200
x = np.linspace(0, 100, npts)

y = 100+x+random.normal(0, 10, npts)

In [None]:
# Fit OLS model
xx = x.reshape(len(x), 1)
w1 = np.ones(len(x)) # adding weights of 1 for illustrative purposes
model1 = LinearRegression().fit(xx, y, w1)
preds1 = model1.predict(xx)

In [None]:
# Set up weighting distributions
m = np.floor(np.mean(y)) # center weights at central tendency of response data 
s = np.std(y)
rv = norm(loc=m, scale = s/2) 
rv2 = norm(loc=m, scale = 1) # extreme weight distribution for illustrative purposes

In [None]:
# Fit Weighted LS
w2 = rv.pdf(y)
w2 = w2 / np.sum(w2) # normalize weights to sum to 1, not strictly necessary in this example 
model2 = LinearRegression().fit(xx, y, w2)
preds2 = model2.predict(xx)

w3 = rv2.pdf(y)
w3 = w3 / np.sum(w3) # normalize weights to sum to 1, not strictly necessary in this example 
model3 = LinearRegression().fit(xx, y, w3)
preds3 = model3.predict(xx)

In [None]:
# Plot regression lines
sns.set(style='whitegrid')
p=sns.lineplot(x=x, y=preds1, label="OLS")
p=p.set(xlabel="X", ylabel="f(X)")
sns.lineplot(x=x, y=preds2, label = "Weighted (large variance)")
sns.lineplot(x=x, y=preds3, label = "Weighted (small variance)")
sns.scatterplot(x=x, y=y, alpha=.7)
plt.legend()
plt.title("OLS vs Weighted")

In [None]:
# Plot Weighting Distriubtions
yy = np.sort(y)
fig, (ax1, ax2) = plt.subplots(2, figsize=(6, 6))
fig.suptitle('Weight Distributions')
ax1.plot(rv.pdf(yy), yy, color=sns.color_palette()[1])
ax1.set_title("Large Variance")
ax1.set_xticklabels([]);
# ax1.set_yticklabels([]);
ax2.plot(rv2.pdf(yy), yy, color=sns.color_palette()[2])
ax2.set_title("Small Variance")
ax2.set_xticklabels([]);
# ax2.set_yticklabels([]);
# plt.show

The weighted least squares model with the small variance gives tiny weight to residuals far away from the sample mean of the response.