# Model Tutorial: Linear Regression

The purpose of this notebook is to demonstrate how to train and predict linear regression models used in this project. First, we will demonstrate the basic code, and then reproduce the results using a custom class `LM` to make the code consistent for multiple models.

## Model Description



## Setup

In [None]:
import sys
sys.path.append('../src')
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
import tensorflow as tf
import yaml
# Local modules
from data_funcs import train_test_split_spacetime
from fmda_models import LM
from metrics import ros, rmse
import reproducibility

## Data Read and Split

In [None]:
df = pd.read_pickle("../data/rocky_2023_05-09.pkl")
# Remove NA fm
df = df.dropna()

In [None]:
# Set seed for reproducibility
reproducibility.set_seed(123)

# Create Data
X_train, X_test, y_train, y_test = train_test_split_spacetime(df)

# Subset Columns
X_train=X_train[["Ed", "Ew", "rain", "hour"]]
X_test=X_test[["Ed", "Ew", "rain", "hour"]]

## Manually Code LR

In [None]:
# create model instance
lm = LinearRegression()
# fit model
lm.fit(X_train, y_train)
preds = lm.predict(X_test)

In [None]:
print("Test RMSE:", rmse(preds, y_test))
print("Test RMSE (ROS):", rmse(ros(preds), ros(y_test)))

## Reproduce using LM Class

We now use a class `LM` that reproduces the code above. The purpose of the class is to have different machine learning models with the same methods for concise code.

The `LM` class uses all defaults with no hyperparameter tuning.

In [None]:
with open('params.yaml', 'r') as file:
    params = yaml.safe_load(file)["lm"]

params

In [None]:
model = LM(params)
model.fit(X_train, y_train)
fitted = model.predict(X_train)
model.eval(X_test, y_test)

## Using Custom Loss

SKlearn's `LinearRegression` supports weighted least squares:

In [None]:
weights = tf.exp(tf.multiply(-0.01, y_train))

In [None]:
# create model instance
lmw = LinearRegression()
# fit model with weights
lmw.fit(X_train, y_train, weights)

In [None]:
preds = lmw.predict(X_test)
print("Test RMSE:", rmse(preds, y_test))
print("Test RMSE (ROS):", rmse(ros(preds), ros(y_test)))

### Using Custom Class

In [None]:
model = LM(params)
model.fit(X_train, y_train, weights)
preds = model.predict(X_test)
model.eval(X_test, y_test)