# `009-linreg-learner`

Task: Fit a linear regression by gradient descent.

## Setup

In [None]:
from fastai.vision.all import *

This function will make a `DataLoaders` object out of an arary dataset.

In [None]:
def make_dataloaders(x, y_true, splitter, batch_size):
    data = L(zip(x, y_true))
    train_indices, valid_indices = splitter(data)
    return DataLoaders(
        DataLoader(data[train_indices], batch_size=batch_size, shuffle=True),
        DataLoader(data[valid_indices], batch_size=batch_size)
    )   

Here are utility functions to plot the first axis of a dataset and a model's predictions.

In [None]:
def plot_data(x, y): plt.scatter(x[:, 0], y[:, 0], s=1)
def plot_model(x, model):
    x = torch.sort(x)[0]
    y_pred = model(x).detach()
    plt.plot(x[:, 0], y_pred[:, 0], 'r')

## Task

Remember this? Suppose we have a dataset with just a single feature `x` and continuous outcome variable `y`.

In [None]:
torch.manual_seed(0)
x = torch.rand(500, 1)
noise = torch.rand_like(x) * .5
y_true = 4 * x - 1 + noise

plot_data(x, y_true)

Let's fit a line to that!

In notebook `006` we manually wrote out `y_pred = weights * x + bias`, and manually took a step that reduced the mean squared error `mse_loss = (y_pred - y_true).pow(2).mean()`. In this notebook, we'll use `nn.Linear` and fastai's `Learner` class.

First we'll make a fastai-compatible `DataLoaders` from this dataset. You should know everything you need to understand how this works, but don't worry about it on the first time around.

In [None]:
splitter = RandomSplitter(valid_pct=0.2, seed=42)
batch_size = 5
dataloaders = make_dataloaders(x, y_true, splitter, batch_size=batch_size)

## Solution

Use the `one_batch` method to inspect one batch of the `train` dataloader. Be sure that you can explain the shapes of everything you see. (Look above to see the `batch_size` that this dataloader uses.)

In [None]:
batch = ...
...
X_batch

In [None]:
y_batch

**Fill in the blanks to construct a `model`**:

```
model = nn.Linear(in_features=..., out_features=..., bias=...)
```

* For `in_features`, think about the shape of the input data.
* For `out_features`, think about the shape of the output data.

In [None]:
model = nn.Linear(in_features=..., out_features=..., bias=...)

To check that we got it right, **call the `model` with the input data from the example batch**.

In [None]:
y_pred = ...
y_pred

Let's look at what the model currently predicts on all the data.

In [None]:
plot_data(x, y_true)
plot_model(x, model)

Pretty bad, huh? Let's evaluate the error on the batch we got:

In [None]:
mse_loss = (y_pred - y_batch).pow(2).mean()
mse_loss

**Create a `loss_func` by instantiating an `nn.MSELoss`.**

In [None]:
loss_func = ...

**Evaluate the loss on the  `loss_func` on the example batch.**  Check that the output matches exactly.

Note: PyTorch loss functions take inputs, then targets. `sklearn` loss functions (metrics) use the reverse order.

In [None]:
...

**Construct a `Learner`.**

* Use the `dataloaders`, `model`, and `loss_func` constructed above.
* Use `SGD` as the `opt_func`.
* The default `metric` is fine so you can omit it. (If you want to, you may add Mean Absolute Error (`mae`).)

In [None]:
...

**Fit the Learner for 10 epochs at the default learning rate.**

Plot the loss when it's finished.

In [None]:
...
learner.recorder.plot_loss()

**Now let's look at what the model predicts.**

In [None]:
plot_data(x, y_true)
plot_model(x, model)

**Not there yet! Try different learning rates in the `learner.fit` to see if you can get it to train to convergence in 10 epochs.**

Remember to Restart and Run All to check that you're starting with a clean model.

## Analysis

Inspect the `weight` and `bias` attributes of `model`. How close are they to the ideal values? Explain.

In [None]:
...

## Extension (optional)

Suppose we rerun this notebook hundreds of times with different random seeds. What is the expected value of the validation loss? 

Answer this by looking at the way that `y_true` was constructed.

(Assume that the model gets enough training data that `weights` and `bias` get exactly the right values. It turns out that this assumption isn't actually needed, but it will make it easier to think about where the error comes from.)