<a href="https://colab.research.google.com/github/darstech/ML-Foundation/blob/main/notebooks/batch_regression_gradient.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Ref: [John Krohn Github](https://github.com/jonkrohn/ML-foundations/blob/master/notebooks/batch-regression-gradient.ipynb)

# Gradient of Cost on a Batch of Data

In this notebook, we expand on the partial derivative calculus of the [*Single Point Regression Gradient* notebook](https://github.com/darstech/ML-Foundation/blob/main/notebooks/single_point_regression_gradient.ipynb) to: 

* Calculate the gradient of mean squared error on a batch of data
* Visualize gradient descent in action

In [1]:
import torch
import matplotlib.pyplot as plt

In [2]:
xs = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7.])
ys = torch.tensor([1.86, 1.31, .62, .33, .09, -.67, -1.23, -1.37])

In [4]:
def regression(my_x, my_m, my_b):
  return my_m*my_x + my_b

In [6]:
m = torch.tensor([0.9]).requires_grad_()
b = torch.tensor([0.1]).requires_grad_()

**Step 1**: Forward pass

In [9]:
yhat = regression(xs, m, b)
yhat

tensor([0.1000, 1.0000, 1.9000, 2.8000, 3.7000, 4.6000, 5.5000, 6.4000],
       grad_fn=<AddBackward0>)

**Step 2**: Compare $\hat{y}$ with true $y$ to calculate cost $C$

As in the [*Regression in PyTorch* notebook](https://github.com/darstech/ML-Foundation/blob/main/notebooks/regression_in_pytorch.ipynb), let's use mean squared error, which averages quadratic cost across multiple data points: $$C = \frac{1}{n} \sum_{i=1}^n (\hat{y_i}-y_i)^2 $$

In [10]:
def mse(my_yhat, my_y):
  sigma = torch.sum((my_yhat - my_y)**2)
  return sigma/len(my_y)

In [11]:
C = mse(yhat, ys)
C

tensor(19.6755, grad_fn=<DivBackward0>)

**Step 3**: Use autodiff to calculate gradient of $C$ w.r.t. parameters

In [12]:
C.backward()

In [13]:
m.grad

tensor([36.3050])

In [14]:
b.grad

tensor([6.2650])