# Bias Variance Tradeoff + Kernels

When modelling, we are trying to create a useful prediction that can help us in the future. When doing this, we have seen how we need to create a train test split in order to keep ourselves honest in tuning our model to the data itself. Another perspective on this problem of overfitting versus underfitting is the bias variance tradeoff. We can decompose the mean squared error of our models in terms of bias and variance to further investigate.

$ E[(y-\hat{f}(x)^2] = Bias(\hat{f}(x))^2 + Var(\hat{f}(x)) + \sigma^2$
  
  
$Bias(\hat{f}(x)) = E[\hat{f}(x)-f(x)]$  
$Var(\hat{f}(x)) = E[\hat{f}(x)^2] - \big(E[\hat{f}(x)]\big)^2$

## 1. Split the data into a test and train set.

In [None]:
import pandas as pd

In [None]:
df = pd.read_excel('movie_data_detailed_with_ols.xlsx')

X = df[['budget', 'imdbRating',
       'Metascore', 'imdbVotes']]
y = df['domgross']

df.head()

In [None]:
#Your code here

## 2. Fit a regression model to the training data.

In [None]:
#Your code here

## 3. Calculating Bias
Write a formula to calculate the bias of a models predictions given the actual data. 
(The expected value can simply be taken as the mean or average value.)

In [None]:
def bias(y, y_hat):
    #Your code here

## 4. Calculating Variance
Write a formula to calculate the variance of a model's predictions (or any set of data).

In [None]:
def variance(y_hat):
    #Your code here

## 5. Us your functions to calculate the bias and variance of your model.

In [None]:
b = #Your code here
v = #Your code here
print('Bias: {} \nVariance: {}'.format(b,v))

## Locally Weighted Linear Regression using Kernels

Locally weighted linear regression helps fit a model to the data by weighing nearby points more heavily then thos far away. This will reduce mean squared error by introducing bias. A common kernel is the guassian which assigns weights by:   

$ w(i)  = \exp\big(\frac{|x^i-x|}{-2k^2}\big)$

Notice the constant K which allows us to determine how much to weight nearby points.

## 6. Write a kernel function 
Use the gaussian function defined above to write a kernel weighting function. The function should take in a test point, X, y and a value of k. The function should then return a prediction for the specific point by minimizing the MSE for the weighted points.

In [None]:
def kernel(point_to_predict, X, k=1):
    #Your code here
    return weights

In [None]:
def locally_weighted_linear_regression(X_test, X_train, y_train, k=1.0):
    """Psuedocode:
    - Iterate through each point in X_test
    - Call the kernel function above to reweight the (normalized) training data
    - Call a linear regression function to optimize coefficient weights to minimize MSE
    - Output a prediction y_hat_i and append it to your vector of predicitons, y_hats
    """
    #Your code here
    return y_hats

## 7. Plotting Bias Variance Curves
Define a function takes in 4 inputs: min_kernel_size, max_kernel_size, x and y. From this, the function should create a graph of the bias and variance for various regression models versus the kernel size of the regression models.

* Split the data into train and test samples (You can also add this in as an optional 5th input for your function.)
* Iterate over the kernel range provided, starting at the min_kernel_size and going up through the max_kernel_size, along with at least 100 evenly spaced kernel sizes in between. 
* For each kernel size, train a locally weighted linear regression model. 
* Then for each kernel model, calculate the bias and variance using your above functions.
* Create 5 lists: kernel size, bias, variance, train error and test error.
* Plot kernel size vs bias 
* Plot kernel size vs variance
* Plot kernel size vs train error
* Plot kernel size vs test error

In [None]:
def plot_bias_var_curves(min_kernel, max_kernel, x, y):
    #Plot curves
    #Your code here

## Call your function!

In [None]:
plot_bias_var_curves(min_kernel = 0.001, max_kernel=1, x, y)

## 8. Describe what you notice and observe.

Reflect on what you noticed here.