## Codio Activity 7.5: Calculating Multiple Loss Functions

**Expected Time = 60 minutes**

**Total Points = 20**

A third loss function mentioned is the Huber loss function.  This is notable for its resistance to extreme values and is defined as a piecewise function:


$${\displaystyle L_{\delta }(y,f(x))={\begin{cases}{\frac {1}{2}}(y-f(x))^{2}&{\textrm {for}}|y-f(x)|\leq \delta ,\\\delta \,(|y-f(x)|-{\frac {1}{2}}\delta ),&{\textrm {otherwise.}}\end{cases}}}$$

In this activity, you will compute and compare the results of minimizing the mean squared error, mean absolute error, and huber loss functions.  

NOTE: If the formula is not rendering correctly (overlapping text), double-click in this cell and then Shift-Enter to reload the cell.


## Index:

- [Problem 1](#Problem-1)
- [Problem 2](#Problem-2)
- [Problem 3](#Problem-3)

In [2]:
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
import plotly.graph_objects as go
from scipy.optimize import minimize

### The tips data

For this exercise, the tips dataset from the lectures will be used, and you are to predict the tip amount given the total bill.  

In [5]:
tips = sns.load_dataset('tips')

In [6]:
tips.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size
0,16.99,1.01,Female,No,Sun,Dinner,2
1,10.34,1.66,Male,No,Sun,Dinner,3
2,21.01,3.5,Male,No,Sun,Dinner,3
3,23.68,3.31,Male,No,Sun,Dinner,2
4,24.59,3.61,Female,No,Sun,Dinner,4


In [9]:
X = tips['total_bill']
y = tips['tip']

[Back to top](#Index:) 

## Problem 1

### Create a Huber Loss function

**10 Points**

Using the formula repeated below for the Huber loss, complete the function that returns the sum of the Huber Loss equation.

$${\displaystyle L_{\delta }(y,f(x))={\begin{cases}{\frac {1}{2}}(y-f(x))^{2}&{\textrm {for}}|y-f(x)|\leq \delta ,\\\delta \,(|y-f(x)|-{\frac {1}{2}}\delta ),&{\textrm {otherwise.}}\end{cases}}}$$


The `huber_loss` function should take as input, two arguments: 
        - `theta`: a float value to use for parameter of regression model.
        - `delta`: the delta value in the Huber loss. Set this parameter equal to `1.5`.
        
Inside the function, define `y_pred` as the product of `theta` and `X`. Next, define `y_err` and the absolute value of the difference between `y` and `y_pred`.

Use the code below to return the value of the error from the Huber loss formula:

```
sum(np.where(y_err <= delta, 1/2*(y_err)**2, delta*(y_err - 1/2*delta)))
```


In [25]:
#GRADED
"""
This function accepts a value for theta
and returns the sum of the huber loss.

Arguments
---------
theta: float
       Values to use for parameter
       of regression model.
       
delta: float
       Value for delta in Huber Loss
        
Returns
-------
huber: np.float
     Sum of huber loss
"""

# YOUR CODE HERE
def huber_loss(theta, delta = 1.5):
    y_pred = X * theta
    y_err = abs(y - y_pred)
    h_loss = sum(np.where(y_err <= delta, 1/2*(y_err)**2, delta*(y_err - 1/2*delta)))
    
    return h_loss

huber_loss(8)

56561.369999999995

[Back to top](#Index:) 

## Problem 2

### Minimizing Huber Loss

**10 Points**

Use the `minimize` function imported from `scipy.optimize` to determine the optimal value of `huber_loss` with with `x0 = .5`. Assign your result to `minimum_theta`.

Next, use the the attribute `minimum_theta.x[0]` together with `np.float` calculate the value of `theta_huber` below   


In [31]:
### GRADED

# Use the `minimize` function with an initial guess of 0.5 and assign the value to `minimum_theta` variable
minimum_theta = None

theta_huber = ''

# YOUR CODE HERE
minimum_theta = minimize(huber_loss, x0 = 0.5)
print(minimum_theta)
theta_huber = float(minimum_theta.x[0])

# Answer check
print(type(theta_huber))
print(theta_huber)

  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: 126.1752379831355
        x: [ 1.463e-01]
      nit: 5
      jac: [ 0.000e+00]
 hess_inv: [[ 1.266e-05]]
     nfev: 16
     njev: 8
<class 'float'>
0.14626752601211537
