# Loss Lab

### Introduction

In this lesson we'll build out the error component to our simple linear regression model.  To do this, we need to know the linear regression model's coefficient and y-intercept.  We'll also need a list of data to predict on and compare the predictions of the model against.  We'll write our methods and encapsulate the related data using object orientation.

### Loading our hypothesis class

First, copy and paste the hypothesis class from the previous lab.

In [1]:
class Hypothesis:
    def __init__(self, coef, intercept, inputs):
        self.coef_ = coef
        self.intercept_ = intercept
        self.x_values = inputs
        
    def predict(self):
        self.y_values = []
        for input in self.x_values:
            y_value = self.coef_ * input + self.intercept_
            self.y_values.append(y_value)
        return self.y_values    
    
class test:
    def __init__(self):
        self.coef_ = coef
        self.intercept_ = intercept

In [2]:
import numpy as np
coef = 0.39
intercept = 153
inputs = np.array([800, 1500, 2000, 3500, 4000])


hypothesis = Hypothesis(coef, intercept, inputs)
hypothesis.__dict__
# {'coef_': 0.39, 'intercept_': 153, 'x_values': [800, 1500, 2000, 3500, 4000]}

{'coef_': 0.39,
 'intercept_': 153,
 'x_values': array([ 800, 1500, 2000, 3500, 4000])}

In [3]:
hypothesis.predict()
# [465.0, 738.0, 933.0, 1518.0, 1713.0]

[465.0, 738.0, 933.0, 1518.0, 1713.0]

Now this `Hypothesis` class will still be the sole class in charge of making predictions.  Now we'll just also add our `Loss` class, which will be in charge of calculating errors.

### Creating the Loss Class

In [4]:
class Loss():
    def __init__(self, hypothesis, targets):
        self.hypothesis = hypothesis
        self.y_values = targets

Think about what it takes to calculate the error at a given point.  We need to know:
* our feature data, and 
* the hypothesis component that makes predictions.  

Update the `Loss` class so that we can initialize it with an instance of our Hypothesis class, a vector of `target_variables`.  
> The Hypothesis class will continue to hold all of the information related to our y-intercept, coefficient, inputs, and predictions.

So first we'll recreate an instance of a hypothesis.

In [5]:
import numpy as np
coef = 0.39
intercept = 153

inputs = np.array([800, 1500, 2000, 3500, 4000])
hypothesis = Hypothesis(coef, intercept, inputs)

Then we need to initialize the Loss instance with our hypothesis instance and target variables.

In [6]:
targets = np.array([330, 780, 1130, 1310, 1780])

In [7]:
loss = Loss(hypothesis, targets)
loss.__dict__

# {'hypothesis': <__main__.Hypothesis at 0x10cd59400>,
#  'targets': array([ 330,  780, 1130, 1310, 1780])}

{'hypothesis': <__main__.Hypothesis at 0x155c1746048>,
 'y_values': array([ 330,  780, 1130, 1310, 1780])}

Now with these three pieces of information, our Loss instance should have the information it needs to calculate errors.

### Calculating Errors

Write a method called errors, that returns a list of errors for each data point that we passed through.

In [8]:
class Loss():
    def __init__(self, hypothesis, targets):
        self.hypothesis = hypothesis
        self.y_values = targets
        
    def errors(self):
        outcomes = self.y_values
        expecteds = self.hypothesis.predict()
        return list((outcome - expected for outcome, expected in zip(outcomes, expecteds)))

In [9]:
targets = np.array([330, 780, 1130, 1310, 1780])
targets

array([ 330,  780, 1130, 1310, 1780])

In [10]:
loss = Loss(hypothesis, targets)
loss.__dict__

# {'hypothesis': <__main__.Hypothesis at 0x10cd59400>,
#  'targets': array([ 330,  780, 1130, 1310, 1780])}

{'hypothesis': <__main__.Hypothesis at 0x155c1746048>,
 'y_values': array([ 330,  780, 1130, 1310, 1780])}

In [11]:
loss.errors()
# [-135.0, 42.0, 197.0, -208.0, 67.0]

[-135.0, 42.0, 197.0, -208.0, 67.0]

Then write a method called `squared_errors` that squares each one of the elements returned from our `errors` method.

> For this method, break down the steps that are involved in matrix vector multiplication and see how you can use it to build the `rss` method.

In [12]:
class Loss():
    def __init__(self, hypothesis, targets):
        self.hypothesis = hypothesis
        self.y_values = targets
        
    def errors(self):
        outcomes = self.y_values
        expecteds = self.hypothesis.predict()
        return list((outcome - expected for outcome, expected in zip(outcomes, expecteds)))
    
    def squared_errors(self):
        return list(error**2 for error in self.errors())
    
    def rss(self):
        return sum(self.squared_errors())

In [13]:
targets

array([ 330,  780, 1130, 1310, 1780])

In [14]:
loss = Loss(hypothesis, targets)
loss.__dict__

# {'hypothesis': <__main__.Hypothesis at 0x10cd59400>,
#  'targets': array([ 330,  780, 1130, 1310, 1780])}

{'hypothesis': <__main__.Hypothesis at 0x155c1746048>,
 'y_values': array([ 330,  780, 1130, 1310, 1780])}

In [15]:
loss.rss()
# 106551.0

106551.0

### Summary

In this lesson, we built out the error component to our simple linear regression model.  To do this, we  created an instance of the Hypothesis class, which was responsible for the parameters of our linear regression model as well making predictions from these parameters and our input data.  We added a Loss class, which calculated errors from our hypothesis instance for a given data set.