# Fitting a Linear Regression Function

### Introduction

In the previous section, we discussed our formula for linear regression, $y$ = $mx + b$.  And we saw how we can use our SciKit Learn library to discover the estimated values for $m$ and $b$.  But how does SciKit Learn, or any linear regression model come up with these numbers.  In this lesson we'll start finding out.

### How SciKit Learn "Fits"

Now SciKit Learn finds values for $m$ and $b$ when we *fit* our linear regression model to the data.  Let's again see how we do this.

In [1]:
inputs = [800, 1500, 2000, 3500, 4000]
sklearn_inputs = [ [800], 
    [1500],
    [2000],
    [3500],
    [4000] ]
outcomes = [330, 780, 1130, 1310, 1780]

In [2]:
from sklearn.linear_model import LinearRegression

regression = LinearRegression()
# create the initial model
regression.fit(sklearn_inputs, outcomes)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)

Once we call the `fit` function, the model has come up with numbers for both the coefficient and the intercept.

In [3]:
regression.coef_

array([0.38675261])

In [4]:
regression.intercept_

153.26385079539216

When we `fit` the line, what we are trying to do is find the line that best matches the data that we pass through the model - our inputs and outcomes.  By "best matches" we mean, the line that comes closes to the data.  So how do we get a line that comes closes to the data?

These are the steps.

1. Start with an initial model: that is, initial values for $m$ and $b$, these numbers can be anything.
2. Evaluate a model by calculating how close the model predicts our observed observed data
3. Update the parameters of our linear regression model and evaluate this updated model
4. Stop when we have a linear regression model that comes as close as possible to the data

Next, we'll explore the steps of building the initial model and evaluating that model.