# A different approach to regression

### Introduction

Let's return to our problem of predicting T-shirt sales.

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 
|    1500        |780 | 
|    2000      | 1130 | 
|    3500      | 1310 | 
|    4000      | 1780 | 

In [1]:
inputs = [800, 1500, 2000, 3500, 4000]
outcomes = [330, 780, 1130, 1310, 1780]

Now let's just look at one of the rows of data.

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 

When we do linear regression, really what we are trying to do is the following: 

> Find the impact of our **independent variable**, here ad-spending, on our **dependent variable** of t-shirts.  

We've previously written this as an equation.

$m*800 = 300$ 

where:

* $800$ is the amount of ad spending
* $300$ is the related number of T shirts sold
* $m$ is our coefficient - the impact ad spending has on T-shirts

So this coefficient $m$ is an example of a parameter we try to solve for when perform linear regression.  

When we only have one row of data, solving for $m$ is fairly straight-forward.

$m*800 = 300$ 

$ m * \dfrac{800}{800} = \dfrac{300}{800} $

$m =  \dfrac{300}{800} $

### Moving to multiple observations

Of course the whole reason why we can't simply use algebra for regression is because we have not just one observation but rows of observations.  

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 
|    1500        |780 | 
|    2000      | 1130 | 
|    3500      | 1310 | 
|    4000      | 1780 | 

And we want to find *a single coefficient value* to multiply each of our independent variables by to equal our dependent variable.

$$800*m = 330 $$
$$1500*m = 780 $$
$$2000*m = 1130 $$
$$3500*m = 1310 $$
$$4000*m = 1780$$

It makes sense that we are assuming $m$ is the same across equations.  This reflects our assumption that the number of T-shirt sales per ad spending will be constant across our different observations.

Notice that this solving a system of equations appears different from our trial and error approach that we saw before.  And it is different.  This approach is more algebraic, and it is referred to as the *analytic solution to regression*.  

We'll learn more about it later, but first we need to learn some more fundamentals.

### Introducing Linear Algebra

Let's look again at our equations from before.

$$800*m = 330 $$
$$1500*m = 780 $$
$$2000*m = 1130 $$
$$3500*m = 1310 $$
$$4000*m = 1780$$

Our approach to solving regression is to find a single value of $m$ that solves or comes close to solving our equations above. 

Now this problem  of having multiple equations, and trying to find a coefficient that satisfies all of the equations is a problem that arises throughout mathematics.  It's called "solving a system of equations", and an entire field of mathematics has been created related to this problem.  The field is called linear algebra.

To understand machine learning, we won't have to learn an entire course in linear algebra, but we will need to learn some of the basics.  Doing so will allow us to understand the some of the concepts in machine learning which come from linear algebra, and it will also allow us to understand the notation of linear algebra, which is how many data scientists speak about and understand machine learning algorithms. 

By using linear algebra we can express our entire system of equations below...

$$800*x = 330 $$
$$1500*x = 780 $$
$$2000*x = 1130 $$
$$3500*x = 1310 $$
$$4000*x = 1780$$

> We replaced the variable $m$ with $x$ to follow convention.

as the following: 

$ax = b$

Where a is the vector: 

$a = \begin{pmatrix}
    800 \\
    1500 \\
    2000 \\
    3500 \\
    4000 \\
\end{pmatrix}$

$x$ is a scalar. 

and $b$ is the vector:

$b =  \begin{pmatrix}
330   \\
780 \\
1130 \\
1310 \\
1780 \\
\end{pmatrix}$ 





But we're getting a little ahead of ourselves.  We don't yet know what a scalar or a vector is.  Or why we would want to use them.  So that is where we will go next.

### Summary

In this lesson, we saw how when we are given a set of observations

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 
|    1500        |780 | 
|    2000      | 1130 | 
|    3500      | 1310 | 
|    4000      | 1780 | 

* We can translate the problem of linear regression, where we try to discover the coefficient (or coefficients) that predict a target, 
* To solving a system of linear equations, and finding the coefficients that solve or come close to solving our system of equations: 

$$800*x = 330 $$

$$1500*x = 780 $$

$$2000*x = 1130 $$

$$3500*x = 1310 $$

$$4000*x = 1780$$

This approach (which we'll later explore further) is called the analytic solution to regression.

We can rewrite a system of equations using vectors.  Below we let our features equal the vector $a$, our target variables equal the vector $b$, and our coefficients equal to the scalar $x$.

Where $a$ is the vector: 

$a = \begin{pmatrix}
    800 \\
    1500 \\
    2000 \\
    3500 \\
    4000 \\
\end{pmatrix}$

$x$ is a scalar. 

and $b$ is the vector:

$b =  \begin{pmatrix}
330   \\
780 \\
1130 \\
1310 \\
1780 \\
\end{pmatrix}$ 

In the lessons that follow, we'll try to better understand how we can re-express and understand our problem of linear regression using vectors and matrices.