# Using Simple Algebra

### Viewing our Data

Let's return to our problem of predicting T-shirt sales.

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 
|    1500        |780 | 
|    2000      | 1130 | 
|    3500      | 1310 | 
|    4000      | 1780 | 

In [18]:
import numpy as np
ad_spending = np.array([800, 1500, 2000, 3500, 400])
sales_spending = np.array([200, 100, 1000, 500, 100])
outcomes = np.array([3300, 7800, 11300, 13100, 1200])

In [19]:
outcomes - (3*ad_spending + 4*sales_spending)

array([ 100, 2900, 1300,  600, -400])

Now let's just look at one of the rows of data.

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 

$$\theta_1*800 = 300$$

$\theta_1*800 = 300$ 

$ \theta_1 * \dfrac{800}{800} = \dfrac{300}{800} $

$\theta_1 =  \dfrac{300}{800} $

### Working with Multiple Observations

Of course the whole reason why we can't simply use algebra for regression is because we have not just one observation but rows of observations.  

|ad spending        | t-shirts           
| ------------- |:-------------:| 
|    800        | 330  | 
|    1500        |780 | 
|    2000      | 1130 | 
|    3500      | 1310 | 
|    4000      | 1780 | 

And we want to find *a single coefficient value* to multiply each of our independent variables by to equal our dependent variable.

In [22]:
theta_1 = .4125
800*theta_1

330.0

$$800*\theta_1 = 330 $$
$$1500*\theta_1 = 780 $$
$$2000*\theta_1 = 1130 $$
$$3500*\theta_1 = 1310 $$
$$4000*\theta_1 = 1780$$

In [31]:
a_adv = np.array([800, 1500, 2000, 3500, 4000])
a_sales = np.array([100, 200, 500, 100, 40])
theta_1 = .5
theta_2 = 1
b = np.array([330, 780, 1130, 1310, 1780])

In [32]:
a_sales*theta_2

array([100, 200, 500, 100,  40])

In [35]:
a_adv = np.array([800, 1500, 2000, 3500, 4000])
a_adv

array([ 800, 1500, 2000, 3500, 4000])

In [40]:
a_sales

array([100, 200, 500, 100,  40])

In [None]:
$\a - b$

In [43]:
a_adv  -a_sales

# 880, 1650

array([ 700, 1300, 1500, 3400, 3960])

In [34]:
a_adv*theta_1

array([ 400.,  750., 1000., 1750., 2000.])

In [33]:
a_adv*theta_1 + a_sales 

array([ 500.,  950., 1500., 1850., 2040.])

### A system of equations

> A **system of equations** is a collection of two or more equations with a same set of unknowns.

$$800*\theta_1 = 330 $$
$$1500*\theta_1 = 780 $$
$$2000*\theta_1 = 1130 $$
$$3500*\theta_1 = 1310 $$
$$4000*\theta_1 = 1780$$

can be represented as the following: 

$a\theta_1 = b$

Where $a$ is the vector: 

$a = \begin{pmatrix}
    800 \\
    1500 \\
    2000 \\
    3500 \\
    4000 \\
\end{pmatrix}$

$\theta_1$ is a scalar. 

and $b$ is the vector:

$b =  \begin{pmatrix}
330   \\
780 \\
1130 \\
1310 \\
1780 \\
\end{pmatrix}$ 



