## the abosute trick : 

$y = w_{1}x + w_{2}$ is the equation for the line were $w_{1}$ is the slope and $w_{2}$ is the y intercept the part of the line where y crosses the y axis. 
- if we add to  $w_{1}$ the slope increases or descrease, simalarly if we add to $w_{2}$ the y-intercept moves up and down i.e. the line move up and down. 
- if we have some point $(p,q)$ and wish to move the line close to it we : 
    1. add a small value $\alpha$ to $w_{2}$ 
    2. add $p\alpha$ to $w_{1}$
    which in turn give us the new equation 
    $y = \left(w_{1} + p\alpha\right) + (w_{2} + \alpha)$

## square trick 

$y = \left(w_{1} + p(q-q')\alpha\right) + (w_{2} + (q-q')\alpha) $ 

this basically moves the line faster towards the point q

## gradient decent

- take the $d/dw$ of the error function. the gradient points to the direction where the function increases the most therefore we take the negative of that and move a bit, that is $w_{i} -> w_{i}- \alpha \frac{\delta}{\delta w_{i}} Error$

## error function : 
 - mean absolute error, mean squared error 
 
![absmeanerror](img/abs_mean_error.png)
 
![meansquareerror](img/mean_squared_error.png)

we want to minimize the mean square error or the mean absolute eror 

In [49]:
# use bmi data to predict life expectacy of bmi 21.07931

import pandas as pd
import os
from sklearn.linear_model import LinearRegression 

In [62]:
bmi_path = os.path.join(os.path.pardir, 'data', 'bmi_and_life_expectancy.csv')
# csv did not have name you must give it names with names=['list', 'of','names']
bmi_data = pd.read_csv(bmi_path)
bmi_data.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 163 entries, 0 to 162
Data columns (total 3 columns):
Country            163 non-null object
Life expectancy    163 non-null float64
BMI                163 non-null float64
dtypes: float64(2), object(1)
memory usage: 3.9+ KB


In [69]:
lr_model = LinearRegression()
#here since we are using one variable we must use reshape 
lr_model.fit(bmi_data.BMI.values.reshape(-1, 1), bmi_data['Life expectancy'])

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

In [70]:
lr_model.predict(21.07931)

array([60.31564716])

## higher dimensions 

|   |$x_{1}$ | $x_{2}$|...|$x_{n}$ |$y^\widehat{}$  |   
|---|---|---|---|---|---| 
|   |size| school quality| ... | rooms | price|
|$house_{1}$|900| 4 | ... | 10| 9,000,000|
|$house_{n}$| 1000 | 7| $ ... $ | 11 | 11,000,000| 

n dimensional space prediction is a n-1 dimensional place

$$ y^\widehat{} = w_{1}x_{1} + ... + w_{n-1}x_{n-1} + w_{n} $$

In [99]:
# use the housing data to predict the price
# https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.names
import pandas as pd
import os
from sklearn.linear_model import LinearRegression 
from sklearn.metrics import accuracy_score, confusion_matrix, precision_score, recall_score

In [79]:
housing_path = os.path.join(os.path.pardir, 'data', 'housing_data.csv')
names = ['crim', 'zn', 'indus', 'chas', 'nox', 'rm', 'age', 'dist', 'rad', 'tax', 'ptratio', 'b', 'lstat', 'medv']
#len(names)
# we have a file that uses whitespace as a delimiter we use delim_whitespace=True
# import data
df = pd.read_csv(housing_path, names=names, delim_whitespace=True)
#df.info()

In [104]:
#split the data
# the y values 
df_price = df.loc[:,'medv']
# x_i -> x_n values
df_house_data = df.iloc[:,0:13]
df_house_data.info()

model = LinearRegression()
model.fit(df_house_data, df_price)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 506 entries, 0 to 505
Data columns (total 13 columns):
crim       506 non-null float64
zn         506 non-null float64
indus      506 non-null float64
chas       506 non-null int64
nox        506 non-null float64
rm         506 non-null float64
age        506 non-null float64
dist       506 non-null float64
rad        506 non-null int64
tax        506 non-null float64
ptratio    506 non-null float64
b          506 non-null float64
lstat      506 non-null float64
dtypes: float64(11), int64(2)
memory usage: 51.5 KB
the score: 0.7406


In [105]:
print("the score: {0:.4f}".format(model.score(df_house_data, df_price)) )
# house prices to predict
sample_house = [[2.29690000e-01, 0.00000000e+00, 1.05900000e+01, 0.00000000e+00, 4.89000000e-01,
                6.32600000e+00, 5.25000000e+01, 4.35490000e+00, 4.00000000e+00, 2.77000000e+02,
                1.86000000e+01, 3.94870000e+02, 1.09700000e+01]]
model.predict(sample_house)

the score: 0.7406


array([23.68284712])

### regulariztion : 
- l1: computationally inefficient, sparce outputs, feature selection
- l2: computationally Efficient, non-sparce outputs, NO feature selection 