## Neural Networks

- Neural networks transform numbers from one form into another
- The initial set of the numbers are the predictors
- The final number is the target

## Example: NBA MVP

- Predictors: PTS, AST
- Target: MVP votes

In [40]:
import pandas as pd
mvp = pd.read_csv("mvps_small.csv", index_col=0)
mvp.head(2)

Unnamed: 0,Player,Year,PTS,AST,MVP Votes
5562,Stephen Curry,2016,30.1,6.7,1310.0
6300,Kevin Durant,2014,32.0,5.5,1232.0


## Linear Regression

- $\hat{y} = m_{1}x_{1} + m_{2}x_{2} + b$
- $Votes = m_{1} * PTS + m_{2} * AST + b$
- The algorithm calculates $m_{1}$, $m_{2}$, and $b$

In [42]:
from sklearn.linear_model import LinearRegression

predictors = mvp[["PTS", "AST"]]
target = mvp["MVP Votes"]

lr = LinearRegression()
lr.fit(predictors, target)
print(f"y = {lr.coef_[0]} * PTS + {lr.coef_[1]} * AST + {lr.intercept_}")

y = 3.040392304416997 * PTS + 0.9527168333392471 * AST + -20.261659338283422


In [43]:
print(lr.coef_[0] * 30.1 + lr.coef_[1] * 6.7 + lr.intercept_)

lr.predict(predictors.head(1))

77.63735180804115


array([77.63735181])

## Neural networks extend linear regression

- Non-linearity
- Layers
- Multiple units

## Non-linearity

* RELU function
* $max(0, m_{1}x_{1} + m_{2}x_{2} + b)$
* $relu(m_{1}x_{1} + m_{2}x_{2} + b)$

## Layers

* $\hat{y} = m_{3} * relu(m_{1}x_{1} + m_{2}x_{2} + b_{1}) + b_{2}$
* $\hat{y} = (m_{5} * relu(m_{1}x_{1} + m_{2}x_{2} + b_{1}) + b_{3}) + (m_{4} * relu(m_{3}x_{1} + m_{4}x_{2} + b_{2}) + b_{3})$

<div>
<img src="img/nn500.png" align="left"/>
</div>


## Gradient Descent

* $\hat{y} = W_{2} * relu(W_{1}X + b_{1}) + b_{2}$
* $(y - \hat{y})^2$ gives us the error.
* Gradient descent solves for $W_{1}$, $W_{2}$, $b_{1}$, and $b_2$ by finding the values that create the least error.
* We do this by checking if the current predictions are higher or lower than the target, then adjusting the paramters to raise or lower the prediction.

<div>
<img src="img/descent.png" align="left"/>
</div>
