# Introduction to Machine Learning Algorithms: REVIEW

### Review

* non-parametric 
    * <font color=red>kNN</font>
    * (w, b) = (X, y)
* parametric
    * $(w, b) = \dots mean(\dots X, \dots y)$
        * makes assumptions about the relationship between $X, y$
    * <font color=red>Linear Regression</font>
        * We assume a true function, 
            * $y = f(x; w, b) = wx + b$
    
* "blend"
    * neural networks, much more like non-parametric
        * remember compressed versions of data
        * (w, b) ~= compressed (X, y)

## $k$ Nearest Neighbors

In [5]:
X = [
    (180, 13), # movie_length, ticket_price
    (120, 14),
    (90, 9),
]

y = [ # like_film
    False,
    True,
    True
]

### The Algorithm

In [4]:
model = (X, y) # remember everything!

### The Prediction

In [6]:
x_new  = (100, 12.50)

$k = 2$ means, "find the 2-most-similar people",

In [7]:
k = 2

### Rank every point in the database by similarity to `x_new`,

How do we rank?

Consider the diference in runtime and price, add them up,

In [20]:
abs(100 - 180) + abs(12.50 - 13)

80.5

..this is bigger if the customer's features ($X$) are different. 

In [15]:
X

[(180, 13), (120, 14), (90, 9)]

In [24]:
ranked = sorted([
    (abs(x_new[0] - runtime) + abs(x_new[0] - price), like) 
    for (runtime, price), like in  zip(X, y)
])

In [26]:
ranked # (X's similarity-to-x_new, y for that X entry)

[(101, True), (106, True), (167, False)]

### Choose $k$ points

Given they are sorted, we want the first $k$,

In [29]:
from statistics import mode

In [30]:
mode([y for rank, y in ranked[:k] ])

True

$ mode(y^0 \dots y^k)$ for $k$ smallest points ranked by $|x_0 - x_0^{new}| + |x_1 - x_1^{new}| $

Aside, 

In [32]:
mode( y for x, y in sorted([
    (abs(x_new[0] - runtime) + abs(x_new[0] - price), like) 
    for (runtime, price), like in  zip(X, y)
])[:k])

True

Aside,

```sql

SELECT MODE(y)
FROM database
ORDER BY 
    ABS(x0 - xnew_0) + ABS(x1 - xnew_1)
LIMIT 2
```

---

# Appendix: EXTRA

Recall the simple linear regression algorithm we have seen,

```python 
history = []
for w_guess in range(0, 10):
    for b_guess in range(0, 10):
        predictions = [ w_guess * x + b ... ]
        error = sum([ abs(y - yhat)] ...)
        
        history.append(  
            (error, (w_guess, b_guess))
        )
```

## Linear Regression

In [34]:
X = [
    (180, 13), # movie_length, ticket_price
    (120, 14),
    (90, 9),
]

y = [19.3, 13.4, 9.9] # spend on sweets

In [102]:
[0.1 * x0 + 0.1 * x1 for x0, x1 in X]

[19.3, 13.4, 9.9]

In [35]:
from random import random

In [135]:

def total_loss(w0, w1):
    yhats = [(w0 * x0 + w1 * x1) for x0, x1 in X]
    return sum([ (obs - pred)**2 for obs, pred in zip(y, yhats) ]) ** 0.5


Does the loss increase if we increase `w0` ?

In [228]:
(total_loss(w0 + 0.05, w1) - total_loss(w0, w1))

0.10478651677013318

...yes, and normalizing, 

In [229]:
(total_loss(w0 + 0.05, w1) - total_loss(w0, w1)) /0.05

2.0957303354026635

We can use this information to update `w0`, 

In [221]:


w0, w1 = random(), random()
history = [(total_loss(w0, w1), (w0, w1) )]

while True:

    
    dir_w0 = (total_loss(w0 + 0.05, w1) - total_loss(w0, w1))/0.05
    dir_w1 = (total_loss(w0, w1 + 0.05) - total_loss(w0, w1))/0.05 
    
    w0 -= 0.0001 * dir_w0 
    w1 -= 0.0001 * dir_w1
    
    history.append( 
        (total_loss(w0, w1), (w0, w1) ) 
    )
    
    last_error = history[-2][0]
    this_error = history[-1][0]
    
    diff = abs(this_error - last_error)/(this_error + last_error)
    
    if diff < 0.01:
        break

In [222]:
history[0]

(17.872459176422442, (0.18277138251914227, 0.026034501047746184))

In [223]:
min(history)

(1.3288105175012592, (0.11249841408694664, 0.019881661605428))

In [224]:
error, (w0_best, w1_best) = min(history)

In [225]:
y

[19.3, 13.4, 9.9]

In [226]:
[ round(w0_best * x0 + w1_best * x1, 1) for x0, x1 in X ]

[20.5, 13.8, 10.3]

---

The update rule is, 

$ w_0^{next} = w_0^{prev} - \lambda \frac{dL(w0)}{dw0}$

Where, 

$ \frac{dL(w_0)}{dw_0} \approx \frac{L(w_0 + \delta) - L(w_0)}{\delta}$