# KNeighborsClassifier with scikit learn

![Creative Commons License](https://i.creativecommons.org/l/by/4.0/88x31.png)  
This work by Jephian Lin is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

In [1]:
import numpy as np
import matplotlib.pyplot as plt

## Code
```python
from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier(<parameters)
model.fit(X, y)
y_new = model.predict(X_test)
```

[Official Reference](https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html#sklearn.neighbors.KNeighborsClassifier.kneighbors)

## Parameters
- `n_neighbors`: Numbers of neighbors (including self) to vote

## Attributes
- `classes_`: an array of shape `(n_classes,)`  
(Usually `0, ..., n_classes-1`)

## Sample data

##### Exercise 1
Let  
```python
mu1 = np.array([2.5,0])
cov1 = np.array([[1.1,-1],
                [-1,1.1]])
mu2 = np.array([-2.5,0])
cov2 = np.array([[1.1,1],
                [1,1.1]])
X = np.vstack([np.random.multivariate_normal(mu1, cov1, 100), 
               np.random.multivariate_normal(mu2, cov2, 100)])
y = np.array([0]*100 + [1]*100)
```

###### 1(a)
Use `plt.scatter` to plot the points with `x` and `y` .  
Use `plt.plot( ..., c='r')` to plot the line with `x_test` and `y_new` .  
Print `model.coef_` and  `model.intercept_` .  
Can you guess these values by the definition of `y` ?

In [None]:
### your answer here

###### 1(b)
Redo 1(a) with the setting `fit_intercept=False` .

In [None]:
### your answer here

In [7]:
KNeighborsClassifier?

In [2]:
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target

In [4]:
from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier()
model.fit(X, y)
y_new = model.predict(X)

In [8]:
model.classes_

array([0, 1, 2])

##### Exercise 2
Let  
```python
x1 = np.arange(5)
X = np.vstack([x1]).T

model = PolynomialFeatures(degree=3, include_bias=False)
X_ex = model.fit_transform(X)
```

###### 2(a)
Understand the relation between `X` and `X_ex` .  
Can you generate `X_ex` by boradcasting instead of the `PolynomialFeatures` function?

In [None]:
### your answer here

###### 2(b)
Switch the setting to `include_bias=True` .  
Understand the relation between `X` and `X_ex` .  

In [None]:
### your answer here

###### 2(c)
Let  
```python
x1 = np.arange(5)
x2 = np.arange(5,10)
X = np.vstack([x1,x2]).T

model = PolynomialFeatures(degree=2, include_bias=False)
X_ex = model.fit_transform(X)
```
Print `model.powers_` and understand the relation between `X` and `X_ex` .  

In [None]:
### your answer here

##### Exercise 3
Let  
```python
r = 100 * np.random.rand(100)
area = 4*np.pi*r**2 + 0.5*np.random.randn(100)
```
be a collection of data of 100 balls,  
where `c` stores the radii and  
`area` stores the surface areas.  
Suppose you knows nothing about the formula of the surface area of a sphere.  
How would you guess their relation?

In [None]:
### your answer here

##### Exercise 4
Let  
```python
r = 100 * np.random.rand(100)
volume = 4/3*np.pi*r**3 + 0.5*np.random.randn(100)
```
be a collection of data of 100 balls,  
where `c` stores the radii and  
`volume` stores the volumes.  
Suppose you knows nothing about the formula of the surface area of a sphere.  
How would you guess their relation?

In [None]:
### your answer here

## Experiments

##### Exercise 5
Let  
```python
x = np.arange(10)
y = 0.1*x**2 + 0.2*x + 0.3 + 0.5*np.random.randn(10)
X = x[:,np.newaxis]
x_test = np.linspace(0,10,20)
X_test = x_test[:,np.newaxis]
```
For `k = 0, ..., 4`, run the polynomial regression model with `degree=k`.  
Let `scores` be a list storing their scores.  
Plot the scores.  
Which degree is an appropriate guess?

In [None]:
### your answer here

##### Exercise 6
Let  
```python
x = np.arange(10)
y = 0.1*x**2 + 0.2*x + 0.3 + 0.5*np.random.randn(10)
X = x[:,np.newaxis]

model = PolynomialRegression(2)
model.fit(X, y)
y_new = model.predict(X)

a0 = model[1].intercept_
a1,a2 = model[1].coef_
```
The prediction `y_new` is supposed to be the same as `a0 + a1*x + a2*x**2` .  
Check if it is true.

In [None]:
### your answer here