# Homework 1: Classification, Perceptrons

Here, we'll continue on the theme of "supervised learning".

<img src="../figures/ml-key-ideas.jpg" width=400>

In the tutorial, we went through all of the components of the above diagram for a regression problme.

Now we'll revise the same steps, but looking at a **binary classification** example.

**Table of contents:**
- <span style="color:dimgray"> **Q1:** Data visualization </span>
- <span style="color:purple"> **Q2:** Hypothesis class: perception  </span>
- <span style="color:blue"> **Q3:** Loss function, binary cross entropy </span>
- <span style="color:turquoise"> **Q4:** Learning (fitting the model) </span>
- <span style="color:gold"> **Q5:** Final model</span>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

### 0. Dataset generation

In [None]:
def get_data(N):

    assert N % 2 == 0 # fct assumes even N 
    
    # The centers of the Gaussian blobs
    c1 = np.array([0,0])
    c2 = np.array([2,1])
    
    s1 = .333
    s2 = .45

    X1 = c1 + s1*np.random.randn(N//2,2)
    X2 = c2 + s2*np.random.randn(N//2,2)

    y1 = np.zeros(N//2)
    y2 = np.ones(N//2)

    X = np.concatenate([X1,X2],axis=0)
    y = np.hstack([y1,y2])
 
    # Shuffle the data
    idx = np.arange(N)
    np.random.shuffle(idx)

    return X[idx], y[idx]

In [None]:
X_feat,y = get_data(100)

In [None]:
X_feat[:3]

In [None]:
y[:3]

### <span style="color:dimgray"> 1. Data visualization </span>

**TO DO:** Visualize the dataset

### <span style="color:purple"> 2. Hypothesis class: perception  </span>

The perceptron is a combination of a linear model with outputs bounded between (0,1), so we can interpret the output probabilisitcally.

$$f(X;w) = \sigma(X w),$$

where $\sigma(z) = \frac{1}{1+ \exp(-z)}$.


**TO DO:** Code up the perceptron

In [None]:
def sigmoid(z):
    '''
    TO DO: Your code here
    '''
    return 


def perceptron(X,w):
    '''
    TO DO: Your code here
    '''
    return 

In [None]:
n,d = X_feat.shape
X = np.column_stack([np.ones(n),X_feat])

w0 = np.random.randn(d+1)

**Question:** What do you expect the accuracy to be before fitting??

**Your answer:**

**TO DO:** Test out your hypotheis, what is the accuracy of your perceptron with this w0 (?)

In [None]:
# Visualizing the predictions

f = perceptron(X,w0)

plt.hist(f[y==0],20,(0,1),color='r',alpha=.5,label='y=0')
plt.hist(f[y==1],20,(0,1),color='b',alpha=.5,label='y=1')
plt.legend(fontsize=15)
plt.xlabel('$f(X;w) = \sigma(X w)$',fontsize=15)
plt.ylabel('Entries',fontsize=15)
plt.show()

In [None]:
'''
TO DO: Calculate accuracy
'''

acc = #fill in
print('acc is',acc)

### <span style="color:blue"> 3. Loss function, binary cross entropy </span>

Then by interpretting the output probabilistically we defined the binary cross entropy function in class:

$$
\begin{align}
\mathcal{L} &= \frac{1}{n}\sum_i - y^{(i)} \log [ f(x^{(i)};w) ] - (1-y^{(i)}) \log [ 1-  f(x^{(i)};w)  ] \\
&= \frac{1}{n}\sum_i - y^{(i)} \log [ \sigma(w^T x^{(i)})] - (1-y^{(i)}) \log [ 1- \sigma(w^T x^{(i)}) ]
\end{align}
$$

**TO DO:** Code up the BCE loss.

In [None]:
def bce(y,y_pred):
    '''
    TO DO: Implement
    '''
    return 

**Q:** What would you expect the loss to be before training?

^^ Tip It's always a good idea to ask yourself this question before training for any model that you have!!

**Your answer:**

**Test your hypo!**

Note, if your result doesn't agree with your expectation, try a few iterations of initiatlizing `w0` and take an avg.

(And maybe dial down the variance when initializing `w0` a bit.)

In [None]:
bce(y,f)

**Does this prediction agree with your expectations?**

### <span style="color:turquoise"> 4. Learning (fitting the model) </span>

Find the minimum of the loss function for this dataset.


In [None]:
def dLdw(y,X,w):
    '''
    To do, code up the formula you derived
    '''

    return 

In [None]:
losses = []

w = np.copy(w0)

alpha=.01

for i in range(100000):

    '''
    TO DO: Code up a training loop
    using the dLdw you defined above
    '''

plt.plot(losses)
plt.xlabel('Iterations',fontsize=15)
plt.ylabel('BCE Loss',fontsize=15)
# plt.xscale('log')

<span style="color:blue"> Nice! It looks like we've reached the minimum. </span>

### <span style="color:gold"> 5. Final model</span>

**To do:**
- [ ] Visualize the decision boundary (model prediciton vs. $(x_0,x_1)$)
    - Hint: `np.meshgrid` and `plt.pcolormesh` might be helpful!
- [ ] Calculate the accuracy after training
- [ ] Draw the roc curve
    - Hint: `np.add.accumulate` might be helpful!

**How does it look?**

In [None]:
# Calculate the accuracy

acc =
print('acc',acc)

In [None]:
# Calculate the roc curve


In [None]:
# Draw the roc curve


**What do you think?** (Sanity check, is your roc curve AUC > 0.5 😉)