<a href="https://colab.research.google.com/github/sonudk/Introduction_to_AI_and_IoT-/blob/master/BHAKTI_TIME_190608_kerasbasics_gaussian_assignment_f.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Keras Basics
We will learn about
* Dense layers
* Categorical cross-entropy

A toy example to show how to train a classifier with Keras and use it. The data comes from three gaussian distributions.

In [0]:
## DATA GENERATION
import numpy as np

def generateX(cls):
    '''
    Inputs:
        cls: class {0, 1, 2}
    Outputs:`w
        x: a sample from cls; a np array of shape (2,)
    '''
    assert cls in [0,1,2]
    if cls==0:
        x = np.random.normal(np.array([0,0]),100)
    elif cls==1:
        x = np.random.normal(np.array([200,200]),100)
    elif cls==2:
        x = np.random.normal(np.array([-200,200]),100)
    return x
Nx = 2 # shape of a sample is (2,)
Ny = 3 # 3 classes

Could you write a function to generate N samples from class 0 and N samples from class 1?

In [0]:
def generateXY(N):
    '''
    Inputs:
        N: no. of samples of each class
    Outputs:
        X: np array of samples; shape = (3*N, 2)
        Y: np array of samples; shape = (3*N, 1)
    '''
    # YOUR CODE HERE
    classes = [0,1,2]
    X = np.empty([3*N,2])
    Y = np.empty([3*N,1])
    for c in classes:
      for i in range(N):
        X[i + c*N] = generateX(c)
        Y[i + c*N] = c
    '''print(X)
    print(Y)'''
    return X, Y

In [106]:
def test_generateXY():
    X_train, Y_train = generateXY(50)
    assert X_train.shape==(150,2)
    assert Y_train.shape==(150,1)
    print('Test passed', '\U0001F44D')
test_generateXY()

Test passed 👍


### One-hot encoding

Now our Y is in the form [0], [1] and [2]. We want to convert them to [1,0,0], [0,1,0] and [0,0,1], respectively. 
Could you write a code to convert Y (with one column) into one-hot encoded Y (with 3 columns)?

In [0]:
def oneHot(y, Ny):
    '''
    Input:
        y: an int in {0, 1, 2}
        Ny: Number of classes, e.g., 3 here.
    Output:
        Y: a vector of Ny (=3) tuples
    '''
    # YOUR CODE HERE
    y = int(y)
    y_onehot = np.eye(Ny)
    Y = y_onehot[y]
    return Y


In [108]:
def test_oneHot():
    assert np.all(oneHot(0,3)==np.array([1,0,0]))
    assert np.all(oneHot(1,3)==np.array([0,1,0]))
    assert np.all(oneHot(2,3)==np.array([0,0,1]))
    print('Test passed', '\U0001F44D')
test_oneHot()

Test passed 👍


### Input Normalization
X can lie in any unbounded range. We need to curtail to a narrow range close to zero. This helps in enhancing the stability of training and hyper-parameter tuning.
This is normally achieved by scaling the X to have zero mean and unit standard deviation (std).

$X \leftarrow \frac{X-mean(X)}{std(X)}$, where this is element wise division

Could you use training samples to find mean and std, and normalize your X_train with that?

In [0]:
def findMeanStddev(X):
    '''
    Input: 
        X: a matrix of size (no. of samples, dimension of each sample)
    Output:
        mean: mean of samples in X; shape is (dimension of each sample,)
        stddev: element-wise std dev of sample in X; shape is (dimension of each sample,)
    '''
    # YOUR CODE HERE
    mean = np.sum(X,axis = 0)/X.shape[0]
    X1 = X - mean
    stddev = np.sqrt(np.sum(X1*X1,axis = 0)/X.shape[0])
    #print(mean,stddev)
    return mean, stddev

In [110]:
def test_findMeanStddev():
    X = np.array([[3,2,6],[7,4,2],[3,5,1]])
    mean, stddev = findMeanStddev(X)
    assert np.isclose(mean, np.array([4.33, 3.66, 3.]), atol=0.1).all()
    assert np.isclose(stddev, np.array([1.88, 1.24, 2.16]), atol=0.1).all()
    print('Test passed', '\U0001F44D')
test_findMeanStddev()

Test passed 👍


In [0]:
def normalizeX(X, mean, stddev):
    '''
    Input:
        X: a matrix of size (no. of samples, dimension of each sample)
        mean: mean of samples in X (same size as X)
        stddev: element-wise std dev of sample in X (same size as X) 
    Output:
        Xn: X modified to have 0 mean and 1 std dev
    '''
    # YOUR CODE HERE
    Xn = (X - mean)/(stddev + 1e-8)
    return Xn

In [112]:
def test_normalizeX():
    X = np.ones((3,3))
    m,s = findMeanStddev(X)
    assert np.all(m==np.ones(3))
    assert np.all(s==np.zeros(3))
    assert np.all(normalizeX(X,m,s)==0*X)
    # test on random X
    X = np.random.random((5,3))
    m,s = findMeanStddev(X)
    Xn = normalizeX(X,m,s)
    mn, sn = findMeanStddev(Xn)
    assert np.allclose(mn, np.zeros(3))
    assert np.allclose(sn, np.ones(3))
    print('Test passed', '\U0001F44D')
test_normalizeX()

Test passed 👍


### Plotting
Could you plot all the samples in X_train with different colors for different classes?

In [0]:
import matplotlib.pyplot as plt
colors = ['b', 'g', 'r', 'c', 'm', 'y', 'k']
def plotXY(X, Y):
    '''
    Inputs:
        X: a matrix of size (no. of samples, dimension of each sample)
        Y: a matrix of size (no. of samples, no. of classes) - these are one-hot vectors
    Action:
        Plots the samples in X, their color depends on Y
    '''
    Ny = Y.shape[1]
    for cls in range(Ny):
        idx = np.where(Y[:,cls]==1)[0]
        plt.plot(X[idx,0], X[idx,1], colors[cls]+'.')


## Creating the Network
We now create the network with dense layers: 
$y = f(Wx)$

ReLU activation: 
$f(h) = h, h>0; 0, h\le 0$

Softmax activation: 
$f(h_i) = \frac{\exp(h_i)}{\sum_j \exp(h_j)}$

Categorical cross-entropy loss:
$\mathcal{L} = -\sum_t y^d_t \log y_t$

Stochastic Gradient Descent:
$w_{ij} \leftarrow w_{ij} - \eta \frac{\partial \mathcal{L}}{\partial w_{ij}}$

In [0]:
import keras
from keras.layers import Input, Dense ,Activation
from keras.models import Model
from keras import optimizers

def makeNN(Nx, Nh, Ny):
    '''
    Input:
        Nx: int; no. of input nodes; shape of each sample; i.e., X.shape[1:] 
        Nh: int; no. of hidden neurons
        Ny: int; no. of output nodes; shape of output; i.e., Y.shape[1]
    Output:
        model: keras NN model with Input layer, Dense layer with Nh neurons, 
                and Dense output layer with softmax non-linearity, loss function
                categorical-crossentropy, optimizer SGD.
    '''
    # YOUR CODE HERE
    x = Input(shape=(Nx,))
    y = Dense(Nh, activation='relu')(x)
    y = Dense(Ny, activation='softmax')(y)
    model = Model(inputs=x, outputs=y)
    model.compile(optimizer=optimizers.sgd(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
    #model.summary()
    '''  model= Model()@########### model.add is not availabel in Model algo. of keras it is available in Sequential of keras
    model.add(Dense(Nh,input_dim = Nx,activation = 'sigmoid'))
    model.add(Desne(Ny,activatoin = 'softmax'))
    model.compile(loss = 'categorical_crossentropy',optimizer = 'sgd',metrics = ['accuracy'])
    model.summary()'''
    return model

### Plotting the model

In [0]:
def plotModel(model):
    from keras.utils import plot_model
    plot_model(model, show_shapes=True, show_layer_names=True, to_file='model.png')
    from IPython.display import Image
    Image(retina=True, filename='model.png')

### Training


In [0]:
def trainNN(model, X_train, Y_train, Nepochs):
    '''
    Action:
        Train model with model.fit
    '''
    # YOUR CODE HERE
    model.fit(X_train,Y_train,epochs=Nepochs,verbose = 0)


In [117]:
'''y = np.array([[1,2,2,1,0]])
Y_onehot = np.empty([y.shape[0],3])
for i in range(y.shape[0]):
  Y_onehot[i] = oneHot(y[i],3)
print(Y_onehot)'''
'''Y = np.eye(4)
Y[3] = [2,3,4,2]
print(Y[3])'''

'Y = np.eye(4)\nY[3] = [2,3,4,2]\nprint(Y[3])'

In [0]:
def trainModel(N, Nh, Nepochs):
    '''
    generateXY, normalizeX, oneHot, makeNN, trainNN
    Input:
        N: int; no. of training samples per class
        Nh: int; no. of neurons in hidden layer
    Output:
        model: keras NN model trained with the training data
        mean_train, stddev_train: mean and stddev of training data - you will 
                            need this for normalizing your test data
    '''
    # YOUR CODE HERE
    X,Y = generateXY(N)
    Ny = 3
    Y_onehot = np.empty((Y.shape[0],Ny))
    for i in range(Y.shape[0]):
      Y_onehot[i] = oneHot(Y[i],Ny)
    mean_train,stddev_train = findMeanStddev(X)
    X = normalizeX(X,mean_train,stddev_train)
    model = makeNN(X.shape[1],Nh,Y_onehot.shape[1])
    trainNN(model, X , Y_onehot, Nepochs)
    return model, mean_train, stddev_train

### Evaluation
Could you:
- Generate 20 samples from each class
- Normalize them with mean_train and stddev_train
- Get Y_test as one hot encoded labels

In [0]:
def testModel(model, Ntest, mean_train, stddev_train):
    '''
    generateXY for test, normalize, onehot, evaluate the model
    Inputs:
        model: trained Keras NN model
        Ntest: int; number of test samples per class
    Output:
        accuracy: float; accuracy on the test data
        CM: confusion matrix on the test data
    '''
    # YOUR CODE HERE
    Ny = 3
    X_test,Y_test = generateXY(Ntest)
    X_test = normalizeX(X_test,mean_train,stddev_train)
    Y_test_onehot = np.zeros((Y_test.shape[0],Ny))
    for i in range(Y_test.shape[0]):
      Y_test_onehot[i] = oneHot(Y_test[i],Ny)
    loss , accuracy = model.evaluate(X_test,Y_test_onehot,verbose = 0)
    from sklearn.metrics import confusion_matrix
    y_pred = model.predict(X_test)
    CM = confusion_matrix(Y_test_onehot.argmax(axis = 1),y_pred.argmax(axis = 1))
    return accuracy, CM


In [120]:
model, mean_train, stddev_train = trainModel(50, 10, 100)
accuracy, CM = testModel(model, 10, mean_train, stddev_train)
print('Accuracy:',accuracy)
print('CM:',CM)

Accuracy: 0.6000000238418579
CM: [[10  0  0]
 [ 0  1  9]
 [ 3  0  7]]


# ADVANCED QUESTIONS



### Effect of changing Nh
### Effect of changing Nepochs
### Effect of changing N, no. of training samples

Can you observe overfitting? 

Can you do hyperparameter tuning here? 

To normalize test data, why do we use the mean and stddev of training data?


In [121]:
#effect of Nh
Nh_all = [1,3,5,6,7,8,9,10,15,17]
Nepochs = 20
N = 50
Ntest =20
acc_list = []
for i in Nh_all:
  model, mean_train, stddev_train = trainModel(N, i , Nepochs)
  accuracy , CM = testModel(model, Ntest , mean_train, stddev_train)
  acc_list.append(accuracy)
print(np.array(acc_list))  

[0.30000001 0.45       0.83333334 0.11666667 0.28333333 0.46666667
 0.15       0.55       0.21666667 0.59999999]
