## Homework 1: Search hyperparameters for a deep feedforward neural network
In this homework, you are going to 
1. practice using Keras to implement a neural network and search its hyperparameters for handwritten digits classification.
2. The framework, e.g., the main function and functions' names, input, and output, has been defined. You are required to complete the create_NN, nn_params_search, retrain_best_nn, performance_acc functions.
3. Add your code in the following blocks, and do not change other places.

```python

    ## add your code here
    
    ##
```

### Student information
    1. Your name: Amanda Ward
    2. Department: CDA Computer Science
    3. Graduate

### TA grading: XX/100
    1. 1.1?/10
    2. 1.2 ?/10
    3. create_NN: ?/30
    4. nn_params_search: ?/30
    5. retrain_best_nn: ?/10
    6: performance_acc: ?/10

In [1]:
# Homework 1
import keras
from sklearn.model_selection import GridSearchCV
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix   
import numpy as np

Using TensorFlow backend.


In [2]:
# Load data and data standardization
def load_data():
    '''Load the MNIST dataset'''
    
    (X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
    return (X_train, y_train, X_test, y_test)

def data_std(X_train, X_test):
    '''Data standardization
    
    Parameters
    ----------
    X_train: origianl training set
    X_test: original test set
    
    Returns
    -------
    X_train_std: rescaled training set
    X_test_std: rescaled test set
    '''
    sc = StandardScaler()
    sc.fit(X_train)
    X_train_std = sc.transform(X_train)
    X_test_std = sc.transform(X_test)
    
    return X_train_std, X_test_std

In [3]:
def create_NN(n_features = 784, n_outputs = 10): # 30 points
    '''create a deep feedforwad neural network using keras
    
    Parameters
    -----------
    n_features: the number of input features/units
    n_output: the number of output units
    
    Returns
    -------
    myNN: the neural network model
    
    '''
    ## add your code here
    myNN = keras.models.Sequential()
    
    # aw add a hiddlen layer
    myNN.add(keras.layers.Dense(
        units=10,
        input_dim=n_features,
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        activation='sigmoid'))
    
    # aw create an output layer
    myNN.add(keras.layers.Dense(
        units=n_outputs,
        input_dim=10,
        kernel_initializer='glorot_uniform',
        bias_initializer='zeros',
        activation='softmax'))

 
    sgd_optimizer = keras.optimizers.SGD(lr=0.001, decay=1e-7, momentum=.9)
    myNN.compile(optimizer=sgd_optimizer, loss='categorical_crossentropy')
    


    ##
    
    return myNN

In [4]:
def nn_params_search(nn, X_train, y_train): # 30 points
    '''Search best paramaters
    
    Parameters
    ----------
    X_train: features
    y_train: target of the input

    
    Returns
    -------
    best_params_
    
    Example grid: (you can customize the search graid by youself)
    param_grid = [{'batch_size': [64, 128], 'epochs' : [10, 30, 50]}]
        
    '''
    ## add your code here
    
    param_grid = [{
                'batch_size' : [64, 128],  
                'epochs' : [10, 30, 50]}]
    #initial the GridSearchCV object. cv: # of folds, 
    gs_cv = GridSearchCV(estimator = nn, cv = 3, param_grid = param_grid , 
                     scoring = 'accuracy', verbose = 1) # verbose = 2
    #model training
    gs_cv.fit(X_train, y_train)
    ##
    
    return gs_cv.best_params_

In [5]:
def retrain_best_nn(best_params, X_train, y_train): # 10 points
    '''
    Retrain classifier using the best parameters
    
    Paramters
    ----------
    best_params:
    X_train: data input of the training set
    y_train: target of the input
    
    Returns
    ---------
    bestNN: the nn classifier trained using the best parameters
    
    '''
    ## add your code here  
    
    #myNN2 = create_NN(n_features = 784, n_outputs = 10)
    
    n_features = 784
    n_outputs = 10
    
    bestNN = keras.models.Sequential()
    
    # aw add a hiddlen layer
    bestNN.add(keras.layers.Dense(
         units=10,
         input_dim=n_features,
         kernel_initializer='glorot_uniform',
         bias_initializer='zeros',
         activation='sigmoid'))
    
    # aw create an output layer
    bestNN.add(keras.layers.Dense(
         units=n_outputs,
         input_dim=10,
         kernel_initializer='glorot_uniform',
         bias_initializer='zeros',
         activation='softmax'))

    sgd_optimizer = keras.optimizers.SGD(lr=0.001, decay=1e-7, momentum=.9)
    bestNN.compile(optimizer=sgd_optimizer, loss='categorical_crossentropy')
    size = best_params.get("batch_size")
    epoch = best_params.get("epochs")
    #train
    history = bestNN.fit(X_train_1, y_train,
                        batch_size=size, epochs=epoch,
                        verbose=1,
                        validation_split=0.1)
    ##
    return bestNN

In [6]:
def performance_acc(y, y_pred): # 10 points
    ''' calculate the concusion matrix and average accuracy
    
        Parameters
        ----------
        y: real target
        y_pred: prediction
        
        Returns
        -------
        cm: confusion matrix
        acc: accuracy
    '''
    ## add your code here
    

    y_test_pred = bestNN.predict_classes(X_test_1, verbose=0)
    correct_preds = np.sum(y_test == y_test_pred, axis=0)
    acc = correct_preds / y_test.shape[0]
    
   
    cm = confusion_matrix(y_test, y_test_pred)


    ##
    
    return cm, acc

In [7]:
from keras.wrappers.scikit_learn import KerasClassifier

if __name__ == '__main__':
    
    #Task 1. load the dataset
    (X_train, y_train, X_test, y_test) = load_data()
    
    # 1.1 reshape the training and test sets to N * 784. 10 points
    ## add your code here
    X_train_1 = np.reshape(X_train, [X_train.shape[0], X_train.shape[1] * X_train.shape[2]]) #aw
    X_test_1 = np.reshape(X_test, [X_test.shape[0], X_test.shape[1] * X_test.shape[2]]) #aw
    print('1. X_train: {}, X_train_1: {}'.format(X_train.shape, X_train_1.shape))
    ##
    
    # 1.2 transform y_train to one-hot vectors using keras.utils.to_categorical. 10 points
    ## add your code here
    y_train_onehot = keras.utils.to_categorical(y_train) #aw
    print('y_train_onehot: {}'.format(y_train_onehot.shape))
    ##
    
    #Task 2. create a deep feedforward neural network
    myNN = create_NN(X_train_1.shape[1], y_train_onehot.shape[1])
    myNN.summary()
    myNN1 = KerasClassifier(build_fn = create_NN, batch_size = 64, epochs = 50)
    
    #Task 3. Search best paprameters, and report the performance
    best_params = nn_params_search(myNN1, X_train_1, y_train)
    print('Best parameters: ', best_params)
    
    bestNN = retrain_best_nn(best_params, X_train_1, y_train_onehot)
    y_test_pred = bestNN.predict_classes(X_test_1)
    cm, acc = performance_acc(y_test, y_test_pred)
    print('Confusion matrix:\n', cm)
    print('Accuracy =    {:.3f}%'.format(acc*100))
    
    #Task 4. Search best nn parameters after data standardization, and report the performance
    X_train_std, X_test_std = data_std(X_train_1, X_test_1)
    
    best_params =  nn_params_search(myNN1, X_train_std, y_train)
    print('Best parameters: ', best_params)
    
    bestNN_std = retrain_best_nn(best_params, X_train_std, y_train_onehot)
    y_test_std_pred = bestNN_std.predict_classes(X_test_std)
    cm1, acc1 = performance_acc(y_test, y_test_std_pred)
    print('Confusion matrix:\n', cm1)
    print('Accuracy =    {:.3f}%'.format(acc1*100))

1. X_train: (60000, 28, 28), X_train_1: (60000, 784)
y_train_onehot: (60000, 10)
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 10)                7850      
_________________________________________________________________
dense_2 (Dense)              (None, 10)                110       
Total params: 7,960
Trainable params: 7,960
Non-trainable params: 0
_________________________________________________________________
Fitting 3 folds for each of 6 candidates, totalling 18 fits


[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
E

[Parallel(n_jobs=1)]: Done  18 out of  18 | elapsed:  7.5min finished


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Best parameters:  {'batch_size': 128, 'epochs': 50}
Train on 54000 samples, validate on 6000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30
Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
E

[Parallel(n_jobs=1)]: Done  18 out of  18 | elapsed:  8.5min finished


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Best parameters:  {'batch_size': 64, 'epochs': 50}
Train on 54000 samples, validate on 6000 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 