# Task 0 - Normalization Constants
Write the function def normalization_constants(X): that calculates the normalization (standardization) constants of a matrix:
- X is the numpy.ndarray of shape (m, nx) to normalize
    - m is the number of data points
    - nx is the number of features
- Returns: the mean and standard deviation of each feature, respectively

In [8]:
import numpy as np


def normalization_constants(X):
    ave = np.mean(X, axis=0)
    std = np.std(X, axis=0)
    return ave, std

### Comentarios
- Se utiliza con respecto al eje x, es decir, promedia renglón a renglón
- Se obtiene el promedio de cada valor de cada feature

In [10]:
import numpy as np

if __name__ == '__main__':
    np.random.seed(0)
    a = np.random.normal(0, 2, size=(100, 1))
    b = np.random.normal(2, 1, size=(100, 1))
    c = np.random.normal(-3, 10, size=(100, 1))
    X = np.concatenate((a, b, c), axis=1)
    m, s = normalization_constants(X)
    print(m)
    print(s)

[ 0.11961603  2.08201297 -3.59232261]
[2.01576449 1.034667   9.52002619]


# Task 1 - Normalize
Write the function def normalize(X, m, s): that normalizes (standardizes) a matrix:
- X is the numpy.ndarray of shape (d, nx) to normalize
    - d is the number of data points
    - nx is the number of features
- m is a numpy.ndarray of shape (nx,) that contains the mean of all features of X
- s is a numpy.ndarray of shape (nx,) that contains the standard deviation of all features of X
- Returns: The normalized X matrix

In [11]:
import numpy as np


def normalize(X, m, s):
    x = np.subtract(X, m)
    ret = np.divide(x, s)
    return ret


In [13]:
import numpy as np

if __name__ == '__main__':
    np.random.seed(0)
    a = np.random.normal(0, 2, size=(100, 1))
    b = np.random.normal(2, 1, size=(100, 1))
    c = np.random.normal(-3, 10, size=(100, 1))
    X = np.concatenate((a, b, c), axis=1)
    m, s = normalization_constants(X)
    print(X[:10])
    X = normalize(X, m, s)
    print(X[:10])
    m, s = normalization_constants(X)
    print(m)
    print(s)

[[  3.52810469   3.8831507   -6.69181838]
 [  0.80031442   0.65224094  -5.39379178]
 [  1.95747597   0.729515     7.99659596]
 [  4.4817864    2.96939671   3.55263731]
 [  3.73511598   0.82687659   3.40131526]
 [ -1.95455576   3.94362119 -19.16956044]
 [  1.90017684   1.58638102  -3.24326124]
 [ -0.30271442   1.25254519 -10.38030909]
 [ -0.2064377    3.92294203  -0.20075401]
 [  0.821197     3.48051479  -3.9815039 ]]
[[ 1.69091612  1.74078977 -0.32557639]
 [ 0.33768746 -1.38186686 -0.18922943]
 [ 0.91174338 -1.3071819   1.21732003]
 [ 2.16402779  0.85765153  0.75051893]
 [ 1.79361228 -1.21308245  0.73462381]
 [-1.02897526  1.79923417 -1.63625998]
 [ 0.88331787 -0.47902557  0.03666601]
 [-0.20951378 -0.80167608 -0.71302183]
 [-0.1617519   1.77924787  0.35625623]
 [ 0.34804709  1.35164437 -0.04088028]]
[ 2.44249065e-17 -4.99600361e-16  1.46549439e-16]
[1. 1. 1.]


# Task 2 - Shuffle Data
Write the function def shuffle_data(X, Y): that shuffles the data points in two matrices the same way:
- X is the first numpy.ndarray of shape (m, nx) to shuffle
    - m is the number of data points
    - nx is the number of features in X
- Y is the second numpy.ndarray of shape (m, ny) to shuffle
    - m is the same number of data points as in X
    - ny is the number of features in Y
- Returns: the shuffled X and Y matrices

In [31]:
import numpy as np


def shuffle_data(X, Y):
    m = X.shape[0]
    shuf_vect = np.random.permutation(m)
    print(shuf_vect)
    x = X[shuf_vect]
    y = Y[shuf_vect]
    return x, y


### Comentarios
- Se genera un vector para hacer el "barajeo" porque deben de quedar en forma correspondiente x, y
- En este ejemplo se genera el vector [2 0 1 3 4]

In [33]:
import numpy as np

if __name__ == '__main__':
    X = np.array([[1, 2],
                [3, 4],
                [5, 6],
                [7, 8], 
                [9, 10]])
    Y = np.array([[11, 12],
                [13, 14],
                [15, 16],
                [17, 18],
                [19, 20]])

    np.random.seed(0)
    X_shuffled, Y_shuffled = shuffle_data(X, Y)

    print(X_shuffled)
    print(Y_shuffled)

[2 0 1 3 4]
[[ 5  6]
 [ 1  2]
 [ 3  4]
 [ 7  8]
 [ 9 10]]
[[15 16]
 [11 12]
 [13 14]
 [17 18]
 [19 20]]


# Task 3 - Mini-Batch
Write the function def train_mini_batch(X_train, Y_train, X_valid, Y_valid, batch_size=32, epochs=5, load_path="/tmp/model.ckpt", save_path="/tmp/model.ckpt"): that trains a loaded neural network model using mini-batch gradient descent:
- X_train is a numpy.ndarray of shape (m, 784) containing the training data
- m is the number of data points
- 784 is the number of input features
- Y_train is a one-hot numpy.ndarray of shape (m, 10) containing the training labels
- 10 is the number of classes the model should classify
- X_valid is a numpy.ndarray of shape (m, 784) containing the validation data
- Y_valid is a one-hot numpy.ndarray of shape (m, 10) containing the validation labels
- batch_size is the number of data points in a batch
- epochs is the number of times the training should pass through the whole dataset
- load_path is the path from which to load the model
- save_path is the path to where the model should be saved after training
- Returns: the path where the model was saved
- Your training function should allow for a smaller final batch (a.k.a. use the entire training set)
1) import meta graph and restore session
2) Get the following tensors and ops from the collection restored
    - x is a placeholder for the input data
    - y is a placeholder for the labels
    - accuracy is an op to calculate the accuracy of the model
    - loss is an op to calculate the cost of the model
    - train_op is an op to perform one pass of gradient descent on the model
3) loop over epochs:
    - shuffle data
    - loop over the batches:
        - get X_batch and Y_batch from data
        - train your model
4) Save session
- You should use shuffle_data = __import__('2-shuffle_data').shuffle_data
- Before the first epoch and after every subsequent epoch, the following should be printed:
    - After {epoch} epochs: where {epoch} is the current epoch
    - \tTraining Cost: {train_cost} where {train_cost} is the cost of the model on the entire training set
    - \tTraining Accuracy: {train_accuracy} where {train_accuracy} is the accuracy of the model on the entire training set
    - \tValidation Cost: {valid_cost} where {valid_cost} is the cost of the model on the entire validation set
    - \tValidation Accuracy: {valid_accuracy} where {valid_accuracy} is the accuracy of the model on the entire validation set
- After every 100 steps gradient descent within an epoch, the following should be printed:
    - \tStep {step_number}: where {step_number} is the number of times gradient descent has been run in the current epoch
    - \t\tCost: {step_cost} where {step_cost} is the cost of the model on the current mini-batch
    - \t\tAccuracy: {step_accuracy} where {step_accuracy} is the accuracy of the model on the current mini-batch
    - Advice: the function range can help you to handle this loop inside your dataset by using batch_size as step value