<a href="https://colab.research.google.com/github/lgvb1899/GCA-2/blob/Development/ActivationFunctions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
from google.colab import drive
drive.mount('/content/drive/')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive/


In [8]:
import h5py
from matplotlib import pyplot as plt
import numpy as np

f = h5py.File('/content/drive/My Drive/images.h5', 'r')

images_train = f['Train/images'][...]
labels_train = f['Train/labels'][...]

images_test = f['Test/images'][...]
labels_test = f['Test/labels'][...]

f.close()

num_train = images_train.shape[0]
images_train_flatten = images_train.flatten().reshape(num_train, 100*100*3)
print("Number Training Images: ", num_train)
print("Shape of Flattened Training Images Array: ", images_train_flatten.shape)

num_test = images_test.shape[0]
images_test_flatten = images_test.flatten().reshape(num_test, 100*100*3)
print("Number Testing Images: ", num_test)
print("Shape of Flattened Testing Images Array: ", images_test_flatten.shape)

train_set_x = images_train_flatten/255.
test_set_x = images_test_flatten/255.

Number Training Images:  4405
Shape of Flattened Training Images Array:  (4405, 30000)
Number Testing Images:  1617
Shape of Flattened Testing Images Array:  (1617, 30000)


A possible improvement to the model could be the use of a new activation function. After some research, we got the idea to use the hyperbolic tangent function, and the ReLU function

In [0]:
def tanh_activation(z):
  
  s = np.tanh(z)
  
  return s
  

In [0]:
def relu(z):
  if z.any() < 0:
    s = 0
  elif z.all() >= 0:
    s = z
    
  return s

To ensure the model works with the chosen function, all instances of the activation function must be written with the proper function, and the prediction labels must be changed to reflect the function. As the code is currently written, it will work with the tanh function. 

In [0]:
def initialize_model(dim):

    
    w = np.zeros(shape=(dim, 1), dtype=float)
    b = 0
    
    return w,b
  
def forward_propagate(X, Y, w, b):
  
    """
    Returns array of activations (one per image)
    and cost (scalar).
    
    input:  X     array of flattened images (num_pixels,num_images)
            Y     array of labels, 1 or 0 (num_images)
            w     array of weights (num_pixels,1)
            b     scalar bias (float)
    output: A     array of activations (num_images)
            cost  loss (float)
    """
  
    # get number of images under consideration
    num_images = X.shape[1]
  
    # calculate the activation for each image
    A = tanh_activation(np.dot(w.T, X) + b) # ACTIVATION FUNCTION HERE
  
    # calculate the cost (scalar) using cross-entropy
    cost = np.squeeze( (-1. / num_images) * np.sum((Y*np.log(A))+(1-Y)*np.log(1-A)),axis=0) 
  
    return A, cost    
  
def backward_propagate(X, Y, A):
  
    """
    Returns a dictionary of derivatives of cost function.
    
    input:  X      array of flattened images (num_pixels,num_images)
            Y      array of labels, 1 or 0 (num_images)
            A      array of activations (num_images)
    output: grads  dictionary with keys dw and db
    """
    
    # get number of images under consideration
    num_images = X.shape[1]
  
    # derivative of cost function wrt w (scalar)
    dw = (1./num_images)*np.dot(X,((A-Y).T))
    
    # derivative of cost function wrt b (scalar)
    db = (1./num_images)*np.sum(A-Y,axis=1)
  
    # create dictionary of derivatives (gradients)
    grads = {"dw": dw, "db": db}
  
    return grads  

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
    
    """
    Optimize array of weights and scalar bias.
    
    input:  w               array of weights (num_pixels,1)
            b               scalar bias (float)
            X               array of flattened images (num_pixels,num_images)
            Y               array of labels, 1 or 0 (num_images)
            num_iterations  number of iterations for optimization (scalar)
            learning_rate   gradient multiplier (scalar)
            print_cost      boolean controlling user feedback
    output: params          w and b after num_iterations of optimization
            grads           dictionary with keys dw and db
            costs           history of cost during optimization (list)
    """
    
    costs = []
    
    # iterate
    for i in range(num_iterations):
        
        # forward propagation
        A, cost = forward_propagate(X, Y, w, b)
        
        # backward propagation
        grads = backward_propagate(X, Y, A)
        dw = grads["dw"]
        db = grads["db"]
        
        # update array of weights and scalar bias
        w = w - learning_rate*dw
        b = b -  learning_rate*db
        
        # save the costs (every 100th)
        if i % 100 == 0:
            costs.append(cost)
        
        # Print the cost every 10 training examples
        if print_cost:
          if i % 10 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
    
    # save optimized w and b in dictionary
    params = {"w": w,
              "b": b}
    
    # save dw and db
    grads = {"dw": dw,
             "db": db}
    
    return params, grads, costs
  
def predict(w, b, X):
    '''
    Given a set of flattened images, predict their labels.
    
    input:   w               array of weights (num_pixels,1)
             b               scalar bias (float)
             X               array of flattened images (num_pixels,num_images)
    output:  Y_prediction    array of predictions (num_images)
    '''
    
    # get number of images
    num_images = X.shape[1]
    
    # initialize prediction array
    Y_prediction = np.zeros((1,num_images))
    
    # calculate activation (probability) for each image
    A = tanh_activation(np.dot(w.T, X) + b) # ACTIVATION FUNCTION HERE
    
    # make predictions
    Y_prediction[A>0] = 1 # Labels must reflect activation function here
    Y_prediction[A<=0] = 0
    
    return Y_prediction
  
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
    """
    A wrapper for gradient descent.
    
    input:   X_train         array of flattened images for training (num_pixels,num_train)
             Y_train         array of training labels (num_train)
             X_test          array of flattened images for testing (num_pixels,num_test)
             Y_test          array of testing labels (num_test)
             num_iterations  number of iterations for optimization (scalar)
             learning_rate   gradient multiplier (scalar)
             print_cost      boolean controlling user feedback
    output:  d               a dictionary of parameters and costs
    """
    
    # initialize
    w, b = initialize_model(X_train.shape[0])
    
    # gradient descent with training set (optimization)
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
    
    # get w and b
    w = parameters["w"]
    b = parameters["b"]
    
    # predict on testing and training set
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)

    # print errors
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))

    # save some parameters into a dictionary
    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    
    return d

In [29]:
d = model(train_set_x.T, labels_train, test_set_x.T, labels_test, num_iterations = 200, learning_rate = 0.002, print_cost = True)




Cost after iteration 0: nan




Cost after iteration 10: nan
Cost after iteration 20: nan
Cost after iteration 30: 3.191071
Cost after iteration 40: nan
Cost after iteration 50: nan
Cost after iteration 60: nan
Cost after iteration 70: 1.434102
Cost after iteration 80: 2.223786
Cost after iteration 90: nan
Cost after iteration 100: 1.751306
Cost after iteration 110: 3.880849
Cost after iteration 120: nan
Cost after iteration 130: nan
Cost after iteration 140: nan
Cost after iteration 150: 1.829955
Cost after iteration 160: nan
Cost after iteration 170: nan
Cost after iteration 180: nan
Cost after iteration 190: 1.807058
train accuracy: 74.91486946651531 %
test accuracy: 31.787260358688926 %
