# Image Classification Using Logistic Regression 

![title](cat2.jpg)

In this project we are going to build an image classifier which identifies cat images with 70% accuracy. 
This is a replication of Andrew Ng's project in the course titled "Neural Networks and Deep Learning" in the Coursera Platform.

**Statement of Problem**

We are presented with data in h5 file format-contain images (split to train and test datasets). We are tasked to build a simple image-recognition algorithm that distinguishes an image as either "cat" or "non-cat". 

**We are going to follow these steps:**

- Obtain train and test datasets (stored in h5py file formats) 
- Vectorize the datasets to allow optimize the learning process. 
- Learn the logistic regression model parameters using the train dataset 
- Used the test dataset to test the model

**Let us go!**





## Step 1: Import the necessary libraries 

In this first step we are going to import all the libraries/modules/packages that we will need when building the classifier.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset

%matplotlib inline

## Step 2: Loading, Exploring and Transforming the Data

In this step we are going to load the data, explore how the data looks like and make the necessary tranformations in order to optimize the learning process. Specifically we are going to vectorize and normalize the data. 

In [None]:
# Load the data
train_set_x_orig, train_set_y, test_set_x_orig,test_set_y, classes = load_dataset()

In [None]:
# check an example of the data (picture)
index = 19
plt.imshow(train_set_x_orig[index])
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' picture.")

In [None]:
# Find out the number of train and test examples in the datasets and well as dimensions of the pictures
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

print('Number of Training Example: m_train = ' + str(m_train))
print('Number of Testing Example: m_test = ' + str(m_test))
print("Each Image is of size: ("+str(num_px)+","+str(num_px)+","+"3"+")")
print('Shape of train_set_x: '+str(train_set_x_orig.shape))
print('Shape of train_set_y: '+str(train_set_y.shape))
print('Shape of test_set_x: '+str(test_set_x_orig.shape))
print('Shape of test_set_y: '+str(test_set_y.shape))

In [None]:
# Reshaping the train and test examples-unroll the pictures to a vector (num_px*num_px*3,1)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1).T
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1).T
print(train_set_x_flatten.shape)
print(test_set_x_flatten.shape)
# standardizing the examples
train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

## Step 3: Building the Algorithm 

Now that we have prepared the data, we need to feed it into the various steps of the algorithm. Below is a **mathematical** summary for the algorithm: 

For one example $x^{(i)}$:
 $$z^{(i)} = w^T x^{(i)} + b \tag{1}$$
 $$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$$ 
 $$ \mathcal{L}(a^{(i)}, y^{(i)}) =  - y^{(i)}  \log(a^{(i)}) - (1-y^{(i)} )  \log(1-a^{(i)})\tag{3}$$

The cost is then computed by summing over all training examples:
 $$ J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6}$$
 
 We are going to build support functions which will ultimately be aggregated to give the model (classifier)

### Step 3a: Sigmoid Function 

In [None]:
def sigmoid(z):
    """"
    Compute the sigmoid of z 
    
    Arguments:
    z - A scalar or a matrix or a vector
    
    Returns:
    s - sigmoid of z 
    """
    s = 1/(1+np.exp(-z))
    return s


### Step 3b: Initializing the Parameters

In [None]:
def initialize_with_zeros(dim):
    """
    This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.
    
    Argument:
    dim -- size of the w vector we want (or number of parameters in this case)
    
    Returns:
    w -- initialized vector of shape (dim, 1)
    b -- initialized scalar (corresponds to the bias)
    """
    w = np.zeros((dim,1))
    b = 0
    assert(w.shape == (dim, 1))
    assert(isinstance(b, float) or isinstance(b, int))
    
    return w, b

### Step 3c. Forward and Backward Propagation

In [None]:
def propagate(w, b, X, Y):
    """
    Implement the cost function and its gradient for the propagation explained above

    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

    Return:
    cost -- negative log-likelihood cost for logistic regression
    dw -- gradient of the loss with respect to w, thus same shape as w
    db -- gradient of the loss with respect to b, thus same shape as b
    
    Tips:
    - Write your code step by step for the propagation. np.log(), np.dot()
    """
    
    m = X.shape[1]
    
    # Forward propagation 
    z = np.dot(w.T,X)+b
    A = sigmoid(z)  
    cost = (-1/m)*np.sum((Y*np.log(A)+(1-Y)*np.log(1-A)))
    
    # Backward propagation 
    dw = 1/m*np.dot(X,((A-Y).T))
    db = 1/m*np.sum(A-Y)
    
    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    cost = np.squeeze(cost)
    assert(cost.shape == ())
    
    grads = {"dw": dw,
             "db": db}
    
    return grads, cost
   

### Step 3d. Optimizing the Parameters

In [None]:
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):
    """
    This function optimizes w and b by running a gradient descent algorithm
    
    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of shape (num_px * num_px * 3, number of examples)
    Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)
    num_iterations -- number of iterations of the optimization loop
    learning_rate -- learning rate of the gradient descent update rule
    print_cost -- True to print the loss every 100 steps
    
    Returns:
    params -- dictionary containing the weights w and bias b
    grads -- dictionary containing the gradients of the weights and bias with respect to the cost function
    costs -- list of all the costs computed during the optimization, this will be used to plot the learning curve.
    
    Tips:
    You basically need to write down two steps and iterate through them:
        1) Calculate the cost and the gradient for the current parameters. Use propagate().
        2) Update the parameters using gradient descent rule for w and b.
    """
    
    costs = []
    
    for i in range(num_iterations):
        grads, cost = propagate(w,b,X,Y)
        dw = grads["dw"]
        db = grads["db"]
        w = w-(learning_rate*dw)
        b = b-(learning_rate*db)
        if i % 100 == 0:
            costs.append(cost)
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
    params = {"w": w,
              "b": b}
    
    grads = {"dw": dw,
             "db": db}
    
    return params, grads, costs


### Step 3e. Predicting (Finding the Predicted values)

In [None]:
def predict(w, b, X):
    '''
    Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)
    
    Arguments:
    w -- weights, a numpy array of size (num_px * num_px * 3, 1)
    b -- bias, a scalar
    X -- data of size (num_px * num_px * 3, number of examples)
    
    Returns:
    Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X
    '''
    
    m = X.shape[1]
    Y_prediction = np.zeros((1,m))
    w = w.reshape(X.shape[0], 1)
    A = sigmoid(np.dot(w.T,X)+b)
    for i in range(A.shape[1]):
        if A[0,i]<=0.5:
            Y_prediction[0,i]=0
        elif A[0,i]>0.5:
            Y_prediction[0,i]=1
        pass
    assert(Y_prediction.shape == (1, m))
    
    return Y_prediction


## Step 4. Merging all the support functions to form the model 

This is the final step which we utilize all the above functions to form the model 

In [None]:
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
    """
    Builds the logistic regression model by calling the function you've implemented previously
    
    Arguments:
    X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)
    Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)
    X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)
    Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)
    num_iterations -- hyperparameter representing the number of iterations to optimize the parameters
    learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()
    print_cost -- Set to true to print the cost every 100 iterations
    
    Returns:
    d -- dictionary containing information about the model.
    """
    # initialize parameters with zeros
    w, b = initialize_with_zeros(X_train.shape[0])
    # Gradient descent
    parameters, grads, costs = optimize(w,b,X_train,Y_train,num_iterations,learning_rate,print_cost=False)
    # Retrieve parameters w and b from dictionary "parameters"
    w = parameters["w"]
    b = parameters["b"]
    # Predict test/train set examples
    Y_prediction_test = predict(w,b,X_test)
    Y_prediction_train = predict(w,b,X_train)
    # Print train/test Errors
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    
    return d


    

## Step 5. Results


In [None]:
d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)

In [None]:
# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

In [None]:
# Trying different learning rates
learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
    print ('\n' + "-------------------------------------------------------" + '\n')

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations (hundreds)')

legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()

In [None]:
# Trying with own image 
my_image = "douglas.jpg"
# We preprocess the image to fit your algorithm.
fname = my_image
image = np.array(plt.imread(fname))
image = image/255.
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(d["w"], d["b"], my_image)

plt.imshow(image)
print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")


<font color='blue'>
    
# Conclusion 
    
Thank you all for following the step by step explanation on how to implement a logistic regression. See you in my next project! You call follow me on twitter: @dkibowen. 