# Logistic Regression as a Single Neuran of a Neural Network

Welcome to your first programming assignment! You will build a logistic regression classifier to recognize  cats. This assignment will step you through how to do this with a Neural Network mindset where logistic regression represents a single nueron. 

**Instructions:**
- Do not use loops (for/while) in your code, unless the instructions explicitly ask you to do so.

**You will learn to:**
- Build the general architecture of a learning algorithm, including:
    - Initializing parameters
    - Calculating the cost function and its gradient
    - Using an optimization algorithm (gradient descent) 
- Gather all three functions above into a main model function, in the right order.

## 1 -  Loading Packages ##
First, let's run the cell below to import all the packages that you will need during this assignment. 

In [59]:
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
import glob
from sklearn.model_selection import train_test_split
from load_dataset import*
%matplotlib inline  #importing libraries and modules

UsageError: unrecognized arguments: #importing libraries and modules


## 2 - Utility functions to convert images into datasets ##
The following functions are used to convert the cats and dogs images in the dataset folder into the numpy array format with labels. 

In [60]:
def convert_image_to_data(image, WIDTH, HEIGHT):
    image_resized = Image.open(image).resize((WIDTH, HEIGHT))
    image_array = np.array(image_resized).T
    return image_array

In [61]:

def create_train_test_data(num_pix,test_size=0.2):
    cat_files = glob.glob("datasets/cat*")
    dog_files = glob.glob("datasets/dog*")

    # Restrict cat and dog files here for testing
    cat_list = [convert_image_to_data(i, num_pix, num_pix) for i in cat_files]
    dog_list = [convert_image_to_data(i, num_pix, num_pix) for i in dog_files]

    y_cat = np.zeros(len(cat_list))
    y_dog = np.ones(len(dog_list))

    X = np.concatenate([cat_list, dog_list])
    X = np.concatenate([cat_list, dog_list])
    y = np.concatenate([y_cat, y_dog])

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1, stratify=y)
    return X_train, X_test, y_train, y_test
   

### 2.1 - Load data set ###
Use the utility functions provided above to load the train_set_x,train_set_y, test_set_x, test_set_y.
Set the `num_pix` to 64 and keep the `test_size` as the default value.

In [62]:

num_pix = 64  #  image is 64 pixels 

datasets = load_dataset("car_dataset.hdf5")

#The below data will be used to train neural network. We will specifically use indexing to get info in array form.
#first training dat awill be loaded and then all sequenses will be implemented
x = np.array(datasets["train"]["X"])#used indexing
y = np.array(datasets["train"]["Y"])

#This is test data which used to define accuracy.
x1 = np.array(datasets["dev"]["X"])
y1 = np.array(datasets["dev"]["Y"])


#### 2.1.1 - Print the shapes ###
Print the shape of the following variables
- Number of training examples: m_train
- Number of testing examples: m_test
- Height/Width of each image: num_px
- train_set_x shape
- train_set_y shape
- test_set_x shape
- test_set_y shape

In [63]:
#printing shapes of different matrices
num_pix = 64
print("Number of training examples: ", x.shape[0])#no of images to train data where x is train set
print("Number of testing examples: ", x1.shape[0])#no of test images to define model accuracy 
print("Height/Width of each image:", num_pix) #pixel height and width
print (x.shape)
print (y.shape)
print (x1.shape)
print (y1.shape)


Number of training examples:  517
Number of testing examples:  172
Height/Width of each image: 64
(517, 64, 64, 3)
(517,)
(172, 64, 64, 3)
(172,)


#### 2.1.2 - Fixing ill-shape vectors ####
It is possible that the train_set_y and test_set_y have an ill-shape. Fix these shapes so the train_set_y and test_set_y are represented as a matrix with size (1, number of examples)

In [64]:
# Reshaping the vectors
y_flat = y.reshape(y.shape[0], -1).T
y1_flat = y1.reshape(y1.shape[0], -1).T

yp_train_set = y_flat
yc_test_set = y1_flat

# This is a cat/dog dataset

# your code here <end>

print ("train_set_y_f shape: " + str(y_flat.shape)) #concatanating strings as it represent y in proper shape
print ("test_set_y_f shape: " + str(y1_flat.shape))

train_set_y_f shape: (1, 517)
test_set_y_f shape: (1, 172)


### 2.2 - Flatten the data
Reshape the training and test data sets so that each image is flattened into single vectors of shape (num_px  ∗ num_px  ∗ 3, 1). Check the shapes for train_set_x_flatten and test_set_x_flatten.

In [65]:
#flatten the data matrix using choosing a 0th column
x_flat = x.reshape(x.shape[0], -1).T

x1_flat = x1.reshape(x1.shape[0], -1).T


print ("train_set_x shape: " + str(x_flat.shape))

print ("test_set_x shape: " + str(x1_flat.shape))

train_set_x shape: (12288, 517)
test_set_x shape: (12288, 172)


### 2.3 - Standardize the data
Divide every row of the dataset by 255 

In [66]:
#dividing flattened data by 255
yp_train_set = x_flat/255 
yp_test_set = x1_flat/255

## 3 - Implementing the Helper Functions ## 

### 3.1 - Sigmoid function
Implement `sigmoid()`. 

In [67]:
#defining a sigmoid function mathematically where z is variable
def sigmoid(z):
    

    s = 1/(1+ np.exp(-z)) #array and exponential combined
    
    
    return s

### 3.2 - Implement initialize_with_zeros
Write a function that initializes initialize w as a vector of zeros and set `b` to zero. 

In [68]:
#as indicated we will initialise w as vector of zero
def initialize_with_zeros(dim):#black represent the 0 value
    
    
    
    w = np.zeros(shape=(dim,1))
    b = 0.0
   

    assert(w.shape == (dim
                       , 1))
    assert(isinstance(b, float) or isinstance(b, int))
    
    return w, b

### 3.3 - Forward propagation

Implement forward propagation to calculate $A$ and cost.

Forward Propagation:
- You get X
- You compute $A = \sigma(w^T X + b) = (a^{(0)}, a^{(1)}, ..., a^{(m-1)}, a^{(m)})$
- You calculate the cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)})$


In [69]:
#making a function for forward propogation
def forward_propogation(w,b,X,Y) :#used previous data as an argument
    
    m = X.shape[1]  
   
    Z= np.dot(w.T,X)+b
    A = sigmoid(Z)   #already defined this function  
    cost = (-1/m) * np.sum(Y* np.log(A)+ (1- Y) * np.log(1-A))  #cost will smooth the curve
    cost = np.squeeze(cost)
    assert(cost.shape == ())
    
    return cost,A,m

### 3.4 - Backward propagation

Implement backward propagation to compute gradients $dw$ and $db$
Here are the two formulas you will be using: 

$$ dw = \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7}$$
$$ db = \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8}$$

In [70]:
#making function for backward propogation
def backward_propogation(w,b,X,Y) : #same as forward_propogation
    m = X.shape[1]  
    Z= np.dot(w.T,X)+b
    A = sigmoid(Z) 
    
    dz = A - Y #this are small changes in directions so indicated as differentiation
    dw = (1/m) * np.dot(X,dz.T)
    db = (1/m) * np.sum(dz)
    
    assert(dw.shape == w.shape)
    assert(db.dtype == float)
    
    return dw,db

### 3.5 - Optimization
- You have initialized your parameters.
- You are also able to compute a cost function and its gradient.
- Now, you want to update the parameters using gradient descent.

Write down the optimization function. The goal is to learn $w$ and $b$ by minimizing the cost function $J$. 

In [71]:
#defining an optimizing function based on which we can optimize the cost
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost= False) :
    
    costs = [] #costs defined as a list in which values will be stored
    for i in range(num_iterations) :

        cost,A,m = forward_propogation(w,b,X,Y)
        dw,db = backward_propogation(w,b,X,Y)

        w = w-learning_rate*dw #learning rate 
        b = b-learning_rate*db

        if i % 100 == 0:  
                costs.append(cost)

        if print_cost and i % 100 == 0:
                print ("Cost after iteration %i: %f" %(i,cost))

    return dw,db,w,b,costs
    

### 3.6 - Prediction

Implement the `predict()` function. There is two steps to computing predictions:

1. Calculate $\hat{Y} = A = \sigma(w^T X + b)$

2. Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector `Y_prediction`. If you wish, you can use an `if`/`else` statement in a `for` loop (though there is also a way to vectorize this). 

In [72]:
#defining prediction
def predict(w, b, X_test):
    
    
    m = X_test.shape[1]#using slicing to grab data sets
    
    w = w.reshape(X_test.shape[0], 1)
    
    
    Z= np.dot(w.T,X_test) + b
    A = sigmoid(Z)
   
    
    Y_prediction = np.around(A)
    
      
    
    assert(Y_prediction.shape == (1, m))
    
    return Y_prediction

## 4 - Regression Model ##
Implement the model function. Use the following notation:
    - Y_prediction for your predictions on the test set
    - Y_prediction_train for your predictions on the train set
    - w, costs, grads for the outputs of optimize()

In [73]:
#defining regressinal model with arguments defined earlier
def reg_model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
       
    # setting parameters to zeros to initialise 
    n= X_train.shape[0]
    w, b = initialize_with_zeros(dim=n)#initialising with zeros
    

    # arguments decide the optimiztions and they hAave to go in respective order
    dw,db,w,b,costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
    
    
    
    # Examples of prediction data sets like train and test set examples
    Y_prediction_test = predict(w,b,X_test)
    Y_prediction_train = predict(w,b,X_train)

    
    # error printing with acuuracy with just doing minus from 100 
    print("accuracy_train: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("accuracy_test: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))

    
    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    
    return d
    

## 5. Run the regression model function to train your model.
### 5.1 - Setting parameters (part 1)
Set the `num_iterations` to 5000 and `learning_rate` to 0.0005

In [74]:

#taken 'dev' dataset and doing test


# Taken 'test' as dataset and performing actions
x = np.array(datasets["train"]["X"])
x_flat = x.reshape(x.shape[0], -1).T
xp1_train_set = x_flat/255
print (xp1_train_set.shape)# printing shape

y = np.array(datasets["train"]["Y"])
y_flat = y.reshape(y.shape[0], -1).T
yp1_train_set = y_flat
print (yp1_train_set.shape)


x1 = np.array(datasets["test"]["X"])
x1_flat = x1.reshape(x1.shape[0], -1).T
xp1_test_set = x1_flat/255
print (xp1_test_set.shape)

y1 = np.array(datasets["test"]["Y"])
y1_flat = y1.reshape(y1.shape[0], -1).T
yp1_test_set = y1_flat/255
print (yp1_test_set.shape)

costs= reg_model(xp1_train_set, yp1_train_set, xp1_test_set, yp1_test_set,  num_iterations = 5000, learning_rate = 0.0005, print_cost = True)


(12288, 517)
(1, 517)
(12288, 173)
(1, 173)
Cost after iteration 0: 0.693147
Cost after iteration 100: 0.548509
Cost after iteration 200: 0.496297
Cost after iteration 300: 0.459536
Cost after iteration 400: 0.432244
Cost after iteration 500: 0.411066
Cost after iteration 600: 0.394034
Cost after iteration 700: 0.379940
Cost after iteration 800: 0.368004
Cost after iteration 900: 0.357705
Cost after iteration 1000: 0.348679
Cost after iteration 1100: 0.340667
Cost after iteration 1200: 0.333477
Cost after iteration 1300: 0.326965
Cost after iteration 1400: 0.321019
Cost after iteration 1500: 0.315553
Cost after iteration 1600: 0.310498
Cost after iteration 1700: 0.305796
Cost after iteration 1800: 0.301404
Cost after iteration 1900: 0.297283
Cost after iteration 2000: 0.293401
Cost after iteration 2100: 0.289732
Cost after iteration 2200: 0.286254
Cost after iteration 2300: 0.282948
Cost after iteration 2400: 0.279796
Cost after iteration 2500: 0.276785
Cost after iteration 2600: 0.273

**Expected Output**: 

<table style="width:40%"> 
    
    <tr>
        <td> **Train Accuracy**  </td> 
        <td> 91.25 % </td>
    </tr>

    <tr>
        <td>**Test Accuracy** </td> 
        <td> 60.0 % </td>
    </tr>
</table> 




Let's also plot the cost function and the gradients.

In [None]:

# plot for learning rate = 0.0005 and num_iterations as 5000 ['test set']
d = reg_model(xp1_train_set, yp1_train_set, xp1_test_set, yp1_test_set,  num_iterations = 5000, learning_rate = 0.0005, print_cost = True)
costs =np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.title("Plot 1")
plt.xlabel('Iterations (per hundred)')
plt.show()



Cost after iteration 0: 0.693147
Cost after iteration 100: 0.548509
Cost after iteration 200: 0.496297
Cost after iteration 300: 0.459536
Cost after iteration 400: 0.432244
Cost after iteration 500: 0.411066
Cost after iteration 600: 0.394034
Cost after iteration 700: 0.379940
Cost after iteration 800: 0.368004
Cost after iteration 900: 0.357705
Cost after iteration 1000: 0.348679
Cost after iteration 1100: 0.340667
Cost after iteration 1200: 0.333477


### 5.1 - Setting parameters (part 2)
Set the `num_iterations` to 2000 and `learning_rate` to 0.005 and run the model again. Plot the cost function.

In [None]:
d=reg_model(xp1_train_set, yp1_train_set, xp1_test_set, yp1_test_set,  num_iterations = 2000, learning_rate = 0.005, print_cost = True)
costs =np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.title("Plot 2")
plt.xlabel('Iterations (per hundred)')
plt.show()


### 5.3 - Analysis
Compare the cost function plots of part_1 and part_2. Write your observation below.

#### Write Your observation here

We can derive a difference between plot1 nad plot2. plot1 has much higher learning rate than plot2. As compared to iteration, number of iterations in plot1 is around 5000 compared to number of iterations is 2000 in plot2. 
1. The obsevation shows that learning rate increases when we decrease iterations . 
2. If we decrease learning rate then cost function will approach global minimum which is better.
3. Accuracy of model 2 is greater than Model1 becuase of iterations.
4. In increasing complexity for the same problem then only higher no of iteration and lower value of learning rate provides reliable results.