### Introduction
In this notebook we will implement binary classification using basic neural network with just one node for logistic regression. We will try to keep the use of ML libraries such as sklearn to minimum. Main aim is to implement all the math carried out in raw python.

#### Sonar Dataset Prediction
From given 60 features we classify whether given material is rock ('R') or metal ('M')

In [276]:
# Import the necessary libraries
import numpy as np
import pandas as pd

In [277]:
# Fetch the data set which in csv format and doesn't have any hearders
sonar = pd.read_csv('./datasets/sonar.csv', header=None)

In [278]:
# Verify if data is rightly imported
sonar.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,51,52,53,54,55,56,57,58,59,60
0,0.02,0.0371,0.0428,0.0207,0.0954,0.0986,0.1539,0.1601,0.3109,0.2111,...,0.0027,0.0065,0.0159,0.0072,0.0167,0.018,0.0084,0.009,0.0032,R
1,0.0453,0.0523,0.0843,0.0689,0.1183,0.2583,0.2156,0.3481,0.3337,0.2872,...,0.0084,0.0089,0.0048,0.0094,0.0191,0.014,0.0049,0.0052,0.0044,R
2,0.0262,0.0582,0.1099,0.1083,0.0974,0.228,0.2431,0.3771,0.5598,0.6194,...,0.0232,0.0166,0.0095,0.018,0.0244,0.0316,0.0164,0.0095,0.0078,R
3,0.01,0.0171,0.0623,0.0205,0.0205,0.0368,0.1098,0.1276,0.0598,0.1264,...,0.0121,0.0036,0.015,0.0085,0.0073,0.005,0.0044,0.004,0.0117,R
4,0.0762,0.0666,0.0481,0.0394,0.059,0.0649,0.1209,0.2467,0.3564,0.4459,...,0.0031,0.0054,0.0105,0.011,0.0015,0.0072,0.0048,0.0107,0.0094,R


In [279]:
# First 60 are featuers and the last column of dataset is labels
X, y = sonar.iloc[:, :60], sonar.iloc[:, 60]

In [280]:
# Normalizing the values
X_mean = np.mean(X, axis=0)
X_std = np.std(X, axis=0)
X = (X - X_mean) / X_std

In [281]:
# We use sklearn's train_test_split for splitting the data in test and train set.
# Both datasets should have almost identical distribution of each label for that we specify
# stratify=y as argument
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, test_size=0.25)

In [282]:
# n_x X m format
# We reshape the features such that each column represents an example (instance) where each row
# represents a feature type
X_train, X_test, y_train, y_test = X_train.values.T, X_test.values.T, y_train.values.reshape(y_train.shape[0], 1), y_test.values.reshape(y_test.shape[0], 1)

In [283]:
# Replace strings with numbers
y_train = np.where(y_train == 'R', 1, 0)
y_test = np.where(y_test == 'R', 1, 0)

In [284]:
# We will use sigmoid function as activaton function
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

In [294]:
def compute_linear_sum(w, b, X):
    return (w.T.dot(X) + b).T

In [295]:
# Abstraction
def apply_activation_function(z):
    return sigmoid(z)

In [296]:
# Compute the predictions using given weights
def compute_yhat(w, b, X):
    return apply_activation_function(compute_linear_sum(w, b, X));

In [297]:
# Set initiale params to zero
def intialize_weights(dim):
    w = np.zeros(shape=(dim, 1))
    b = 0
    assert(w.shape == (dim, 1))
    return (w, b)

In [298]:
# Calculates optimum weights
def model(X, y, num_iterations=1000, learning_rate=0.01):
    
    # Initialize the network
    m = X.shape[1]
    w, b = intialize_weights(X.shape[0])
    
    for i in range(num_iterations):
        A = compute_yhat(w, b, X)
        
        # Calculate gradient
        dw = X.dot((A - y)) / m
        db = np.sum(A - y) / m
        
        # Update the weights
        w = w - learning_rate * dw
        b = b - learning_rate * db
        
    return (w, b)

In [299]:
# Applies weight and converts to proper labels
def predict(X, w, b):
    y_predict = compute_yhat(w, b, X)
    return np.where(y_predict <= 0.5, 0, 1)

In [301]:
# Sample training
w, b = model(X_train, y_train)

# Training Accuracy
100 * np.where(y_train == predict(X_train, w, b), 1, 0).sum() / y_train.shape[0]

83.97435897435898

In [302]:
# Test Accuracy
100 * np.where(y_test == predict(X_test, w, b), 1, 0).sum() / y_test.shape[0]

80.76923076923077