# Introduction To Neural Networks
Learn linear regression, perceptron, gradient descent 

## L5-9 AND Perceptron

### Find the weights and bias for AND perceptron
1. Plotting the points on a graph shows that [(0,0), (0,1), (1,0), (1,1)] form a square
2. Getting the equation of the diagonal not passing through the origin -> x + y = 1
3. The points on the line [(0,1), (1,0)] are `False` outputs for the perceptron
4. So the required line will be a parallel line to x + y = 1 away from the origin.
5. We can take any value for bias like -1.1, -2, -1.0001
6. If we go beyond -2, our line will cross (1, 1) point which is a `Truth` value.

In [None]:
import pandas as pd

In [None]:
# set weight and bias
weight1 = 1.0
weight2 = 1.0
bias = -1.1

In [None]:
# take inputs (ideally for real cases, these would be read from some source later)
test_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
# we already know what will be the outputs for above (AND logic table)
correct_outputs = [False, False, False, True]

In [None]:
outputs = []
for test_input, correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1*test_input[0] + weight2*test_input[1] + bias
    output = int (linear_combination >= 0)
    is_correct_output = 'Yes' if output == correct_output else 'No'
    output_str = 'True' if output == 1 else 'False'
    outputs.append([test_input[0], test_input[1], linear_combination, output, output_str, correct_output, is_correct_output])

# print(outputs)

In [None]:
# Print the output
num_wrong = len([output[6] for output in outputs if output[6] == 'No'])
output_frame = pd.DataFrame(outputs, columns= ['Input 1', 'Input 2', 'Linear Combination', 'Activation Output', 'Output Label', 'Expected Output', 'Is correct'])
print(output_frame.to_string(index=False))

In [None]:
# Check how many are correct and proceed
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))

## L5-10 OR and NOT perceptrons

### Find the weights and bias for OR perceptron
1. Plotting the points on a graph shows that [(0,0), (0,1), (1,0), (1,1)] form a square
2. Getting the equation of the diagonal not passing through the origin -> x + y = 1
3. The points on the line [(0,1), (1,0)] are `True` outputs for the perceptron
4. So the required line will be x + y = 1 or a parallel line towards the origin.
5. We can take any value for bias like -1.0, -0.9, -0.0001
6. If we go beyond 0, our line will cross (0, 0) point which is a `False` value.

In [None]:
import pandas as pd

# set weights and bias
weight1 = 1.0
weight2 = 1.0
bias = -0.00001

# take inputs
test_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
correct_outputs = [False, True, True, True]

outputs = []
for test_input, correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1*test_input[0] + weight2*test_input[1] + bias
    output = int(linear_combination >= 0)
    output_str = 'True' if output == 1 else 'False'
    is_correct_output = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], linear_combination, output, output_str, correct_output, is_correct_output])

num_wrong = len([output[6] for output in outputs if output[6] == 'No'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))

output_frame = pd.DataFrame(outputs, columns=['Input 1', 'Input 2', 'Linear Combination', 'Activation Output', 'Output Label', 'Expected Output', 'Is Correct'])
print(output_frame.to_string(index=False))

### Find the weights and bias for NOT perceptron
1. NOT only cares about one input and ignores the rest.
2. Plotting the points on a graph shows that [(0,x), (1,x)] form a regions
3. The dividing line between them -> x = 0.5
4. The points to the right line [(1, x)] are `False` outputs for the perceptron
5. So the required line will be x = 0.5 or a parallel line x = k where 0 < k < 1.
6. We can take any value for bias like 1.0, 0.9, 0.0001 and slightly more negative weight

In [None]:
import pandas as pd

# set the weights and bias
weight1 = -1.00001
weight2 = 0.0
bias = 1.0

# Take the inputs
test_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
correct_outputs = [True, True, False, False]

outputs = []
for test_input,correct_output in zip(test_inputs, correct_outputs):
    linear_combination = weight1*test_input[0] + weight2*test_input[1] + bias
    output = int(linear_combination >= 0)
    output_str = 'True' if output == 1 else 'False'
    # print(output, ':', correct_output, ':', output == correct_output)
    is_correct_output = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], linear_combination, output, output_str, correct_output, is_correct_output])

num_wrong = len([output[6] for output in outputs if output[6] == 'No'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))
    
output_frame = pd.DataFrame(outputs, columns = ['Input 1', 'Input 2', 'Linear Combination', 'Activation Output', 'Output Label', 'Expected Output', 'Is Correct'])
print(output_frame.to_string(index=False))

## L5-11 XOR perceptron
1. XOR perceptron is not linearly separable. So we need to chain multiple perceptrons

#### Helper Methods
Converting the above Logic Gates into helper methods to solve the XOR perceptron

In [None]:
def step_perceptron(test_input = [0.0, 0.0], weights = [0.0, 0.0], bias = 0.0):
    linear_combination = weights[0]*test_input[0] + weights[1]*test_input[1] + bias
    output = int(linear_combination >= 0)
    return output == 1
    
def and_perceptron(test_input = [0.0, 0.0]):
    return step_perceptron(test_input, weights = [1.0, 1.0], bias = -1.001)

def or_perceptron(test_input = [0.0, 0.0]):
    return step_perceptron(test_input, weights = [1.0, 1.0], bias = -0.001)

def not_perceptron(test_input = [0.0, 0.0]):
    return step_perceptron(test_input, weights = [-1.001, 0.0], bias = 1.0)

In [None]:
import pandas as pd

# take inputs
test_inputs = [(0, 0), (0, 1), (1, 0), (1, 1)]
correct_outputs = [False, True, True, False]

outputs = []
for test_input, correct_output in zip(test_inputs, correct_outputs):
    inp_1 = int(and_perceptron([test_input[0], int(not_perceptron([test_input[1], 0]))]))
    inp_2 = int(and_perceptron([test_input[1], int(not_perceptron([test_input[0], 0]))]))
    # print(test_input[0], test_input[1], inp_1, inp_2, correct_output)
    output = or_perceptron([inp_1, inp_2])
    output_str = 'True' if output == 1 else 'False'
    is_correct_output = 'Yes' if output == correct_output else 'No'
    outputs.append([test_input[0], test_input[1], output, output_str, correct_output, is_correct_output])
    
num_wrong = len([output[5] for output in outputs if output[5] == 'No'])
if not num_wrong:
    print('Nice!  You got it all correct.\n')
else:
    print('You got {} wrong.  Keep trying!\n'.format(num_wrong))
    
output_frame = pd.DataFrame(outputs, columns = ['Input 1', 'Input 2', 'Activation Output', 'Output Label', 'Expected Output', 'Is Correct'])
print(output_frame.to_string(index=False))

## L5-12 Sigmoid Neural Network

In [None]:
import numpy as np

def sigmoid(x):
    return 1/(1 + np.exp(-x))

inputs = np.array([0.7, -0.3])
weights = np.array([0.1, 0.8])
bias = -0.1

# linear_combination = weights[0]*inputs[0] + weights[1]*inputs[1] + bias
linear_combination = np.dot(inputs, weights) + bias
output = sigmoid(linear_combination)

print('Output: ')
print(output)

## L5-15 Gradient Descent

In [None]:
import numpy as np

# define the sigmoid function
def sigmoid(x):
    return 1/(1 + np.exp(-x))

# define the derivative of sigmoid function
def sigmoid_prime(x):
    return sigmoid(x)*(1 - sigmoid(x))

# Input data
x = np.array([0.1, 0.3])

# Target
y = 0.2

# Input weights
w = np.array([-0.8, 0.5])

# Learning rate
learn_rate = 0.5

# NN output i.e., prediction
nn_output = sigmoid(np.dot(x, w))

# derive the error
error = y - nn_output

# error term (lower delta)
error_term = error*sigmoid_prime(np.dot(x, w))

# gradient descent step
del_w = learn_rate*error_term*x

print('Neural Network output:')
print(nn_output)
print('Amount of Error:')
print(error)
print('Change in Weights:')
print(del_w)

## L5-16 Implement Gradient Descent
1. First read the data and standardize it. (Data Cleanup)
2. Initialize random weights and small learning rate.

In [1]:
import numpy as np
import pandas as pd

verbose = True
max_row_disp = 5
training_percent = 0.9
random_seed = 42

def read_csv():
    admissions = pd.read_csv('assets/binaryL5-16.csv')
#     print(admissions.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
    return admissions

def modify_data():
    # make dummy variables for rank
    data = pd.concat([admissions, pd.get_dummies(admissions['rank'], prefix='rank')], axis=1)
    data = data.drop('rank', axis=1)
#     print(data.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
    
    # Standarize the data for gre and gpa
    for field in ['gre', 'gpa']:
        mean, std = data[field].mean(), data[field].std()
        data.loc[:,field] = (data[field]-mean)/std
    
#     print(data.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
    
    # Split off random x% of data for testing set.
    np.random.seed(random_seed)
    sample = np.random.choice(data.index, size=int(len(data)*training_percent), replace=False)
    data, test_data = data.ix[sample], data.drop(sample)
    
#     print("Training Data") if verbose else print('')
#     print(data.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
#     print("Testing Data") if verbose else print('')
#     print(test_data.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
    
    # spilt into Features and test
    features, targets = data.drop('admit', axis=1), data['admit']
    features_test, targets_test = test_data.drop('admit', axis=1), test_data['admit']
    return features, targets, features_test, targets_test

    
admissions = read_csv()
features, targets, features_test, targets_test = modify_data()

# print(features.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
# print(targets.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
# print(features_test.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')
# print(targets_test.to_string(index=False, max_rows=max_row_disp)) if verbose else print('')

def sigmoid(x):
    return 1/(1 + np.exp(-x))

# use same random seed to make debugging easier
np.random.seed(random_seed)

n_records, n_features = features.shape
last_loss = None

print('records: ', n_records, ', features: ', n_features) if verbose else print('')

# Initialize random weights normailzed to 1/sqaure_root(number of features) to keep input small to sigmoid
weights = np.random.normal(scale=1/n_features**.5, size=n_features)
print('initial scaled weights: ', weights) if verbose else print('')

# Neural Network hyper parameters
epochs = 1000
learn_rate = 0.5

for e in range(epochs):
    del_w = np.zeros(weights.shape)
    for x,y in zip(features.values, targets):
        output = sigmoid(np.dot(x, weights))
        error = y - output
        error_term = error*output*(1-output)
        del_w += learn_rate*error_term*x
    weights += del_w
    
    # Printing out the mean square error on the training set
    if e % (epochs / 10) == 0:
        out = sigmoid(np.dot(features, weights))
        loss = np.mean((out - targets) ** 2)
        if last_loss and last_loss < loss:
            print("Train loss: ", loss, "  WARNING - Loss Increasing")
        else:
            print("Train loss: ", loss)
        last_loss = loss

# Calculate accuracy on test data
tes_out = sigmoid(np.dot(features_test, weights))
predictions = tes_out > 0.5
accuracy = np.mean(predictions == targets_test)
print("Prediction accuracy: {:.3f}".format(accuracy))

records:  360 , features:  6
initial scaled weights:  [ 0.2027827  -0.05644616  0.26441774  0.62177434 -0.09559271 -0.09558601]
Train loss:  0.286196010415
Train loss:  0.257761346594
Train loss:  0.257722034703
Train loss:  0.257722752309
Train loss:  0.257722752309
Prediction accuracy: 0.725
