# Neural Network - Ionosphere Dataset :-

#### Title of the data set: Ionosphere Data Set

#### Number of instance: 351

#### Number of attributes : 35 (34 predictive, 1 name)

#### Dataset Information:
This radar data was collected by a system in Goose Bay, Labrador. This system consists of a phased array of 16 high-frequency antennas with a total transmitted power on the order of 6.4 kilowatts. See the paper for more details. The targets were free electrons in the ionosphere. "Good" radar returns are those showing evidence of some type of structure in the ionosphere. "Bad" returns are those that do not; their signals pass through the ionosphere.

Received signals were processed using an autocorrelation function whose arguments are the time of a pulse and the pulse number. There were 17 pulse numbers for the Goose Bay system. Instances in this databse are described by 2 attributes per pulse number, corresponding to the complex values returned by the function resulting from the complex electromagnetic signal.

#### Attribute Information:
-- All 34 are continuous
-- The 35th attribute is either "good" or "bad" according to the definition summarized above. This is a binary classification task.
  


Import the libraries

In [50]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt 
from sklearn.preprocessing import OneHotEncoder
from sklearn.model_selection import train_test_split
pd.set_option('display.max_rows', 200)

Imort data from data file

In [51]:
df = pd.read_csv(r'C:\Stuff\KU Study\EECS 738 Machine Learning\Projects\EECS738-Project-3---Says-One-Neuron-To-Another-\data\ionosphere.data', header=None, )

In [52]:
df.sample(5)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,25,26,27,28,29,30,31,32,33,34
194,1,0,1.0,-0.09524,-1.0,-1.0,-1.0,-1.0,1.0,0.31746,...,0.36508,1.0,1.0,1.0,0.50794,-1.0,-0.3254,-1.0,0.72831,b
28,1,0,1.0,0.0838,1.0,0.17387,1.0,-0.13308,0.98172,0.6452,...,1.0,0.83899,1.0,0.74822,1.0,0.64358,1.0,0.52479,1.0,g
79,0,0,1.0,1.0,1.0,-1.0,1.0,1.0,-1.0,1.0,...,-1.0,1.0,1.0,1.0,1.0,1.0,-1.0,-1.0,1.0,b
19,0,0,1.0,-1.0,0.0,0.0,0.0,0.0,1.0,1.0,...,1.0,1.0,1.0,1.0,-1.0,1.0,1.0,1.0,1.0,b
126,1,0,0.5984,0.40332,0.82809,0.80521,0.76001,0.70709,0.8401,-0.10984,...,-0.30063,1.0,0.17076,0.62958,0.42677,0.87757,0.81007,0.81979,0.68822,b


Taking the Attribute values in X.

In [73]:
X= df.drop(df.columns[34], axis=1)
X

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,24,25,26,27,28,29,30,31,32,33
0,1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1.00000,0.03760,...,0.56811,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300
1,1,0,1.00000,-0.18829,0.93035,-0.36156,-0.10868,-0.93597,1.00000,-0.04549,...,-0.20332,-0.26569,-0.20468,-0.18401,-0.19040,-0.11593,-0.16626,-0.06288,-0.13738,-0.02447
2,1,0,1.00000,-0.03365,1.00000,0.00485,1.00000,-0.12062,0.88965,0.01198,...,0.57528,-0.40220,0.58984,-0.22145,0.43100,-0.17365,0.60436,-0.24180,0.56045,-0.38238
3,1,0,1.00000,-0.45161,1.00000,1.00000,0.71216,-1.00000,0.00000,0.00000,...,1.00000,0.90695,0.51613,1.00000,1.00000,-0.20099,0.25682,1.00000,-0.32382,1.00000
4,1,0,1.00000,-0.02401,0.94140,0.06531,0.92106,-0.23255,0.77152,-0.16399,...,0.03286,-0.65158,0.13290,-0.53206,0.02431,-0.62197,-0.05707,-0.59573,-0.04608,-0.65697
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
346,1,0,0.83508,0.08298,0.73739,-0.14706,0.84349,-0.05567,0.90441,-0.04622,...,0.95378,-0.04202,0.83479,0.00123,1.00000,0.12815,0.86660,-0.10714,0.90546,-0.04307
347,1,0,0.95113,0.00419,0.95183,-0.02723,0.93438,-0.01920,0.94590,0.01606,...,0.94520,0.01361,0.93522,0.04925,0.93159,0.08168,0.94066,-0.00035,0.91483,0.04712
348,1,0,0.94701,-0.00034,0.93207,-0.03227,0.95177,-0.03431,0.95584,0.02446,...,0.93988,0.03193,0.92489,0.02542,0.92120,0.02242,0.92459,0.00442,0.92697,-0.00577
349,1,0,0.90608,-0.01657,0.98122,-0.01989,0.95691,-0.03646,0.85746,0.00110,...,0.91050,-0.02099,0.89147,-0.07760,0.82983,-0.17238,0.96022,-0.03757,0.87403,-0.16243


Taking the class information in Y. g represents good and b represents bad

In [74]:
Y=df[34]
#Y=Y.astype(float)
Y

0      g
1      b
2      g
3      b
4      g
      ..
346    g
347    g
348    g
349    g
350    g
Name: 34, Length: 351, dtype: object

Using on_hot-encoding to code the class so that it can be converted in to a numerical value

In [75]:
one_hot_encoder = OneHotEncoder(sparse=False)

Y = one_hot_encoder.fit_transform(np.array(Y).reshape(-1, 1))

array([[0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.],
       [1., 0.],
       [0., 1.

Changing the datatype of dataframe to numpy array.

In [77]:
X=X.astype(int)
X = X.to_numpy()
X

array([[1, 0, 0, ..., 0, 0, 0],
       [1, 0, 1, ..., 0, 0, 0],
       [1, 0, 1, ..., 0, 0, 0],
       ...,
       [1, 0, 0, ..., 0, 0, 0],
       [1, 0, 0, ..., 0, 0, 0],
       [1, 0, 0, ..., 0, 0, 0]])

Split the data set into train/validation/test

In [78]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.15)
X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train, test_size=0.1)

## Implementation neural network.
### Neural network consists of:
        - An input layer, X
        - Arbitary number of hidden layers
        - An out put layer, Y
        - A set of weights(W) and biases(B) between each layer
        - A choice of activation function for each hidden layer, σ
        


Weight initialization is used to define the initial values for the parameters in neural network models prior to training the models on a dataset. We are going to initaialize the weight on the basis of nodes. The weights are initialized randomly between [-1,1]. The bias has a constant value = 1. Nodes are the list of integers and list of nodes denotes the number of layers. The function returns weights as a multi-dimensional array. 

In [79]:
def Initialize_Weight(nodes):
    layers, weights = len(nodes), []
    
    for i in range(1, layers):
        wt = [[np.random.uniform(-1, 1) for k in range(nodes[i-1] + 1)] for j in range(nodes[i])]
        weights.append(np.matrix(wt))
    
    return weights

Neural_Network function trains the network for given number of iterations. The input parameters are:
        - training data and target values
        - validation data and target values
        - number of iterations , default=10
        - list of integers
        - learning rate of back-propagation algorithm, default=0.15

In [80]:
def Neural_Network(X_train, Y_train, X_val=None, Y_val=None, iters=10, nodes=[], rate=0.15):
    hiddenLayers = len(nodes) - 1
    weights = Initialize_Weight(nodes)

    for iter in range(1, iters+1):
        weights = Train_Network(X_train, Y_train, rate, weights)

        #Print the accuracy of training and validation after every 10 iterations
        if(iter % 10 == 0):
            print("Iteration {}".format(iter))
            print("Training Accuracy:{}".format(accuracy(X_train, Y_train, weights)))
            if X_val.any():
                print("Validation Accuracy:{}".format(accuracy(X_val, Y_val, weights)))
                        
    return weights

Forward_Propagation fucntion: Each layer receives an input and calculated the output by passing the dot product of input and weights of the layer to the Sigmoid function. The out of current layer is the input for the next layer. The output of last layer will be our prediction.

In [81]:
def Forward_Propagation(x, weights, layers):
    output, current_input = [x], x

    for j in range(layers):
        activation = Sigmoid(np.dot(current_input, weights[j].T))
        output.append(activation)
        current_input = np.append(1, activation)
    
    return output

Backward_Propagation Function: It propagates the error backwards and adjusts the weights. Delta is calculated as error of next layer times sigmoid derivation of the current layer's output. Weights between layers can be updated by adding weight of previous layer and output of previous layer with rate. Current layer's error can be calculated by removing bias from the weights of previous layer and then multiply it with delta.

In [82]:
def Backward_Propagation(y, output, weights, layers):
    outputFinal = output[-1]
    error = np.matrix(y - outputFinal) #Calculate the error at last output
    
    #Back propagate the error
    for j in range(layers, 0, -1):
        currOutput = output[j]
        
        if(j > 1):
            # Adding previous output
            prevOutput = np.append(1, output[j-1])
        else:
            prevOutput = output[0]
        
        delta = np.multiply(error, sigmoidDerivative(currOutput))
        weights[j-1] += rate * np.multiply(delta.T, prevOutput)

        wt = np.delete(weights[j-1], [0], axis=1) # Remove bias from weights
        error = np.dot(delta, wt) # Calculate error for current layer
    
    return weights

Train_Network: It will perform the forward and backward propagation based on teh inputs of X_train, Y-train, rate and weights. It will return the newly calculated weights.

In [83]:
def Train_Network(X, Y, rate, weights):
    layers = len(weights)
    for i in range(len(X)):
        x = X[i]
        y = Y[i]
        x = np.matrix(np.append(1, x)) # Add feature vector
        
        output = Forward_Propagation(x, weights, layers)
        weights = Backward_Propagation(y, output, weights, layers)

    return weights

Activation Function : Sigmoid 
This function gives an output based on set of inputs for a node. The output generated for one node is used as input for the next node. This function decides which nodes can forward the output to the next layer. It outputs a value between 0 and 1.

In [84]:
def Sigmoid(x):
    return 1 / (1 + np.exp(-x))

def sigmoidDerivative(x):
    return np.multiply(x, 1-x)

Predict fuction: In this funtion, the input is first passed to the Neural_network and then the result will have an array of values corresponding the the classes. The maximum value represent the likeliness of the class being predicted. 


In [85]:
def Predict(item, weights):
    layers = len(weights)
    item = np.append(1, item)
    
    #forward propagation
    output = Forward_Propagation(item, weights, layers)
    
    outputFinal = output[-1].A1
    index = findMax(outputFinal)

    y = [0 for i in range(len(outputFinal))]
    y[index] = 1

    return y

findMax() : It will find the maximum value in the array and then set corresponding index to 1

In [86]:
def findMax(output):
    m, index = output[0], 0
    for i in range(1, len(output)):
        if(output[i] > m):
            m, index = output[i], i
    
    return index

accuracy fuction: This will calculate the percentage of the predicted class against the actual class of the sample.

In [87]:
def accuracy(X, Y, weights):
    correct_classification = 0

    for i in range(len(X)):
        x, y = X[i], list(Y[i])
        prediction = Predict(x, weights)

        if(y == prediction):
            correct_classification += 1

    return correct_classification / len(X)

Passing the features, classes, layers, rates and iterations to the Neural_Network.

In [88]:

features = len(X[0])
classes = len(Y[0]) 
layers = [features, 5, 10, classes]
rate, iterations = 0.15, 100 #We are training the Neural_Network up to 100 iterations for this dataset as the accuracy converges after that

weights = Neural_Network(X_train, Y_train, X_val, Y_val, iters=iterations, nodes=layers, rate=rate)

Iteration 10
Training Accuracy:0.8395522388059702
Validation Accuracy:0.8666666666666667
Iteration 20
Training Accuracy:0.8843283582089553
Validation Accuracy:0.8666666666666667
Iteration 30
Training Accuracy:0.9104477611940298
Validation Accuracy:0.8666666666666667
Iteration 40
Training Accuracy:0.9216417910447762
Validation Accuracy:0.8333333333333334
Iteration 50
Training Accuracy:0.9216417910447762
Validation Accuracy:0.8666666666666667
Iteration 60
Training Accuracy:0.9253731343283582
Validation Accuracy:0.8666666666666667
Iteration 70
Training Accuracy:0.9328358208955224
Validation Accuracy:0.9
Iteration 80
Training Accuracy:0.9328358208955224
Validation Accuracy:0.9
Iteration 90
Training Accuracy:0.9328358208955224
Validation Accuracy:0.9
Iteration 100
Training Accuracy:0.9328358208955224
Validation Accuracy:0.9
Iteration 110
Training Accuracy:0.9365671641791045
Validation Accuracy:0.9
Iteration 120
Training Accuracy:0.9365671641791045
Validation Accuracy:0.9
Iteration 130
Train

Calculating the accuracy after all the iterations.

In [89]:
print("Testing Accuracy: {}".format(accuracy(X_test, Y_test, weights)))

Testing Accuracy: 0.8867924528301887


Let's predict an actual sample and get the predicted value

In [90]:
print("Actual: ",X_test[0], list(Y_test[0]))
print("Predicted: ",Predict(X_test[0], weights))

Actual:  [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0] [0.0, 1.0]
Predicted:  [0, 1]
