#Neural Network Training

This project involves training two networks. The first is tackles a classification task and the second a regression 

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import altair as alt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from tensorflow import keras

##Binary  Classification Network



### Questions
**1. Run your training and evaluation code 3 times and record the number of training epochs needed, the final training accuracy and the final testing accuracy. Remember that with early stopping, your final training accuracy is at the point of lowest loss (likely 3 epochs back from when the training actually stopped).**

Number of epochs----Final Training Accuracy----Final Testing Accuracy

29------------------------0.9863---------------------0.9855

19------------------------0.9945---------------------0.9964

22------------------------0.9818---------------------0.9709

**2. Are the results reasonably consistent?**
Yes the results are reasonably consistent

**3. Are you seeing significant differences in accuracy between training and testing. What might this mean?**
No, there is no significant differences in accuracy between the training and testing. 

**4. Increase the learning rate to 0.5 and re-record the results from 3 runs (as in question 1).**

Number of epochs----Final Training Accuracy----Final Testing Accuracy

7-------------------------0.9462---------------------0.9091

19------------------------0.9973---------------------0.9891

13------------------------0.9727---------------------0.9818

**5. Are these results appreciably different from the earlier results on the first network?** The number of epochs needed with this learning rate has decreased. Also, I see that there is a relatively higher difference between testing and training accuracy. 

In [None]:
# Read in our classification dataset
df = pd.read_csv('/content/drive/MyDrive/HDS/Datasets/BankNoteAuthentication.csv')
df.head()

Unnamed: 0,variance,skewness,curtosis,entropy,class
0,3.6216,8.6661,-2.8073,-0.44699,0
1,4.5459,8.1674,-2.4586,-1.4621,0
2,3.866,-2.6383,1.9242,0.10645,0
3,3.4566,9.5228,-4.0112,-3.5944,0
4,0.32924,-4.4552,4.5718,-0.9888,0


In [None]:
# Pull out features and targets
features = df.iloc[:, :-1].values
targets = df.iloc[:,-1].values

# Split off a test set
train_features, test_features, train_targets, test_targets = train_test_split(features, targets, test_size=0.2, random_state=42)

# Scale the features to the range 0-1
scaler = MinMaxScaler().fit(train_features)
train_features = scaler.transform(train_features)
test_features = scaler.transform(test_features)


In [None]:
# Put code in this cell and create new cells below for more code
# Our networks are always sequential
nn = keras.Sequential() # Create the network

# Add the input layer and specify its activation function
# We have 4 features so the input shape is (4, )
# Lets start with 2 neurons on this layer using swish
nn.add(keras.layers.Dense(2, input_shape=(4,), activation='swish'))

# Add the second layer and specify its activation function
# We want one output
# The output should be between 0 and 1
# so we use a sigmoid activation function
nn.add(keras.layers.Dense(1, activation='sigmoid'))

# Specify the learning algorithm (sometimes including learning rate),
# the loss (error) function, and metrics which are what
# do we want to know as the network is running
nn.compile(optimizer=keras.optimizers.RMSprop(0.5),
           loss=keras.losses.BinaryCrossentropy(),
           metrics=[keras.metrics.BinaryAccuracy()])

# Early stopping callback will halt training if loss increases
callback = keras.callbacks.EarlyStopping(monitor='loss', patience=3)

# Finally train the network
print("Training History:")
nn.fit(train_features, train_targets, epochs=50, callbacks=[callback])
# Evaluate its performance on the test set
print("Testing Results:")
nn.evaluate(test_features, test_targets)

Training History:
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Testing Results:


[0.02578314207494259, 0.9818181991577148]

## Part Two - Regression Network





### Questions
**1. Run your training and evaluation code 3 times and record the number of training epochs needed, the final training accuracy and the final testing accuracy. Remember that with early stopping, your final training accuracy is at the point of lowest loss (likely 3 epochs back from when the training actually stopped).**

Number of epochs----Final Training Accuracy----Final Testing Accuracy

19-------------------------0.1066---------------------0.0791

13------------------------0.1379---------------------0.1066

7------------------------0.1594---------------------0.1370


**2. Are the results reasonably consistent?** Yes, the results are reasonably consistent

**3. Are you seeing significant differences in accuracy between training and testing. What might this mean?** No, there arent any significant differences in accuracy between training and testing. 

**4. Create a plot of the actual output of one of your runs compared to the target output. An example of such a plot is in the Regression Learning Example notebook.**

**5. Copy your training code to a new cell. Increase the first and second layers to 64 neurons and re-record the results from 3 runs (as in question 1).**

Number of epochs----Final Training Accuracy----Final Testing Accuracy

12-------------------------0.2507---------------------0.2528

17------------------------0.2574---------------------0.2346

20------------------------0.2394---------------------0.2538


**6. Are these results appreciably different from the earlier results on the first network? If so, why might that be the case?**
Yes, the testing and training accuracy of the first network were much higher. This may be due to the greater number of neurons in the second network (64) making it harder for the model to learn. 

**7. Again create a plot of the actual output of one of your runs with this new larger network compared to the target output.** 

Record your answers in text cells.

In [None]:
# Create the dataset
x = np.linspace(0,2*np.pi,1001)
y = np.sin(x)*np.cos(x)
df = pd.DataFrame({"Input":x, "Target":y})
alt.Chart(df, title="cos(x)*sin(x)").mark_line().encode(
    x='Input:Q',
    y='Target:Q'
)

In [None]:
# Pull out features and targets
features = df.iloc[:, :-1].values
targets = df.iloc[:,-1].values

# Split off a test set
train_features, test_features, train_targets, test_targets = train_test_split(features, targets, test_size=0.2, random_state=42)

# Skip scaling as their is only one feature and its is in a reasonable range

In [None]:
# Put code in this cell and create new cells below for more code
# Our networks are always sequential
nn = keras.Sequential() # Create the network

# Add the input layer and specify its activation function
# We have 1 feature so the input shape is (1, )
# Lets start with 8 neurons using swish
nn.add(keras.layers.Dense(8, input_shape=(1,), activation='swish'))

# Then add another hidden layer with 4 neurons
nn.add(keras.layers.Dense(4, activation='swish'))

# Add the output layer and specify its activation function
# We want one output with no limits on the range
# so we use a linear activation function
nn.add(keras.layers.Dense(1, activation='linear'))

# Specify the learning algorithm, loss (error) function and what
# do we want to know as the network is running
nn.compile(optimizer=keras.optimizers.RMSprop(learning_rate=0.05),
           loss=keras.losses.MeanSquaredError(),
           metrics=[keras.metrics.MeanAbsoluteError()])

# Early stopping callback will halt training if loss increases
callback = keras.callbacks.EarlyStopping(monitor='loss', patience=3)

# Finally train the network
print("Training History:")
nn.fit(train_features, train_targets, epochs=50, callbacks=[callback])
# Evaluate the results
print("Testing Results:")
nn.evaluate(test_features, test_targets)

Training History:
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Testing Results:


[0.03143736720085144, 0.13701750338077545]

In [None]:
# Look at the predictions
actual = nn.predict(x).flatten()
df["Actual"] = actual
alt.Chart(df, title="cos(x)*sin(x)").transform_fold(
    ['Target', 'Actual'],
    as_=['Result', 'Output']
).mark_line().encode(
    x='Input:Q',
    y='Output:Q',
    color='Result:N'
)

In [None]:
# Put code in this cell and create new cells below for more code
# Our networks are always sequential
nn = keras.Sequential() # Create the network

# Add the input layer and specify its activation function
# We have 1 feature so the input shape is (1, )
# Lets start with 8 neurons using swish
nn.add(keras.layers.Dense(64, input_shape=(1,), activation='swish'))

# Then add another hidden layer with 4 neurons
nn.add(keras.layers.Dense(64, activation='swish'))

# Add the output layer and specify its activation function
# We want one output with no limits on the range
# so we use a linear activation function
nn.add(keras.layers.Dense(1, activation='linear'))

# Specify the learning algorithm, loss (error) function and what
# do we want to know as the network is running
nn.compile(optimizer=keras.optimizers.RMSprop(learning_rate=0.05),
           loss=keras.losses.MeanSquaredError(),
           metrics=[keras.metrics.MeanAbsoluteError()])

# Early stopping callback will halt training if loss increases
callback = keras.callbacks.EarlyStopping(monitor='loss', patience=3)

# Finally train the network
print("Training History:")
nn.fit(train_features, train_targets, epochs=50, callbacks=[callback])
# Evaluate the results
print("Testing Results:")
nn.evaluate(test_features, test_targets)

Training History:
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Testing Results:


[0.09439661353826523, 0.253794401884079]

In [None]:
# Look at the predictions
actual = nn.predict(x).flatten()
df["Actual"] = actual
alt.Chart(df, title="cos(x)*sin(x)").transform_fold(
    ['Target', 'Actual'],
    as_=['Result', 'Output']
).mark_line().encode(
    x='Input:Q',
    y='Output:Q',
    color='Result:N'
)