## MLP for Binary Classification

In this lab, you will use the Ionosphere data binary (two-class) classification dataset to demonstrate an MLP for binary classification.

This dataset involves predicting whether a structure is in the atmosphere or not given radar returns.

The dataset will be downloaded automatically using Pandas, but you can learn more in the links below.

[Ionosphere Dataset (csv)](https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv)

[Ionosphere Dataset Description (csv)](https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.names)


Your task for this is lab is to develop a Keras-based Multi-Layer Perceptron model for this data set. Remember the number of output layers is equal to the number of classes.

Following we have provided some piece of code to you while you need to complete the rest of the code on your own.



In [19]:
# Importing Libraries

# Your code to import read_csv class from pandas
# Your code to import train_test_split class from sklearn. Follow link https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
from sklearn.model_selection import train_test_split 


In [3]:
import pandas as pd

# Read the dataset from the path below. Store the data in a pandas dataframe named 'df'

Link to API - https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

In [4]:
path = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/ionosphere.csv'
df=pd.read_csv(path)
# Your code to read the csv from the above path.

See the sample dataset. Print few rows of the dataset. Use dataframe.head() method.

Link to API:  https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html

In [5]:
# Your code to print first few rows of the dataset.
df.head()

Unnamed: 0,1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1.1,0.03760,...,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300,g
0,1,0,1.0,-0.18829,0.93035,-0.36156,-0.10868,-0.93597,1.0,-0.04549,...,-0.26569,-0.20468,-0.18401,-0.1904,-0.11593,-0.16626,-0.06288,-0.13738,-0.02447,b
1,1,0,1.0,-0.03365,1.0,0.00485,1.0,-0.12062,0.88965,0.01198,...,-0.4022,0.58984,-0.22145,0.431,-0.17365,0.60436,-0.2418,0.56045,-0.38238,g
2,1,0,1.0,-0.45161,1.0,1.0,0.71216,-1.0,0.0,0.0,...,0.90695,0.51613,1.0,1.0,-0.20099,0.25682,1.0,-0.32382,1.0,b
3,1,0,1.0,-0.02401,0.9414,0.06531,0.92106,-0.23255,0.77152,-0.16399,...,-0.65158,0.1329,-0.53206,0.02431,-0.62197,-0.05707,-0.59573,-0.04608,-0.65697,g
4,1,0,0.02337,-0.00592,-0.09924,-0.11949,-0.00763,-0.11824,0.14706,0.06637,...,-0.01535,-0.0324,0.09223,-0.07859,0.00732,0.0,0.0,-0.00039,0.12011,b


Print the basic info of the dataset. Use dataframe.info() from pandas library


In [6]:
# Your code to print information about the dataframe
df.describe()

Unnamed: 0,1,0,0.99539,-0.05889,0.85243,0.02306,0.83398,-0.37708,1.1,0.03760,...,0.56811,-0.51171,0.41078,-0.46168,0.21266,-0.34090,0.42267,-0.54487,0.18641,-0.45300
count,350.0,350.0,350.0,350.0,350.0,350.0,350.0,350.0,350.0,350.0,...,350.0,350.0,350.0,350.0,350.0,350.0,350.0,350.0,350.0,350.0
mean,0.891429,0.0,0.64033,0.044667,0.60035,0.116154,0.549284,0.120779,0.510453,0.181756,...,0.395643,-0.069928,0.542015,-0.068417,0.378919,-0.027013,0.352313,-0.002248,0.349829,0.015816
std,0.311546,0.0,0.498059,0.442032,0.520431,0.461443,0.493124,0.520816,0.507117,0.484482,...,0.579206,0.508675,0.516896,0.550411,0.576642,0.508425,0.572289,0.513491,0.523339,0.468338
min,0.0,0.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,...,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0,-1.0
25%,1.0,0.0,0.471517,-0.065388,0.412555,-0.024868,0.209105,-0.053483,0.086785,-0.049003,...,0.0,-0.323745,0.283612,-0.428992,0.0,-0.234935,0.0,-0.239347,0.0,-0.161013
50%,1.0,0.0,0.870795,0.0167,0.80862,0.02117,0.728,0.015085,0.68243,0.01755,...,0.549175,-0.014915,0.70853,-0.017685,0.499215,0.0,0.446875,0.0,0.413115,0.0
75%,1.0,0.0,1.0,0.194727,1.0,0.335317,0.970445,0.451572,0.950555,0.536192,...,0.907165,0.157922,0.999972,0.154862,0.884572,0.154218,0.85949,0.200935,0.816778,0.172105
max,1.0,0.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,...,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0


Print the shape of the dataframe. Select suitable API call from the pandas library

In [9]:
# Your code to print the shape of the dataset
df.shape

(350, 35)

# Separate the input and output from the dataframe. Input is all columns besides last column. Output is the last column.


In [15]:
X = df.values[:, :-1]
X
# Your code to get y - Hint y = df.values[:, some parameters]

array([[1, 0, 1.0, ..., -0.06288, -0.13738, -0.02447],
       [1, 0, 1.0, ..., -0.2418, 0.56045, -0.38238],
       [1, 0, 1.0, ..., 1.0, -0.32382, 1.0],
       ...,
       [1, 0, 0.94701, ..., 0.00442, 0.92697, -0.00577],
       [1, 0, 0.90608, ..., -0.03757, 0.87403, -0.16243],
       [1, 0, 0.8471, ..., -0.06678, 0.85764, -0.06151]], dtype=object)

In [21]:
Y=df['g']
Y

0      b
1      g
2      b
3      g
4      b
      ..
345    g
346    g
347    g
348    g
349    g
Name: g, Length: 350, dtype: object

In [23]:
Y.shape

(350,)

We have converted everthing in X to 'float' and the letters in column y to the numbers in the following cell.

In [28]:
X = X.astype('float32')
Y = LabelEncoder().fit_transform(Y)


Printing the genral information of the X and y in the following cell

In [30]:
# Your code to print X
# Your code to print y
# your code to print shape of X. Remember X is a numpy array
# your code to print shape of y. Remember y is a numpy array


In [32]:
X.shape

(350, 34)

In [33]:
Y.shape

(350,)

* Separate X and y into training and test set with a ratio of your choice.
* Print the shapes of the resulting arrays.
* Get the number of features from X_train. Remember the number of features are the number of inputs.

Use sklearn train_test_split class.
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html


In [34]:
# Your code to separate the data into trauning and test set.

# Your code to print shape of X_train
# Your code to print shape of X_test
# Your code to print shape of y_train
# Your code to print shape of X_test

X_train, X_test, y_train, y_test = train_test_split(X,Y, 
                                   random_state=104,  
                                   test_size=0.25,  
                                   shuffle=True) 

n_features = X_train.shape[1]

# Creating a Multi-layer Perceptron using Keras.
We have added first and last layers. Create the hidden layers of your choise.
You can chose any number of hidden layers and activation function of your chose
https://keras.io/api/layers/core_layers/dense/

In [40]:
import tensorflow as tf
model = Sequential()
model.add(Dense(10, activation='relu', input_shape=(n_features,)))
model.add(tf.keras.layers.Dense(100, activation='relu'))
#
# Add as many layers with activation functions of your choice
#
model.add(Dense(1, activation='sigmoid'))

In the next cell, we trained the above neural network model and tested its accuracy. As this concept has still not benn covered in the class, just run the code to check the accuracy.

In [42]:
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=150, batch_size=32, verbose=0)

loss, acc = model.evaluate(X_test, y_test, verbose=0)
print('Test Accuracy: %.3f' % acc)

Test Accuracy: 0.943


** How much accuracy have you got? Compare the accuracy with your peers. **
** Now, change your model and activation function to get the better accuracy as compared to your peers **

## **Important:** Document in your lab logbook the accuracy of the improved model. Do not include any code or explanations in your lab logbook. Simply record the accuracy. For example, if the obtained accuracy is 0.98, then enter "0.98" in your lab logbook.

## In addition to the accuracy, also document the output of the neural network as provided in Task 2.



Next, we have provided the code to predict on an unknown value.
We will cover these concepts later in the class. For now, just run the code to see the prediction.

In [43]:
row = [1,0,0.99539,-0.05889,0.85243,0.02306,
       0.83398,-0.37708,1,0.03760,0.85243,-0.17755,
       0.59755,-0.44945,0.60536,-0.38223,0.84356,
       -0.38542,0.58212,-0.32192,0.56971,-0.29674,0.36946,
       -0.47357,0.56811,-0.51171,0.41078,-0.46168,0.21266,
       -0.34090,0.42267,-0.54487,0.18641,-0.45300]
yhat = model.predict([row])
print('Predicted: %.3f' % yhat)

Predicted: 1.000


### Try out the same model with Keras Functional models!
Refer to [Keras](https://keras.io/) for more details and tutorials for the same.