<a href="https://colab.research.google.com/github/uteyechea/neural-network-from-scratch/blob/main/ANN_MultiLayer_MultiLabelClasses.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Part 1: Import all necessary dependecies

In [1]:
!pip install scikit-multilearn #Data split train/test sets



In [2]:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from sklearn.metrics import confusion_matrix
from skmultilearn.model_selection import iterative_train_test_split

#Part 2: Import data and make train/test sets

We wil be working with ECG data, looking to classify normal ECGs vs not-normal ECGs.

##2.1 Download ECG data


In [3]:
def read_cvs_with_html_tags(data_url='https://github.com/uteyechea/neural-network-from-scratch/blob/main/ecg_data/ecg.csv'): 
  # This function only works for my very peculiar data structure.
  """
  Arguments:
  data_url -- csv file url

  Returns:
  X -- Training data
  y -- Labels
  """ 
  raw_data=pd.read_html(data_url) # Looking for <table> tag
  unformatted_data=raw_data[0][1].str.split(";",expand=True) #read_html return need a little work
  #By now you have a standard pandas DataFrame, but still needs some work...
  #make first row new header, (optional, depending on your csv file)
  new_header=unformatted_data.iloc[0,:]
  data=unformatted_data[1:]
  data.columns=new_header

  X=data.iloc[:,0:-5]
  y=data.iloc[:,-4:] #'Clase' spanish for class label. OR use last 4 for one-hot representation
  #y=y.values.reshape(y.shape[0],1)

  X=np.array(X)
  y=np.array(y)

  X=X.astype(np.float)
  y=y.astype(np.float)

  #y=y/max(y)-(1/max(y)) Normalize class labels

  return X,y

##2.2 Split train/test data

In [4]:
def split_data(X,y,test_size=0.5):
  """
  Agruments(None):
  data_url -- csv file location
  test_size -- Test/train data split, ex. test_size=0.7 distributes 70% to the test set and 30% to the training set. 
  Returns:
  X_train -- train data features
  y_train -- train data labels
  X_test -- test data features
  y_test -- test data labels
  """
  data_url='https://github.com/uteyechea/neural-network-from-scratch/blob/main/ecg_data/ecg.csv'
  X,y=read_cvs_with_html_tags(data_url)

  X_train, y_train, X_test, y_test = iterative_train_test_split(X, y, test_size) 
  
  #Reshape data to fit expected data structure for nn model
  #X_train=X_train.reshape(X_train.shape[1],X_train.shape[0])
  #X_test=X_test.reshape(X_test.shape[1],X_test.shape[0])
  #y_train=y_train.reshape(y_train.shape[1],y_train.shape[0])
  #y_test=y_test.reshape(y_test.shape[1],y_test.shape[0])

  y_test=y_test.astype(int) #cross enctropy loss in NN model expects labels as ints
  y_train=y_train.astype(int) #cross enctropy loss in NN model expects labels as ints

  N=len(np.unique(y_test))

  return X_train, y_train, X_test, y_test,N

In [5]:
def get_data_distribution(y_test,y_train):
  #Assumed y_train, y_test data shape (n,1)

  y_test_distribution={}
  y_train_distribution={}

  for label in np.unique(y_test,axis=0): #fix to avoid only 0s and 1s
    y_test_distribution['class='+str(label)]= np.count_nonzero(y_test == label, axis=0)
  
  for label in np.unique(y_train,axis=0):
    y_train_distribution['class='+str(label)]= np.count_nonzero(y_train == label, axis=0)

  #plot
  y_test=pd.DataFrame(y_test_distribution)
  y_train=pd.DataFrame(y_train_distribution)
  fig = go.Figure(data=[
  go.Bar(name='Train', x=y_train.columns, y=y_train.iloc[0,:]),
  go.Bar(name='Test', x=y_test.columns, y=y_test.iloc[0,:])
  ])
  # Change the bar mode
  fig.update_layout(barmode='stack',
                    title_text='Train/Test data distribution')
  fig.show()    
  
  return y_train_distribution,y_test_distribution

##2.3 Get ECG data

In [6]:
X,y=read_cvs_with_html_tags()
X_train, y_train, X_test, y_test,N=split_data(X,y,test_size=0.3)
y_train_distribution,y_test_distribution=get_data_distribution(y_test,y_train)

#Part 3: ANN model

We have a small amount of data, which is further reduced due to splitting the data into a training and testing set. It could be said that due to the small amount of presented data in this dataset, we must be careful to not create an overly complex model, which could lead to overfitting our data. But in order to confirm wheter your model has too many parameters you need to do some experimentation.

We are going to use an architecture based on a single Dense layers with 128 neurons using a ReLU (Rectified Linear Unit) activation function. A dense layer with a softmax activation function will be used as output layer.

In order to allow us to know if our model is properly learning, we will use a categorical cross entropy loss function because we have four labels and to report the performance of it we will adopt the mean squared error as metric.

##3.1 Model arquitecture

In [7]:
from keras.models import Sequential
from keras.layers import Dense

model = Sequential()
model.add(Dense(128, input_shape=(X_train.shape[1],) ,activation='relu', name='dense_1'))
#model.add(Dense(64, activation='relu', name='dense_2'))
#model.add(Dense(128, activation='relu', name='dense_3'))
model.add(Dense(4, activation='softmax', name='dense_output'))


#loss = tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['mse'])

#Switch to one-hot representation and use loss=categegorical_crossentropy

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 128)               5120      
_________________________________________________________________
dense_output (Dense)         (None, 4)                 516       
Total params: 5,636
Trainable params: 5,636
Non-trainable params: 0
_________________________________________________________________


##3.2 Train model

In [8]:
history = model.fit(X_train, y_train, epochs=1000, validation_split=0.05)

Epoch 1/1000
Epoch 2/1000
Epoch 3/1000
Epoch 4/1000
Epoch 5/1000
Epoch 6/1000
Epoch 7/1000
Epoch 8/1000
Epoch 9/1000
Epoch 10/1000
Epoch 11/1000
Epoch 12/1000
Epoch 13/1000
Epoch 14/1000
Epoch 15/1000
Epoch 16/1000
Epoch 17/1000
Epoch 18/1000
Epoch 19/1000
Epoch 20/1000
Epoch 21/1000
Epoch 22/1000
Epoch 23/1000
Epoch 24/1000
Epoch 25/1000
Epoch 26/1000
Epoch 27/1000
Epoch 28/1000
Epoch 29/1000
Epoch 30/1000
Epoch 31/1000
Epoch 32/1000
Epoch 33/1000
Epoch 34/1000
Epoch 35/1000
Epoch 36/1000
Epoch 37/1000
Epoch 38/1000
Epoch 39/1000
Epoch 40/1000
Epoch 41/1000
Epoch 42/1000
Epoch 43/1000
Epoch 44/1000
Epoch 45/1000
Epoch 46/1000
Epoch 47/1000
Epoch 48/1000
Epoch 49/1000
Epoch 50/1000
Epoch 51/1000
Epoch 52/1000
Epoch 53/1000
Epoch 54/1000
Epoch 55/1000
Epoch 56/1000
Epoch 57/1000
Epoch 58/1000
Epoch 59/1000
Epoch 60/1000
Epoch 61/1000
Epoch 62/1000
Epoch 63/1000
Epoch 64/1000
Epoch 65/1000
Epoch 66/1000
Epoch 67/1000
Epoch 68/1000
Epoch 69/1000
Epoch 70/1000
Epoch 71/1000
Epoch 72/1000
E

##3.3 Plot loss/validation function

In [9]:
fig = go.Figure()
fig.add_trace(go.Scattergl(y=history.history['loss'],
                    name='Train'))
fig.add_trace(go.Scattergl(y=history.history['val_loss'],
                    name='Validation'))

fig.update_layout(height=500, width=700,
                  xaxis_title='Epoch',
                  yaxis_title='Loss')
fig.show()

#Part 4: Evaluate model

In order to properly assess if our model is capable of working on a real world scenario, we must then evaluate it using our test set. We do below by using the evaluate method along with the features and targets from the test set.

##4.1 Estimate MS and MA error

Mean Square Error (MSE) and the Mean Absolute Error are computed using the X test set over the y test set.

In [10]:
mse_nn, mae_nn = model.evaluate(X_test, y_test)



##4.2 Compute Confusion Matrix

In [11]:
predictions = model.predict(X_test)

In [12]:
predictions=np.around(predictions,0) 
predictions= predictions.astype(int) #one-hot representation uses binary
#predictions

In [13]:
# Confusion matrix
#sklearn.metrics.confusion_matrix(y_true, y_pred, *, labels=None, sample_weight=None, normalize=None)
confusion_matrix(np.argmax(y_test,axis=1),np.argmax(predictions,axis=1)) #decode one-hot encoding

array([[57,  1,  0,  0],
       [ 1, 26,  0,  0],
       [ 0,  0, 28,  0],
       [ 1,  0,  0, 25]])