# Neural Networks

Nueral networks have been increasing in popularity due to their advantages over traditional machine learning models because of their flexibility and customisability.

A neural network basic element is called a perceptron, this receives inputs, multiplies them by weights and gives the result to an activation function (logistic, relu, etc..), and in turn this will produce an output.

A neural network is built by creating layers (levels) made up of these perceptrons. A neural network has three layers: input, hidden, and output. 

- Input layer: this is the layer that receives the data
- Output layer: this is the layer that returns a result
- Hidden layer/s: this can be more than one, and this is where computation takes place

## Predicting Diabetes

We are going to use the same dataset used for classification, to assign labels whether a person is diabetic or not.

### Loading the data

In [1]:
import pandas as pd

data = pd.read_csv('diabetes.csv')
data.head()

Unnamed: 0,PatientID,Pregnancies,PlasmaGlucose,DiastolicBloodPressure,TricepsThickness,SerumInsulin,BMI,DiabetesPedigree,Age,Diabetic
0,1354778,0,171,80,34,23,43.509726,1.213191,21,0
1,1147438,8,92,93,47,36,21.240576,0.158365,23,0
2,1640031,7,115,47,52,35,41.511523,0.079019,23,0
3,1883350,9,103,78,25,304,29.582192,1.28287,43,1
4,1424119,1,85,59,27,35,42.604536,0.549542,22,0


### Check data

Next step is to check the state of the data. We can obtain basic statistics.

In [2]:
print (data.shape)
print (data.describe().transpose())

(15000, 10)
                          count          mean            std           min  \
PatientID               15000.0  1.502922e+06  289253.443471  1.000038e+06   
Pregnancies             15000.0  3.224533e+00       3.391020  0.000000e+00   
PlasmaGlucose           15000.0  1.078569e+02      31.981975  4.400000e+01   
DiastolicBloodPressure  15000.0  7.122067e+01      16.758716  2.400000e+01   
TricepsThickness        15000.0  2.881400e+01      14.555716  7.000000e+00   
SerumInsulin            15000.0  1.378521e+02     133.068252  1.400000e+01   
BMI                     15000.0  3.150965e+01       9.759000  1.820051e+01   
DiabetesPedigree        15000.0  3.989677e-01       0.377944  7.804379e-02   
Age                     15000.0  3.013773e+01      12.089703  2.100000e+01   
Diabetic                15000.0  3.333333e-01       0.471420  0.000000e+00   

                                 25%           50%           75%           max  
PatientID               1.252866e+06  1.505508e+

We can also check if there are any nulls.

In [3]:
print (data.isnull().sum())

PatientID                 0
Pregnancies               0
PlasmaGlucose             0
DiastolicBloodPressure    0
TricepsThickness          0
SerumInsulin              0
BMI                       0
DiabetesPedigree          0
Age                       0
Diabetic                  0
dtype: int64


### Selecting Labels and Features

Now we can define the features, and the target label.

In [4]:
target = ['Diabetic']
exclude = ['PatientID']

features = list(set(list(data.columns)) - set(target) - set(exclude))

### Data Scaling

In neural networks it is ideal to convert all the values to the same range. This can be done using MinMaxScaler found in sklearn. It changes all the values from 0 to 1.

In [5]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
scaled = scaler.fit_transform(data[features])
print(scaled)

[[8.58108108e-01 3.13953488e-01 0.00000000e+00 ... 6.68952179e-01
  5.10511281e-01 1.14649682e-02]
 [3.24324324e-01 4.65116279e-01 5.71428571e-01 ... 8.03524571e-02
  3.61229438e-02 2.80254777e-02]
 [4.79729730e-01 5.23255814e-01 5.00000000e-01 ... 6.16137348e-01
  4.38385837e-04 2.67515924e-02]
 ...
 [3.31081081e-01 4.18604651e-01 0.00000000e+00 ... 1.29558076e-02
  1.56958511e-01 5.47770701e-02]
 [5.94594595e-01 1.27906977e-01 0.00000000e+00 ... 4.20555241e-02
  1.00835769e-01 1.87261146e-01]
 [4.72972973e-01 4.65116279e-01 2.14285714e-01 ... 4.76155567e-01
  3.11749422e-02 6.34394904e-01]]


### Data Splitting

Now, we can split the data making sure to use the scaled data for the features.

In [6]:
from sklearn.model_selection import train_test_split

X = scaled
y = data[target[0]].values
print(y)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
print ('Training Instances : ' , X_train.shape)
print ('Test Instances : ' ,X_test.shape)

[0 0 0 ... 0 0 1]
Training Instances :  (10500, 8)
Test Instances :  (4500, 8)


### Training the Model

In this case we will be using a Neural Network classifier named MLPClassifier. It accepts a number of parameters, in this case:
- hidden_layer_sizes is set to (8,8,8) this means 3 hidden layers with 8 perceptrons in each node
- activation is set to 'relu', this determines the activation function 
- solver is set to 'adam', this is the solver for the weight optimization
- max_iter is set to 1000, this is the number of times the network iterates until it converges

For more information: **[MLPClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html)**

After the model is trained, we can predict using the test data set.

In [7]:
'''

8 12 8 4 possible
8 6 4

Which best combination of hidden layers

'''

from sklearn.neural_network import MLPClassifier

mlp = MLPClassifier(hidden_layer_sizes=(8, 8, 8), activation='relu', solver='adam', max_iter=1000)
mlp.fit(X_train, y_train)

predictions = mlp.predict(X_test)

### Evaluting the Model

After we predict the values we can use any metric we want to calculate the accuracy of the model. In this case a classification report, and a confusion matrix is created.

In [8]:
from sklearn.metrics import classification_report, confusion_matrix

print (confusion_matrix(y_test, predictions))
print (classification_report(y_test, predictions))

[[2765  221]
 [ 225 1289]]
              precision    recall  f1-score   support

           0       0.92      0.93      0.93      2986
           1       0.85      0.85      0.85      1514

    accuracy                           0.90      4500
   macro avg       0.89      0.89      0.89      4500
weighted avg       0.90      0.90      0.90      4500

