<a href="https://colab.research.google.com/github/andrybrew/data-analytic/blob/master/004_artificial_neural_network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Artificial Neural Network

Artificial neural networks (ANN) or connectionist systems are computing systems vaguely inspired by the biological neural networks that constitute animal brains. Such systems "learn" to perform tasks by considering examples, generally without being programmed with task-specific rules.

## 1.ANN: Classification with Keras

**Import Libraries**

In [0]:
# Import Library for Data Manipulation
import pandas as pd
import numpy as np
import keras

In [0]:
# Import Library for Visualization
import matplotlib. pyplot as plt
import seaborn as sns

**Import Data**

In [0]:
# Import Dataset
df_cancer = pd.read_csv('https://raw.githubusercontent.com/dianrdn/data/master/breast_cancer.csv', sep=',')
df_cancer

In [0]:
# Prints the Dataset Information
df_cancer.info()

In [0]:
# Prints Descriptive Statistics
df_cancer.describe().transpose()

**Data Preparation**

Set Features and Target

In [0]:
# Seperating Features (independent Variables) and Target (Dependent Variable)

feature = df_cancer.drop(['id', 'diagnosis', 'Unnamed: 32'], axis=1)
target = df_cancer['diagnosis']

Encode Categorical Data

In [0]:
# Import Module
from sklearn.preprocessing import LabelEncoder

# Encoder
labelencoder = LabelEncoder()

# Encode Categorical Data
target = labelencoder.fit_transform(target)
target

Set Training and Testing Data

In [0]:
# Set Training and Testing Data (80:20)
from sklearn.model_selection import train_test_split, cross_val_score
feature_train, feature_test, target_train, target_test = train_test_split(feature , target, shuffle = True, test_size=0.2, random_state=42)

# Show the Training and Testing Data
print(feature_train.shape)
print(feature_test.shape)
print(target_train.shape)
print(target_test.shape)

**Modeling ANN**

How did we decide that the number of units in the layer? People will tell you it is an art and it comes with experience and expertise. A simple way for a beginner is to add the total number of columns in X and y and divide by 2. (30+1)/2 = 15.5 ~ 16. hence , units = 16.

In [0]:
# Import Modules
from keras.models import Sequential
from keras.layers import Dense

# Initialising the ANN
classifier = Sequential()

# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu', input_dim = 30))

# Adding the second hidden layer
classifier.add(Dense(units = 16, kernel_initializer = 'uniform', activation = 'relu'))

# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation = 'relu'))

# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])

In [0]:
# Fitting the ANN to the Training Set
classifier.fit(feature_train, target_train, batch_size = 10, epochs = 100)

In [0]:
# Predicting the Test Set Results
target_predicted = classifier.predict(feature_test)
target_predicted = (target_predicted > 0.5)

**Model Evaluation**

In [0]:
# Import Library
import sklearn.metrics as metrics

# Confsion Matrix
cm = metrics.confusion_matrix(target_test, target_predicted)
cm

In [0]:
# Accuracy, Precision, Recall
acc= metrics.accuracy_score(target_test, target_predicted)
prec = metrics.precision_score(target_test, target_predicted)
rec = metrics.recall_score(target_test, target_predicted)
f1 = metrics.f1_score(target_test, target_predicted)
kappa = metrics.cohen_kappa_score(target_test, target_predicted)

# Show Accuracy, Precision, Recall
print('Accuracy:', acc )
print('Precision:', prec)
print('Recall:', rec)
print('F1 Score:', f1)
print('Cohens Kappa Score:', kappa)

## 2.ANN: Multi-layer Perceptron Regressor

The Boston Housing Dataset consists of price of houses in various places in Boston. Alongside with price, the dataset also provide information such as Crime (CRIM), areas of non-retail business in the town (INDUS), the age of people who own the house (AGE), and there are many other attributes. https://www.kaggle.com/c/boston-housing

**Import Libraries**

In [0]:
# Import Library for Data Manipulation
import pandas as pd
import numpy as np

In [0]:
# Import Library for Visualization
import matplotlib. pyplot as plt
import seaborn as sns

**Import Data**

In [0]:
# Import Dataset
df_housing = pd.read_csv('https://raw.githubusercontent.com/dianrdn/data/master/boston_housing.csv', sep=';')
df_housing

In [0]:
# Prints the Dataset Information
df_housing.info()

In [0]:
# Prints Descriptive Statistics
df_housing.describe().transpose()

**Data Preparation**

In [0]:
# Import Module
from sklearn.preprocessing import MinMaxScaler

# initialize min-max scaler
mm_scaler = MinMaxScaler()
column_names = df_housing.columns.tolist()
column_names.remove('medv')

# Transform all attributes
df_housing[column_names] = mm_scaler.fit_transform(df_housing[column_names])
df_housing.sort_index(inplace=True)
df_housing.head()

Set Features and Target

In [0]:
# Seperating Features (independent Variables) and Target (Dependent Variable)

feature = df_housing.drop(['medv'], axis=1)
target = df_housing['medv']

Set Training and Testing Data

In [0]:
# Set Training and Testing Data (80:20)
from sklearn.model_selection import train_test_split, cross_val_score
feature_train, feature_test, target_train, target_test = train_test_split(feature , target, shuffle = True, test_size=0.2, random_state=42)

# Show the Training and Testing Data
print(feature_train.shape)
print(feature_test.shape)
print(target_train.shape)
print(target_test.shape)

**Modeling ANN**

In [0]:
# Import Module
from sklearn.neural_network import MLPRegressor

# Initialising the ANN
mlp = MLPRegressor(hidden_layer_sizes=(70), activation='relu', solver='adam', alpha=0.0001, batch_size='auto', learning_rate='constant', max_iter=50000, verbose = True)

In [0]:
# Fitting the ANN to the Training Set
mlp.fit(feature_train, target_train)

**Predict Data**

In [0]:
# Predicted Data
target_predicted = mlp.predict(feature_test)
target_predicted = pd.DataFrame(target_predicted)

# Actual Data
target_test = pd.DataFrame(target_test)
target_test.reset_index(drop=True, inplace=True)

# Predicted vs Actual Data
predvsactual = pd.concat([target_predicted, target_test], ignore_index=True, axis=1)
predvsactual.columns = ['predicted', 'actual']
predvsactual

In [0]:
# Set Graph Size
plt.rcParams['figure.figsize'] = (16, 8)

# Visualize Actual vs Predicted House Price Index
plt.plot(target_test, color='blue', label='actual')
plt.plot(target_predicted, color='red', label='prediction')
plt.xlabel(' ')
plt.ylabel(' ')
plt.title('Median Value of Owner-Occupied Homes in $1000s')
plt.legend()
plt.show()

**Model Evaluation**

In [0]:
# Import Module
from sklearn.metrics import mean_squared_error

# Calculate MSE
train_mse = mean_squared_error(target_train, mlp.predict(feature_train))
test_mse = mean_squared_error(target_test, mlp.predict(feature_test))

# Show MSE
print("Train MSE:", np.round(train_mse,2))
print("Test MSE:", np.round(test_mse,2))