# Artificial Neural Network (ANN) Methodology

##### Author information
- Name: Silvia Garces Rigol
- email address: silvia.grcs0200@gmail.com
- GitHub: https://github.com/SilviaGarces

#### Part 1. Brief background of ANN
Before the introduction of Artificial neural networks, most scientists were using different techniques. Some of them are: 
1. Rule-based Systems. These systems relied on a set of predefined rules and logic to make decisions or perform tasks. Experts in a particular domain would manually design these rules based on their knowledge and understanding of the problem. However, they had limitations, as they were often unable to handle complex or ambiguous situations where rules may conflict or be incomplete. [1]

2. Expert Systems. A type of AI system designed to mimic human expertise in specific domains. They incorporated knowledge from human experts into a knowledge base and used inference engines to reason and make decisions based on that knowledge. These systems typically used symbolic representations and rules, but they lacked the ability to learn and adapt from data. [2]

3. Statistical Methods:These methods involved analyzing and modeling data using mathematical techniques such as regression, clustering, and classification. Statistical models were built based on assumptions about the underlying data distribution, and parameters were estimated from the available data. However, these models often struggled with complex patterns and non-linear relationships. [3]

4. Feature Engineering: Prior to ANNs, feature engineering played a crucial role in machine learning. It involved manually selecting and engineering relevant features from the input data that could be used by learning algorithms to make predictions or decisions. Feature engineering required domain expertise and often consumed a significant amount of time and effort. [4]

5. Limited Computational Power: Before the widespread availability of powerful computing resources, training complex models and processing large amounts of data were challenging tasks. This limitation made it difficult to build and train models that could handle complex problems effectively.

The artificial neural network (ANN) was introduced in 1943 as a computational model inspired by the structure and function of the human brain. 
Neurons have diffferent parts, first of all a signal is received from the dendrites, then the nucleus makes a summation of all the incoming signals, after that, the axon reacts if the sum is enought to activate it and finally the outcoming signal is the input for other neurons. 
They are used to learn from and make predictions or classifications based on input data. ANNs can approximate complex functions and find patterns in data that may not be immediately obvious or easily quantifiable by humans.

ANNS can be used in various applications including image classification, natural language processing, speech recognition, and predictive analytics. They have been successfully used in many fields, such as marketing, healthcare, finance, and robotics. ANNs can help improve decision-making processes by providing insights and predictions based on data. Overall, ANNs are a powerful tool for analyzing complex data and making predictions that can be valuable in a wide range of industries. [5]


#### Part 2. Key concept of ANN
As explained before, ANNs are a computational model inspired by the structure and function of the human brain. They consist of interconnected nodes, called *neurons*, arranged in layers. The *input layer* receives the input data, and subsequent layers process this data through a series of nonlinear transformations before outputting a result.

The main strength of ANNs lies in their ability to learn and generalize from data, which makes them particularly well-suited for tasks such as image classification, speech recognition, and natural language processing.

The mathematical equation that explains the behavior of a single neuron in an ANN is the weighted sum of inputs, followed by a nonlinear activation function:

```
y = f(w1 * x1 + w2 * x2 + ... + wn * xn)
```
where **y** is the output of the neuron, **x1** to **xn** are the inputs, **w1** to **wn** are the weights assigned to each input, and **f** is the activation function.

The weights w1 to wn are adjusted during the learning process using a variant of stochastic gradient descent (SGD) called backpropagation. **Backpropagation** was intoduced in 1986 and it computes the gradient of the cost function with respect to the weights, and uses this gradient to update the weights to reduce the error between the predicted and actual outputs.

Overall, the ability of ANNs to learn and generalize from data, combined with their ability to approximate complex functions and find patterns that may not be obvious to humans, make them a powerful tool for a wide range of applications.

#### Part 3. Example
I used the PIMA indians Diabetes dataset, which can be downloaded in kaggle: https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database

This dataset is really simple so we can try to build the model to predict some outputs.

In [1]:
import numpy as np
import pandas as pd

In [14]:
diabetes=pd.read_csv('diabetes.csv')
diabetes.head()

Unnamed: 0,Pregnancies,Glucose,BloodPressure,SkinThickness,Insulin,BMI,DiabetesPedigreeFunction,Age,Outcome
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [4]:
data=diabetes.dropna()

In [5]:
data_x=data.drop(columns=['Outcome'])
data_y=data.Outcome

In [6]:
import tensorflow as tf
from tensorflow import keras
from sklearn.metrics import accuracy_score
from keras import metrics
from keras.models import Sequential
from keras.layers import Dense
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

In [7]:
#1: Separate into training and test data
train_x, test_x, train_y, test_y = train_test_split(data_x, data_y, test_size=0.2, random_state=42)
#Normalize the data
min_max_scaler = MinMaxScaler()
train_norm = min_max_scaler.fit_transform(train_x)
test_norm = min_max_scaler.transform(test_x)

In [8]:
#2:Create ANN
model = Sequential()
model.add(Dense(312, input_dim=8, activation='relu'))# Input layer with 8 input features
model.add(Dense(256, activation='relu')) # hidden layer
model.add(Dense(128, activation='relu')) # hidden layer
model.add(Dense(64, activation='relu')) # hidden layer 
model.add(Dense(1, activation='sigmoid')) # output layer

In [9]:
#3: Compile model
model.compile(tf.keras.optimizers.Adam(learning_rate=0.001),loss='binary_crossentropy', metrics=['AUC']) 
# We choose binary_cross entropy because our outcome is 0 or 1
# Adam is one of the best optimizers
# For metrics we want to check accuracy

In [None]:
#4: Fit the model 
history = model.fit(train_norm, train_y, batch_size=32, epochs=200)

In [13]:
train_loss, train_acc = model.evaluate(train_norm, train_y)
print('Training accuracy:', train_acc)

Training accuracy: 1.0


In [12]:
#5: Make predictions of test data
test_pred = model.predict(test_norm)
pred = np.where(test_pred > 0.5, 1, 0).flatten()
test_acc = accuracy_score(test_y, pred)
print('Test accuracy:', test_acc)

Test accuracy: 0.6818181818181818


#### The model
This is the code for creating an ANN with three different hidden layers. After creating our ANN we set the loss function and optimizer we want, there are several and depending on our data we may use on or other.

After that we check the loss and AUC of our model, we can also check the accuracy by changing the metrics. Once the training is done, we check the accuracy of the model and start to make predictions on the test data.
ANNs are used for **classification**, so in this case we need to classify whether the output is 0 or 1, meaning that is has diabetes or not.

Once we check the accuracy of the model to predict the output values for the test data, we can check if the accuracy is appropriate or not. After building the model we can tune some of the **hyperparameters** (*batch and epoch*), the **layers** and the different **loss** and **optimizer** functions.

#### Part 4. References
[1] https://en.wikipedia.org/wiki/Rule-based_system 

[2] https://www.techtarget.com/searchenterpriseai/definition/expert-system

[3] M. Grebovic, L. Filipovic, I. Katnic, M. Vukotic and T. Popovic, "Overcoming Limitations of Statistical Methods with Artificial Neural Networks," 2022 International Arab Conference on Information Technology (ACIT), Abu Dhabi, United Arab Emirates, 2022, pp. 1-6, doi: 10.1109/ACIT57182.2022.9994218.

[4] https://towardsdatascience.com/what-is-feature-engineering-importance-tools-and-techniques-for-machine-learning-2080b0269f10

[5] https://www.ibm.com/topics/neural-networks


