# Neural Networks

**How do they work ?**

A neuron or node is the unit that takes in one or more inputs, multiplies each input by a parameter (weight), sums the weighted input's values along with some bias value (tipically 1) and then feeds the value into an activation function.

Feedforward neural network : Also called multilayer perceptron is the simplest artificial network.

The name is obtained from the fact that an observation's feature values are fed forward through network, with each layer successively transforming the feature values with the goal that the output at the end is the same as the target's value.

- Input layer : Each unit contains an observation's value for a single feature. (100 features = 100 nodes)
- Output layer : End of neural network.Transform the output of the hidden layer into values useful for the task at hand.
- Hidden layers: Between input and output layers. Transform the features values from the input layer to do something that once processed resembles the target class.

(When having hidden layers is called DeepLearning)


Forward Propagation : Parameters are initialize as small random values from a gaussian or normal uniform. Once an observation is fed through the network, the outputted value is compared with the observation's true value using a loss function.

Backpropagation : An algorithm that goes backward though the network identifying how much each parameter contributed to the error between predicted and true values. At each parameter the optimization algorithm determines how much each weightt should be adjusted to improve the output.

Neural network learn by repeating the process of forward propagation and backpropagation for every observation in the training data multiple times.

Epoch: Each time all observations have been sent through the network, training consists on multiple epochs iteratively updating the values of the parameters.






# Preprocessing data for Neural Networks

Use standard scaler.

It is very important for neural networks.

NN often behave poorly when features values are much larger than parameter values. Furthermore, since an observation's feature values are combined as they pass thruogh individuals units. IT IS VERY IMPORTANT THAT ALL THE FEATURES ARE THE SAME SCALE,

Choose an architecture of the NN is an art.

Construct FeedForward NN : 

- 1-. Receives a number of inputs
- 2-. Weights each input by a parameter value.
- 3-. Sums together all weighted inputs along with some bias (tipically 1)
- 4-. Apply activation function
- 5-. Send the output in the next layer 

For each layer in the hidden and output layers we must define the number of units to inclide in the layer and the activation function.

The more units we have in a layer the more oir network is able to learn complex patterns.

The more units can make our model overfit the training data in a way detrimentalto the performance of the test data. 

RELU : Rectified Linear Unit. Activation function. f(z) = max(o,Z) where X is the sum of weighted inputs and bias.

We need to fefine the number of hidden layers to use in the network. More layers allow the network to learn more complex relationships.

Er have to define the structure of the activation function of he output layer.

Output layers patterns : 

- Binary classification: one unit with a sigmoid activation fucntion
- Multiclass classification : k units (k = number of classes) and a softmax activation fucntion
- Regression : One unit with no activation function

Loss Function : 

- Binary classification : Binary cross entropy
- Multiclass classification : Categorical cross entropy
- Regression : Mean Squared Error

Determine optimizer : 

Walking around, tbe loss function to find the parameter that produce the lowest error.
Common choices : stochastic gradient descend with/without momentum, root mena squared propagation, adaptative moment estimation.

We can select one or more metrics to evaluate the performance such as Accuracy

In [19]:
# Load libraries 

import numpy as np
from sklearn.preprocessing import StandardScaler

In [24]:
# Create feature

features = np.array([[-100.1, 3240.1],
                     [-200.2, -234.1],
                     [5000.5, 150.1],
                     [6000.6, -125.1],
                     [9000.9, -673.1]])
features

array([[-100.1, 3240.1],
       [-200.2, -234.1],
       [5000.5,  150.1],
       [6000.6, -125.1],
       [9000.9, -673.1]])

In [20]:
# Create scaler

scaler = StandardScaler()

In [23]:
# Transform the feature

features_standardized = scaler.fit_transform(features)
features_standardized

array([[-1.12541308,  1.96429418],
       [-1.15329466, -0.50068741],
       [ 0.29529406, -0.22809346],
       [ 0.57385917, -0.42335076],
       [ 1.40955451, -0.81216255]])

In [25]:
print("Mean: ", round(features_standardized[:,0].mean()))
print("Standard Deviation: ", round(features_standardized[:,0].std()))

Mean:  0.0
Standard Deviation:  1.0


# Designing a Neural Network

Using keras sequential model.



In [36]:
# load libraries 

from keras import models
from keras import layers

In [37]:
# Start neural Network

network = models.Sequential()

In [38]:
# Add fully connected layer with a ReLU activation function

network.add(layers.Dense(units = 16, activation = "relu", input_shape=(10,)))

In [39]:
# Add fully connected layer with a ReLU activation function

network.add(layers.Dense(units = 16, activation = "relu"))

In [40]:
# Add fully connected layer with a sigmoid activation function

network.add(layers.Dense(units = 1, activation = "sigmoid"))

In [41]:
# Compile Neural network

network.compile(loss = "binary_crossentropy",
                optimizer = "rmsprop",
                metrics = ["accuracy"])

# Training a Binary Classifier



In [42]:
# Load libraries

import numpy as np
from keras import models
from keras import layers
from keras.datasets import imdb
from keras.preprocessing.text import Tokenizer

In [44]:
# Set random seed

np.random.seed(0)

In [45]:
# Set the number of features we want

number_features = 1000

In [46]:
# Load data and target vector frin nivue review data

(data_train, target_train),(data_test, target_test) = imdb.load_data(num_words = number_features)

Downloading data from https://s3.amazonaws.com/text-datasets/imdb.npz


In [47]:
# Convert movie review data to one-hot encoded feature matrix

tokenizer = Tokenizer(num_words = number_features)

features_train = tokenizer.sequences_to_matrix(data_train, mode= "binary")
features_test = tokenizer.sequences_to_matrix(data_test, mode = "binary")

In [48]:
# Start Neural Network

network = models.Sequential()

In [52]:
# Add fully conected layer with ReLu activation function

network.add(layers.Dense(units = 16,activation = "relu", input_shape = (number_features,)))

In [53]:
# Add fully connectec layer with ReLU activation fucntion

network.add(layers.Dense(units = 16, activation = "relu"))

In [54]:
# Add dully connected layer with a sigmoid activation

network.add(layers.Dense(units = 1, activation = "sigmoid"))

In [55]:
# Compile neural network 

network.compile(loss = "binary_crossentropy", optimizer = "rmsprop", metrics = ["accuracy"])

In [57]:
# Train a Neural Network

history = network.fit(features_train, target_train, epochs = 3, verbose = 1, batch_size = 100, validation_data = (features_test, target_test))



Train on 25000 samples, validate on 25000 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


In [58]:
features_train.shape

(25000, 1000)

We use 50.000 movies reviews (50-50 training-test) categorized as positives or negatives.

We convert the text of the review in to 5000 binary features indicating the presence of one of the 1000 most frequent words. (25.000 observations with 1000 features to predict if a movie review is positive or negative)

There are  6 parameter in fit method

- 1-. features
- 2-. target vector
- 3-. epochs
- 4-. verbose
- 5-. batch_size (number of observations to propagate through the network before updating the parameters)
- 6-. held out a test set of fata to use to evaluate the model. validation_data = test features and target vector can be arguments.
- + validation_split (how to split data for evaluation) 
