# NEURAL NETWORK

### FACEBOOK PREDICTION

This practical exercice will consist of predict whether a customer of facebook will like or not a post present to him  

In [1]:
import pandas as pd
import numpy as np
import math
import keras

df = pd.read_csv("Facebook1.csv", sep = ",", encoding = "UTF")  # Downloading csv data

Let's have a look on data

In [2]:
df.shape

(160407, 34)

Let's have a look on data description and data type

In [3]:
df.describe().T
df.dtypes

ID                           int64
Name                        object
Age                          int64
Gender                       int64
Marital Status               int64
Town                         int64
Nationality                  int64
Education                    int64
Hobby                        int64
Occupation                   int64
Contact                      int64
Religion                     int64
Language                     int64
Group type                   int64
Post number per day          int64
Who posted ?                 int64
When ?                       int64
Relationship                 int64
Age of poster                int64
Gender of poster             int64
Post Nature                  int64
Subject                      int64
Reaction of Other            int64
Shared                       int64
Tags                         int64
Privacy                      int64
Location                     int64
Feeling                      int64
Post Language       

Now, let's divided data into feature and predicted

In [4]:
pred = df[["Like"]]
feat = df.drop(["ID","Name", "Like"], axis = 1)
X = feat
Y = pred

## **Sampling**

We are using cross validation protocol to implement our model. So we will split sample into **75%** for learning and **25%** for test. Thus we note: `x_train` learning data for prediction and `x_test` data we will use for testing our model. Data to be predicted for set `x_train` and `x_test` are `y_train` and `y_test`. We will use the library **train_test_split** of `sklearn.model_selection`.

In [5]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size = 0.25, random_state = 0)



## **Construction of our Neural Network model**

Steps of neural network learning that we will implement:

1- Random initialization of weights uniformly randomly;

2- Put the observations of our dataset on the input layer of the network;

3- Propagate input values to the output: neurons are activated and predict whether a client will like or not;

4- Compare the predicted results to the actual values of our dataset;

5- We adjust the weights for the neurons involved in the errors noted in the previous step;

6- Repeat all previous steps as many times as desired ans keep the weights that produced the lowest error.

Throughout our study, we will make modeling choices for our neural network and justify them.

Our network will have an **input layer** and an **output layer** for the input layer we choose for the first step 1 node, for connection weight we use an uniform distribution and the activation function 'relu'. The input_shape=31 because we have 31 variables.

Also have: sigmoïd, softmax, softplus, tanh, selu, elu, LeakyRelu

In [6]:
from keras.models import Sequential
from keras.layers import Dense, Dropout

network = Sequential()
network.add(Dense(1, kernel_initializer='uniform', activation='relu', bias_initializer='zero', input_shape=(31,)))
network.add(Dense(1, kernel_initializer='uniform', activation='relu'))

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [7]:
network.summary()   #Present the total number of parameters of our netwok

Now, we will compile our neural network

To compile our network, we use the `rmsprop` optimizer, the `categorical_crossentropy` as loss function and then the `accuracy` metric.

In [8]:
network.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])

Let's fit the network now.

The batch size is a number of samples processed before yhe connection weightof the model is updated(trying smaller batch size first 32 or 64. In short the powers of 2).

The number of epoch is the number of complete passes through the training dataset.

`Workers` represents the total number of processus that will be used to train our model, only when you are computing in parallel.

In [16]:
network.fit(x_train, y_train, batch_size=32, epochs=100)

Epoch 1/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 2ms/step - accuracy: 0.4966 - loss: -8.0253
Epoch 2/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 1ms/step - accuracy: 0.4956 - loss: -8.0409
Epoch 3/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 1ms/step - accuracy: 0.4998 - loss: -7.9744
Epoch 4/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 2ms/step - accuracy: 0.4949 - loss: -8.0526
Epoch 5/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 1ms/step - accuracy: 0.4950 - loss: -8.0513
Epoch 6/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 1ms/step - accuracy: 0.4978 - loss: -8.0060
Epoch 7/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 2ms/step - accuracy: 0.4956 - loss: -8.0415
Epoch 8/100
[1m3760/3760[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 2ms/step - accuracy: 0.4953 - loss: -8.0465
Epoch 9/

<keras.src.callbacks.history.History at 0x1c1bfd77a50>

Let's remove some data for the prediction

In [17]:
x_new = x_test[:2] #Selection of the first line of x_test matrix
y_new = y_test[:2]
x_new

Unnamed: 0,Age,Gender,Marital Status,Town,Nationality,Education,Hobby,Occupation,Contact,Religion,...,Shared,Tags,Privacy,Location,Feeling,Post Language,Device,Marital Status of Poster,Followers,Origin of Post
57152,85,1,6,2,3,2,10,2,7,1,...,6,1,3,2,7,4,1,2,1,1
113160,63,2,5,5,2,1,7,1,5,1,...,7,1,4,1,1,4,3,3,3,4


In [18]:
y_new

Unnamed: 0,Like
57152,1
113160,2


## **Testing the model**

Now, we will evaluate the parameters of our network

In [19]:
loss, score = network.evaluate(x_test, y_test)
print("loss: ", loss)
print("score: ", score)

[1m1254/1254[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2ms/step - accuracy: 0.4986 - loss: -7.9941
loss:  -8.002863883972168
score:  0.49790534377098083


## **Prediction**

In [20]:
o = network.predict(x_new)
o

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 102ms/step


array([[2.0684192],
       [2.4021704]], dtype=float32)

Comparison with the real value

In [21]:
e = y_new - o
e

Unnamed: 0,Like
57152,-1.068419
113160,-0.40217
