# Tema 7 - Ejercicio Redes de Neuronas

La base de datos incluida en el archivo Bank.csv (dentro de Bank.zip)
recoge información de 4.521 clientes a los que se les ofreció contratar un
depósito a plazo en una entidad bancaria portuguesa (el zip también
contiene un fichero de texto denominado Bank-names.txt con el detalle
completo de todas las variables incluidas)
Utilizando dicha base de datos, elabore una red neuronal que permita
pronosticar si, en base a sus características, el cliente contratará el
depósito o no.

De cara a la realización de este ejercicio, debe tener en cuenta que:

- La variable objetivo de nuestro modelo es “y”, la cual tiene el valor
“yes” si el cliente ha contratado el depósito y “no” en caso contrario.

- Observe que hay múltiples variable de tipo cualitativo que deberá
transformar antes de estimar el modelo.

- No olvide normalizar los datos antes de introducirlos en el modelo.
  
- Recuerde especificar el número de capas ocultas y neuronas
utilizadas, así como el umbral de error permitido y el algoritmo de
cálculo elegidos. Se permite realizar y presentar variaciones del
modelo a fin de obtener un ajuste óptimo.

- Deberá dejar un porcentaje del dataset para validar los resultados de
la red neuronal estimada.




**NOTA:**

Mirar esto para los warnings</br>
</br>
https://github.com/tensorflow/tensorflow/issues/42738    </br> 
</br>
Ejecuntado esto en una consola:   </br>
</br>
for a in /sys/bus/pci/devices/*; do echo 0 | sudo tee -a $a/numa_node; done  </br>
</br>
Desaparecen esos warnings </br>
(puede que aparezcan cada vez que se reinicia el ordenador)</br>

https://stackoverflow.com/questions/44232898/memoryerror-in-tensorflow-and-successful-numa-node-read-from-sysfs-had-negativ

In [30]:
import tensorflow as tf
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
gpu_devices

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


Importamos dependencias


In [31]:
import pandas as pd # to load and manipulate data and for One-Hot Encoding
import numpy as np # to calculate the mean and standard deviation
import matplotlib.pyplot as plt 
import matplotlib.colors as colors

from sklearn.model_selection import train_test_split 
from sklearn.preprocessing import scale  
from sklearn.model_selection import GridSearchCV # this will do cross validation (grid search cross validation)

from sklearn.metrics import accuracy_score, confusion_matrix, classification_report, ConfusionMatrixDisplay

# Keras
import keras

from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Input
from keras.layers import Dropout
#from keras.utils import to_categorical
from keras.utils import plot_model

## Paso 1: importar datos

In [3]:
## import data
bank_raw = pd.read_csv(r"./Bank/bank.csv",sep=';')

## Paso 2: explorar y procesar datos

In [4]:
# explore and prepare data
bank_raw.head()

Unnamed: 0,age,job,marital,education,default,balance,housing,loan,contact,day,month,duration,campaign,pdays,previous,poutcome,y
0,30,unemployed,married,primary,no,1787,no,no,cellular,19,oct,79,1,-1,0,unknown,no
1,33,services,married,secondary,no,4789,yes,yes,cellular,11,may,220,1,339,4,failure,no
2,35,management,single,tertiary,no,1350,yes,no,cellular,16,apr,185,1,330,1,failure,no
3,30,management,married,tertiary,no,1476,yes,yes,unknown,3,jun,199,4,-1,0,unknown,no
4,59,blue-collar,married,secondary,no,0,yes,no,unknown,5,may,226,1,-1,0,unknown,no


In [5]:
bank_raw.describe()

Unnamed: 0,age,balance,day,duration,campaign,pdays,previous
count,4521.0,4521.0,4521.0,4521.0,4521.0,4521.0,4521.0
mean,41.170095,1422.657819,15.915284,263.961292,2.79363,39.766645,0.542579
std,10.576211,3009.638142,8.247667,259.856633,3.109807,100.121124,1.693562
min,19.0,-3313.0,1.0,4.0,1.0,-1.0,0.0
25%,33.0,69.0,9.0,104.0,1.0,-1.0,0.0
50%,39.0,444.0,16.0,185.0,2.0,-1.0,0.0
75%,49.0,1480.0,21.0,329.0,3.0,-1.0,0.0
max,87.0,71188.0,31.0,3025.0,50.0,871.0,25.0


In [6]:
bank_raw.describe(include='object')

Unnamed: 0,job,marital,education,default,housing,loan,contact,month,poutcome,y
count,4521,4521,4521,4521,4521,4521,4521,4521,4521,4521
unique,12,3,4,2,2,2,3,12,4,2
top,management,married,secondary,no,yes,no,cellular,may,unknown,no
freq,969,2797,2306,4445,2559,3830,2896,1398,3705,4000


In [7]:
#check  if categorical variables contain strange values, there are missing values ..
print(bank_raw['job'].unique())
print(bank_raw['marital'].unique())
print(bank_raw['education'].unique())
print(bank_raw['default'].unique())
print(bank_raw['housing'].unique())
print(bank_raw['loan'].unique())
print(bank_raw['contact'].unique())
print(bank_raw['month'].unique())
print(bank_raw['poutcome'].unique())
print(bank_raw['y'].unique())

['unemployed' 'services' 'management' 'blue-collar' 'self-employed'
 'technician' 'entrepreneur' 'admin.' 'student' 'housemaid' 'retired'
 'unknown']
['married' 'single' 'divorced']
['primary' 'secondary' 'tertiary' 'unknown']
['no' 'yes']
['no' 'yes']
['no' 'yes']
['cellular' 'unknown' 'telephone']
['oct' 'may' 'apr' 'jun' 'feb' 'aug' 'jan' 'jul' 'nov' 'sep' 'mar' 'dec']
['unknown' 'failure' 'other' 'success']
['no' 'yes']


En principio no hay ningún problema.

Ahora toca separar la variable dependiente de las demás:

In [8]:
#X = bank_raw.drop('y',axis=1).copy()
#X.head()

Las redes de neuronas no aceptan variables categóricas, así que las trasformamos en variables numéricas mediante la técnica conocida como "**hot encoding**": </br>
(La variable month la vamos a codificar con valores de 1 a 12, para no añadir 12 columnas).

In [9]:
bank_encoded = pd.get_dummies(bank_raw, columns=['job','marital','education','default','housing','loan','contact','poutcome','y'], dtype='int64')
bank_encoded.head()

Unnamed: 0,age,balance,day,month,duration,campaign,pdays,previous,job_admin.,job_blue-collar,...,loan_yes,contact_cellular,contact_telephone,contact_unknown,poutcome_failure,poutcome_other,poutcome_success,poutcome_unknown,y_no,y_yes
0,30,1787,19,oct,79,1,-1,0,0,0,...,0,1,0,0,0,0,0,1,1,0
1,33,4789,11,may,220,1,339,4,0,0,...,1,1,0,0,1,0,0,0,1,0
2,35,1350,16,apr,185,1,330,1,0,0,...,0,1,0,0,1,0,0,0,1,0
3,30,1476,3,jun,199,4,-1,0,0,0,...,1,0,0,1,0,0,0,1,1,0
4,59,0,5,may,226,1,-1,0,0,1,...,0,0,0,1,0,0,0,1,1,0


In [10]:
#X_encoded = pd.get_dummies(X, columns=['job','marital','education','default','housing','loan','contact','poutcome'], dtype='int64')
X_encoded = bank_encoded.drop('y_no',axis=1).drop('y_yes',axis=1).copy()
X_encoded.head()

Unnamed: 0,age,balance,day,month,duration,campaign,pdays,previous,job_admin.,job_blue-collar,...,housing_yes,loan_no,loan_yes,contact_cellular,contact_telephone,contact_unknown,poutcome_failure,poutcome_other,poutcome_success,poutcome_unknown
0,30,1787,19,oct,79,1,-1,0,0,0,...,0,1,0,1,0,0,0,0,0,1
1,33,4789,11,may,220,1,339,4,0,0,...,1,0,1,1,0,0,1,0,0,0
2,35,1350,16,apr,185,1,330,1,0,0,...,1,1,0,1,0,0,1,0,0,0
3,30,1476,3,jun,199,4,-1,0,0,0,...,1,0,1,0,0,1,0,0,0,1
4,59,0,5,may,226,1,-1,0,0,1,...,1,1,0,0,0,1,0,0,0,1


In [11]:
#y = bank_raw['y'].replace({'no':0, 'yes':1}).astype(int)
#y.head()

#from sklearn.preprocessing import LabelEncoder

#y = bank_raw['y'].copy()
#encoder = LabelEncoder()
#encoder.fit(y)
#y_encoded = encoder.transform(y)
#y_encoded.shape

#y_encoded = pd.get_dummies(bank_raw['y'], columns=['y'], dtype='int64')
#y_encoded.head()

y_encoded = bank_encoded[['y_no','y_yes']].copy()
y_encoded.head()

Unnamed: 0,y_no,y_yes
0,1,0
1,1,0
2,1,0
3,1,0
4,1,0


Ahora transformamos los meses ("Ene" -> 1, "Feb" -> 2 ...)

In [12]:
#encoding month (name to number)
from datetime import datetime
X_encoded['month'] = X_encoded['month'].apply(lambda m : datetime.strptime(m, '%b').month)
#X_encoded['month']

In [13]:
X_encoded.head()

Unnamed: 0,age,balance,day,month,duration,campaign,pdays,previous,job_admin.,job_blue-collar,...,housing_yes,loan_no,loan_yes,contact_cellular,contact_telephone,contact_unknown,poutcome_failure,poutcome_other,poutcome_success,poutcome_unknown
0,30,1787,19,10,79,1,-1,0,0,0,...,0,1,0,1,0,0,0,0,0,1
1,33,4789,11,5,220,1,339,4,0,0,...,1,0,1,1,0,0,1,0,0,0
2,35,1350,16,4,185,1,330,1,0,0,...,1,1,0,1,0,0,1,0,0,0
3,30,1476,3,6,199,4,-1,0,0,0,...,1,0,1,0,0,1,0,0,0,1
4,59,0,5,5,226,1,-1,0,0,1,...,1,1,0,0,0,1,0,0,0,1


Ya solo queda normalizar las variables numéricas.

Pero primero vamos a separar los conjuntos de entrenamiento y de test (para evitar así "data leakage": que información del conjunto de entrenamiento contamine el conjunto de test).

In [14]:
%%time
X_train, X_test, y_train, y_test=train_test_split(X_encoded, y_encoded, test_size=0.25, random_state=99)

X_train_scaled = scale(X_train)
X_test_scaled = scale(X_test)

CPU times: user 12.8 ms, sys: 100 μs, total: 12.9 ms
Wall time: 12.2 ms


In [15]:
print(X_train_scaled.shape)
print(X_test_scaled.shape)
print(y_train.shape)
print(y_test.shape)

(3390, 40)
(1131, 40)
(3390, 2)
(1131, 2)


## Paso 3: Entrenamiento del modelo

Construimos 3 redes, basándonos en lo ya visto en R usando Keras:

In [16]:
def classifier_testing(clf, X_train, X_test, y_train, y_test, epochs=100, verbose=0):
    # Training
    clf.fit(X_train, y_train,  epochs=epochs, verbose=verbose)
  
    #Predictions
    y_pred = clf.predict(X_test)

    #transform the two columns with numbers in 0 (first column = no) or 1 (second column = 1)
    y_pred=np.argmax(y_pred, axis=1)
    y_test=np.argmax(y_test, axis=1)

    #Accuracy
    clf_accuracy_score = round(accuracy_score(y_test, y_pred)*100, 2)
    print("Accuracy Score:\n", clf_accuracy_score, "\n")

    #Confusion Matrix
    conf_mtx = confusion_matrix(y_test, y_pred)
    print("Confusion Matrix:\n", conf_mtx, "\n")
    
    #Classification Report
    class_rep = classification_report(y_test, y_pred)
    print("Classification Report:\n", class_rep, "\n")

In [17]:
# define classification model
def simple_classification_model():
    # create model
    
    model = Sequential(name = "keras_40_2")
    model.add(Input(shape=(X_train_scaled.shape[1],)))
    model.add(Dense(40, activation='relu'))
    model.add(Dense( 2, activation='sigmoid'))
    
    # compile model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model

In [34]:
# define classification model
def medium_classification_model():
    # create model
    
    model = Sequential(name = "keras_40_10_2")
    model.add(Input(shape=(X_train_scaled.shape[1],)))
    
    model.add(Dense(40, activation='relu'))
    model.add(Dropout(rate=0.8))
    model.add(Dense(10, activation='relu'))
    model.add(Dropout(rate=0.4))
    model.add(Dense( 2, activation='sigmoid'))
    
    # compile model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model

In [39]:
# define classification model
def complex_classification_model():
    # create model
    
    model = Sequential(name = "keras_complex")
    model.add(Input(shape=(X_train_scaled.shape[1],)))
    
    model.add(Dense(40, activation='relu'))
    model.add(Dropout(rate=0.9))
    model.add(Dense(20, activation='relu'))
    model.add(Dropout(rate=0.6))
    model.add(Dense(10, activation='relu'))
    model.add(Dropout(rate=0.4))
    model.add(Dense( 4, activation='relu'))
    model.add(Dense( 2, activation='sigmoid'))
    
    
    # compile model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    return model

In [20]:
# build the models
simple_model = simple_classification_model()

2025-03-09 18:16:13.674733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:2021] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 3402 MB memory:  -> device: 0, name: NVIDIA GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1


In [21]:
%%time
# train the simple_model and evaluate it
classifier_testing(simple_model, X_train_scaled, X_test_scaled, y_train, y_test, epochs=500)

I0000 00:00:1741540574.901356  112093 service.cc:146] XLA service 0x7f8f64006740 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1741540574.901382  112093 service.cc:154]   StreamExecutor device (0): NVIDIA GeForce GTX 1050, Compute Capability 6.1
2025-03-09 18:16:14.923455: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2025-03-09 18:16:15.009554: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:531] Loaded cuDNN version 90101
I0000 00:00:1741540575.697701  112093 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m36/36[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step 
Accuracy Score:
 87.0 

Confusion Matrix:
 [[930  69]
 [ 78  54]] 

Classification Report:
               precision    recall  f1-score   support

           0       0.92      0.93      0.93       999
           1       0.44      0.41      0.42       132

    accuracy                           0.87      1131
   macro avg       0.68      0.67      0.68      1131
weighted avg       0.87      0.87      0.87      1131
 

CPU times: user 1min 21s, sys: 6.38 s, total: 1min 27s
Wall time: 56.3 s


classifier_testing(simple_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=50**) </br>
->  Accuracy Score: 0.8947833775419982

In [22]:
#plot_model(simple_model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

In [35]:
# build the models
medium_model = medium_classification_model()

In [36]:
%%time

# train the medium_model and evaluate it
classifier_testing(medium_model, X_train_scaled, X_test_scaled, y_train, y_test, epochs=1000)

[1m36/36[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
Accuracy Score:
 90.1 

Confusion Matrix:
 [[973  26]
 [ 86  46]] 

Classification Report:
               precision    recall  f1-score   support

           0       0.92      0.97      0.95       999
           1       0.64      0.35      0.45       132

    accuracy                           0.90      1131
   macro avg       0.78      0.66      0.70      1131
weighted avg       0.89      0.90      0.89      1131
 

CPU times: user 2min 45s, sys: 11.7 s, total: 2min 57s
Wall time: 1min 56s


classifier_testing(medium_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=50**) </br>
-> Accuracy Score: 0.8726790450928382

In [40]:
# build the models
complex_model = complex_classification_model()

In [41]:
%%time

# train the complex_model and evaluate it
classifier_testing(complex_model, X_train_scaled, X_test_scaled, y_train, y_test, epochs=1000)

[1m36/36[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
Accuracy Score:
 89.92 

Confusion Matrix:
 [[978  21]
 [ 93  39]] 

Classification Report:
               precision    recall  f1-score   support

           0       0.91      0.98      0.94       999
           1       0.65      0.30      0.41       132

    accuracy                           0.90      1131
   macro avg       0.78      0.64      0.68      1131
weighted avg       0.88      0.90      0.88      1131
 

CPU times: user 2min 55s, sys: 11.1 s, total: 3min 6s
Wall time: 2min 5s


## Paso 4: Evaluación del modelo

Comparamos resultados obtenidos con el modelo más complejo:

    model.add(Dense(40, activation='relu'))
    model.add(Dropout(rate=0.4))
    model.add(Dense(20, activation='relu'))
    model.add(Dropout(rate=0.3))
    model.add(Dense(10, activation='relu'))
    model.add(Dropout(rate=0.15))
    model.add(Dense( 4, activation='relu'))
    model.add(Dense( 2, activation='sigmoid'))

classifier_testing(complex_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=200**)  </br> 
->   Accuracy Score:  0.8885941644562334 

classifier_testing(complex_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=1000**) </br>
-> Accuracy Score: **89.57**
</br>
</br>
Confusion Matrix: </br>
 [[960  39]  </br>
 [ 79  53]] 

    model.add(Dense(40, activation='relu'))
    model.add(Dropout(rate=0.8))
    model.add(Dense(20, activation='relu'))
    model.add(Dropout(rate=0.4))
    model.add(Dense(10, activation='relu'))
    model.add(Dropout(rate=0.2))
    model.add(Dense( 4, activation='relu'))
    model.add(Dense( 2, activation='sigmoid'))

classifier_testing(complex_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=1000**) </br>
Accuracy Score:  </br>
 **90.00884173297966** </br> 
</br>
Confusion Matrix: </br>
 [[968  31] </br>
 [ 82  50]] 
    

Aumentando los ratios de dropout todavía más, se consigue mejorar, aunque sea un poco solo. Se consigue pasar del 90%!

    model.add(Dense(40, activation='relu'))
    model.add(Dropout(rate=0.9))
    model.add(Dense(20, activation='relu'))
    model.add(Dropout(rate=0.6))
    model.add(Dense(10, activation='relu'))
    model.add(Dropout(rate=0.3))
    model.add(Dense( 4, activation='relu'))
    model.add(Dense( 2, activation='sigmoid'))


classifier_testing(complex_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=1000**) </br>
Accuracy Score:  </br>
 **90.19** </br> 
</br>
Confusion Matrix: </br>
 [[969  30] </br>
 [ 81  51]] 
    

Disminuyendo los ratios de descartes, el resultado final es peor:

    model.add(Dense(40, activation='relu'))
    model.add(Dropout(rate=0.30))
    model.add(Dense(20, activation='relu'))
    model.add(Dropout(rate=0.15))
    model.add(Dense(10, activation='relu'))
    model.add(Dropout(rate=0.05))
    model.add(Dense( 4, activation='relu'))
    model.add(Dense( 2, activation='sigmoid'))

classifier_testing(complex_model, X_train_scaled, X_test_scaled, y_train, y_test, **epochs=1000**) </br>
Accuracy Score:  </br>
 **88.51** </br> 
</br>
Confusion Matrix: </br>
 [[960  39] </br>
  [ 91  41]] 
