# Ejercicio: ¿Deberíamos decir adiós a ese cliente?

Utiliza nuestro modelo de RNA para predecir si el cliente con la siguiente información abandonará el banco:



Geografia: Francia

Puntaje de crédito: 600

Género masculino

Edad: 40 años de edad

Tenencia: 3 años.

Saldo: $ 60000

Número de productos: 2

¿Este cliente tiene una tarjeta de crédito? Sí

¿Es este cliente un miembro activo? Sí

Salario estimado: $ 50000

Entonces, ¿deberíamos decir adiós a ese cliente?



La solución se proporciona en el vídeo al final de la tarea realizada pero te recomiendo que intentes resolverla por su cuenta.





¡Disfruta del aprendizaje profundo!

In [25]:
import pandas as pd
import torch
import torch.optim as optim
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import torch.nn as nn

In [26]:
df = pd.read_csv('./Churn_Modelling.csv')
df.head

<bound method NDFrame.head of       RowNumber  CustomerId    Surname  CreditScore Geography  Gender  Age  \
0             1    15634602   Hargrave          619    France  Female   42   
1             2    15647311       Hill          608     Spain  Female   41   
2             3    15619304       Onio          502    France  Female   42   
3             4    15701354       Boni          699    France  Female   39   
4             5    15737888   Mitchell          850     Spain  Female   43   
...         ...         ...        ...          ...       ...     ...  ...   
9995       9996    15606229   Obijiaku          771    France    Male   39   
9996       9997    15569892  Johnstone          516    France    Male   35   
9997       9998    15584532        Liu          709    France  Female   36   
9998       9999    15682355  Sabbatini          772   Germany    Male   42   
9999      10000    15628319     Walker          792    France  Female   28   

      Tenure    Balance  NumOfPro

## Encode Categorical Data

In [27]:
df = pd.get_dummies(df, columns=['Geography'])
gender_enc = {
    'Male': 1,
    'Female': 0,
}
df['Gender'] = df['Gender'].map(gender_enc)
print(df)

      RowNumber  CustomerId    Surname  CreditScore  Gender  Age  Tenure  \
0             1    15634602   Hargrave          619       0   42       2   
1             2    15647311       Hill          608       0   41       1   
2             3    15619304       Onio          502       0   42       8   
3             4    15701354       Boni          699       0   39       1   
4             5    15737888   Mitchell          850       0   43       2   
...         ...         ...        ...          ...     ...  ...     ...   
9995       9996    15606229   Obijiaku          771       1   39       5   
9996       9997    15569892  Johnstone          516       1   35      10   
9997       9998    15584532        Liu          709       0   36       7   
9998       9999    15682355  Sabbatini          772       1   42       3   
9999      10000    15628319     Walker          792       0   28       4   

        Balance  NumOfProducts  HasCrCard  IsActiveMember  EstimatedSalary  \
0        

## Divide dataset in train and evaluation

In [28]:
# Define charasteristics and objective
X = df[['CreditScore','Gender', 'Geography_France','Geography_Spain','Geography_France','Age','Tenure', 'Balance','NumOfProducts','IsActiveMember','HasCrCard']]
y = df['Exited']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
type(X_train)

pandas.core.frame.DataFrame

## Scale variables

In [29]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train) # X_train it's a np array
X_test_scaled = scaler.transform(X_test)

type(X_train)
type(y_train.values)

numpy.ndarray

## Convert to tensor

In [30]:
X_train_tensor = torch.tensor(X_train_scaled, dtype=torch.float32)
X_test_tensor = torch.tensor(X_test_scaled, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train.values.reshape(-1, 1), dtype=torch.float32) # How the output it's unidimensional, we have to converto to bidimensional
y_test_tensor = torch.tensor(y_test.values.reshape(-1, 1), dtype=torch.float32)
print(X_train_tensor.shape[0])
len(X_train_tensor)

8000


8000

## Build ANN

In [31]:
class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.layer1 = nn.Linear(11,6)
        self.relu = nn.ReLU()
        self.dropout1 = nn.Dropout(p=0.1) # 0,1???
        self.layer2 = nn.Linear(6, 6)
        self.dropout2 = nn.Dropout(p=0.1)
        self.output_layer = nn.Linear(6, 1)

        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = self.layer1(x)
        x = self.relu(x)
        x = self.dropout1(x)

        x = self.layer2(x)
        x = self.relu(x)
        x = self.dropout2(x)

        x = self.output_layer(x)
        x = self.sigmoid(x)
        return x

model = NeuralNetwork()

# Define Loss function and optimizer
criterion = nn.BCELoss() # Binary Cross Entropy Loss
optimizer = optim.Adam(model.parameters())  # Optimizador Adam

## Train ANN

In [32]:
epochs = 100
batch_size = 10
model.train()  
for epoch in range(epochs):
    for i in range(0, X_train_tensor.shape[0], batch_size):
        inputs = X_train_tensor[i:i+batch_size]
        labels = y_train_tensor[i:i+batch_size]
        
        optimizer.zero_grad()   # Restart gradients
        outputs = model(inputs)
        loss = criterion(outputs, labels)  # Calc loss
        loss.backward()  # Backpropagation
        optimizer.step()  # Update weights

    if (epoch+1) % 10 == 0:
            print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

Epoch [10/100], Loss: 0.5805
Epoch [20/100], Loss: 0.4393
Epoch [30/100], Loss: 0.3640
Epoch [40/100], Loss: 0.3461
Epoch [50/100], Loss: 0.4533
Epoch [60/100], Loss: 0.3742
Epoch [70/100], Loss: 0.5224
Epoch [80/100], Loss: 0.3643
Epoch [90/100], Loss: 0.3603
Epoch [100/100], Loss: 0.5730


## Eval and calc final predictions

In [33]:
model.eval()  
with torch.no_grad(): 
    outputs = model(X_test_tensor)
    loss = criterion(outputs, y_test_tensor)
    print(f'Test Loss: {loss.item():.4f}')
    outputs = (outputs>0.5)
    print(outputs)

Test Loss: 0.3350
tensor([[False],
        [False],
        [False],
        ...,
        [ True],
        [False],
        [False]])


## Elaborate Matriz Confussion

In [34]:
from sklearn.metrics import confusion_matrix
print(y_test_tensor)
cm = confusion_matrix(y_test_tensor.numpy(), outputs.numpy()) # Confussion matrix only accepts np arrays
print("Confussion Matrix")
print(cm)

tensor([[0.],
        [0.],
        [0.],
        ...,
        [1.],
        [1.],
        [1.]])
Confussion Matrix
[[1533   74]
 [ 201  192]]


In [35]:
print("Accuracy:") #sum true positives and true negatives / total preds
(cm[0][0]+cm[1][1])/cm.sum()

Accuracy:


np.float64(0.8625)

## PREDICTION OF THE EXERCISE WITH EXERCISE DATA

In [50]:
X = {
    'CreditScore': [600],
    'Gender': [1],
    'Geography_France': [1],
    'Geography_Spain': [0],
    'Geography_Germany': [0],
    'Age': [40],
    'Tenure': [3],
    'Balance': [60000],
    'NumOfProducts': [2],
    'IsActiveMember': [1],
    'HasCrCard': [1],
}

X_exercise = pd.DataFrame(X)
print(X_exercise)
print(type(X_exercise.values))
X_exercise_tensor = torch.tensor(X_exercise.values, dtype=torch.float32)
print(X_test_tensor)

   CreditScore  Gender  Geography_France  Geography_Spain  Geography_Germany  \
0          600       1                 1                0                  0   

   Age  Tenure  Balance  NumOfProducts  IsActiveMember  HasCrCard  
0   40       3    60000              2               1          1  
<class 'numpy.ndarray'>
tensor([[-0.5775,  0.9132, -0.9985,  ...,  0.8084, -1.0258, -1.5404],
        [-0.2973,  0.9132,  1.0015,  ...,  0.8084,  0.9748,  0.6492],
        [-0.5256, -1.0950, -0.9985,  ...,  0.8084, -1.0258,  0.6492],
        ...,
        [ 0.8131, -1.0950,  1.0015,  ..., -0.9167, -1.0258,  0.6492],
        [ 0.4188,  0.9132,  1.0015,  ..., -0.9167, -1.0258,  0.6492],
        [-0.2454,  0.9132, -0.9985,  ..., -0.9167,  0.9748,  0.6492]])


In [51]:
model.eval()  
with torch.no_grad(): 
    outputs = model(X_exercise_tensor)
    outputs = (outputs>0.5)
    print(outputs)

tensor([[True]])
