# Telco Customer Churn Prediction



**Problem**: Development of a machine learning model that can predict customers who are likely to churn is required.

The Telco customer churn data contains information about 7043 customers who were provided with home telephone and Internet services by a telecom company in California during a specific quarter of a given year.

**21 Variables, 7043 Observations**

Each row represents a unique customer.

Variables include information about the services customers are subscribed to, details about their accounts, contracts, etc.

**Churn**: Whether the customer has churned (Yes or No) - customers who left in the last month or quarter

**MonthlyCharges**: The amount billed to the customer on a monthly basis

**TotalCharges**: The total amount billed to the customer

**CustomerId**: Customer Id

**Gender**: Gender

**SeniorCitizen**: Whether the customer is a senior citizen (1, 0)

**Partner**: Whether the customer has a partner (Yes, No)

**Dependents**: Whether the customer has dependents (Yes, No) (Children, mother, father, grandmother)

**Tenure**: The number of months the customer has been with the company

**PhoneService**: Whether the customer has phone service (Yes, No)

**MultipleLines**: Whether the customer has multiple lines (Yes, No, No phone service)

**InternetService**: The type of internet service provider the customer has (DSL, Fiber optic, None)

**OnlineSecurity**: Whether the customer has online security (Yes, No, No internet service)

**OnlineBackup**: Whether the customer has online backup (Yes, No, No internet service)

**DeviceProtection**: Whether the customer has device protection (Yes, No, No internet service)

**TechSupport**: Whether the customer receives technical support (Yes, No, No internet service)

**StreamingTV**: Whether the customer has TV streaming (Yes, No, No internet service) This indicates if the customer uses their Internet service to stream television programs from a third-party provider

**StreamingMovies**: Whether the customer has movie streaming (Yes, No, No internet service) This indicates if the customer uses their Internet service to stream movies from a third-party provider

**Contract**: The customer's contract duration (Month-to-month, One year, Two years)

**PaperlessBilling**: Whether the customer has paperless billing (Yes, No)

**PaymentMethod**: The customer's payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))

In [None]:
import keras
import tensorflow as tf
print("Keras Current Version:", keras.__version__, "Tensorflow Current Version:", tf.__version__)

Keras Current Version: 2.15.0 Tensorflow Current Version: 2.15.0


In [None]:
# !pip uninstall tf-keras
# !pip install keras-tuner
# !pip install tensorflow==2.16.1

# Imports

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import random
from joblib import dump, load

from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow import keras

from tensorflow.keras.layers import Dense, Dropout, BatchNormalization, Input, Activation
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.regularizers import l2
from tensorflow.keras.optimizers import SGD, RMSprop, Adam
from keras.optimizers.schedules import ExponentialDecay
from tensorflow.keras.metrics import Precision, Recall
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.layers import ReLU, LeakyReLU, PReLU
from keras_tuner import RandomSearch
from keras_tuner.engine.hyperparameters import HyperParameters


random.seed(46)
np.random.seed(46)
tf.random.set_seed(46)

pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', lambda x: '%.5f' % x)


# Functions


In [None]:
def grab_col_names(dataframe, cat_th=10, car_th=20):
    # cat_cols, cat_but_car
    cat_cols = [col for col in dataframe.columns if dataframe[col].dtypes == "O"]

    num_but_cat = [col for col in dataframe.columns if
                   dataframe[col].nunique() < cat_th and dataframe[col].dtypes != "O"]

    #cat_but_car
    cat_but_car = [col for col in dataframe.columns if
                   dataframe[col].nunique() > car_th and dataframe[col].dtypes == "O"]

    cat_cols = cat_cols + num_but_cat

    cat_cols = [col for col in cat_cols if col not in cat_but_car]

    # num_cols
    num_cols = [col for col in dataframe.columns if dataframe[col].dtypes != "O"]

    num_cols = [col for col in num_cols if col not in num_but_cat]

    return cat_cols, num_cols, cat_but_car


def prepare_datasets(X_train, X_val, y_train, y_val, batch_size=None):
    if batch_size is None:
        batch_size = len(X_train)
    train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
    train_dataset = train_dataset.shuffle(buffer_size=len(X_train)).batch(batch_size)
    val_dataset = tf.data.Dataset.from_tensor_slices((X_val, y_val))
    val_dataset = val_dataset.batch(batch_size)
    return train_dataset, val_dataset

def plot_training_history(history, train_loss='loss', train_metric='accuracy', val_loss='val_loss', val_metric='val_accuracy'):

    #Loss
    plt.figure(figsize=(10, 5))
    plt.plot(history.history[train_loss], label='Training Loss')
    plt.plot(history.history[val_loss], label='Validation Loss')
    plt.title('Training and Validation Loss Over Epochs')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()
    plt.show()

    # Metrics
    plt.figure(figsize=(10, 5))
    plt.plot(history.history[train_metric], label=f"Training: {train_metric}")
    plt.plot(history.history[val_metric], label=f"Validation: {val_metric}")
    plt.title(f'Training and Validation {train_metric} Over Epochs')
    plt.xlabel('Epochs')
    plt.ylabel(f'train_metric')
    plt.legend()
    plt.show()

def get_best_epoch_details(history, metric="val_loss", mode=min):
    metric_values = history.history[metric]
    min_metric_value_index = metric_values.index(mode(metric_values))
    best_epoch = min_metric_value_index + 1

    metrics = []
    values = []

    for key, value in history.history.items():
        metrics.append(key)
        values.append(value[min_metric_value_index])

    data = {'Metric': metrics, 'Value': values}
    df = pd.DataFrame(data)
    df['Value'] = df['Value'].map('{:.4f}'.format)
    best_epoch_data = pd.DataFrame({'Metric': ['best_epoch'], 'Value': [str(best_epoch)]})
    df = pd.concat([df, best_epoch_data], ignore_index=True)
    return df

def print_hyperparameters(hyperparameters):
    hp_df = pd.DataFrame(list(hyperparameters.items()), columns=['Hyperparameter', 'Value'])
    print(hp_df)


def dataproprocessing(dataframe):

    cat_cols, num_cols, cat_but_car = grab_col_names(dataframe)


    dataframe["TotalCharges"].fillna(dataframe["TotalCharges"].median(), inplace=True)

    # feature engineering
    dataframe.loc[(dataframe["tenure"] >= 0) & (dataframe["tenure"] <= 12), "NEW_TENURE_YEAR"] = "0-1 Year"
    dataframe.loc[(dataframe["tenure"] > 12) & (dataframe["tenure"] <= 24), "NEW_TENURE_YEAR"] = "1-2 Year"
    dataframe.loc[(dataframe["tenure"] > 24) & (dataframe["tenure"] <= 36), "NEW_TENURE_YEAR"] = "2-3 Year"
    dataframe.loc[(dataframe["tenure"] > 36) & (dataframe["tenure"] <= 48), "NEW_TENURE_YEAR"] = "3-4 Year"
    dataframe.loc[(dataframe["tenure"] > 48) & (dataframe["tenure"] <= 60), "NEW_TENURE_YEAR"] = "4-5 Year"
    dataframe.loc[(dataframe["tenure"] > 60) & (dataframe["tenure"] <= 72), "NEW_TENURE_YEAR"] = "5-6 Year"

    dataframe["NEW_Engaged"] = dataframe["Contract"].apply(lambda x: 1 if x in ["One year", "Two year"] else 0)

    dataframe["NEW_noProt"] = dataframe.apply(lambda x: 1 if (x["OnlineBackup"] != "Yes") or (x["DeviceProtection"] != "Yes") or (
                x["TechSupport"] != "Yes") else 0, axis=1)

    dataframe["NEW_Young_Not_Engaged"] = dataframe.apply(lambda x: 1 if (x["NEW_Engaged"] == 0) and (x["SeniorCitizen"] == 0) else 0,
                                          axis=1)

    dataframe['NEW_TotalServices'] = (dataframe[['PhoneService', 'InternetService', 'OnlineSecurity',
                                  'OnlineBackup', 'DeviceProtection', 'TechSupport',
                                  'StreamingTV', 'StreamingMovies']] == 'Yes').sum(axis=1)

    dataframe["NEW_FLAG_ANY_STREAMING"] = dataframe.apply(
        lambda x: 1 if (x["StreamingTV"] == "Yes") or (x["StreamingMovies"] == "Yes") else 0, axis=1)

    dataframe["NEW_FLAG_AutoPayment"] = dataframe["PaymentMethod"].apply(
        lambda x: 1 if x in ["Bank transfer (automatic)", "Credit card (automatic)"] else 0)

    dataframe["NEW_AVG_Charges"] = dataframe["TotalCharges"] / (dataframe["tenure"] + 1)

    dataframe["NEW_Increase"] = dataframe["NEW_AVG_Charges"] / dataframe["MonthlyCharges"]

    dataframe["NEW_AVG_Service_Fee"] = dataframe["MonthlyCharges"] / (dataframe['NEW_TotalServices'] + 1)

    cat_cols, num_cols, cat_but_car = grab_col_names(dataframe)

    cat_cols.remove("Churn")

    dataframe = pd.get_dummies(dataframe, columns=cat_cols, drop_first=True, dtype=int)

    scaler = MinMaxScaler()

    dataframe[num_cols] = scaler.fit_transform(dataframe[num_cols])

    dump(scaler, 'scaler.joblib')

    dataframe.columns = [col.replace(' ', '_').upper() for col in dataframe.columns]

    y = dataframe["CHURN"]
    X = dataframe.drop(["CHURN", "CUSTOMERID"], axis=1)

    return X, y

# Data Preprocessing

In [None]:
df = pd.read_csv("/content/telco_customer_churn.csv")

In [None]:
df.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges,Churn
0,7590-VHVEG,Female,0,Yes,No,1,No,No phone service,DSL,No,Yes,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85,0
1,5575-GNVDE,Male,0,No,No,34,Yes,No,DSL,Yes,No,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5,0
2,3668-QPYBK,Male,0,No,No,2,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15,1
3,7795-CFOCW,Male,0,No,No,45,No,No phone service,DSL,Yes,No,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75,0
4,9237-HQITU,Female,0,No,No,2,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65,1


In [None]:
df["Churn"].value_counts() * 100 / len(df)

Churn
0   73.46301
1   26.53699
Name: count, dtype: float64

In [None]:
X, y = dataproprocessing(df)

In [None]:
X.head()

Unnamed: 0,TENURE,MONTHLYCHARGES,TOTALCHARGES,NEW_AVG_CHARGES,NEW_INCREASE,NEW_AVG_SERVICE_FEE,GENDER_MALE,PARTNER_YES,DEPENDENTS_YES,PHONESERVICE_YES,MULTIPLELINES_NO_PHONE_SERVICE,MULTIPLELINES_YES,INTERNETSERVICE_FIBER_OPTIC,INTERNETSERVICE_NO,ONLINESECURITY_NO_INTERNET_SERVICE,ONLINESECURITY_YES,ONLINEBACKUP_NO_INTERNET_SERVICE,ONLINEBACKUP_YES,DEVICEPROTECTION_NO_INTERNET_SERVICE,DEVICEPROTECTION_YES,TECHSUPPORT_NO_INTERNET_SERVICE,TECHSUPPORT_YES,STREAMINGTV_NO_INTERNET_SERVICE,STREAMINGTV_YES,STREAMINGMOVIES_NO_INTERNET_SERVICE,STREAMINGMOVIES_YES,CONTRACT_ONE_YEAR,CONTRACT_TWO_YEAR,PAPERLESSBILLING_YES,PAYMENTMETHOD_CREDIT_CARD_(AUTOMATIC),PAYMENTMETHOD_ELECTRONIC_CHECK,PAYMENTMETHOD_MAILED_CHECK,NEW_TENURE_YEAR_1-2_YEAR,NEW_TENURE_YEAR_2-3_YEAR,NEW_TENURE_YEAR_3-4_YEAR,NEW_TENURE_YEAR_4-5_YEAR,NEW_TENURE_YEAR_5-6_YEAR,SENIORCITIZEN_1,NEW_ENGAGED_1,NEW_NOPROT_1,NEW_YOUNG_NOT_ENGAGED_1,NEW_TOTALSERVICES_1,NEW_TOTALSERVICES_2,NEW_TOTALSERVICES_3,NEW_TOTALSERVICES_4,NEW_TOTALSERVICES_5,NEW_TOTALSERVICES_6,NEW_TOTALSERVICES_7,NEW_FLAG_ANY_STREAMING_1,NEW_FLAG_AUTOPAYMENT_1
0,0.01389,0.11542,0.00128,0.00414,0.00041,0.2071,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0
1,0.47222,0.38507,0.21587,0.03227,0.00677,0.18441,1,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0
2,0.02778,0.35423,0.01031,0.01935,0.00282,0.15883,1,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,1,0,0,1,0,0,0,0,0,0
3,0.625,0.2393,0.21024,0.02221,0.00674,0.06353,1,0,0,0,1,0,0,0,0,1,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,1,0,0,0,0,0,1
4,0.02778,0.52189,0.01533,0.0298,0.00346,0.88119,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0


In [None]:
X.shape

(7043, 50)

In [None]:
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

train_ds, val_ds = prepare_datasets(X_train, X_val, y_train, y_val, batch_size=32)

# Base Model with Binary Log Loss

In [None]:
X_train.shape[0]

5634

In [None]:
X_val.shape[0]

1409

In [None]:
base_model = Sequential([

    Input(shape=(train_ds.element_spec[0].shape[1],)),
    Dense(50, activation='relu', kernel_regularizer=l2(0.001)),
    BatchNormalization(),
    Dropout(0.5),
    Dense(1, activation='sigmoid')])

optimizer = Adam(learning_rate=0.001)

base_model.compile(optimizer=optimizer,
                   loss="binary_crossentropy",
                   metrics=["accuracy", "precision", "recall", "auc"])

early_stopping = EarlyStopping(monitor='val_loss',
                               patience=20,
                               verbose=1,
                               restore_best_weights=True)

base_model_history = base_model.fit(train_ds,
                                    epochs=1000,
                                    validation_data=val_ds,
                                    verbose=1,
                                    callbacks=early_stopping)

Epoch 1/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 19ms/step - accuracy: 0.5894 - auc: 0.6268 - loss: 0.8570 - precision: 0.3467 - recall: 0.5738 - val_accuracy: 0.7757 - val_auc: 0.8240 - val_loss: 0.5118 - val_precision: 0.6418 - val_recall: 0.3458
Epoch 2/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7517 - auc: 0.7715 - loss: 0.5698 - precision: 0.5432 - recall: 0.5539 - val_accuracy: 0.7885 - val_auc: 0.8446 - val_loss: 0.4632 - val_precision: 0.6582 - val_recall: 0.4182
Epoch 3/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7627 - auc: 0.7963 - loss: 0.5203 - precision: 0.5654 - recall: 0.4993 - val_accuracy: 0.8020 - val_auc: 0.8472 - val_loss: 0.4542 - val_precision: 0.6487 - val_recall: 0.5496
Epoch 4/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7797 - auc: 0.8137 - loss: 0.4912 - precision: 0.5892 - recall:

In [None]:
get_best_epoch_details(base_model_history, metric="val_loss", mode=min)

Unnamed: 0,Metric,Value
0,accuracy,0.8071
1,auc,0.8497
2,loss,0.4253
3,precision,0.6885
4,recall,0.4993
5,val_accuracy,0.802
6,val_auc,0.8506
7,val_loss,0.4254
8,val_precision,0.6577
9,val_recall,0.5255


In [None]:
val_loss, val_accuracy, val_precision, val_recall, val_auc = base_model.evaluate(val_ds, verbose=0)
f1_score = 2 * (val_precision * val_recall) / (val_precision + val_recall)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")
print(f"Validation AUC: {val_auc}")
print(f"Validation Precision: {val_precision}")
print(f"Validation Recall: {val_recall}")
print(f"Validation F1-Score: {f1_score}")

Validation Loss: 0.42540422081947327
Validation Accuracy: 0.8019872307777405
Validation AUC: 0.8505944013595581
Validation Precision: 0.6577181220054626
Validation Recall: 0.525469183921814
Validation F1-Score: 0.5842026923200292


In [None]:
df["Churn"].value_counts() * 100 / len(df)

Churn
0   73.46301
1   26.53699
Name: count, dtype: float64

# Weighted Cross-Entropy Loss

In [None]:
len(df[df['Churn'] == 1])

1869

In [None]:
len(df[df['Churn'] == 0])

5174

In [None]:
class_weight_for_0 = 1.0 / len(df[df['Churn'] == 0])

class_weight_for_1 = 1.0 / len(df[df['Churn'] == 1])

In [None]:
class_weight_for_0

0.00019327406262079628

In [None]:
class_weight_for_1

0.0005350454788657035

In [None]:
class_weights = {0: class_weight_for_0, 1: class_weight_for_1}

In [None]:
base_model = Sequential([
    Input(shape=(train_ds.element_spec[0].shape[1],)),
    Dense(50, activation='relu', kernel_regularizer=l2(0.001)),
    BatchNormalization(),
    Dropout(0.5),
    Dense(1, activation='sigmoid')])

optimizer = Adam(learning_rate=0.001)

base_model.compile(optimizer=optimizer,
                   loss="binary_crossentropy",
                   metrics=["accuracy", "precision", "recall", "auc"])

early_stopping = EarlyStopping(monitor='val_loss',
                               patience=20,
                               verbose=1,
                               restore_best_weights=True,
                               mode='min')

base_model_history = base_model.fit(train_ds,
                                    epochs=1000,
                                    validation_data=val_ds,
                                    verbose=1,
                                    callbacks=early_stopping,
                                    class_weight=class_weights)

Epoch 1/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 17ms/step - accuracy: 0.5751 - auc: 0.6138 - loss: 0.0318 - precision: 0.3375 - recall: 0.6152 - val_accuracy: 0.7182 - val_auc: 0.8067 - val_loss: 0.6430 - val_precision: 0.4819 - val_recall: 0.8552
Epoch 2/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6739 - auc: 0.7578 - loss: 0.0025 - precision: 0.4316 - recall: 0.7428 - val_accuracy: 0.7459 - val_auc: 0.8452 - val_loss: 0.6043 - val_precision: 0.5127 - val_recall: 0.8097
Epoch 3/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7299 - auc: 0.8148 - loss: 2.3696e-04 - precision: 0.4942 - recall: 0.7774 - val_accuracy: 0.7949 - val_auc: 0.8413 - val_loss: 0.5131 - val_precision: 0.6667 - val_recall: 0.4504
Epoch 4/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7193 - auc: 0.8122 - loss: 1.6252e-04 - precision: 0.4829 -

In [None]:
get_best_epoch_details(base_model_history, metric="val_loss", mode=min)

Unnamed: 0,Metric,Value
0,accuracy,0.7263
1,auc,0.8113
2,loss,0.0002
3,precision,0.4905
4,recall,0.7901
5,val_accuracy,0.7743
6,val_auc,0.8461
7,val_loss,0.4475
8,val_precision,0.5565
9,val_recall,0.7265


In [None]:
val_loss, val_accuracy, val_precision, val_recall, val_auc = base_model.evaluate(val_ds, verbose=0)
f1_score = 2 * (val_precision * val_recall) / (val_precision + val_recall)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")
print(f"Validation AUC: {val_auc}")
print(f"Validation Precision: {val_precision}")
print(f"Validation Recall: {val_recall}")
print(f"Validation F1-Score: {f1_score}")

Validation Loss: 0.4474751055240631
Validation Accuracy: 0.7743080258369446
Validation AUC: 0.8461032509803772
Validation Precision: 0.5564681887626648
Validation Recall: 0.7265415787696838
Validation F1-Score: 0.6302325775373169


In [None]:
# weighted binary log loss
# Validation Loss: 0.4474751055240631
# Validation Accuracy: 0.7743080258369446
# Validation AUC: 0.8461032509803772
# Validation Precision: 0.5564681887626648
# Validation Recall: 0.7265415787696838
# Validation F1-Score: 0.6302325775373169

# binary log loss
# Validation Loss: 0.42540422081947327
# Validation Accuracy: 0.8019872307777405
# Validation AUC: 0.8505944013595581
# Validation Precision: 0.6577181220054626
# Validation Recall: 0.525469183921814
# Validation F1-Score: 0.5842026923200292

# Weighted Cross-Entropy Loss and Monitoring With AUC

In [None]:
class_weights = {0: class_weight_for_0, 1: class_weight_for_1}

base_model = Sequential([
    Input(shape=(train_ds.element_spec[0].shape[1],)),
    Dense(50, activation='relu', kernel_regularizer=l2(0.001)),
    BatchNormalization(),
    Dropout(0.5),
    Dense(1, activation='sigmoid')])

optimizer = Adam(learning_rate=0.001)

base_model.compile(optimizer=optimizer,
                   loss="binary_crossentropy",
                   metrics=["accuracy", "precision", "recall", "auc"])

early_stopping = EarlyStopping(monitor='val_auc',
                               patience=20,
                               verbose=1,
                               restore_best_weights=True,
                               mode='max')

base_model_history = base_model.fit(train_ds,
                                    epochs=1000,
                                    validation_data=val_ds,
                                    verbose=1,
                                    callbacks=early_stopping,
                                    class_weight=class_weights)

Epoch 1/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 16ms/step - accuracy: 0.5390 - auc: 0.5821 - loss: 0.0314 - precision: 0.3157 - recall: 0.6036 - val_accuracy: 0.7715 - val_auc: 0.8242 - val_loss: 0.6064 - val_precision: 0.5581 - val_recall: 0.6568
Epoch 2/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6786 - auc: 0.7614 - loss: 0.0024 - precision: 0.4312 - recall: 0.7692 - val_accuracy: 0.7949 - val_auc: 0.8349 - val_loss: 0.5807 - val_precision: 0.6228 - val_recall: 0.5710
Epoch 3/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7178 - auc: 0.8091 - loss: 2.4042e-04 - precision: 0.4753 - recall: 0.7777 - val_accuracy: 0.7949 - val_auc: 0.8430 - val_loss: 0.4823 - val_precision: 0.6567 - val_recall: 0.4718
Epoch 4/1000
[1m177/177[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7312 - auc: 0.8221 - loss: 1.5943e-04 - precision: 0.4921 -

In [None]:
get_best_epoch_details(base_model_history, metric="val_auc", mode=max)

Unnamed: 0,Metric,Value
0,accuracy,0.7251
1,auc,0.8132
2,loss,0.0002
3,precision,0.489
4,recall,0.7888
5,val_accuracy,0.7218
6,val_auc,0.8528
7,val_loss,0.5347
8,val_precision,0.4855
9,val_recall,0.8525


In [None]:
val_loss, val_accuracy, val_precision, val_recall, val_auc = base_model.evaluate(val_ds, verbose=0)
f1_score = 2 * (val_precision * val_recall) / (val_precision + val_recall)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")
print(f"Validation AUC: {val_auc}")
print(f"Validation Precision: {val_precision}")
print(f"Validation Recall: {val_recall}")
print(f"Validation F1-Score: {f1_score}")

Validation Loss: 0.534650981426239
Validation Accuracy: 0.7217885255813599
Validation AUC: 0.8527953028678894
Validation Precision: 0.4854961931705475
Validation Recall: 0.8525469303131104
Validation F1-Score: 0.6186770544264993


In [None]:
# weighted binary log loss
# Validation Loss: 0.4474751055240631
# Validation Accuracy: 0.7743080258369446
# Validation AUC: 0.8461032509803772
# Validation Precision: 0.5564681887626648
# Validation Recall: 0.7265415787696838
# Validation F1-Score: 0.6302325775373169

# binary log loss
# Validation Loss: 0.42540422081947327
# Validation Accuracy: 0.8019872307777405
# Validation AUC: 0.8505944013595581
# Validation Precision: 0.6577181220054626
# Validation Recall: 0.525469183921814
# Validation F1-Score: 0.5842026923200292

# Hyperparameter Optimization


## Search Space

In [None]:
def build_model(hp):
    model = Sequential()
    model.add(Input(shape=(train_ds.element_spec[0].shape[1],)))

    # Hidden layers with advanced activation functions, l2, Dropout
    for i in range(hp.Int('num_layers', 1, 10)):
        # Add Dense layer
        model.add(Dense(
            units=hp.Int('units_' + str(i + 1), min_value=32, max_value=512, step=16),
            kernel_regularizer=l2(hp.Float('l2_' + str(i + 1), min_value=0.0001, max_value=0.01, sampling='log'))
        ))

        # Activation layer choice
        activation_choice = hp.Choice('activation_' + str(i + 1), values=['relu', 'leaky_relu', 'prelu'])

        if activation_choice == 'relu':
            model.add(ReLU())
        elif activation_choice == 'leaky_relu':
            model.add(LeakyReLU(negative_slope=0.01))
        elif activation_choice == 'prelu':
            model.add(PReLU())
        else:
            model.add(Activation(activation_choice))

        # Batch Normalization and Dropout
        model.add(BatchNormalization())
        model.add(Dropout(hp.Float('dropout_' + str(i + 1), min_value=0.0, max_value=0.5, step=0.1)))

    model.add(Dense(1, activation='sigmoid'))

    # Optimizer: Adam with tuning for beta1 and beta2
    optimizer = Adam(
        beta_1=hp.Float('beta1', min_value=0.85, max_value=0.99, step=0.01),
        beta_2=hp.Float('beta2', min_value=0.995, max_value=0.999, step=0.001)
    )

    model.compile(optimizer=optimizer,
                  loss="binary_crossentropy",
                  metrics=["accuracy", "precision", "recall", "auc"])

    return model


## Random Search

In [None]:
class_weights = {0: class_weight_for_0, 1: class_weight_for_1}

random_search_tuner = RandomSearch(
    build_model,
    objective='val_loss',
    max_trials=30,
    executions_per_trial=1,
    overwrite=True)

early_stopping = EarlyStopping(
    monitor='val_auc',
    patience=10,
    verbose=1,
    restore_best_weights=True,
    mode='max')

model_checkpoint = ModelCheckpoint(
    'final_tuned_model.keras',
    monitor='val_auc',
    verbose=0,
    save_best_only=True)

random_search_tuner.search(train_ds,
                           epochs=250,

                           validation_data=val_ds,

                           callbacks=[early_stopping, model_checkpoint],

                           class_weight=class_weights)


Trial 30 Complete [00h 01m 15s]
val_loss: 0.4300013482570648

Best val_loss So Far: 0.42378589510917664
Total elapsed time: 00h 19m 58s


In [None]:
best_hps = random_search_tuner.get_best_hyperparameters(num_trials=1)[0]

print_hyperparameters(best_hps.values)

   Hyperparameter       Value
0      num_layers           4
1         units_1         480
2            l2_1     0.00130
3    activation_1  leaky_relu
4       dropout_1     0.20000
5           beta1     0.91000
6           beta2     0.99800
7         units_2          32
8            l2_2     0.00010
9    activation_2        relu
10      dropout_2     0.00000
11        units_3          32
12           l2_3     0.00010
13   activation_3        relu
14      dropout_3     0.00000
15        units_4          32
16           l2_4     0.00010
17   activation_4        relu
18      dropout_4     0.00000


In [None]:
dump(best_hps, 'best_hps.joblib')

['best_hps.joblib']

In [None]:
best_model = random_search_tuner.get_best_models(num_models=1)[0]

  saveable.load_own_variables(weights_store.get(inner_path))


In [None]:
best_model.summary()

In [None]:
val_loss, val_accuracy, val_precision, val_recall, val_auc = best_model.evaluate(val_ds, verbose=0)
f1_score = 2 * (val_precision * val_recall) / (val_precision + val_recall)
print(f"Validation Loss: {val_loss}")
print(f"Validation Accuracy: {val_accuracy}")
print(f"Validation AUC: {val_auc}")
print(f"Validation Precision: {val_precision}")
print(f"Validation Recall: {val_recall}")
print(f"Validation F1-Score: {f1_score}")

Validation Loss: 0.42378589510917664
Validation Accuracy: 0.7948899865150452
Validation AUC: 0.8497482538223267
Validation Precision: 0.599056601524353
Validation Recall: 0.6809651255607605
Validation F1-Score: 0.6373902024366314


In [None]:
# weighted binary log loss monitor with auc
# Validation Loss: 0.45272842049598694
# Validation Accuracy: 0.7750177383422852
# Validation AUC: 0.8536428213119507
# Validation Precision: 0.5534350872039795
# Validation Recall: 0.777479887008667
# Validation F1-Score: 0.6465997564127386

# weighted binary log loss
# Validation Loss: 0.4474751055240631
# Validation Accuracy: 0.7743080258369446
# Validation AUC: 0.8461032509803772
# Validation Precision: 0.5564681887626648
# Validation Recall: 0.7265415787696838
# Validation F1-Score: 0.6302325775373169

# binary log loss
# Validation Loss: 0.42540422081947327
# Validation Accuracy: 0.8019872307777405
# Validation AUC: 0.8505944013595581
# Validation Precision: 0.6577181220054626
# Validation Recall: 0.525469183921814
# Validation F1-Score: 0.5842026923200292

# Retrain for Entire Dataset

In [None]:
def build_model(hp):
    model = Sequential()
    model.add(Input(shape=(train_ds.element_spec[0].shape[1],)))

    # Hidden layers with advanced activation functions, l2, Dropout
    for i in range(hp.Int('num_layers', 1, 10)):
        # Add Dense layer
        model.add(Dense(
            units=hp.Int('units_' + str(i + 1), min_value=32, max_value=512, step=16),
            kernel_regularizer=l2(hp.Float('l2_' + str(i + 1), min_value=0.0001, max_value=0.01, sampling='log'))
        ))

        # Activation layer choice
        activation_choice = hp.Choice('activation_' + str(i + 1), values=['relu', 'leaky_relu', 'prelu'])

        if activation_choice == 'relu':
            model.add(ReLU())
        elif activation_choice == 'leaky_relu':
            model.add(LeakyReLU(negative_slope=0.01))
        elif activation_choice == 'prelu':
            model.add(PReLU())
        else:
            model.add(Activation(activation_choice))

        # Batch Normalization and Dropout
        model.add(BatchNormalization())
        model.add(Dropout(hp.Float('dropout_' + str(i + 1), min_value=0.0, max_value=0.5, step=0.1)))

    model.add(Dense(1, activation='sigmoid'))

    # Optimizer: Adam with tuning for beta1 and beta2
    optimizer = Adam(
        beta_1=hp.Float('beta1', min_value=0.85, max_value=0.99, step=0.01),
        beta_2=hp.Float('beta2', min_value=0.995, max_value=0.999, step=0.001)
    )

    model.compile(optimizer=optimizer,
                  loss="binary_crossentropy",
                  metrics=["accuracy", "precision", "recall", "auc"])

    return model

In [None]:
df = pd.read_csv("/content/telco_customer_churn.csv")

In [None]:
X, y = dataproprocessing(df)

In [None]:
dataset = tf.data.Dataset.from_tensor_slices((X, y)).shuffle(buffer_size=len(X)).batch(len(X))

In [None]:
best_hps = load('best_hps.joblib')

final_tuned_model = build_model(best_hps)

In [None]:
early_stopping = EarlyStopping(
    monitor='loss',
    patience=5,
    verbose=1,
    restore_best_weights=True)

model_checkpoint = ModelCheckpoint(
    'final_tuned_all_data_model.keras',
    monitor='loss',
    verbose=0,
    save_best_only=True)

final_history = final_tuned_model.fit(dataset,
            epochs=100,
            verbose=1,
            callbacks=[early_stopping, model_checkpoint])

Epoch 1/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 6s/step - accuracy: 0.5040 - auc: 0.5094 - loss: 1.0454 - precision: 0.2762 - recall: 0.5361
Epoch 2/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 94ms/step - accuracy: 0.5918 - auc: 0.6627 - loss: 0.9047 - precision: 0.3634 - recall: 0.7159
Epoch 3/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 97ms/step - accuracy: 0.6409 - auc: 0.7473 - loss: 0.8282 - precision: 0.4099 - recall: 0.8036
Epoch 4/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 88ms/step - accuracy: 0.6648 - auc: 0.7780 - loss: 0.7945 - precision: 0.4311 - recall: 0.8240
Epoch 5/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 95ms/step - accuracy: 0.6787 - auc: 0.7933 - loss: 0.7728 - precision: 0.4435 - recall: 0.8277
Epoch 6/100
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 91ms/step - accuracy: 0.6811 - auc: 0.8002 - loss: 0.7572 - precision: 0.4458 - rec

In [None]:
get_best_epoch_details(final_history, metric="loss", mode=min)

Unnamed: 0,Metric,Value
0,accuracy,0.8636
1,auc,0.9283
2,loss,0.4236
3,precision,0.7328
4,recall,0.7646
5,best_epoch,100.0


In [None]:
final_tuned_model.save('final_tuned_all_data_model.keras')

# Prediction

## Imports

In [None]:
# !pip uninstall tf-keras
# !pip install keras-tuner
# !pip install tensorflow==2.16.1

Found existing installation: tf_keras 2.15.1
Uninstalling tf_keras-2.15.1:
  Would remove:
    /usr/local/lib/python3.10/dist-packages/tf_keras-2.15.1.dist-info/*
    /usr/local/lib/python3.10/dist-packages/tf_keras/*
Proceed (Y/n)? y
  Successfully uninstalled tf_keras-2.15.1
Collecting keras-tuner
  Downloading keras_tuner-1.4.7-py3-none-any.whl (129 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m129.1/129.1 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
Collecting kt-legacy (from keras-tuner)
  Downloading kt_legacy-1.0.5-py3-none-any.whl (9.6 kB)
Installing collected packages: kt-legacy, keras-tuner
Successfully installed keras-tuner-1.4.7 kt-legacy-1.0.5
Collecting tensorflow==2.16.1
  Downloading tensorflow-2.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (589.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m589.8/589.8 MB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
Collecting h5py>=3.10.0 (from tensorflow==2.16.1)
  Downlo

In [None]:
import keras
import tensorflow as tf
print("Keras Current Version:", keras.__version__, "Tensorflow Current Version:", tf.__version__)

Keras Current Version: 3.3.3 Tensorflow Current Version: 2.16.1


In [None]:
import numpy as np
import pandas as pd

import random
from joblib import dump, load

from sklearn.preprocessing import MinMaxScaler

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import load_model

def grab_col_names(dataframe, cat_th=10, car_th=20):
    cat_cols = [col for col in dataframe.columns if dataframe[col].dtypes == "O"]
    num_but_cat = [col for col in dataframe.columns if dataframe[col].nunique() < cat_th and dataframe[col].dtypes != "O"]
    cat_but_car = [col for col in dataframe.columns if dataframe[col].nunique() > car_th and dataframe[col].dtypes == "O"]
    cat_cols = cat_cols + num_but_cat
    cat_cols = [col for col in cat_cols if col not in cat_but_car]
    num_cols = [col for col in dataframe.columns if dataframe[col].dtypes != "O"]
    num_cols = [col for col in num_cols if col not in num_but_cat]
    return cat_cols, num_cols, cat_but_car

## New Customers

In [None]:
new_customers_df = pd.read_csv("/content/new_customers.csv")

In [None]:
new_customers_df.head()

Unnamed: 0,customerID,gender,SeniorCitizen,Partner,Dependents,tenure,PhoneService,MultipleLines,InternetService,OnlineSecurity,OnlineBackup,DeviceProtection,TechSupport,StreamingTV,StreamingMovies,Contract,PaperlessBilling,PaymentMethod,MonthlyCharges,TotalCharges
0,7590-VHVEG,Female,0,Yes,No,1,No,No phone service,DSL,No,Yes,No,No,No,No,Month-to-month,Yes,Electronic check,29.85,29.85
1,5575-GNVDE,Male,0,No,No,34,Yes,No,DSL,Yes,No,Yes,No,No,No,One year,No,Mailed check,56.95,1889.5
2,3668-QPYBK,Male,0,No,No,2,Yes,No,DSL,Yes,Yes,No,No,No,No,Month-to-month,Yes,Mailed check,53.85,108.15
3,7795-CFOCW,Male,0,No,No,45,No,No phone service,DSL,Yes,No,Yes,Yes,No,No,One year,No,Bank transfer (automatic),42.3,1840.75
4,9237-HQITU,Female,0,No,No,2,Yes,No,Fiber optic,No,No,No,No,No,No,Month-to-month,Yes,Electronic check,70.7,151.65


In [None]:
new_customers_df.shape

(10, 20)

## Load Scaler & Final Model

In [None]:
scaler = load('scaler.joblib')

loaded_final_tuned_model = load_model("/content/final_tuned_all_data_model.keras", compile=False)

In [None]:
loaded_final_tuned_model.predict(new_customers_df)

In [None]:
def data_proprocess_prediction(dataframe, scaler):

    # feature engineering
    dataframe.loc[(dataframe["tenure"] >= 0) & (dataframe["tenure"] <= 12), "NEW_TENURE_YEAR"] = "0-1 Year"
    dataframe.loc[(dataframe["tenure"] > 12) & (dataframe["tenure"] <= 24), "NEW_TENURE_YEAR"] = "1-2 Year"
    dataframe.loc[(dataframe["tenure"] > 24) & (dataframe["tenure"] <= 36), "NEW_TENURE_YEAR"] = "2-3 Year"
    dataframe.loc[(dataframe["tenure"] > 36) & (dataframe["tenure"] <= 48), "NEW_TENURE_YEAR"] = "3-4 Year"
    dataframe.loc[(dataframe["tenure"] > 48) & (dataframe["tenure"] <= 60), "NEW_TENURE_YEAR"] = "4-5 Year"
    dataframe.loc[(dataframe["tenure"] > 60) & (dataframe["tenure"] <= 72), "NEW_TENURE_YEAR"] = "5-6 Year"

    dataframe["NEW_Engaged"] = dataframe["Contract"].apply(lambda x: 1 if x in ["One year", "Two year"] else 0)

    dataframe["NEW_noProt"] = dataframe.apply(lambda x: 1 if (x["OnlineBackup"] != "Yes") or (x["DeviceProtection"] != "Yes") or (
                x["TechSupport"] != "Yes") else 0, axis=1)

    dataframe["NEW_Young_Not_Engaged"] = dataframe.apply(lambda x: 1 if (x["NEW_Engaged"] == 0) and (x["SeniorCitizen"] == 0) else 0,
                                          axis=1)

    dataframe['NEW_TotalServices'] = (dataframe[['PhoneService', 'InternetService', 'OnlineSecurity',
                                  'OnlineBackup', 'DeviceProtection', 'TechSupport',
                                  'StreamingTV', 'StreamingMovies']] == 'Yes').sum(axis=1)

    dataframe["NEW_FLAG_ANY_STREAMING"] = dataframe.apply(
        lambda x: 1 if (x["StreamingTV"] == "Yes") or (x["StreamingMovies"] == "Yes") else 0, axis=1)

    dataframe["NEW_FLAG_AutoPayment"] = dataframe["PaymentMethod"].apply(
        lambda x: 1 if x in ["Bank transfer (automatic)", "Credit card (automatic)"] else 0)

    dataframe["NEW_AVG_Charges"] = dataframe["TotalCharges"] / (dataframe["tenure"] + 1)

    dataframe["NEW_Increase"] = dataframe["NEW_AVG_Charges"] / dataframe["MonthlyCharges"]

    dataframe["NEW_AVG_Service_Fee"] = dataframe["MonthlyCharges"] / (dataframe['NEW_TotalServices'] + 1)

    cat_cols, num_cols, cat_but_car = grab_col_names(dataframe, cat_th=5)

    cat_cols.remove("customerID")

    dataframe = pd.get_dummies(dataframe, columns=cat_cols, drop_first=True, dtype=int)

    dataframe[num_cols] = scaler.fit_transform(dataframe[num_cols])

    dataframe.columns = [col.replace(' ', '_').upper() for col in dataframe.columns]

    X = dataframe.drop(["CUSTOMERID"], axis=1)

    return X

In [None]:
new_customers_processed = data_proprocess_prediction(new_customers_df, scaler)

In [None]:
new_customers_processed

Unnamed: 0,TENURE,MONTHLYCHARGES,TOTALCHARGES,NEW_AVG_CHARGES,NEW_INCREASE,NEW_AVG_SERVICE_FEE,GENDER_MALE,PARTNER_YES,DEPENDENTS_YES,PHONESERVICE_YES,...,NEW_TENURE_YEAR_2-3_YEAR,NEW_TENURE_YEAR_3-4_YEAR,NEW_TENURE_YEAR_5-6_YEAR,NEW_ENGAGED_1,NEW_YOUNG_NOT_ENGAGED_1,NEW_TOTALSERVICES_3,NEW_TOTALSERVICES_4,NEW_TOTALSERVICES_5,NEW_FLAG_ANY_STREAMING_1,NEW_FLAG_AUTOPAYMENT_1
0,0.0,0.001332,0.0,0.0,0.0,0.17558,0,1,0,0,...,0,0,0,0,1,0,0,0,0,0
1,0.540984,0.362425,0.537766,0.433472,0.891878,0.14783,1,0,0,1,...,1,0,0,1,0,1,0,0,0,0
2,0.016393,0.321119,0.022642,0.234433,0.337384,0.116549,1,0,0,1,...,0,0,0,0,1,1,0,0,0,0
3,0.721311,0.167222,0.523669,0.278448,0.888021,0.0,1,0,0,0,...,0,1,0,1,0,1,0,0,0,1
4,0.016393,0.545636,0.035222,0.395345,0.428056,1.0,0,0,0,1,...,0,0,0,0,1,0,0,0,0,0
5,0.114754,0.931379,0.228637,0.846084,0.826014,0.377598,0,0,0,1,...,0,0,0,0,1,0,1,0,1,0
6,0.344262,0.790806,0.555088,0.774948,0.898453,0.47225,1,0,1,1,...,0,0,0,0,1,1,0,0,1,1
7,0.147541,0.0,0.07867,0.138944,0.84128,0.173562,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
8,0.442623,1.0,0.872213,1.0,1.0,0.27817,0,1,0,1,...,1,0,0,0,1,0,0,1,1,0
9,1.0,0.351765,1.0,0.448771,0.967652,0.139758,1,0,1,1,...,0,0,1,1,0,1,0,0,0,1


In [None]:
loaded_final_tuned_model.predict(new_customers_processed)

In [None]:
def dataproprocessing(dataframe):

    # cat_cols, num_cols, cat_but_car = grab_col_names(dataframe)

    dataframe["TotalCharges"].fillna(dataframe["TotalCharges"].median(), inplace=True)

    # feature engineering
    dataframe.loc[(dataframe["tenure"] >= 0) & (dataframe["tenure"] <= 12), "NEW_TENURE_YEAR"] = "0-1 Year"
    dataframe.loc[(dataframe["tenure"] > 12) & (dataframe["tenure"] <= 24), "NEW_TENURE_YEAR"] = "1-2 Year"
    dataframe.loc[(dataframe["tenure"] > 24) & (dataframe["tenure"] <= 36), "NEW_TENURE_YEAR"] = "2-3 Year"
    dataframe.loc[(dataframe["tenure"] > 36) & (dataframe["tenure"] <= 48), "NEW_TENURE_YEAR"] = "3-4 Year"
    dataframe.loc[(dataframe["tenure"] > 48) & (dataframe["tenure"] <= 60), "NEW_TENURE_YEAR"] = "4-5 Year"
    dataframe.loc[(dataframe["tenure"] > 60) & (dataframe["tenure"] <= 72), "NEW_TENURE_YEAR"] = "5-6 Year"

    dataframe["NEW_Engaged"] = dataframe["Contract"].apply(lambda x: 1 if x in ["One year", "Two year"] else 0)

    dataframe["NEW_noProt"] = dataframe.apply(lambda x: 1 if (x["OnlineBackup"] != "Yes") or (x["DeviceProtection"] != "Yes") or (
                x["TechSupport"] != "Yes") else 0, axis=1)

    dataframe["NEW_Young_Not_Engaged"] = dataframe.apply(lambda x: 1 if (x["NEW_Engaged"] == 0) and (x["SeniorCitizen"] == 0) else 0,
                                          axis=1)

    dataframe['NEW_TotalServices'] = (dataframe[['PhoneService', 'InternetService', 'OnlineSecurity',
                                  'OnlineBackup', 'DeviceProtection', 'TechSupport',
                                  'StreamingTV', 'StreamingMovies']] == 'Yes').sum(axis=1)

    dataframe["NEW_FLAG_ANY_STREAMING"] = dataframe.apply(
        lambda x: 1 if (x["StreamingTV"] == "Yes") or (x["StreamingMovies"] == "Yes") else 0, axis=1)

    dataframe["NEW_FLAG_AutoPayment"] = dataframe["PaymentMethod"].apply(
        lambda x: 1 if x in ["Bank transfer (automatic)", "Credit card (automatic)"] else 0)

    dataframe["NEW_AVG_Charges"] = dataframe["TotalCharges"] / (dataframe["tenure"] + 1)

    dataframe["NEW_Increase"] = dataframe["NEW_AVG_Charges"] / dataframe["MonthlyCharges"]

    dataframe["NEW_AVG_Service_Fee"] = dataframe["MonthlyCharges"] / (dataframe['NEW_TotalServices'] + 1)

    cat_cols, num_cols, cat_but_car = grab_col_names(dataframe)

    cat_cols.remove("Churn")

    dataframe = pd.get_dummies(dataframe, columns=cat_cols, drop_first=True, dtype=int)

    scaler = MinMaxScaler()

    dataframe[num_cols] = scaler.fit_transform(dataframe[num_cols])

    # dump(scaler, 'scaler.joblib')

    dataframe.columns = [col.replace(' ', '_').upper() for col in dataframe.columns]

    y = dataframe["CHURN"]
    X = dataframe.drop(["CHURN", "CUSTOMERID"], axis=1)

    return X, y

In [None]:
original_df = pd.read_csv("/content/telco_customer_churn.csv")

In [None]:
original_X, y = dataproprocessing(original_df)

In [None]:
original_X.shape

(7043, 50)

In [None]:
new_customers_processed.shape

(10, 35)

In [None]:
def compare_columns(orijinal_df, new_df):
    columns_original_df = set(orijinal_df.columns)
    columns_new_df = set(new_df.columns)
    only_in_original_df = columns_original_df - columns_new_df
    only_in_new_df = columns_new_df - columns_original_df
    return list(only_in_original_df), list(only_in_new_df)

In [None]:
only_in_original, only_in_new_df = compare_columns(original_X, new_customers_processed)

In [None]:
only_in_original

['NEW_TOTALSERVICES_1',
 'TECHSUPPORT_NO_INTERNET_SERVICE',
 'ONLINESECURITY_NO_INTERNET_SERVICE',
 'NEW_TOTALSERVICES_7',
 'INTERNETSERVICE_NO',
 'ONLINEBACKUP_NO_INTERNET_SERVICE',
 'DEVICEPROTECTION_NO_INTERNET_SERVICE',
 'NEW_TOTALSERVICES_6',
 'NEW_TOTALSERVICES_2',
 'STREAMINGMOVIES_NO_INTERNET_SERVICE',
 'CONTRACT_TWO_YEAR',
 'STREAMINGTV_NO_INTERNET_SERVICE',
 'NEW_NOPROT_1',
 'NEW_TENURE_YEAR_4-5_YEAR',
 'SENIORCITIZEN_1']

In [None]:
only_in_new_df

[]

In [None]:
len(only_in_original)

15

In [None]:
original_X[only_in_original].head()

Unnamed: 0,NEW_TOTALSERVICES_1,TECHSUPPORT_NO_INTERNET_SERVICE,ONLINESECURITY_NO_INTERNET_SERVICE,NEW_TOTALSERVICES_7,INTERNETSERVICE_NO,ONLINEBACKUP_NO_INTERNET_SERVICE,DEVICEPROTECTION_NO_INTERNET_SERVICE,NEW_TOTALSERVICES_6,NEW_TOTALSERVICES_2,STREAMINGMOVIES_NO_INTERNET_SERVICE,CONTRACT_TWO_YEAR,STREAMINGTV_NO_INTERNET_SERVICE,NEW_NOPROT_1,NEW_TENURE_YEAR_4-5_YEAR,SENIORCITIZEN_1
0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
4,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0


In [None]:
new_customers_processed

Unnamed: 0,TENURE,MONTHLYCHARGES,TOTALCHARGES,NEW_AVG_CHARGES,NEW_INCREASE,NEW_AVG_SERVICE_FEE,GENDER_MALE,PARTNER_YES,DEPENDENTS_YES,PHONESERVICE_YES,...,NEW_TENURE_YEAR_2-3_YEAR,NEW_TENURE_YEAR_3-4_YEAR,NEW_TENURE_YEAR_5-6_YEAR,NEW_ENGAGED_1,NEW_YOUNG_NOT_ENGAGED_1,NEW_TOTALSERVICES_3,NEW_TOTALSERVICES_4,NEW_TOTALSERVICES_5,NEW_FLAG_ANY_STREAMING_1,NEW_FLAG_AUTOPAYMENT_1
0,0.0,0.001332,0.0,0.0,0.0,0.17558,0,1,0,0,...,0,0,0,0,1,0,0,0,0,0
1,0.540984,0.362425,0.537766,0.433472,0.891878,0.14783,1,0,0,1,...,1,0,0,1,0,1,0,0,0,0
2,0.016393,0.321119,0.022642,0.234433,0.337384,0.116549,1,0,0,1,...,0,0,0,0,1,1,0,0,0,0
3,0.721311,0.167222,0.523669,0.278448,0.888021,0.0,1,0,0,0,...,0,1,0,1,0,1,0,0,0,1
4,0.016393,0.545636,0.035222,0.395345,0.428056,1.0,0,0,0,1,...,0,0,0,0,1,0,0,0,0,0
5,0.114754,0.931379,0.228637,0.846084,0.826014,0.377598,0,0,0,1,...,0,0,0,0,1,0,1,0,1,0
6,0.344262,0.790806,0.555088,0.774948,0.898453,0.47225,1,0,1,1,...,0,0,0,0,1,1,0,0,1,1
7,0.147541,0.0,0.07867,0.138944,0.84128,0.173562,0,0,0,0,...,0,0,0,0,1,0,0,0,0,0
8,0.442623,1.0,0.872213,1.0,1.0,0.27817,0,1,0,1,...,1,0,0,0,1,0,0,1,1,0
9,1.0,0.351765,1.0,0.448771,0.967652,0.139758,1,0,1,1,...,0,0,1,1,0,1,0,0,0,1


In [None]:
for col in only_in_original:
    if col not in new_customers_processed.columns:
        new_customers_processed[col] = 0

In [None]:
pd.set_option('display.max_columns', None)

In [None]:
new_customers_processed

Unnamed: 0,TENURE,MONTHLYCHARGES,TOTALCHARGES,NEW_AVG_CHARGES,NEW_INCREASE,NEW_AVG_SERVICE_FEE,GENDER_MALE,PARTNER_YES,DEPENDENTS_YES,PHONESERVICE_YES,MULTIPLELINES_NO_PHONE_SERVICE,MULTIPLELINES_YES,INTERNETSERVICE_FIBER_OPTIC,ONLINESECURITY_YES,ONLINEBACKUP_YES,DEVICEPROTECTION_YES,TECHSUPPORT_YES,STREAMINGTV_YES,STREAMINGMOVIES_YES,CONTRACT_ONE_YEAR,PAPERLESSBILLING_YES,PAYMENTMETHOD_CREDIT_CARD_(AUTOMATIC),PAYMENTMETHOD_ELECTRONIC_CHECK,PAYMENTMETHOD_MAILED_CHECK,NEW_TENURE_YEAR_1-2_YEAR,NEW_TENURE_YEAR_2-3_YEAR,NEW_TENURE_YEAR_3-4_YEAR,NEW_TENURE_YEAR_5-6_YEAR,NEW_ENGAGED_1,NEW_YOUNG_NOT_ENGAGED_1,NEW_TOTALSERVICES_3,NEW_TOTALSERVICES_4,NEW_TOTALSERVICES_5,NEW_FLAG_ANY_STREAMING_1,NEW_FLAG_AUTOPAYMENT_1,NEW_TOTALSERVICES_1,TECHSUPPORT_NO_INTERNET_SERVICE,ONLINESECURITY_NO_INTERNET_SERVICE,NEW_TOTALSERVICES_7,INTERNETSERVICE_NO,ONLINEBACKUP_NO_INTERNET_SERVICE,DEVICEPROTECTION_NO_INTERNET_SERVICE,NEW_TOTALSERVICES_6,NEW_TOTALSERVICES_2,STREAMINGMOVIES_NO_INTERNET_SERVICE,CONTRACT_TWO_YEAR,STREAMINGTV_NO_INTERNET_SERVICE,NEW_NOPROT_1,NEW_TENURE_YEAR_4-5_YEAR,SENIORCITIZEN_1
0,0.0,0.001332,0.0,0.0,0.0,0.17558,0,1,0,0,1,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0.540984,0.362425,0.537766,0.433472,0.891878,0.14783,1,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,0,1,0,1,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0.016393,0.321119,0.022642,0.234433,0.337384,0.116549,1,0,0,1,0,0,0,1,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0.721311,0.167222,0.523669,0.278448,0.888021,0.0,1,0,0,0,1,0,0,1,0,1,1,0,0,1,0,0,0,0,0,0,1,0,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0.016393,0.545636,0.035222,0.395345,0.428056,1.0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,0.114754,0.931379,0.228637,0.846084,0.826014,0.377598,0,0,0,1,0,1,1,0,0,1,0,1,1,0,1,0,1,0,0,0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,0.344262,0.790806,0.555088,0.774948,0.898453,0.47225,1,0,1,1,0,1,1,0,1,0,0,1,0,0,1,1,0,0,1,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,0.147541,0.0,0.07867,0.138944,0.84128,0.173562,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,0.442623,1.0,0.872213,1.0,1.0,0.27817,0,1,0,1,0,1,1,0,0,1,1,1,1,0,1,0,1,0,0,1,0,0,0,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,1.0,0.351765,1.0,0.448771,0.967652,0.139758,1,0,1,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,1,1,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [None]:
new_customers_processed.shape

(10, 50)

In [None]:
original_X.shape

(7043, 50)

In [None]:
loaded_final_tuned_model.predict(new_customers_processed)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 129ms/step


array([[0.38229764],
       [0.4074176 ],
       [0.3631357 ],
       [0.36696312],
       [0.40483233],
       [0.39115614],
       [0.32096267],
       [0.37224138],
       [0.38878006],
       [0.3204678 ]], dtype=float32)

# Let's wrap it up

In [None]:
# !pip uninstall tf-keras
# !pip install keras-tuner
# !pip install tensorflow==2.16.1

In [None]:
import keras
import tensorflow as tf
print("Keras Current Version:", keras.__version__, "Tensorflow Current Version:", tf.__version__)

Keras Current Version: 3.3.3 Tensorflow Current Version: 2.16.1


In [None]:
import numpy as np
import pandas as pd

import random
from joblib import dump, load

from sklearn.preprocessing import MinMaxScaler

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import load_model

def grab_col_names(dataframe, cat_th=10, car_th=20):
    cat_cols = [col for col in dataframe.columns if dataframe[col].dtypes == "O"]
    num_but_cat = [col for col in dataframe.columns if dataframe[col].nunique() < cat_th and dataframe[col].dtypes != "O"]
    cat_but_car = [col for col in dataframe.columns if dataframe[col].nunique() > car_th and dataframe[col].dtypes == "O"]
    cat_cols = cat_cols + num_but_cat
    cat_cols = [col for col in cat_cols if col not in cat_but_car]
    num_cols = [col for col in dataframe.columns if dataframe[col].dtypes != "O"]
    num_cols = [col for col in num_cols if col not in num_but_cat]
    return cat_cols, num_cols, cat_but_car

In [None]:
def data_proprocessing_new(dataframe):

    dataframe["TotalCharges"].fillna(dataframe["TotalCharges"].median(), inplace=True)

    # feature engineering
    dataframe.loc[(dataframe["tenure"] >= 0) & (dataframe["tenure"] <= 12), "NEW_TENURE_YEAR"] = "0-1 Year"
    dataframe.loc[(dataframe["tenure"] > 12) & (dataframe["tenure"] <= 24), "NEW_TENURE_YEAR"] = "1-2 Year"
    dataframe.loc[(dataframe["tenure"] > 24) & (dataframe["tenure"] <= 36), "NEW_TENURE_YEAR"] = "2-3 Year"
    dataframe.loc[(dataframe["tenure"] > 36) & (dataframe["tenure"] <= 48), "NEW_TENURE_YEAR"] = "3-4 Year"
    dataframe.loc[(dataframe["tenure"] > 48) & (dataframe["tenure"] <= 60), "NEW_TENURE_YEAR"] = "4-5 Year"
    dataframe.loc[(dataframe["tenure"] > 60) & (dataframe["tenure"] <= 72), "NEW_TENURE_YEAR"] = "5-6 Year"

    dataframe["NEW_Engaged"] = dataframe["Contract"].apply(lambda x: 1 if x in ["One year", "Two year"] else 0)

    dataframe["NEW_noProt"] = dataframe.apply(lambda x: 1 if (x["OnlineBackup"] != "Yes") or (x["DeviceProtection"] != "Yes") or (
                x["TechSupport"] != "Yes") else 0, axis=1)

    dataframe["NEW_Young_Not_Engaged"] = dataframe.apply(lambda x: 1 if (x["NEW_Engaged"] == 0) and (x["SeniorCitizen"] == 0) else 0,
                                          axis=1)

    dataframe['NEW_TotalServices'] = (dataframe[['PhoneService', 'InternetService', 'OnlineSecurity',
                                  'OnlineBackup', 'DeviceProtection', 'TechSupport',
                                  'StreamingTV', 'StreamingMovies']] == 'Yes').sum(axis=1)

    dataframe["NEW_FLAG_ANY_STREAMING"] = dataframe.apply(
        lambda x: 1 if (x["StreamingTV"] == "Yes") or (x["StreamingMovies"] == "Yes") else 0, axis=1)

    dataframe["NEW_FLAG_AutoPayment"] = dataframe["PaymentMethod"].apply(
        lambda x: 1 if x in ["Bank transfer (automatic)", "Credit card (automatic)"] else 0)

    dataframe["NEW_AVG_Charges"] = dataframe["TotalCharges"] / (dataframe["tenure"] + 1)

    dataframe["NEW_Increase"] = dataframe["NEW_AVG_Charges"] / dataframe["MonthlyCharges"]

    dataframe["NEW_AVG_Service_Fee"] = dataframe["MonthlyCharges"] / (dataframe['NEW_TotalServices'] + 1)

    cat_cols, num_cols, cat_but_car = grab_col_names(dataframe)

    cat_cols.remove("Churn")

    dataframe = pd.get_dummies(dataframe, columns=cat_cols, drop_first=True, dtype=int)

    scaler = MinMaxScaler()

    dataframe[num_cols] = scaler.fit_transform(dataframe[num_cols])


    dataframe.columns = [col.replace(' ', '_').upper() for col in dataframe.columns]

    y = dataframe["CHURN"]
    X = dataframe.drop(["CHURN", "CUSTOMERID"], axis=1)

    dump(scaler, 'scaler.joblib')
    dump(X.columns, 'original_col_names.joblib')

    return X, y

In [None]:
original_df = pd.read_csv("/content/telco_customer_churn.csv")

original_X, y = data_proprocessing_new(original_df)

In [None]:
original_X.head()

Unnamed: 0,TENURE,MONTHLYCHARGES,TOTALCHARGES,NEW_AVG_CHARGES,NEW_INCREASE,NEW_AVG_SERVICE_FEE,GENDER_MALE,PARTNER_YES,DEPENDENTS_YES,PHONESERVICE_YES,...,NEW_YOUNG_NOT_ENGAGED_1,NEW_TOTALSERVICES_1,NEW_TOTALSERVICES_2,NEW_TOTALSERVICES_3,NEW_TOTALSERVICES_4,NEW_TOTALSERVICES_5,NEW_TOTALSERVICES_6,NEW_TOTALSERVICES_7,NEW_FLAG_ANY_STREAMING_1,NEW_FLAG_AUTOPAYMENT_1
0,0.013889,0.115423,0.001275,0.004136,0.000412,0.207096,0,1,0,0,...,1,1,0,0,0,0,0,0,0,0
1,0.472222,0.385075,0.215867,0.032272,0.006769,0.184406,1,0,0,1,...,0,0,0,1,0,0,0,0,0,0
2,0.027778,0.354229,0.01031,0.019352,0.002817,0.158828,1,0,0,1,...,1,0,0,1,0,0,0,0,0,0
3,0.625,0.239303,0.210241,0.022209,0.006742,0.063531,1,0,0,0,...,0,0,0,1,0,0,0,0,0,1
4,0.027778,0.521891,0.01533,0.029797,0.003463,0.881188,0,0,0,1,...,1,1,0,0,0,0,0,0,0,0


In [None]:
scaler = load('scaler.joblib')

original_col_names = load('/content/original_col_names.joblib')

loaded_final_tuned_model = load_model("/content/final_tuned_all_data_model.keras", compile=False)


In [None]:
len(original_col_names)

50

In [None]:
def data_proprocess_prediction_new(dataframe, col_names, scaler):

    # feature engineering
    dataframe.loc[(dataframe["tenure"] >= 0) & (dataframe["tenure"] <= 12), "NEW_TENURE_YEAR"] = "0-1 Year"
    dataframe.loc[(dataframe["tenure"] > 12) & (dataframe["tenure"] <= 24), "NEW_TENURE_YEAR"] = "1-2 Year"
    dataframe.loc[(dataframe["tenure"] > 24) & (dataframe["tenure"] <= 36), "NEW_TENURE_YEAR"] = "2-3 Year"
    dataframe.loc[(dataframe["tenure"] > 36) & (dataframe["tenure"] <= 48), "NEW_TENURE_YEAR"] = "3-4 Year"
    dataframe.loc[(dataframe["tenure"] > 48) & (dataframe["tenure"] <= 60), "NEW_TENURE_YEAR"] = "4-5 Year"
    dataframe.loc[(dataframe["tenure"] > 60) & (dataframe["tenure"] <= 72), "NEW_TENURE_YEAR"] = "5-6 Year"

    dataframe["NEW_Engaged"] = dataframe["Contract"].apply(lambda x: 1 if x in ["One year", "Two year"] else 0)

    dataframe["NEW_noProt"] = dataframe.apply(lambda x: 1 if (x["OnlineBackup"] != "Yes") or (x["DeviceProtection"] != "Yes") or (
                x["TechSupport"] != "Yes") else 0, axis=1)

    dataframe["NEW_Young_Not_Engaged"] = dataframe.apply(lambda x: 1 if (x["NEW_Engaged"] == 0) and (x["SeniorCitizen"] == 0) else 0,
                                          axis=1)

    dataframe['NEW_TotalServices'] = (dataframe[['PhoneService', 'InternetService', 'OnlineSecurity',
                                  'OnlineBackup', 'DeviceProtection', 'TechSupport',
                                  'StreamingTV', 'StreamingMovies']] == 'Yes').sum(axis=1)

    dataframe["NEW_FLAG_ANY_STREAMING"] = dataframe.apply(
        lambda x: 1 if (x["StreamingTV"] == "Yes") or (x["StreamingMovies"] == "Yes") else 0, axis=1)

    dataframe["NEW_FLAG_AutoPayment"] = dataframe["PaymentMethod"].apply(
        lambda x: 1 if x in ["Bank transfer (automatic)", "Credit card (automatic)"] else 0)

    dataframe["NEW_AVG_Charges"] = dataframe["TotalCharges"] / (dataframe["tenure"] + 1)

    dataframe["NEW_Increase"] = dataframe["NEW_AVG_Charges"] / dataframe["MonthlyCharges"]

    dataframe["NEW_AVG_Service_Fee"] = dataframe["MonthlyCharges"] / (dataframe['NEW_TotalServices'] + 1)

    cat_cols, num_cols, cat_but_car = grab_col_names(dataframe, cat_th=5)

    cat_cols.remove("customerID")

    dataframe = pd.get_dummies(dataframe, columns=cat_cols, drop_first=True, dtype=int)

    dataframe[num_cols] = scaler.fit_transform(dataframe[num_cols])

    dataframe.columns = [col.replace(' ', '_').upper() for col in dataframe.columns]

    X = dataframe.drop(["CUSTOMERID"], axis=1)

    if scaler.n_features_in_ != dataframe.shape[1]:
        print("sizes are different")
        for col in col_names:
            if col not in X.columns:
                X[col] = 0

    return X

In [None]:
new_customers_df = pd.read_csv("/content/new_customers.csv")

In [None]:
new_customers_processed = data_proprocess_prediction_new(new_customers_df, original_col_names, scaler)

sizes are different


In [None]:
new_customers_processed.head()

Unnamed: 0,TENURE,MONTHLYCHARGES,TOTALCHARGES,NEW_AVG_CHARGES,NEW_INCREASE,NEW_AVG_SERVICE_FEE,GENDER_MALE,PARTNER_YES,DEPENDENTS_YES,PHONESERVICE_YES,...,STREAMINGTV_NO_INTERNET_SERVICE,STREAMINGMOVIES_NO_INTERNET_SERVICE,CONTRACT_TWO_YEAR,NEW_TENURE_YEAR_4-5_YEAR,SENIORCITIZEN_1,NEW_NOPROT_1,NEW_TOTALSERVICES_1,NEW_TOTALSERVICES_2,NEW_TOTALSERVICES_6,NEW_TOTALSERVICES_7
0,0.0,0.001332,0.0,0.0,0.0,0.17558,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0.540984,0.362425,0.537766,0.433472,0.891878,0.14783,1,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,0.016393,0.321119,0.022642,0.234433,0.337384,0.116549,1,0,0,1,...,0,0,0,0,0,0,0,0,0,0
3,0.721311,0.167222,0.523669,0.278448,0.888021,0.0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0.016393,0.545636,0.035222,0.395345,0.428056,1.0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0


In [None]:
new_customers_processed.shape

(10, 50)

In [None]:
original_X.shape

(7043, 50)

In [None]:
loaded_final_tuned_model.predict(new_customers_processed)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 133ms/step


array([[0.38229764],
       [0.4074176 ],
       [0.3631357 ],
       [0.36696312],
       [0.40483233],
       [0.39115614],
       [0.32096267],
       [0.37224138],
       [0.38878006],
       [0.3204678 ]], dtype=float32)

In [None]:
# 0, 0, 1, 0, 1, 1, 0, 0, 1, 0