## 5.3.2 Neural Networks
In this section, we will implement Estimation to estimate the likelihood that a given customer will churn based on his/her demographic and transaction profiles. 

For the purpose of demonstrating the concept of Estimation modeling, and reusing the same churn dataset,  we need to transform the initially categorical target attribute type values ('yes' or 'no')  with numeric  values (1 or 0), though not entirely precise but somehow the 1 and 0 present a churn estimate value. 

The following Python codes show the example implementation of the data modeling phase to solve the Estimation problem using the Pyhton-supported Neural Networks' Multi-layer Perceptron Regressor (i.e., `MLPRegressor()` function) algorithm.  The comments embedded in the codes give explanations to guide the rationale of the programming logic.

In [1]:
# import necessary libraries
import warnings
warnings.filterwarnings('ignore')
import pandas as pd
from sklearn.model_selection import train_test_split 
from sklearn import metrics 
from sklearn.neural_network import MLPRegressor 

# access dataset from source
df = pd.read_csv('data/ChurnFinal.csv')

# convert data types to numeric/float for Estimation modeling
df.loc[df['Churn'] == 'yes', 'Churn'] = 1
df.loc[df['Churn'] == 'no', 'Churn'] = 0
df['Churn'] = pd.to_numeric(df['Churn'], errors='coerce').astype('float')

# specify inputs and label
df_inputs = pd.get_dummies(df[['Gender', 'Age', 'PostalCode', 'Cash', 'CreditCard', 
            'Cheque', 'SinceLastTrx', 'SqrtTotal', 'SqrtMax', 'SqrtMin']])
df_label = df['Churn']

#The random state is a random seed number generator to ensure same order number.
#Splitting dataset into training and testing dataset
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test = train_test_split(df_inputs, df_label, 
                test_size=0.2, random_state=1)

# feature scaling
from sklearn.preprocessing import StandardScaler  
scaler = StandardScaler()  
scaler.fit(X_train)                       
X_train = scaler.transform(X_train)  
X_test = scaler.transform(X_test)      

# create a MLPRegressor object 
mlp = MLPRegressor(
    hidden_layer_sizes=(100,),  activation='relu', solver='adam', alpha=1e-05, 
    batch_size='auto', learning_rate='constant', learning_rate_init=0.0001, 
    power_t=0.5, max_iter=1000, shuffle=True, random_state=0, tol=0.0001, 
    verbose=False, warm_start=False, momentum=0.9, nesterovs_momentum=True,
    early_stopping=False, validation_fraction=0.1, beta_1=0.9, beta_2=0.999, 
    epsilon=1e-08)

# train the model using train data set
mlp.fit(X_train, Y_train)

#apply the created model using test data set 
y_predict = mlp.predict(X_test)   

# assess the model performance using mean squared error
print('Mean Squared Error(MSE): ', 
            round(metrics.mean_squared_error(Y_test, y_predict),3))

Mean Squared Error(MSE):  0.207


After running all the codes together given above, we obtain the model performance in Mean Squared Error (MSE) of 0.209 printed on the console terminal as Mean Squared Error (MSE): 0.209. This error indicates that the model has a variance of 0.209 in predicting the estimated churn likelihood compared to the actual value.


NOTE: Besides MSE, we will examine more measures of model performance in Week 6. 

For a detailed explanation of the `MLPRegressor()` API parameters, refer to the official website, https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html#sklearn.neural_network.MLPRegressor