# Industry Accelerators - Customer Churn Prediction

### Introduction

The Customer Churn Prediction accelerator includes a structured glossary of more than 100 business terms and a set of sample data science assets. The glossary provides the information architecture that you need to understand why customers leave. Your data scientists can use the sample notebooks, predictive models, and dashboards to accelerate data preparation, machine learning modeling, and data reporting. Understand the likelihood of a Customer Churn occuring or their Funds Under Management dropping by a specific threshold or higher in a month & analyse the business metrics influencing the Churn based on temporal data.

<p>
    <img src="../misc/images/acceleratorWorkflow.png" alt="Service details" style="height: 500px;" align="center" />
    <br style="clear: both;" />
</p>

## Inventory of Artifacts provided

### Knowledge Catalog

Described in the Knowledge Center  https://www.ibm.com/support/knowledgecenter/en/SSQNUZ

### Sample Datasets

The sample input datasets are : 

* **'customer.csv'** : Customer Data, Demographic data, Temporal data. 
* **'account.csv'** : Account type and Account Information Data, Investment Information, Temporal data. 
* **'customer summary'** : Detailed Customer Transaction Data, Business Metrics, Investment and Income Stats.

The idea is to generate a dataset which is used as input for model training and scoring purposes. For that purpose the above three datasets initially need to merged on the basis of the following data fields : 
* **customer.customer_id, customer.effective_date, customer_summary.customer_id, customer_summary.end_date, account.primary_customer_id, account.open_date and account.close_date.**

The merged dataset is the ***'customer_history.csv'***. This CSV typically will be starting point for the Data preparation. Given a list of transactions that the customer experiences, the script transforms this long form dataset into a wide format with one record per customer, which can be used for modelling purposes. 

To generate the 'customer_history.csv' find the SQL Query under **Scripts > sql > CUSTOMER_HISTORY_VIEW.sql** 

### Notebooks

* **1-model-train**: Load data, prepare and clean data for model training, correlation analysis, build ML models, exploratory data analysis, data visualization, selecting best performing ML model and saving to ICP4D.
* **2-model-score** : Operationalize the models, Test the scoring pipeline as a Web Service, Release and Deploy Model Scoring REST API Endpoint, Release the project, Deployment of scoring pipeline as Webservice.


### Scripts

The following scripts are called from the notebooks mentioned above : 

* **churn_prep.py** : Called from 1-model-train notebook. The script performs the data preparation and generates the dataset that is used for modelling. We take a wide form dataset with customer details, customer summary over time and aggregate account statistics, filter to include only columns that are relevant, complete data cleaning and produce the data for modelling. The script also stamps each customer with the target variable, whether they have churned or not. A customer has churned if their status becomes ‘Inactive’ or if their funds under management drops by a specified percentage or more in a month.

* **Churn_Scoring_Pipeline.py** : Called from 2-model-score. Loads models, Executes the model scoring and generates the predictions for the customer churn, Extracts the highest impact features and collaborates data to be used for the Dashboard. 

### R - Studio

* **Dashboard View** : Shows top action clients,  monthly customer churn, and customer churn risk level. Provides Search option to get client activity based on Customer ID. 
* **Client View** : Targets individual client information, depicts the top business metrics, account details,  provides option to run the model scoring webservice, predicts Customer Churn and Visualizes the influential factors and data fields


### Sequence of steps to run -- 

* Click notebooks, open 1-model-train & execute step-by-step
* Click notebooks, open 2-model-score & execute step-by-step
* Click RStudio, under the **Shiny** sub group click on CustomerChurnDashboard and Run the Shiny app by clicking on Run App button (from app.R file) 
                                                  **OR**
  If you deployed the app from the asset tab then Launch the dashboard by clicking on the shiny Dashboard under deployments tab in the Project Release    

**This project contains Sample Materials, provided under license. <br>
Licensed Materials - Property of IBM. <br>
© Copyright IBM Corp. 2019. All Rights Reserved. <br>
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.<br>**