# Quickstart - Customer Churn Full Suite Model Documentation

This interactive notebook will guide you through documenting a model using the ValidMind Developer framework. We will use sample datasets provided by the library and train a simple classification model.

For this simple demonstration, we will use the following bank customer churn dataset from Kaggle: https://www.kaggle.com/code/kmalit/bank-customer-churn-prediction/data.

We will train a sample model and demonstrate the following documentation functionalities:

- Initializing the ValidMind Developer Framework
- Using a sample datasets provided by the library to train a simple classification model
- Running a test suite to quickly generate document about the data and model

## Before Starting (Important)

Click File > Save a copy in Drive >  to make your own copy in Google Drive so that you can modify the notebook.

Alternatively, you can download the notebook source and work with it in your own developer environment.



## Install ValidMind Developer Framework


In [None]:
!pip install validmind

*Note: Colab may generate the following warning after running the first cell*:

```
WARNING [...]
You must restart the runtime in order to use newly installed versions
```

*If you see this, please click on **"Restart runtime"** and continue with the next cell.*


##Initializing the Python environment


In [None]:
import pandas as pd
import xgboost as xgb

from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

%matplotlib inline

## Initializing the ValidMind Client Library

Log in to the ValidMind platform with your registered email address, and navigate to the Documentation Projects page.

### Creating a new Documentation Project 

***(Note: if a documentation project has already been created, you can skip this section and head directly "Finding Project API key and secret")***

Clicking on "Create a new project" allows to you to register a new documentation project for our demo model. 

Select "Customer Churn model" from the Model drop-down, and "Initial Validation" as Type. Finally, click on "Create Project".

### Finding the project API key and secret 

In the "Client Integration" page of the newly created project, you will find the initialization code that allows the client library to associate documentation and tests with the appropriate project. The initialization code configures the following arguments: 

* api_host: Location of the ValidMind API.
* api_key: Account API key.
* api_secret: Account Secret key.
* project: The project identifier. The `project` argument is mandatory since it allows the library to associate all data collected with a specific account project.


The code snippet can be copied and pasted directly in the cell below to initialize the ValidMind Developer Framework when run:  

In [None]:
## Replace the code below with the code snippet from your project ## 







import validmind as vm

vm.init(
  api_host = "https://api.prod.validmind.ai/api/v1/tracking",
  api_key = "d84fda1911a2cd3711b32d296e33f848",
  api_secret = "e8f5ae23f6afc61368cdd64b3ef546af0b31cd7b78d7eedcd8a312f9c44b9dc2",
  project = "clhowg73e001s1pk10uouvsde"
)
  
  
  
  
  
  

## Load the Demo Dataset

For the purpose of this demonstration, we will use a sample dataset provided by the ValidMind library. 

In [None]:
# Import the sample dataset from the library
from validmind.datasets.classification import customer_churn as demo_dataset
# You can try a different dataset with: 
#from validmind.datasets.classification import taiwan_credit as demo_dataset

df = demo_dataset.load_data()

#### Initialize a dataset object for ValidMind

Before running the test plan, we must first initialize a ValidMind dataset object using the `init_dataset` function from the `vm` module. This function takes in arguements: `dataset` which is the dataset that we want to analyze; `target_column` which is used to identify the target variable; `class_labels` which is used to identify the labels used for classification model training.

In [None]:
vm_dataset = vm.init_dataset(
    dataset=df,
    target_column=demo_dataset.target_column,
    class_labels=demo_dataset.class_labels
)

## Run the Full Data and Model Validation Test Suite

We will need to preprocess the dataset and produce the training, test and validation splits first.

### Prepocess the Raw Dataset

For demonstration purposes, we simplified the preprocessing using demo_dataset.preprocess which executes the following operations: 

In [None]:
train_df, validation_df, test_df = demo_dataset.preprocess(df)

x_train = train_df.drop(demo_dataset.target_column, axis=1)
y_train = train_df[demo_dataset.target_column]
x_val = validation_df.drop(demo_dataset.target_column, axis=1)
y_val = validation_df[demo_dataset.target_column]

model = xgb.XGBClassifier(early_stopping_rounds=10)
model.set_params(
    eval_metric=["error", "logloss", "auc"],
)
model.fit(
    x_train,
    y_train,
    eval_set=[(x_val, y_val)],
    verbose=False,
)

We can now initialize the training and test datasets into dataset objects using vm.init_dataset():

In [None]:
vm_train_ds = vm.init_dataset(
    dataset=train_df,
    type="generic",
    target_column=demo_dataset.target_column
)

vm_test_ds = vm.init_dataset(
    dataset=test_df,
    type="generic",
    target_column=demo_dataset.target_column
)


We also initialize a model object using vm.init_model():

In [None]:

vm_model = vm.init_model(
    model,
    train_ds=vm_train_ds,
    test_ds=vm_test_ds,
)

### Run the Full Suite

We are now ready to run the test suite for binary classifier with tabular datasets. This function will run test plans on the dataset and model objects, and will document the results in the ValidMind UI.

In [None]:
full_suite = vm.run_test_suite(
    "binary_classifier_full_suite",
    dataset=vm_dataset,
    model=vm_model
)

In [None]:
vm.test_plans.list_tests()

In [None]:
vm.test_plans.describe_plan("tabular_data_quality")

In [None]:
full_suite = vm.run_test_suite(
    "binary_classifier_full_suite",
    dataset=vm_dataset,
    model=vm_model
)

You can access and review the resulting documentation in the ValidMind UI, in the "Model Development" section of the model documentation. 