Here CustomerChurnRate dataset by Gang Liu is used classify whether a customer is going to churn or not. Using the dataset EDA
is done.
While doing this we'll
- Deal with
outliers
using theiqr
andz-scores
methods - Feature selection using
backward elimination
The notebook is available on Kaggle to work in the same environment where this notebook was created i.e. use the same version packages used, etc...
Correlation matrix
Count plot (how unbalanced the dataset is)
Cross validation scores for different models
Learning curve
Confusion matrix, without normalization