Summary
For any Credit card provider, scrutny of the application is the most import task. A wrongly classified record will be a financial loss to the Bank. In this project we are using the application and credit data available on the KAGGLE to create a model for the prediction Bad and Good Applicant based on available data. For the identification of bad and good customer Vintage Analysis is perfomed on available credit data. Various model such as Logistic Regression, Random Forest, XGBoost, CatBoost and KNN model are fitte.
The term 'Vintage' refers to the month or quarter in which account was opened (loan was granted). In simple words, the vintage analysis measures the performance of a portfolio in different periods of time after the loan (or credit card) was granted. Performance can be measured in the form of cumulative charge-off rate, proportion of customers 30/60/90 days past due (DPD), utilization ratio, average balance etc.
Data Distribution post Vintage Analysis
Application approval(target) distribution in for important features
ROC Curve
Confusion Matrics
Summary
- Every wrongly classified customer is a finacial loss to the Bank
- Recall score of Random forest: 0.45 & CatBoost: 0.44 which slightly different
- From execution perspective CatBoost is faster, Hence we are selecting it as the final model.
Reference: