Loan default prediction is one of the most critical and crucial problem faced by financial institutions and organizations as it has a noteworthy effect on the profitability of these institutions. In recent years, there is a tremendous increase in the volume of non – performing loans which results in a jeopardizing effect on the growth of these institutions. Therefore, to maintain a healthy portfolio, the banks put stringent monitoring and evaluation measures in place to ensure timely repayment of loans by borrowers. Despite these measures, a major proportion of loans become delinquent. Delinquency occurs when a borrower misses a payment against his/her loan. Given the information like mortgage details, borrowers related details and payment details, our objective is to identify the delinquency status of loans for the next month given the delinquency status for the previous 12 months (in number of months).
Find attached a Powerpoint Presentation explaining in detail the approach taken to solve this use case.
The various Banking terms are explained in the Presentation.
This was a project which gave me the maximum opportunity to experiment with Binary classification and has now given me the confidence to further deal with other classification problems. I experimented with:
- different feature processing,
- Normalization VS Standardization
- Dropping columns and checking its effect on the model
- One-Hot-Encode VS LabelEncode
- different methods of sampling the imbalanced dataset,
- RandomUnderSampler
- RandomOverSampler
- SMOTE Oversampling
- ADASYN Oversampling
- different train and validation split methods,
- Hold-Out Method
- StratifiedKFold
- and different models
- Logisticregression
- DecisionTrees
- Random Forest
- XGBoost
- AdaBoost
- LightGBM