Skip to content

Leangonplu/Default_Prediction_for_Loan_Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Default Prediction for Loan

ML-cover

Business problem:

Banks face significant losses when customers default on their loans, which in turn negatively impacts the country's economic growth. To address this problem, a data scientist is needed to perform in-depth analysis of loan data to identify the factors that influence defaults and constantly review the status of the loans. The objective of this project is to develop a risk mitigation plan that allows permanent control of the status of the credits granted by the institution in order for banks to contact borrowers and minimize losses. To achieve this goal, we will use various data science techniques, such as logistic regression, decision tree, random forest, and Xgboost. This project will be considered a binary classification problem, in which we intend to identify whether or not a client will default on their loan. The results of this analysis will be used to inform decision making and mitigate bank and investor losses, while promoting economic growth.

The objective is to identify the models that present the best ACCURACY, in other words, to search for the best metric performance that measures the proportion of correct predictions in general, that is, the proportion of cases classified correctly (true positives and true negatives) in relation to with all cases.

Finally, as a reminder and as we will observe in this paper, it should be noted that accuracy is a general performance metric for the model that evaluates global precision, but to maximize true positives and minimize false negatives, it is necessary to consider other metrics. such as sensitivity and specificity. Choosing the right metric depends on the goal and context of the classification problem.