Loan defaulter prediction is extremely important and is widely used by banks and private loan providers all around the world to determine if a person would be able to repay the dept or not, and is used to determine if they should be given a loan or not. Machine Learning algorithms are being utilized for this task as they provide a near perfect estimate and are able to identify the important factors which contribute in making the estimate near perfect. In this project, I aim to study, analyze and visualize various factors and relationships between those factors which contribute in determining the rate of interest of the loan amount and also if a person is a potential loan defaulter.
The major questions I’m trying to answer with this project are:
- Does the amount of loan vary with the purpose which the loan has been taken, for two different loan terms?
- Is there a geographical connection between the loan amount for United States or not, if yes, which state has the highest number of loan defaulters?
- How does the amount of loan vary with the annual income of the borrower?
- Is there a relationship between the amount of loan, purpose of loan and the type of application for the loan?
- How the grade of the loan influences the rate of interest of the loan?
The pre-processing and machine learning algorithms have been applied using Python and can be found in 'Loan-or-Not/Loan-or-Not.py' or 'Loan-or-Not/Loan-or-Not.ipynb'