Skip to content

Muhammadhidayatullahaspar/Credit-Risk-Classification

Repository files navigation

Background Project

The background for the Credit Profiling System project is to assist financial institutions such as banks or financing companies in determining the credit risk of their customers. This project is crucial because credit is one of the main sources of income for financial institutions. Before granting credit, financial institutions need to perform credit risk analysis on potential customers.

To perform credit risk analysis, financial institutions must evaluate the financial capability and eligibility of potential customers. To do this, they need accurate and complete data on customers' credit history, income, employment, and assets. This data can be very large and complex, especially if the financial institution has many customers.

Therefore, the use of big data in the Credit Profiling System project can greatly assist financial institutions in determining the credit risk of their customers. However, this big data may have many errors and deficiencies (dirty data) that need to be cleaned and processed before use. In addition, data processing, such as feature engineering and feature scaling, is necessary to improve data quality and increase model accuracy.

After the data is clean and processed, the next step is to perform modeling using machine learning techniques. In this stage, label/target variables and data preprocessing need to be done to prepare the data for machine learning models. The model can then be tested and evaluated by performing exploratory data analysis and visual analysis to determine its performance.

Once the model is successfully created, it can be deployed using Flask on a local host server. This will enable financial institutions to use it in their daily activities and expedite the credit risk analysis process.

Deployment

credit risk profilling

If the result is good credit it will produce the following display

hasil good

If the result is bad credit it will produce the following display

hasil bad

Conclusion

Based on the analysis results of the credit risk project for classifying good and bad credit, it can be concluded that most borrowers have good credit grades, the most common home ownership status is 'MORTGAGE', the most common loan purpose is for debt consolidation, and most borrowers have a low debt-to-income ratio. Additionally, there is an imbalance in the data, which can pose a problem in data analysis, so oversampling SMOTE was performed to balance the data. The built model has a high accuracy on both the training and testing data, as well as good validation, with kc 0.95, ks 0.93, and cross-validation of 0.92. Therefore, this model can be used to predict good or bad credit with a high level of accuracy.

Some other things that can be done for this project include:

Beberapa hal lain yang dapat dilakukan untuk project ini:

  • Loan amount distribution: Further analysis should be done regarding high loan amounts, which may pose a higher credit risk, by looking at the debt-to-income ratio and how much monthly payments borrowers have to make.

  • Distribution credit grade: This analysis can help understand the different borrower risk profiles and determine appropriate credit strategies for each risk profile.

  • Distribution of home ownership: Further analysis can be done to understand how home ownership affects credit risk. This can help determine appropriate credit strategies for different borrower risk profiles.

  • Loan purpose distribution: Monitoring is needed for loans with the purpose of 'other', 'major purpose', 'small business', 'car', 'medical', 'moving', 'vacation', 'wedding', 'house', 'educational', and 'renewable energy' because they have fewer loans compared to the top three loan purposes.

  • Distribution of dti: Further analysis is needed to understand how the debt-to-income ratio affects credit risk and whether there are specific DTI values that pose a higher credit risk.

  • Distribution bad flag: Oversampling SMOTE can help balance data and improve model performance in predicting non-dominant cases.

  • Model evaluation: Regular model testing is needed to ensure consistent model performance and alignment with analysis objectives. Additionally, further analysis can be done regarding factors that affect model performance, such as selecting appropriate features or using more complex algorithms to improve model accuracy.

About

This end to end project developed a classification model to predict loan credit risk. Debt consolidation was the most common loan purpose, with good credit being prevalent. Imbalanced data was addressed using SMOTE oversampling. The model achieved high accuracy and validation scores, with suggestions for variable importance and model evaluation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors