๐ฎ It is an end-to-end machine learning solution designed to help financial institutions minimize credit risk. By analyzing historical loan application data, the system identifies key patterns that lead to defaults, allowing lenders to make data-driven decisions on whether to approve or reject a loan.
The dataset includes the following features for each applicant:
- Personal: Gender, Marital Status, Dependents, Education.
- Financial: Applicant Income, Co-applicant Income, Loan Amount, Loan Term.
- Credit: Credit History (0 or 1), Property Area (Urban/Semiurban/Rural).
- Target:
Status(Y = Approved/Repaid, N = Default/Rejected).
- Data Preprocessing: Handling missing values and encoding categorical text into numerical format.
- Exploratory Data Analysis (EDA): Visualizing the relationship between credit history and loan approval.
- Feature Selection: Dropping non-predictive columns like
Loan_ID. - Model Training:
- Logistic Regression: Baseline statistical model.
- Random Forest: Advanced ensemble model for higher accuracy.
- Evaluation: Comparing models using accuracy scores and confusion matrices.
- Language: Python ๐
- Libraries:
pandas,numpyโ Data wranglingmatplotlib,seabornโ Visualizationscikit-learnโ Machine Learning
| Model | Accuracy | Suitability |
|---|---|---|
| Logistic Regression | ~80% | High interpretability |
| Random Forest | ~85% | Better at catching complex patterns |