In the ever-evolving landscape of financial services, lending institutions face the critical challenge of mitigating risks associated with loan defaults. Accurately predicting which clients are likely to default on their loans is paramount for maintaining financial stability and ensuring the growth of both the institutions and the broader economy. Banks and lending institutions can lower the frequency of non-performing loans by making better choices by precisely identifying high-risk borrowers. With the use of statistical and machine learning methods, this research seeks to create a reliable predictive model for loan default predictions.
My primary objectives for this project are threefold:
-
Develop a Classification Model: My goal is to produce a trustworthy and precise classification model that can forecast a client's likelihood of loan default. Through the utilization of diverse machine learning methodologies and financial datasets, my goal is to create a model that can be relied upon by financial institutions to make well-informed loan choices.
-
Features' Significance Recommendations: my objectives goes beyond merely forecasting loan defaults; i also want to give financial institutions useful information about the significant factors that affect a client's probability of default. By focusing on the most predictive variables when evaluating loan applications, these insights will enable institutions to improve their loan application and approval procedures.
-
Deployment to API and Web Application: Using Streamlit, i will integrate the final classification model into a user-friendly web application and expose it as an API to guarantee the model's practical utility. Real-time prediction and smooth integration into financial institutions' current operations will be made possible by this implementation.
My goals with this initiative are to lower loan risks, improve the decision-making capabilities of financial institutions, and improve the stability and effectiveness of the financial industry as a whole.
The dataset comprises over 200 observations and includes 16 input features and a target variable. These features are::
Age: The age of the loan applicant. Age can provide insights into the applicant's financial maturity and stability.
Income: The annual income of the applicant. Higher income generally indicates a greater ability to repay loans.
LoanAmount: The amount of the loan requested by the applicant. Larger loan amounts might pose a higher risk of default.
CreditScore: A numerical representation of the applicant's
creditworthiness. Higher credit scores are typically associated with lower default risks.
MonthsEmployed: The number of months the applicant has been employed at their current job. Longer employment durations often suggest job and income stability.
NumCreditLines: The number of open credit lines the applicant has. This can indicate the applicant's experience with managing credit.
InterestRate: The interest rate applied to the loan. Higher interest rates can increase the repayment burden, potentially leading to defaults.
LoanTerm: The duration over which the loan is to be repaid. Longer loan terms can result in lower monthly payments but may increase the total interest paid.
DTIRatio: The Debt-to-Income (DTI) ratio, representing the proportion of the applicant's income that goes towards debt payments. Higher DTI ratios can signal higher default risks.
Education: The highest level of education attained by the applicant. Education level can correlate with income potential and financial literacy.
EmploymentType: The type of employment (e.g., full-time, part-time, self-employed). Different employment types can influence income stability.
MaritalStatus: The marital status of the applicant. Married applicants might have dual incomes, affecting their repayment capability.
HasMortgage: This indicates whether the applicant currently has a mortgage. Existing large debts might impact the ability to repay additional loans.
HasDependents: A binary indicator of whether the applicant has dependents. More dependents can increase living expenses and financial strain.
LoanPurpose: The stated purpose of the loan (e.g. business, home purchase, automobile, education). Different purposes might carry different levels of risk.
HasCoSigner: This indicates whether the loan applicant has a co-signer. Having a co-signer can reduce the risk of default.
Default: This column is a binary indicator of whether the applicant defaulted on the loan (1 for default, 0 for no default). This is our target variable for prediction.
I utilized the following tools and technologies for this project
● Python
● Flask API
● Postman
● Git
● Render
● Heroku
● Streamlit