# COGS 118A - Final Project

# Credit Risk Assessment Using Machine Learning Techniques and an Ensemble Approach

## Group members

- Jason Liang
- Tasnia Jamal
- Thanh Derek Nguyen
- Wilson Chen

# Abstract 
This section should be short and clearly stated. It should be a single paragraph <200 words.  It should summarize: 
- what your goal/problem is
- what the data used represents 
- the solution/what you did
- major results you came up with (mention how results are measured) 

__NB:__ this final project form is much more report-like than the proposal and the checkpoint. Think in terms of writing a paper with bits of code in the middle to make the plots/tables

Creditability helps assist financial institutions in making decisions regarding the granting of credit. The credit standing for individuals is typically based on a multitude of factors and requires nuanced computational methods to ensure accurate predictions, especially in a turbulent market when bestowers need to seriously consider whether a person is worthy of loans. The goal of this research project is to predict creditability standing for individuals based on a variety of variables like duration of credit, account balance, occupation, and other personal financial statistics. Our workflow consists of cleaning the data and normalizing it, and then used to build supervised machine learning models like linear regression, logistic regression, decision trees, and K-nearest neighbors and then using obtained performance and error metrics, to conclude which model is best for classifying a person's credit score.

# Background

In finances, credit scoring is a way of analyzing statistical data used in financial organizations and banks to acquire a person's creditworthiness. The score itself plays a significant role in determining the creditworthiness of a person and if they are qualified to sanction a loan from the bank. The process of credit scoring is normally automated using machine learning techniques and involves evaluating various factors related to a person's or entity's credit history and financial behavior to determine the likelihood of repaying debts and fulfilling financial obligations <a name="abdou"></a>[<sup>[1]</sup>](#abdounote).

The primary purpose of credit scoring is to help lenders make informed decisions about extending credit to potential borrowers, allowing them to evaluate the risk associated with lending money to an individual or business. This allows them to determine the terms and conditions of the pending credit offer, including interest rates and credit limits. In common credit scoring models, the following are some of the main criteria used when calculating a score:

1.) Payment history: This factor assesses the borrower's track record of making payments on time, including any late or missed payments.

2.) Credit utilization: This measures the amount of credit a borrower is using in relation to their available credit limits. Higher credit limit may negatively impact overall credit score.

3.) Length of credit history: The length of time the borrower has been using credit is taken into account. A longer credit history is typically viewed positively.

4.) Credit mix: This factor considers the borrower's mix of credit accounts, such as credit cards mortgages, and loans. Having a diverse credit mix may be seen as positive.

5.) New credit applications: Recent applications for credit or loans may have an impact on the credit score, as multiple applications within a short period can be viewed as a sign of financial instability <a name="xiao"></a>[<sup>[2]</sup>](#xiaonote).

The model then assigns a numerical value to each factor and generates a score (typically via machine learning models) that ranges from a minimum to a maximum value. The most commonly used scoring system is the FICO score, developed by the Fair Isaac Corporation <a name="fico"></a>[<sup>[3]</sup>](#ficonote). It ranges from 300 to 850, with higher scores indicating lower credit risk. With the recent failure of Silicon Valley Bank and the decline of major tech giants in what appears to be a recession, it is more important than ever to develop accurate computational models that assist financial institutions in the decision-making process at the time of a customer's financial request. Today, the credit market demands new tools and technologies that can contribute to this classification.

In our research, we attempt to apply machine learning techniques to credit analysis, specifically in predicting a person's creditability and in doing so, evaluating which variables are most relevant in defining good and bad payers. The main question at hand here is how do we measure the risk in granting credit to individuals with greater accuracy <a name="pinco"></a>[<sup>[4]</sup>](#pinconote)? Some of the secondary points to consider that will aid us in answering include determining the variables that define the individual's ability to pay, what makes an individual take credit even though they are aware they don't have the resources to pay, and how the individual's behavior as a consumer impact their own creditability.

# Problem Statement

The problem involves creating a machine learning model to try and predict a person's creditability by utilizing different variables. In order to create such a model to solve this problem, we need to take into account a range of relevant parameters such as income, age, occupation, and more. This problem is solvable by quantifying critical variables into playing a role in an individual's credit standing. For example, credit standing will be scored in terms of “Bad” or “Good,” which will be dependent on the critical values that factor into creditability. The problem is measurable because creditability has had a long-running background in its form of operation. After comparing the predicted credit standing to the true credit standing, we can then evaluate the accuracy and efficiency of the machine learning model that we plan to implement.

# Data

This credit dataset is from 1994 and was found in the UCI Machine Learning Repository <a name="dataset"></a>[<sup>[5]</sup>](#datasetnote), it contains 1000 observations with 21 different variables. Note that this dataset consists of data scored under the German credit system, which is different from FICO credit scoring as described in the previous section; however, both procedures share many financial statistics used to evaluate a person's credit worthiness.

An observation consists of:
 - Creditability: It has values of 0 or 1 where 0 indicates a person is not credit-worthy and 1 indicates a person is credit-worthy
 - Account Balance: Balance of the current account in German currency (Deutsche Mark)
 - Duration of Credit: Credit duration range from <= 6 months to > 54 months
 - Payment Status of Previous Credit: Status of previous credit payments
 - Purpose: Purpose of credit which includes cars, furniture, household appliances, education, business, etc.
 - Credit Amount: Amount of credit in German currency (Deutsche Mark)
 - Value Savings/Stocks: Value of savings or stocks in German currency (Deutsche Mark)
 - Length of current employment: How long a person has been employed ranging from <= 1 year to >= 7 years
 - Installment: Installment in percent of available income
 - Sex & Marital Status: Sex combined with marital status
 - Guarantors: Further debtors or guarantors
 - Duration in Current address: How long a person has been living in their current household for
 - Most valuable available asset: Most valuable available asset categories include house, land, life insurance, car, or no asset
 - Age: Age range from  0 to >= 65 years old
 - Concurrent Credits: Further running credits at other institutions
 - Type of apartment: Categories include rented flat, owner-occupied flat, or free apartment
 - Number of Credits at this Bank: ​​Number of previous credits at this bank (including the running one)
 - Occupation: Categories include unemployed, unskilled worker, skilled worker, executive
 - Number of dependents: Number of people entitled to maintenance
 - Telephone: It has values of 1 or 2 where 1 indicates a person does not have a telephone and 2 indicates a person has a telephone
 - Foreign Worker: It has values of 1 or 2 where 1 indicates a person is a foreign worker and 2 indicates a person is not a foreign worker

All of the variables mentioned above are to be considered as critical variables because of their contribution to the main scoring of creditability and are all categorical variables but encoded into numerics. 


# Proposed Solution

Prior literature and experiments have demonstrated that the most successful models for credit score classification are logistic regression, neural networks, decision trees, random forests, and support vector machines due to the complexity of many credit card datasets. 

# Evaluation Metrics

In classification, we are concerned with building robust models by seeing how it well it performs in correctly classifying unseen data. Depending on the selected model, we will be using include accuracy, precision, recall, and F1-score for our performance metrics, and things like false positive rate for our error metric. We prioritize accuracy and F1-score as previously mentioned. 

# Results

You may have done tons of work on this. Not all of it belongs here. 

Reports should have a __narrative__. Once you've looked through all your results over the quarter, decide on one main point and 2-4 secondary points you want us to understand. Include the detailed code and analysis results of those points only; you should spend more time/code/plots on your main point than the others.

If you went down any blind alleys that you later decided to not pursue, please don't abuse the TAs time by throwing in 81 lines of code and 4 plots related to something you actually abandoned.  Consider deleting things that are not important to your narrative.  If its slightly relevant to the narrative or you just want us to know you tried something, you could keep it in by summarizing the result in this report in a sentence or two, moving the actual analysis to another file in your repo, and providing us a link to that file.

### Subsection 1

You will likely have different subsections as you go through your report. For instance you might start with an analysis of the dataset/problem and from there you might be able to draw out the kinds of algorithms that are / aren't appropriate to tackle the solution.  Or something else completely if this isn't the way your project works.



### Subsection 2

Another likely section is if you are doing any feature selection through cross-validation or hand-design/validation of features/transformations of the data

### Subsection 3

Probably you need to describe the base model and demonstrate its performance.  Maybe you include a learning curve to show whether you have enough data to do train/validate/test split or have to go to k-folds or LOOCV or ???

### Subsection 4

Perhaps some exploration of the model selection (hyper-parameters) or algorithm selection task. Validation curves, plots showing the variability of perfromance across folds of the cross-validation, etc. If you're doing one, the outcome of the null hypothesis test or parsimony principle check to show how you are selecting the best model.

### Subsection 5 

Maybe you do model selection again, but using a different kind of metric than before?



### Subsection 1

The dataset consists of a mixture of categorical and numeric variables. The pre-processing of the dataset with a characterization of the dataset where lower significance items were removed and numerical values were categorized. The most relevant attributes for credit risk evaluation was selected using Forward Stepwise Regression in the WEKA tool<a name="weka"></a>[<sup>[6]</sup>](#weka), which also displayed the comparative gain of every variable. Thus, we were able to easily analyze the effects of each variable and their impact on influencing the model's categorization of the data without having to implement multiple models and algorithms on the entire dataset, saving us time.



### Subsection 2

Based on our analysis, we see immediate patterns such as a linear relationship between. This prompts us to do a baseline model 

In [None]:
Based on our analysis, we see immediate patterns such as a line

### Subsection 3

However, our problem requires us to explore all avenues of machine learning models in order to see which one performs best on classifying the data and to uncover any false assumptions or information we may have overlooked during EDA. 

In [None]:
However, our problem requires us to explore all a

### Subsection 4

In [None]:
Based on these metrics 

### Subsection 5

As an extension to this model to see if we can further improve upon our 

In [None]:
As an extension to this model to see if 

# Discussion

### Interpreting the result

OK, you've given us quite a bit of tech informaiton above, now its time to tell us what to pay attention to in all that.  Think clearly about your results, decide on one main point and 2-4 secondary points you want us to understand. Highlight HOW your results support those points.  You probably want 2-5 sentences per point.

### Limitations

Are there any problems with the work?  For instance would more data change the nature of the problem? Would it be good to explore more hyperparams than you had time for?   

### Ethics & Privacy

Since our project focuses on credit classification, several ethical and privacy concerns must be considered. First, we must consider data privacy and security since personal financial data associated with classifying credit score is confidential and highly sensitive. If we were given personally identifiable information (which can be insecure if given in the correct combinations, i.e. birthday, SSN, and address), we would have to scrap these variables or encrypt them so they are not visible during the EDA process. However, we will not be using these variables for our application, as these variables are usually provided to a third-party financial institution to perform background checks to ensure accurate credit scoring. However, the data we collected will not contain this information; instead, it contains non-personally sensitive information such as occupation and amount in the bank account. When given in combination, it is rare that such data can be used to expose customer data. Regardless, we will be asking for consent if we were to collect this data in real life. Other than that, the data set that we will be using is completely safe and customer personal information will be invisible to us and other potential adversaries.

Another concern is the biases that may emerge from an imbalance in our data collection or biased decision-making during the data labeling process. We plan to address this concern by building models that are fair and unbiased, while actively working to identify and mitigate any unfair impacts. For instance, there are signficantly more foreign workers than non-foreign workers in out data set, so perhaps we may not consider this variable or build a fair training and test split from it. Lastly, we might face concerns regarding transparency so we plan to make sure sure that the models used for classification to be transparent and explainable. Some machine learning algorithms that we use in our work, such as deep learning models, are often considered black-box models, making it challenging to interpret their decision-making process. Credit risk assessment requires transparency and explainability, as financial institutions need to understand the reasons behind credit risk predictions to comply with regulations and provide justifications for their decisions.

### Conclusion

Modern global markets are full of risks and 

# Footnotes
<a name="abdounote"></a>1.[^](#abdou): Abdou, HAH and Pointon, J. (2011). Credit scoring, statistical techniques and evaluation criteria: A review of the literature, Intelligent Systems in Accounting, Finance Management. 18 (2-3), pp. 59-88.<br> 
<a name="xiaonote"></a>2.[^](#xiao): XIAO.-L. Li and Y. Zhong. (2012). An overview of personal credit scoring: Techniques and future work, ‖ International Journal of Intelligence Science, vol. 2, no. 4, pp. 181–189.<br>
<a name="ficonote"></a>3.[^](#fico): https://www.fico.com/en/products/fico-score.<br>
<a name="pinconote"></a>4.[^](#pinco): M. Pincovsky, A. Falcão, W. N. Nunes, A. Paula Furtado and R. C. L. V. Cunha, ""Machine Learning applied to credit analysis: a Systematic Literature Review"," 2021 16th Iberian Conference on Information Systems and Technologies (CISTI), Chaves, Portugal, 2021, pp. 1-5, doi: 10.23919/CISTI52073.2021.9476350.<br>
<a name="dataset"></a>5.[^](#dataset): https://online.stat.psu.edu/stat857/node/222/<br> 
<a name="weka"></a>6.[^](#weka): https://www.cs.waikato.ac.nz/ml/weka/<br> 

# Extra Credit

We strongly believe we deserve extra credit on this project.