Team of three: Alice, Yaksh, Joe
Language, Tools, Technologies, Library: Python, Jupyter Notebook, Pandas, Numpy, Matplotlib
The study took payment data in October, 2005, from an important bank (a cash and credit card issuer) in Taiwan and the targets were credit card holders of the bank. Among the total 25,000 observations, 5529 observations (22.12%) are the cardholders with default payment.
- Investigate and understand customers' default payment behavior in Taiwan.
- Aims to compare the predictive accuracy of data mining methods in determining the probability of default among these customers.
- Data was randomly divided into two groups, one for model training and the other to validate the model.
- Total records: 30000 records
- Testing: 9000 records
- Training: 21000 records
- Decision Tree
- Support Vector Machine (SVM)
- SVM Linear
- SVM Radial Basis Function (RBF)
- SVM Sigmoid
- SVM Polynomial (degree 4)
- Artificial Neural Network (ANN)
- SVM RBF looks to be the best classifier throughout.
- There are few points where ANN and SVM linear might slightly outperform SVM RBF.