Download the PowerPoint to see the analysis breakdown.
Using the German Credit dataset to statistically model Default Rates. This involved exploratory analysis, variable selection, and statistical modeling. In order to analyze the default rate of the german people I compared logistic regression, KNN, LDA and QDA models for the best error rate, sensitivity, specificity, and AUC. LDA had the best model for this project with the lowest error rate, highest accuracy with the best balance of AUC.
Statistical Learning
- Requirements
- R Markdown
-
Load the germancredit.csv data
-
Exploratory Analysis of data
-
Building a reasonably “good” logistic regression model
-
Analyzing regression coefficients and error rate
-
Fit KNN model and find error rate, sensitivity, specificity, and AUC
-
Fit LDA model and find error rate, sensitivity, specificity, and AUC
-
Fit QDA model and find error rate, sensitivity, specificity, and AUC
-
Comparing every classifier
-
Cross validation LOOCV and k-fold analysis
-
Ridge and Lasso analysis