# ShichenXie/scorecard

Scorecard Development in R, 评分卡
Latest commit 9be09c0 Apr 18, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
R Apr 18, 2019
data Nov 17, 2018
docs Mar 22, 2019
man Mar 22, 2019
tests Dec 11, 2018
vignettes Jan 7, 2019
.DS_Store Nov 17, 2018
.Rbuildignore Jan 14, 2019
.gitignore Nov 17, 2018
.travis.yml Nov 17, 2018
CRAN-RELEASE Apr 18, 2019
DESCRIPTION Apr 18, 2019
NAMESPACE Jan 7, 2019
NEWS.md Apr 18, 2019
_pkgdown.yml Jan 9, 2019
scorecard.Rproj Nov 17, 2018

# scorecard

The goal of scorecard package is to make the development of the traditional credit risk scorecard model easier and efficient by providing functions for some common tasks that summarized in below. This package can also used in the development of machine learning models on binomial classification.

• data preparation (split_df, one_hot)
• variable selection (var_filter, iv, vif)
• weight of evidence (woe) binning (woebin, woebin_plot, woebin_adj, woebin_ply)
• performance evaluation (perf_eva, perf_psi)
• scorecard scaling (scorecard, scorecard_ply)
• scorecard report (gains_table, report)

## Installation

• Install the release version of scorecard from CRAN with:
install.packages("scorecard")
# install.packages("devtools")
devtools::install_github("shichenxie/scorecard")

## Example

This is a basic example which shows you how to develop a common credit risk scorecard:

# Traditional Credit Scoring Using Logistic Regression
library(scorecard)

# data preparing ------
data("germancredit")
# filter variable via missing rate, iv, identical value rate
dt_f = var_filter(germancredit, y="creditability")
# breaking dt into train and test
dt_list = split_df(dt_f, y="creditability", ratio = 0.6, seed = 30)
label_list = lapply(dt_list, function(x) x\$creditability)

# woe binning ------
bins = woebin(dt_f, y="creditability")
# woebin_plot(bins)

## or specify breaks manually
age.in.years=c(26, 35, 40),
other.debtors.or.guarantors=c("none", "co-applicant%,%guarantor"))

# converting train and test into woe values
dt_woe_list = lapply(dt_list, function(x) woebin_ply(x, bins_adj))

# glm ------
m1 = glm( creditability ~ ., family = binomial(), data = dt_woe_list\$train)
# vif(m1, merge_coef = TRUE) # summary(m1)
# Select a formula-based model by AIC (or by LASSO for large dataset)
m_step = step(m1, direction="both", trace = FALSE)
m2 = eval(m_step\$call)
# vif(m2, merge_coef = TRUE) # summary(m2)

# # Adjusting for oversampling (support.sas.com/kb/22/601.html)
# library(data.table)
# p1=0.03 # bad probability in population
# r1=0.3 # bad probability in sample dataset
# dt_woe = copy(dt_woe_list\$train)[, weight := ifelse(creditability==1, p1/r1, (1-p1)/(1-r1) )][]
# fmla = as.formula(paste("creditability ~", paste(names(coef(m2))[-1], collapse="+")))
# m3 = glm(fmla, family = binomial(), data = dt_woe, weights = weight)

# performance ks & roc ------
## predicted proability
pred_list = lapply(dt_woe_list, function(x) predict(m2, x, type='response'))
## performance
perf = perf_eva(pred = pred_list, label = label_list)

# score ------
## scorecard