Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scorecard_ply can not convert some variables. #47

Closed
ilkinhuseyn opened this issue Mar 11, 2020 · 5 comments
Closed

scorecard_ply can not convert some variables. #47

ilkinhuseyn opened this issue Mar 11, 2020 · 5 comments

Comments

@ilkinhuseyn
Copy link

Hi Shichen

I have the following codes:
my_card <- scorecard(bins, model, points0 = 600, odds0 = 1/38, pdo = 20)

points_train <- scorecard_ply(train.data, my_card, only_total_score = FALSE, print_step = 0) %>% as.data.frame()

The above codes run normally, but when I check new transformed dataframe(points_train )
I see that some of the variables has NA values (like GENDER, MARITAL STATUS). Checking the my_card list I see that there is no problem for these variables, but once I convert them to get scores for trained data(points_train) these variables has NA values. Do you know why it can be?
Have been few days trying to solve it but still cant figure out. Many Thanks!

@ShichenXie
Copy link
Owner

could you give me a sample data to reproduce your issue?

If you can use R, please use the r package scorcard, which is much mature.

@ilkinhuseyn
Copy link
Author

Hi
Thanks for your response. I have attached the sample data. I am using R and If you perform the following codes you can see the issue.

m = glm( formula, binnedwoe_sample, family = 'binomial')
my_card <- scorecard(bins_var, m, points0 = 600, odds0 = 1/38, pdo = 20)
points_train <- scorecard_ply(binnedwoe_sample, my_card, only_total_score = FALSE, print_step = 0) %>% as.data.frame()

So when you have a look at the points_train you will see that some variables (v1-v7) will have NA values. I still can't figure out why this is the case.
Many thanks

binnedwoe_sample.txt

@ShichenXie
Copy link
Owner

Your sample data and code cant be reproduce, since the 'binnedwoe_sample' is missing. You can provide the data in original values, but not in woe values.

@ilkinhuseyn
Copy link
Author

ilkinhuseyn commented Mar 14, 2020

I have attached the original data sample. If you can follow the commands below you can see the the isssue. Many Thanks!

bins_var = woebin(train_data, y="GOODBAD",positive = "BAD|1")
train_binned <- woebin_ply(train_data, bins_var)
names(train_binned) = gsub(pattern = "_woe", replacement = "", x = names(train_binned))
m3 = glm( GOODBAD~., train_binned, family = 'binomial')
my_card <- scorecard(bins_var, m3, points0 = 600, odds0 = 1/38, pdo = 20)
points_train <- scorecard_ply(train_binned, my_card, only_total_score = FALSE, print_step = 0) %>% as.data.frame()

train_data.txt

@ShichenXie
Copy link
Owner

You should use the original data in the scorecard_ply function, but not the woe data frame.

points_train <- scorecard_ply(train_data, my_card, only_total_score = FALSE, print_step = 0)

Since your are using the r version package, please open issue in the r package repo next time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants