Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

as.character(as.item(..., missing.values = 0, ...)) returns NA instead of a string #52

Closed
bixiou opened this issue Feb 8, 2021 · 5 comments

Comments

@bixiou
Copy link

bixiou commented Feb 8, 2021

Since an update in memisc (somewhere between 0.99.22 and 0.99.28), the following code returns an error:

foo <- as.item(c(0,1,1,-1), missing.values= 0, labels = structure(c(-1,0,1), names=c('Yes', 'PNR', 'No')))
bar <- as.factor(as.character(foo))
bar <- relevel(bar, 'PNR')

Before the update (I had 0.99.22), we had: foo = c('PNR', 'Yes', 'Yes', 'No') while now we have foo = c(NA, 'Yes', 'Yes', 'No').
This is a weird behavior because the point of memisc is to distinguish missing values from NA.
Can you revert to the previous behavior?

@bixiou
Copy link
Author

bixiou commented Feb 15, 2021

PS: I corrected a typo in my post. With the typo there was no error. Now, there is the error.

@melff
Copy link
Owner

melff commented Feb 18, 2021

This is not a bug, but a feature. Values equal to 0 are translated into NA by as.character() because 0 is marked as a missing value. Such an automatic translation into NA is the whole point of the "item" class, because otherwise there is no way in R to distinguish between "non-NA" missing values and NA.

If you want to retain "PNR" as a valid factor level, you have to declare 0 as valid.
E.g. by

missing.values(foo) <- NULL
bar <- as.factor(foo)
bar <- relevel(bar, 'PNR')

or

valid.values(foo) <- -1:1
bar <- as.factor(foo)
bar <- relevel(bar, 'PNR')

or

bar <- as.factor(include.missings(foo))
bar <- relevel(bar, 'PNR')

If in earlier versions of memisc as.character() did not take into account the missingness status of certain values then that was a bug.

@melff melff closed this as completed Feb 18, 2021
@bixiou
Copy link
Author

bixiou commented Feb 18, 2021

Well, this is a pity because this "bug" was really convenient: it allowed me to run regressions without dropping data ('PNR' were considered as a response category, which it is) while allowing me to treat these 'PNR' as missing in the analysis (i.e. have
a special status for descriptive stats, graphs...).
So I'll keep using memisc 0.99.22 which does exactly what I need.

@melff
Copy link
Owner

melff commented Feb 18, 2021

I think

bar <- as.factor(as.character(include.missings(foo)))

or

bar <- as.factor(as.character(foo,include.missings=TRUE))

will do what you want it to do with newer versions of memisc.

@bixiou
Copy link
Author

bixiou commented Feb 18, 2021

In this particular case, yes, because I convert the variable in a factor anyway. But I noticed other parts of the code where the update causes issues, namely regressions. I'd need to transform each item variable to a factor in my regressions with the new memisc version. Easier (and more compact code) to keep the old one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants