Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

read_dta: import value labels for labeled negative values #367

Closed
kwenzig opened this issue Apr 13, 2018 · 7 comments
Closed

read_dta: import value labels for labeled negative values #367

kwenzig opened this issue Apr 13, 2018 · 7 comments
Labels
bug an unexpected problem or unintended behavior readstat

Comments

@kwenzig
Copy link
Contributor

kwenzig commented Apr 13, 2018

Example:

tmp <- tempdir()
x1 <- labelled(
  sample(-1:5), 
  c(Good = -1, Bad = 5)
)
df <- tibble::data_frame(x1, z = 1:7)
as_factor(df)
write_dta(df, paste0(tmp,"/df.dta"),version=14)
df1 <- read_dta(paste0(tmp,"/df.dta"))
as_factor(df1)

Here the result of as_factor(df) contains a row Good 7, while the corresponding row of as_factor(df1) is -1 7.00.

The produced Stata file is correct:

. label list x1
x1:
          -1 Good
           5 Bad
@cschwem2er
Copy link

We're currently having trouble with saving negative value labels in sjlabelled, which relies on havens functions for reading and writing. But we're not 100% sure whether this is the same problem (see strengejacke/sjlabelled#7). Maybe @hadley has an idea.

@kwenzig
Copy link
Contributor Author

kwenzig commented May 2, 2018

In my example constructing a data set and saving with write_dta works within R: The Stata file has a negative value with the correct label. The problem is reading this file with read_dta.

@hadley

This comment has been minimized.

@hadley hadley added the bug an unexpected problem or unintended behavior label Jun 20, 2018
@hadley
Copy link
Member

hadley commented Jun 20, 2018

library(haven)

x1 <- labelled(-1:3,  c(Good = -1, Bad = 3))

tmp <- tempfile()
write_dta(data.frame(x1), tmp, version = 14)
x2 <- read_dta(tmp)$x1

x1
#> <Labelled integer>
#> [1] -1  0  1  2  3
#> 
#> Labels:
#>  value label
#>     -1  Good
#>      3   Bad
x2
#> <Labelled double>
#> [1] -1  0  1  2  3
#> 
#> Labels:
#>  value label
#>  NA(z)  Good
#>      3   Bad

Created on 2018-06-20 by the reprex package (v0.2.0).

@hadley
Copy link
Member

hadley commented Jun 20, 2018

@evanmiller I think this is a readstat issue — readstat_value_is_tagged_missing() is returning TRUE for the value passed to the readstat_set_value_label_handler().

evanmiller added a commit to WizardMac/ReadStat that referenced this issue Jun 20, 2018
A signed/unsigned comparison caused negative values to be tagged
as missing. Fixes tidyverse/haven#367
@evanmiller
Copy link
Collaborator

Fix: WizardMac/ReadStat@42c5212

Test: WizardMac/ReadStat@68d7a59

Looks like there was a problem with signed/unsigned comparisons. The fix involved ripping out code, which I am always happy about.

@hadley hadley closed this as completed in edec5a0 Jun 20, 2018
@lock
Copy link

lock bot commented Dec 17, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Dec 17, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior readstat
Projects
None yet
Development

No branches or pull requests

4 participants