Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing a variable with missing values but no labels #219

Closed
larmarange opened this issue Sep 20, 2016 · 6 comments

Comments

@larmarange
Copy link
Contributor

commented Sep 20, 2016

In SPSS, you could have a variable with defined missing values but no value label. In such case, read_spss is producing a vector with a structure like:

> x <- structure(c(1.2, 2.4), class = c("labelled_spss", "labelled"), na_values = 9)

If it's working with print, there is a bug when using summary:

> x
<Labelled SPSS double>
[1] 1.2 2.4
Missing values: 9
> summary(x)
Error: `x` and `labels` must be same type

In fact, as currently written, labelled_spss format doesn't allow a vector with no value labels.

> labelled_spss(c(1.2, 2.4), NULL, na_values = 9)
Error: `x` and `labels` must be same type 
@hadley

This comment has been minimized.

Copy link
Member

commented Jan 25, 2017

I think there are two problems here:

# First, you should be able to create a labelled spss vector with no labels
# (That's a bit inelegant but it's expedient)
labelled_spss(c(1.2, 2.4), na_values = 9)
labelled_spss(c(1.2, 2.4), double(), na_values = 9)

# Also need to make sure the C++ code returns an object with that structure

# Second you should be able to subset that object
# (That's the root cause of the summary failure)
x <- structure(c(1.2, 2.4), class = c("labelled_spss", "labelled"), na_values = 9)
x[1]

Would you mind providing an SPSS file that contains that variable so I can test?

@hadley

This comment has been minimized.

Copy link
Member

commented Feb 15, 2018

@larmarange do you still have any interest in this issue?

@huftis

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2018

I’m not the original submitter, but here’s an example SPSS file as requested, @hadley:
missing-no-label.zip

Example code to trigger the error message:

library(haven)
d <- read_sav("https://github.com/tidyverse/haven/files/1942576/missing-no-label.zip", user_na = TRUE)
summary(d$x)
#> Error: `x` and `labels` must be same type
@dusadrian

This comment has been minimized.

Copy link

commented Oct 15, 2018

I was recently thinking about a similar example, and concluded it simply does not make any sense to have a declared missing value with no label. This would be similar to a general NA in R.

How about using the value itself as a label? Something like:

x <- structure(c(1.2, 2.4), class = c("labelled_spss", "labelled"),
               labels = c("9" = 9), na_values = 9)

Something like that could be done at import time.

@hadley

This comment has been minimized.

Copy link
Member

commented Jan 24, 2019

Reprex with built-in dataset:

library(haven)
sav <- system.file("files", "testdata.sav", package = "foreign")
x <- read_spss(file = sav, user_na = TRUE)

x$numeric_long_label
#> <Labelled SPSS double>
#> [1] 1.00000 2.00000 3.33333 4.00000      NA
#> Missing range:  [1, 2]
x$numeric_long_label[1:3]
#> Error: `x` and `labels` must be same type

Created on 2019-01-23 by the reprex package (v0.2.1.9000)

@lock

This comment has been minimized.

Copy link

commented Jul 23, 2019

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Jul 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
4 participants
You can’t perform that action at this time.