New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`write_sav` issues with labelled class vectors #193

Closed
sjPlot opened this Issue Jun 27, 2016 · 3 comments

Comments

Projects
None yet
2 participants
@sjPlot
Copy link

sjPlot commented Jun 27, 2016

I have found some issues when saving labelled data to SPSS. I used the latest dev-build from 25th June. First, here is the reproducible example for the issues:

library(haven)
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
              c("Agreement" = 1, "äquivalent" = 2, "Disagreement" = 4, "First" = tagged_na("c"),
                "Refused" = tagged_na("a"), "Not home" = tagged_na("z")))

attr(x, "label") <- "Äquivalenzeinkommen"
x
# test.sav
haven::write_sav(data.frame(test = x), "test.sav")
# test2.sav
haven::write_sav(data.frame(test = haven::as_factor(x)), "test2.sav")

Now to what I have found:

  • Using the data.frame method on a labelled vector produces a column name with dots at the end. Dots are no legal characters at the end of variable names in SPSS. Trying to run any command with this variable leads to an error in SPSS.
> data.frame(x)
   x..i..
1       1
2       2
3       3
4      NA

spss_2

  • It seems that the "tags" of tagged NA values are saved, but recognized as invalid value labels. See screenshot and also the screenshot above:

spss_1

  • After changing the variable name, the value labels are not correctly assigned. See screenshot above, where 1 is agreement, 2 is äquivalenz and 4 is disagreement. Running the frequency command in SPSS gives following:

spss_4

  • After editing some of the value labels, assignment still is not correct:

spss_5

To summarise, labelled vectors are not properly saved to SPSS. However, factors are. The second SPSS file, where I used haven::as_factor(x)), seems to work, except that as_factor does not convert values to NA (thus, they are counted as regular values).

  • Correct value labels

spss_try2_2

  • Correct frequencies

spss_try2_1

  1. I would suggest to convert any labelled vector to factor with as_factor before writing the data.
  2. Furthermore, it would be nice to include an option, where (tagged/labelled) NA values can be converted to NA. You have to be sure to convert them to regular NA (i.e. removing the "tag" from the NA value), else you have broken value labels (see 1st/2nd screenshot at top)

Test1 and 2.zip

hadley added a commit that referenced this issue Aug 9, 2016

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Aug 9, 2016

It would be helpful if you could file multiple issues for different problems. I've fixed the first one.

I'm not currently planning to add full write support for tagged missing values at this time.

@hadley hadley closed this Aug 9, 2016

@sjPlot

This comment has been minimized.

Copy link

sjPlot commented Aug 11, 2016

No, I was not thinking about full write support for tagged NAs, but for converting them to regular NA before writing, because tagged NA produce invalid values in the output file.

@hadley

This comment has been minimized.

Copy link
Member

hadley commented Aug 11, 2016

In that case, please file a minimal bug report along those lines.

@lock lock bot locked and limited conversation to collaborators Jun 26, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.