Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_sav() truncates long labels #157

sjpierce opened this issue Apr 23, 2016 · 4 comments

write_sav() truncates long labels #157

sjpierce opened this issue Apr 23, 2016 · 4 comments


Copy link

sjpierce commented Apr 23, 2016

Below is a reproducible example of a situation where haven::write_sav exports an SPSS file that contains incorrect value labels. When one adds labels to a character variable (making it a labelled vector in R because you want it to have value labels in the exported SPSS file) and at least one of the actual values in that variable is >= 9 characters in length, then the labels are not correctly exported for that variable. Labels may be fine for other variables in the same file. Labels are correctly exported if all values for the variable in question have <= 8 characters. When I use SPSS to inspect the value labels in test_export.sav, I see a value that looks like "12345678lz h□" where I should see "12345678E". Similarly, I see "1234ABCD{z h□" where I should see "1234ABCD". I've extended the code to show what is actually read back in via read_sav().

# Write_sav() does not properly export value labels in the SPSS file for 
# labelled character variables when at least one value of that variable 
# contains 9 or more characters. 

# Create example data. v1 should export correctly because all values are 
# <= 8 characters, v2 will not because first value is 9 characters long.  
dat <- data.frame(v1 = c("12345678",  "ABCDEFGH", "1234ABCD"),
                  v2 = c("12345678E", "ABCDEFGH", "1234ABCD"),
                  l1 = c("Text1",  "Text2", "Text3"),
                  l2 = c("Text4",  "Text5", "Text6"),
                  stringsAsFactors = FALSE)

# Create a named vector from a vector of values plus a vector of labels.
named <- function(x, labels) {names(x) <- labels; return(x)}

# Turn v1 & v2 into labelled variables so they'll have value labels 
# assigned in the exported SPSS file. Length & contents of the strings 
# in the label vectors seem not to matter. 
dat$v1 <- labelled(dat$v1, named(dat$v1, dat$l1))
dat$v2 <- labelled(dat$v2, named(dat$v2, dat$l2))

# The SPSS file created below will have correct value labels only for v1. 
# You have to open the file in SPSS to see that. 
write_sav(dat, path = "test_export.sav")

# Read the file back in to show that what was stored in test_export.sav
# does not match properties of the data frame written out. 
dat2 <- read_sav(path = "test_export.sav")

# Correct set of labels we tried to write out. 
attr(dat$v2, "labels")

# Incorrect set of labels recovered from the exported file. 
attr(dat2$v2, "labels") ``` 

@hadley hadley changed the title write_sav exports labeled character vector with incorrect value labels write_sav() truncates long labels May 30, 2016
Copy link

hadley commented May 30, 2016

Minimal reprex:

x <- labelled(c("1", "2"), c("2" = "1234567890", "1" = "1"))
tmp <- tempfile()
write_sav(tibble::data_frame(x), tmp)

@evanmiller this seems like a readstat bug

Copy link

tklebel commented May 30, 2016

SPSS seems to default to creating new variables with a maximum "width" of 8, therefore restricting every value in that variable to eight digits/characters. Maybe that causes a problem along the way.

Copy link

Just wanted to acknowledge this as a bug in ReadStat. Should have a fix available soon. (Values longer than 8 bytes and labels longer than 255 bytes require a separate "long value labels" record in the SAV file.)

Copy link

Should be fixed in WizardMac/ReadStat@c0d19dc and WizardMac/ReadStat@3203ffe

@hadley hadley closed this as completed in 101b072 May 31, 2016
@lock lock bot locked and limited conversation to collaborators Jun 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet

No branches or pull requests

4 participants