Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_sav needs to coerce labels to UTF-8 #87

Closed
larmarange opened this issue Jul 7, 2015 · 4 comments
Closed

write_sav needs to coerce labels to UTF-8 #87

larmarange opened this issue Jul 7, 2015 · 4 comments

Comments

@larmarange
Copy link
Contributor

An example:

v1 <- labelled(c(1,1,2,3), c(éè = 1, à = 2, ï  = 3))
v2 <- c("éè", "éè", "à", "ï")
v3 <- c("ee", "ee", "a", "i")
dt <- data_frame(v1, v2, v3)
attr(dt$v1, "label") <- "a làbèl wïth acèènts"
attr(dt$v2, "label") <- "a label with no accent"
write_sav(dt, "example.sav")

When opening the resulting file with SPSS, it appears that:

  • v2 and v3 are correct
  • the value labels of v1 are not correct (i.e. they are displayed as ??, ??, ?, ? by SPSS)
  • variable labels are correct
@sjPlot
Copy link

sjPlot commented Jul 7, 2015

Do you have the same problems with the write_spss function from the sjmisc-package? Try to convert label attributes to factor levels (as_factor or to_label), this might work...

@larmarange
Copy link
Contributor Author

I just tried with write_spss. Exactly the same bug with accents in value labels.

@hadley
Copy link
Member

hadley commented Jul 7, 2015

In the short-term, you can work around it by setting the Encoding() of every character vector to UTF-8. In the long-term, haven should do that for you automatically.

@hadley hadley changed the title Encoding pb with value labels in write_sav write_sav needs to coerce labels to UTF-8 May 30, 2016
@hadley
Copy link
Member

hadley commented May 30, 2016

Minimal reprex:

# c("éè", "à", "ï")
labels_utf8 <- c("\u00e9\u00e8", "\u00e0", "\u00ef")
labels_latin1 <- iconv(labels_utf8, "utf-8", "latin1")

v_utf8 <- labelled(3:1, setNames(1:3, labels_utf8))
v_latin1 <- labelled(3:1, setNames(1:3, labels_latin1))

roundtrip_var(v_utf8) #ok
roundtrip_var(v_latin1) # not ok

@hadley hadley closed this as completed in 092059d May 30, 2016
@lock lock bot locked and limited conversation to collaborators Jun 27, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants