New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
write_sav fails when character variables contain special characters #258
Comments
A simpler example with write_sas :
|
This works for me: df <- tibble::tibble(x = c("Normalre", "Blåresep"))
write_sas(df, tempfile())
write_sav(df, tempfile()) But it's possibly because I'm on a mac. Could you please see if that works for you? Otherwise, a reprex in that style would be much appreciated. |
Thank you for your suggestion, unfortunately it doesn't work: > library(haven)
> df <- tibble::tibble(x = c("Normalre", "Blåresep"))
> write_sas(df, tempfile())
Error in write_sas_(data, normalizePath(path, mustWork = FALSE)) :
Writing failure: A provided string value was longer than the available storage size of the specified column. By the way, df <- tibble::tibble(y=c("a","b"),x = c("Normalre", "Blårese")) # Works fine
df <- tibble::tibble(y=c("a","b"),x = c("Normalr", "Blårese")) # Yields error message
df <- tibble::tibble(y=c("a","b"),x = c("Normalresept", "Blåresept")) # Works fine
df <- tibble::tibble(y=c("a","b"),x = c("Normalresept", "Blåresept")) # Works fine
df <- tibble::tibble(y=c("a","b"),x = c("Normalresep", "Blåresept")) # Works fine
df <- tibble::tibble(y=c("a","b"),x = c("Normalrese", "Blåresept")) # Works fine
df <- tibble::tibble(y=c("a","b"),x = c("Normalres", "Blåresept")) # Yields error message |
GitHub issues use markdown and they're much easier to read if you learn a little bit about it. The most important thing is to put your R code inside a block that starts with |
Can you please also work on making your reprex minimal, like mine? I have lots of examples that work - I just need to have one example that I can easily copy and paste into R that demonstrates the problem. |
Eyeballing the issue, it looks like whatever is calculating the max string length is counting characters when it should be counting bytes.
… On Jan 26, 2017, at 08:31, Hadley Wickham ***@***.***> wrote:
Can you please also work on making your reprex minimal, like mine? I have lots of examples that work - I just need to have one example that I can easily copy and paste into R that demonstrates the problem.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I can confirm that this bug is still present in the latest version of library(haven)
df <- tibble::tibble(x = "Blaresep", y = "Blåresep")
write_sav(df[,1], tempfile()) # This works
write_sav(df[,2], tempfile()) # This doesn’t
# Error in write_sas_(data, normalizePath(path, mustWork = FALSE)) :
# Writing failure: A provided string value was longer than the available storage size of the specified column.
|
Note that even when I think an easy way to fix this is (or a work-around, for people having this problem), it to reencode all strings to UTF-8 before saving them ( |
@evanmiller any thoughts on whether this is a haven or a readstat problem? It's possible I've missed a conversion to utf-8 somewhere |
@evanmiller I'm either missing something obvious, I have misunderstood the readstat API, or there's a readstat bug.
I don't see a way to set the encoding for the output, so I'm assuming utf-8, but maybe that assumption is wrong? |
I believe the bug is here: https://github.com/tidyverse/haven/blob/master/src/DfWriter.cpp#L266 Should be measuring UTF-8 length on this line. |
Oh, I'm an idiot, thanks! |
This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/ |
The above code yield the error message below when x[2,2] is truncated to the 8, 16, 22-24, 30-32, 38-40, 54-56, 60-64, 68-72, or 76-78 first characters. Otherwise it works fine.
Error in write_sav_(data, normalizePath(path, mustWork = FALSE)) :
Writing failure: A provided string value was longer than the available storage size of the specified column.
The text was updated successfully, but these errors were encountered: