New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
write_csv: scientific notation cannot be disabled #671
Comments
I don't think we will be changing this behavior in the near future (if ever). A workaround you can use is to format the columns before writing. See format_numeric <- function(x, ...) {
numeric_cols <- vapply(x, is.numeric, logical(1))
x[numeric_cols] <- lapply(x[numeric_cols], format, ...)
x
}
library("readr")
df <- data.frame(a = -0.0004029971, b = 0.0412975501857025)
format_csv(format_numeric(df))
#> [1] "a,b\n-0.0004029971,0.04129755\n" |
Thanks! One general question though: Why default to a notation/formatting that, at least to me, seems to be less compatible with other tools? (Even more so when the file format, csv, is arguably one of the most interchangeble/compatible formats.) I guess this is all a matter of perspective, I just would like to understand your design choice. One addition and one question with respect to the format_numeric fuction: I guess one ought to add the format_numeric_jh <- function(x, ...) {
numeric_cols <- vapply(x, is.numeric, logical(1))
x[numeric_cols] <- lapply(x[numeric_cols], format, ...)
x
}
format_numeric_dpd <- function(x, scientific = FALSE, ...) {
numeric_cols <- vapply(x, is.numeric, logical(1))
x[numeric_cols] <- lapply(x[numeric_cols], format, scientific = scientific, ...)
x
}
df <- data.frame(a = -0.00004029971, b = 0.0412975501857025)
geoid_df <- data.frame(GEOID = seq(from = 60150001022000, to = 60150001022005, 1))
print(df, digits = 18)
#> a b
#> 1 -4.0299709999999997e-05 0.041297550185702497
library("readr")
format_csv(format_numeric_jh(df))
#> [1] "a,b\n-4.029971e-05,0.04129755\n"
format_csv(format_numeric_jh(geoid_df))
#> [1] "GEOID\n6.015e+13\n6.015e+13\n6.015e+13\n6.015e+13\n6.015e+13\n6.015e+13\n"
# ehm, no
format_csv(format_numeric_dpd(df))
#> [1] "a,b\n-0.00004029971,0.04129755\n"
format_csv(format_numeric_dpd(geoid_df))
#> [1] "GEOID\n60150001022000\n60150001022001\n60150001022002\n60150001022003\n60150001022004\n60150001022005\n" But how can I reliably preserve precision without hard-coding it with |
While I am not the one to talk about design decisions maybe I can help explaining the integer limitations (this message may help to understand the problem or workaround it with the In R you store integers as a 32-bit integer number ( Using doubles, Another alternative if you need to work with integer numbers below
Usually doubles (called |
Thanks @zeehio for the explanation. I was not aware of that (in that detail at least). |
write_csv()
turns (some) longer numbers into scientific notation and there does not seem to be a way to disable it. This has been mentioned before in #229 and apparently was fixed then, so this might be a regression?The problem is I cannot use scientific notation (well) with the tools that import the csv.
Also compare this to this (which should be equivalent IMHO):
I.e.
GEOID\n60150001022e3
instead ofGEOID\n60150001022000
The text was updated successfully, but these errors were encountered: