write_excel_csv inserts 'U+FEFF', Byte Order Mark (BOM) when append = TRUE, write_csv doesn't. I believe it would be preferable if neither did.
From the docs:
All columns are encoded as UTF-8. write_excel_csv() and write_excel_csv2() also include a UTF-8 Byte order mark which indicates to Excel the csv is UTF-8 encoded.
However, write_excel_csv doesn't include the BOM if there is no "append"ing.
I discovered this problem because a vlookup in Excel wasn't matching the values in the first row of each appended section.
I've included session info, but can confirm that this behavior is the same on Windows and Mac.
Note: reprex creates 2 csv files that are necessary to demonstrate problem.
library(reprex)
library(tidyverse)
simple_tbl <- tibble(
Region = c("West", "East", "North", "South"),
Sales = c(50, 100, 73, 28)
)
simple_write <- function(rgn, sales, out_file, use_excel_csv) {
out <- tibble(
region_summary = paste0("Sales for the ", rgn, " region: ", sales)
)
append_flag <- file.exists( out_file )
if (use_excel_csv) {
write_excel_csv(
out,
out_file,
append = append_flag,
col_names = !append_flag
)
} else {
write_csv(
out,
out_file,
append = append_flag,
col_names = !append_flag
)
}
}
# sales_excel.csv will have U+FEFF (Byte Order Mark)
# at start of each "append"
simple_tbl %>%
select(Region, Sales) %>%
pwalk(~simple_write(..1, ..2, "sales_excel.csv", TRUE))
# sales_no_excel.csv will not have any U+FEFF
simple_tbl %>%
select(Region, Sales) %>%
pwalk(~simple_write(..1, ..2, "sales_no_excel.csv", FALSE))
Created on 2020-02-07 by the reprex package (v0.3.0)
Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 3.6.1 (2019-07-05)
#> os macOS Catalina 10.15.3
#> system x86_64, darwin15.6.0
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz America/New_York
#> date 2020-02-07
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
#> backports 1.1.5 2019-10-02 [1] CRAN (R 3.6.0)
#> broom 0.5.4 2020-01-27 [1] CRAN (R 3.6.1)
#> callr 3.4.1 2020-01-24 [1] CRAN (R 3.6.0)
#> cellranger 1.1.0 2016-07-27 [1] CRAN (R 3.6.0)
#> cli 2.0.1.9000 2020-02-07 [1] Github (r-lib/cli@e9f041e)
#> colorspace 1.4-1 2019-03-18 [1] CRAN (R 3.6.0)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.0)
#> DBI 1.1.0 2019-12-15 [1] CRAN (R 3.6.0)
#> dbplyr 1.4.2 2019-06-17 [1] CRAN (R 3.6.0)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.0)
#> devtools 2.2.1 2019-09-24 [1] CRAN (R 3.6.0)
#> digest 0.6.23 2019-11-23 [1] CRAN (R 3.6.0)
#> dplyr * 0.8.4 2020-01-31 [1] CRAN (R 3.6.0)
#> ellipsis 0.3.0 2019-09-20 [1] CRAN (R 3.6.0)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
#> fansi 0.4.1 2020-01-08 [1] CRAN (R 3.6.0)
#> forcats * 0.4.0 2019-02-17 [1] CRAN (R 3.6.0)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.0)
#> generics 0.0.2 2018-11-29 [1] CRAN (R 3.6.0)
#> ggplot2 * 3.2.1 2019-08-10 [1] CRAN (R 3.6.0)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.0)
#> gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0)
#> haven 2.2.0 2019-11-08 [1] CRAN (R 3.6.0)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
#> hms 0.5.3 2020-01-08 [1] CRAN (R 3.6.0)
#> htmltools 0.4.0 2019-10-04 [1] CRAN (R 3.6.0)
#> httr 1.4.1 2019-08-05 [1] CRAN (R 3.6.0)
#> jsonlite 1.6.1 2020-02-02 [1] CRAN (R 3.6.0)
#> knitr 1.28 2020-02-06 [1] CRAN (R 3.6.1)
#> lattice 0.20-38 2018-11-04 [1] CRAN (R 3.6.1)
#> lazyeval 0.2.2 2019-03-15 [1] CRAN (R 3.6.0)
#> lifecycle 0.1.0 2019-08-01 [1] CRAN (R 3.6.0)
#> lubridate 1.7.4 2018-04-11 [1] CRAN (R 3.6.0)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.0)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.0)
#> modelr 0.1.5 2019-08-08 [1] CRAN (R 3.6.0)
#> munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0)
#> nlme 3.1-140 2019-05-12 [1] CRAN (R 3.6.1)
#> pillar 1.4.3 2019-12-20 [1] CRAN (R 3.6.0)
#> pkgbuild 1.0.6 2019-10-09 [1] CRAN (R 3.6.0)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.0)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.0)
#> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.0)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.0)
#> purrr * 0.3.3 2019-10-18 [1] CRAN (R 3.6.0)
#> R6 2.4.1 2019-11-12 [1] CRAN (R 3.6.0)
#> Rcpp 1.0.3 2019-11-08 [1] CRAN (R 3.6.0)
#> readr * 1.3.1 2018-12-21 [1] CRAN (R 3.6.0)
#> readxl 1.3.1 2019-03-13 [1] CRAN (R 3.6.0)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.0)
#> reprex * 0.3.0 2019-05-16 [1] CRAN (R 3.6.0)
#> rlang 0.4.4 2020-01-28 [1] CRAN (R 3.6.1)
#> rmarkdown 2.1.1 2020-02-07 [1] Github (cpsievert/rmarkdown@d587b7b)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.0)
#> rvest 0.3.5 2019-11-08 [1] CRAN (R 3.6.0)
#> scales 1.1.0 2019-11-18 [1] CRAN (R 3.6.0)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
#> stringi 1.4.5 2020-01-11 [1] CRAN (R 3.6.0)
#> stringr * 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
#> testthat 2.3.1 2019-12-01 [1] CRAN (R 3.6.0)
#> tibble * 2.1.3 2019-06-06 [1] CRAN (R 3.6.0)
#> tidyr * 1.0.2 2020-01-24 [1] CRAN (R 3.6.0)
#> tidyselect 1.0.0 2020-01-27 [1] CRAN (R 3.6.1)
#> tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 3.6.0)
#> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.0)
#> vctrs 0.2.99.9005 2020-02-07 [1] Github (r-lib/vctrs@0fc9acd)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.0)
#> xfun 0.12 2020-01-13 [1] CRAN (R 3.6.0)
#> xml2 1.2.2 2019-08-09 [1] CRAN (R 3.6.0)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.0)
#>
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library
write_excel_csvinserts 'U+FEFF', Byte Order Mark (BOM) when append = TRUE,write_csvdoesn't. I believe it would be preferable if neither did.From the docs:
However,
write_excel_csvdoesn't include the BOM if there is no "append"ing.I discovered this problem because a vlookup in Excel wasn't matching the values in the first row of each appended section.
I've included session info, but can confirm that this behavior is the same on Windows and Mac.
Note: reprex creates 2 csv files that are necessary to demonstrate problem.
Created on 2020-02-07 by the reprex package (v0.3.0)
Session info