Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wb_add_data] string_nums option. closes #503 #516

Merged
merged 4 commits into from
Jan 20, 2023
Merged

[wb_add_data] string_nums option. closes #503 #516

merged 4 commits into from
Jan 20, 2023

Conversation

JanMarvin
Copy link
Owner

@JanMarvin JanMarvin commented Jan 17, 2023

This PR provides a string_nums option for openxlsx2 to control if strings that can be expressed as numerics should be written as styled numerics and not as character strings (e.g. "1" as 1). This does not provide on the fly conversions from strings to numerics (which ... would be possible as well; simply avoid writing the style to the cell).

This needs some testing (accuracy and speed) and for now I have hidden it with an option. This option knows states 0 (the current default), 1 (write string as number with cell style) and 2 (write string as number). The entire conversion part likely will have an impact on writing time, due to the call to is_double().

library(openxlsx2)

dat <- data.frame(x = "2023", y = 2023)
wb <- wb_workbook()

options("openxlsx2.string_nums" = 1)

wb <- wb %>% 
  wb_add_worksheet() %>% 
  wb_add_data(x = dat)

options("openxlsx2.string_nums" = 0)

wb <-  wb %>% 
  wb_add_worksheet() %>% 
  wb_add_data(x = dat)
  

# new
str(wb_to_df(wb, 1))
#> 'data.frame':    1 obs. of  2 variables:
#>  $ x: num 2023
#>  $ y: num 2023
#>  - attr(*, "tt")='data.frame':   1 obs. of  2 variables:
#>   ..$ x: chr "n"
#>   ..$ y: chr "n"
#>  - attr(*, "types")= Named num [1:2] 1 1
#>   ..- attr(*, "names")= chr [1:2] "A" "B"

# old
str(wb_to_df(wb, 2))
#> 'data.frame':    1 obs. of  2 variables:
#>  $ x: chr "2023"
#>  $ y: num 2023
#>  - attr(*, "tt")='data.frame':   1 obs. of  2 variables:
#>   ..$ x: chr "s"
#>   ..$ y: chr "n"
#>  - attr(*, "types")= Named num [1:2] 0 1
#>   ..- attr(*, "names")= chr [1:2] "A" "B"

Update

With the latest push on my Arch Linux desktop I see the following. It is quite fast, though I'm not sure about the effects of setting this style applyNumberFormat = "1", quotePrefix = "1", numFmtId = "49" on every character cell, if any cell requires it. That somehow seems like overkill. Setting it only to those cells that need it, slows the example terribly down.

n <- 10000
k <- 10

mm <- matrix(rnorm(n = n * k), nrow = n, ncol = k)
mc <- matrix(as.character(mm), nrow = n, ncol = k)


library(openxlsx2)

wb <- wb_workbook()

t1 <- Sys.time()
options("openxlsx2.string_nums" = 0)
wb$add_worksheet()$add_data(x = mc)
t2 <- Sys.time()
options("openxlsx2.string_nums" = 1)
wb$add_worksheet()$add_data(x = mc)
t3 <- Sys.time()
options("openxlsx2.string_nums" = 2)
wb$add_worksheet()$add_data(x = mc)
t4 <- Sys.time()

t2 - t1 # initial matrix as character
#> Time difference of 0.6670108 secs
t3 - t2 # matrix as numeric with style
#> Time difference of 0.963567 secs
t4 - t3 # matrix as numeric
#> Time difference of 0.3893292 secs

JanMarvin and others added 2 commits January 17, 2023 02:03
…d as numerics (e.g. "1") should be written as styled numerics and not as character strings.
@JanMarvin JanMarvin merged commit 05a631f into main Jan 20, 2023
@JanMarvin JanMarvin deleted the char_nums branch January 20, 2023 16:15
@JanMarvin
Copy link
Owner Author

hmpf. the Mac build is falling due to the sprintf warnings triggered by clang. They are already addressed in the Rcpp repo and fortunately simply annoying and should be fixed once the next Rcpp release hits CRAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant