Skip to content

Slow rendering of large tables #1924

@magnusdv

Description

@magnusdv

Prework

Proposal

I noticed that certain tables take forever to print, spending most of its time in the little function rownum_translation() inside gt:::build_data(). Here is a reprex:

library(gt)

ncol <- 100
nrow <- 2000
df <- matrix(sample(1:10, ncol*nrow, replace = T), ncol = ncol) |> 
  as.data.frame()

g <- gt(df) |> 
  tab_style(
    style = cell_text(color = "red"),
    locations = lapply(1:ncol, \(i) cells_body(rows = df[[i]] > 5, columns = i))
  )

# Profiling (truncated)
Rprof()
dat <- gt:::build_data(g, "html")
Rprof(NULL)
summaryRprof()

#> $by.total
#>                                 total.time total.pct self.time self.pct
#> "gt:::build_data"                    48.82     99.96      0.00     0.00
#> "resolve_footnotes_styles"           48.06     98.40      0.00     0.00
#> "rownum_translation"                 46.86     95.95      0.12     0.25
#> "which"                              40.94     83.82      1.14     2.33
#> "as.numeric"                         38.90     79.65     38.90    79.65
#> (...)
#> 
#> $sampling.time
#> [1] 48.84

Indeed, the current rownum_translation() seems unneccessarily inefficient, using a for loop to grow a vector:

rownum_translation <- function(body, rownum_start) {
rownum_final <- c()
for (rownum_s in rownum_start) {
rownum_final <-
c(
rownum_final,
which(as.numeric(rownames(body)) == rownum_s)
)
}
rownum_final
}

I believe the entire function body can be replaced with match(rownum_start, as.numeric(rownames(body))).
On my computer this reduces the runtime of the example from 48 seconds to less than 2 seconds.

I'd be happy to open a PR for this change if you want.

Created on 2024-11-21 with reprex v2.1.1

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions