using arrow for larger tables #49

pachadotdev · 2022-01-18T01:42:42Z

I made some changes in the API, which offer a 50% time reduction by reading parquet instead of JSON from the API

library(microbenchmark)

microbenchmark(
  ots_create_tidy_data(table = "yrpc-parquet", reporters = "can"),
  ots_create_tidy_data(table = "yrpc", reporters = "can"),
  times = 10L
)

Unit: seconds
                                                            expr      min       lq     mean   median       uq      max neval
 ots_create_tidy_data(table = "yrpc-parquet", reporters = "can") 1.390202 1.894805 2.006423 1.989092 2.309964 2.427866    10
         ots_create_tidy_data(table = "yrpc", reporters = "can") 3.548397 3.750299 3.891013 3.889045 3.950003 4.364702    10

I already sent the PR to allow Parquet serialization in the plumber package, api.tradestatistics.io is now using plumber development version

The text was updated successfully, but these errors were encountered:

This was referenced Jan 18, 2022

Update feather serializer; Add parquet serializer rstudio/plumber#849

Merged

read parquet data #50

Merged

pachadotdev closed this as completed Feb 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using arrow for larger tables #49

using arrow for larger tables #49

pachadotdev commented Jan 18, 2022

using arrow for larger tables #49

using arrow for larger tables #49

Comments

pachadotdev commented Jan 18, 2022