Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Add data.table backend for timetk #102

Open
vidarsumo opened this issue Dec 14, 2021 · 6 comments
Open

Feature request: Add data.table backend for timetk #102

vidarsumo opened this issue Dec 14, 2021 · 6 comments

Comments

@vidarsumo
Copy link

Hi Matt.

When using functions like tk_augment_lag(), tk_augment_fourier() and tk_augment_slidify(), it can be slow if you have thousands of time series. An example is a data set with 6,000 time series where it can take up to 0.5 hours to finish creating lags and fourier terms for all the time series.

Would it be possible to add data.table backend for timetk or would that require new package like dtplyr for dplyr?

@mdancho84
Copy link
Contributor

This is a great idea. I'd need to investigate what it would take, but maybe with the dtplyr backend or maybe just internally convert to DT and speed up functions that way.

@spsanderson
Copy link

I think using dtplyr would be better or tidytable might be easier.

@saurabhkumartsc
Copy link

Hi Matt,

Same request from my side too. It would be great if we can use the dtplyr along with the timetk package.

@AlbertoAlmuinha
Copy link

Hi @vidarsumo ,

I have just made a PR that modifies the tk_augment_lags and tk_augment_leads functions so that they have a backend with data.table / tidytable packages. Now they should be much faster.

Could you do some test with the process that you had so heavy and tell us if you notice any significant improvement when you can? Don't rush

Regards,

@vidarsumo
Copy link
Author

Hi @AlbertoAlmuinha, this is great news! I’ll run some test and let you know :)

@andremanesco
Copy link

I think it would be good to add a parallel process control on timetk functions. It's taking a lot of time to process future_frame() depending of how many timeseries has

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants