Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow functions #7

Open
2 of 4 tasks
NickCH-K opened this issue Jul 20, 2019 · 4 comments
Open
2 of 4 tasks

Slow functions #7

NickCH-K opened this issue Jul 20, 2019 · 4 comments

Comments

@NickCH-K
Copy link
Owner

NickCH-K commented Jul 20, 2019

The following pmdplyr commands are too slow, to the point even where the examples take >5s, and would ideally be faster:

  • tlag when .d=0 (slow because it currently performs a join)
  • panel_fill
  • panel_locf
  • mutate_cascade

Slow speed in many cases is driven by the need to work when .i and .t do not uniquely identify observations, so one possible means of speeding up in those cases is to check upon call whether they do uniquely identify data, and divert to a quick method if they do. tlag already has an option to do this the user can select. But ideally all cases will be sped up.

@NickCH-K
Copy link
Owner Author

tlag (both .d=0 and otherwise) considerably sped up in 108e3c6

@NickCH-K
Copy link
Owner Author

NickCH-K commented Aug 5, 2019

panel_fill improved in 58bb82b but could stand to be improved more

@NickCH-K
Copy link
Owner Author

NickCH-K commented Aug 7, 2019

panel_locf improved massively in d255fad and now at a good place.

@NickCH-K
Copy link
Owner Author

mutate_cascade improved modestly in 1aaae84. Not sure how much faster it can get without a major restructuring.

@NickCH-K NickCH-K added this to the 0.4.0 Checklist milestone Aug 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant