You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We could do much better than this in terms of performance with a C implementation. vec_group_rle() does a lot more work than is necessary here because it uses a dictionary to keep track of what it has already seen.
Also consider vec_rle(), which would be different from vec_group_rle() since that uses a dictionary to track the first time we saw an individual value. vec_rle() would be much simpler, and would just track changes in x, like vec_runs(). It would return a two column data frame with val and len, very much like rle().
Also, why doesn't vec_group_rle() return a data frame rather than a rcrd? It seems like that would've been simpler for such a low level function.
For vec_runs() and vec_rle() to work efficiently, we need to extract out the equality comparison caching utilities from the dictionary code (i.e. d->equal and d->vec_p). That will allow us to extremely efficiently compare values of x for equality.
Inspired by the adjacency grouping idea in tidyverse/dplyr#5184
We could do much better than this in terms of performance with a C implementation.
vec_group_rle()
does a lot more work than is necessary here because it uses a dictionary to keep track of what it has already seen.I've added this to the vec-prefixes google sheet
Created on 2020-05-05 by the reprex package (v0.3.0)
The text was updated successfully, but these errors were encountered: