Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

reduce side vectorization #103

Closed
piccolbo opened this Issue · 1 comment

1 participant

@piccolbo
Owner

We have not defined what vectorization could mean on the reduce side, if it makes sense to have this feature. If the number of reduce groups is relatively small, then just applying vectorized primitives to the list of values is all the vectorization that we need. If the groups are small, their number large, then this may not be enough. Presenting the user with a list of keys and a list of list of values may only make matters more complicated without achieving any speed gains, as such nested structures can only be dealt with with functions in the apply family. But the structured case could present an opportunity for vectorization of the reduce phase. One could imagine a data frame where the first column is the key and there are as many rows as the sum of all the groups that we want to process at once, as in

kk #list of keys
vvv # list of list of values each associated with one key, positionally
data.frame = (keys = c(..., rep(kk[[i]], length(vvv[[i]], ...), to.data.frame(values = do.call(c, vvv)))
@piccolbo
Owner

oops this was added in 2.0

@piccolbo piccolbo closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.