-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add keyby iterator #1
Conversation
02bd84e
to
41607ac
Compare
41607ac
to
49f0800
Compare
b6ea06b
to
c7a6e2c
Compare
I can squash before merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also have a typing rule stating that data.type
for keyby
iterators must be stream
, since they make little sense on an array, for example, where, as you pointed out, groupmerger
can already achieve this effect.
Furthermore, if we want to be able to use this, we might want to introduce some way to generate hashes efficiently. Right now we can essentially cast numbers to u64 and add them up. We could add some syntactic sugar, i.e. a hash
operator that generates casts and sums, as well as magic numbers for structs for example. Then the keyby
-function would essentially look like |e| hash({e.$1, e.$2})
, for example, instead of |e| 89 + ((u64)e.$1) + ((u64)e.$2)
or whatever.
Ok, good points. I'll add some more commits |
I think it is best if I continue with |
88bb6e4
to
58ec119
Compare
This PR adds support for a
keyby
iterator which can be used as:|e| u64(e)
is a function which takes an element and returns a corresponding key, and is used for partitioning elements. For example,[1,2,3,3]
would in this case be partitioned into:[1]
,[2]
and[3,3]
. The body of the for loop then iterates over each key-value, wherek
is the key andv
is the value.