Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add keyby iterator #1

Merged
merged 1 commit into from
Nov 8, 2018
Merged

WIP: Add keyby iterator #1

merged 1 commit into from
Nov 8, 2018

Conversation

segeljakt
Copy link
Member

@segeljakt segeljakt commented Oct 27, 2018

This PR adds support for a keyby iterator which can be used as:

let s = streamappender[i32];
...
let s = result(s);
for(keyby(s, |e| u64(e)), appender[i32], |b,k,v| merge(b,v));

|e| u64(e) is a function which takes an element and returns a corresponding key, and is used for partitioning elements. For example, [1,2,3,3] would in this case be partitioned into: [1], [2] and [3,3]. The body of the for loop then iterates over each key-value, where k is the key and v is the value.

@segeljakt segeljakt changed the title WIP: Add keyby iterator WIP: Add bykey iterator Oct 27, 2018
@segeljakt segeljakt changed the title WIP: Add bykey iterator WIP: Add keyby iterator Nov 6, 2018
@segeljakt segeljakt force-pushed the klasseg/keyby branch 2 times, most recently from b6ea06b to c7a6e2c Compare November 6, 2018 15:31
@segeljakt
Copy link
Member Author

segeljakt commented Nov 6, 2018

I can squash before merging

Copy link
Contributor

@Bathtor Bathtor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also have a typing rule stating that data.type for keyby iterators must be stream, since they make little sense on an array, for example, where, as you pointed out, groupmerger can already achieve this effect.

Furthermore, if we want to be able to use this, we might want to introduce some way to generate hashes efficiently. Right now we can essentially cast numbers to u64 and add them up. We could add some syntactic sugar, i.e. a hash operator that generates casts and sums, as well as magic numbers for structs for example. Then the keyby-function would essentially look like |e| hash({e.$1, e.$2}), for example, instead of |e| 89 + ((u64)e.$1) + ((u64)e.$2) or whatever.

@segeljakt
Copy link
Member Author

Ok, good points. I'll add some more commits

@segeljakt segeljakt mentioned this pull request Nov 7, 2018
@segeljakt
Copy link
Member Author

I think it is best if I continue with hash in a separate PR #3.

@segeljakt segeljakt requested review from Bathtor and removed request for Bathtor November 7, 2018 17:20
@segeljakt segeljakt removed the request for review from Bathtor November 7, 2018 18:18
@Bathtor Bathtor merged commit 22add72 into master Nov 8, 2018
@segeljakt segeljakt deleted the klasseg/keyby branch February 21, 2020 15:58
segeljakt added a commit that referenced this pull request Feb 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants