Some more useful helper / wrapper functions #91
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Just a couple of small things I came across while working with the library.
Firstly,
dataset.foreachPartition { }has overload resolution ambiguity, so I addedforEachPartition {}to the API.Secondly, some functions (like
reduceGroups()) return aDataset<Pair<K, V>>, or the user creates a key/value like dataset themselves and then it might be useful to just take the keys or the values (I know I found it useful).So I added
takeKeys()andtakeValues()forDataset<Pair>,Dataset<Tuple2>, andDataset<Arity2>. It's a small thing, but might improve readability.Lastly, just like getting the columns using property references, it might also be useful to sort datasets using those, so I added the ability to do:
@asm0dey let me know if these are helpful :)