New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add partitionToMap function #1968
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1968 +/- ##
==========================================
- Coverage 69.33% 67.62% -1.72%
==========================================
Files 214 214
Lines 6659 6662 +3
Branches 365 460 +95
==========================================
- Hits 4617 4505 -112
- Misses 2042 2157 +115
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for this! some minor comments
* @return partitioned SCollections in a `Map` | ||
* @group collection | ||
*/ | ||
def partitionToMap[U: Coder](partitionKeys: Set[U], f: T => U): Map[U, SCollection[T]] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to partition
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is what I wanted to do originally but the compiler didn't like it.
Caused errors for example in the partition by predicate method it didn't know if to use my new one or the original method. Can that be avoided without breaking backwards compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it will require type ascription, it's ok to add it in this version since already introduces some breaking changes. TBH partition
should have been curried
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on a second thought, it might be better to just use similar function signature to what you had.
def partitionByKey[U: Coder](partitionKeys: Set[U])(f: T => U): Map[U, SCollection[T]]
* Add partitionToMap function * fix indentation * addressing review comments * Addressing review comments, part 2
It can be annoying to be forced to work with Integers while doing partitions. This is a helper method that takes care of that annoyance by allowing the user to give a set of possible outcomes and a function into that set instead.