Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MonoidAggregator combinators #659

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

johnynek
Copy link
Collaborator

this some code we have been using at Stripe written by @non and myself.

We should support probably non-monoid (that is Semigroup based Aggregator) as well, although we try to use MonoidAggregator primarily at Stripe.

cc @erik-stripe

@johnynek
Copy link
Collaborator Author

it looks like internally Parallel.oneOf and ApplicativeAggregators are the only ones we actually use, which maybe makes sense. Maybe we can only do those two for semigroups and maybe even remove the others from the PR in the interest of keeping things simpler...

@codecov-io
Copy link

Codecov Report

Merging #659 into develop will increase coverage by 0.11%.
The diff coverage is 95.76%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #659      +/-   ##
===========================================
+ Coverage    89.44%   89.55%   +0.11%     
===========================================
  Files          113      118       +5     
  Lines         8945     9078     +133     
  Branches       490      491       +1     
===========================================
+ Hits          8001     8130     +129     
- Misses         944      948       +4
Impacted Files Coverage Δ
...tter/algebird/generic/ApplicativeAggregators.scala 100% <100%> (ø)
...twitter/algebird/generic/CombinedAggregators.scala 100% <100%> (ø)
.../twitter/algebird/generic/CinchedAggregators.scala 90% <90%> (ø)
...twitter/algebird/generic/ParallelAggregators.scala 96.66% <96.66%> (ø)
...m/twitter/algebird/generic/JoinedAggregators.scala 96.66% <96.66%> (ø)
... and 9 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e45e654...69f72ab. Read the comment docs.

* val b: MonoidAggregator[Animal, ShapeStats, Result] = ...
*
* val m1: Aggregator[Animal, ColorStats :: ShapeStats :: HNil, Result] =
* CombinedAggregators(a :: b :: HNil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about all the HList stuff but this seems like a combinator on ApplicativeAggregators. I would think that if I have:

val a: MonoidAggregator[Animal, ColorStats, Result] = ...
val b: MonoidAggregator[Animal, ShapeStats, Result]= ..

I should be able to do something like this

ApplicativeAggregators(a::b::HNil).andThenPresent{ case (a,b) => implicitly[Semigroup[Result]].combine(a,b)  }

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it's the same concept, it is just implemented directly, which is actually in a way easier due to all the shapeless work.

* We can also say:
*
* val m3: MonoidAggregator[Animal, DogStats :: CatStats :: HNil, Result] =
* JoinedAggregators(a :: b :: HNil).composePrepare(Generic[Animal].to)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, why cant be implement this as a combinator using ApplicativeAggregators?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show me how? Suggest a change?

Copy link
Contributor

@MansurAshraf MansurAshraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnynek are these compatible with algebird-spark? I wonder if we will run into Serialization issues when HList based aggregators cross execution boundaries

@johnynek
Copy link
Collaborator Author

Good point! We also have kryo serializers for HLists (easy to do, but a good idea to add).

We don’t customize our kryo serializers internally as far as I know (much less spark than scalding) but we did set up for scalding.

@MansurAshraf
Copy link
Contributor

@johnynek can we get this merged? Would love to use it at my work

@CLAassistant
Copy link

CLAassistant commented Nov 16, 2019

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants