Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EventuallyAggregator and variants #407

Merged
merged 2 commits into from
Apr 13, 2015

Conversation

avibryant
Copy link
Contributor

An Aggregator that wraps the EventuallySemigroup (and one for EventuallyMonoid), as well as a SetSizeAggregator that starts with a Set and ends with a HLL.

@@ -451,3 +451,15 @@ case class HyperLogLogAggregator(val hllMonoid: HyperLogLogMonoid) extends Monoi
def prepare(value: Array[Byte]) = monoid.create(value)
def present(hll: HLL) = hll
}

case class SetSizeAggregator[A](hllBits: Int, maxSetSize: Int = 10)(implicit toBytes: A => Array[Byte])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might make sense to make the default maxSetSize to be something that depends on hllBits. At the extreme you could serialize every item coming in and increment a byte counter (although the in-memory may be much less or more, but still), then when it exceeds 2^hllBits, switch over. I would think something like maxSetSize: Int = math.max(10, 1 << (hllBits - 5)) might be a decent default.

@johnynek
Copy link
Collaborator

This looks great, but you're pushing our test coverage down here. Can we get at least a simple test to exercise this a bit?

Also, can you add some comments to the type so the scaladoc is not bare?

@johnynek
Copy link
Collaborator

ping for at least a test or two that exercises this.

@johnynek
Copy link
Collaborator

ping again for a test.

@avibryant
Copy link
Contributor Author

Sorry, been busy - I'll try to get one in, but there's always the next release.

@johnynek
Copy link
Collaborator

@DanielleSucher thanks for the tests!

johnynek added a commit that referenced this pull request Apr 13, 2015
@johnynek johnynek merged commit 062f560 into twitter:develop Apr 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants