New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: implement Count-Min Sketch #4970

Merged
merged 8 commits into from Nov 6, 2017

Conversation

7 participants
@lamxTyler
Member

lamxTyler commented Nov 1, 2017

Count-Min Sketch is used to estimate point queries. It consists of d counter array of size w. Every item will be mapped into one position in each of the array.
PTAL @coocood @hanfei1991 @winoros

@shenli

This comment has been minimized.

Show comment
Hide comment
@shenli

shenli Nov 1, 2017

Member

Any reference for the Count-Min Sketch algorithm?

Member

shenli commented Nov 1, 2017

Any reference for the Count-Min Sketch algorithm?

@hanfei1991

This comment has been minimized.

Show comment
Hide comment
@hanfei1991

hanfei1991 Nov 1, 2017

Member

@shenli A design document has been shared with tidb team by email.

Member

hanfei1991 commented Nov 1, 2017

@shenli A design document has been shared with tidb team by email.

min = c.table[i][j]
}
noise := (c.count - uint64(c.table[i][j])) / (uint64(c.width) - 1)
if uint64(c.table[i][j]) < noise {

This comment has been minimized.

@jackysp

jackysp Nov 2, 2017

Member

Could this happen?

@jackysp

jackysp Nov 2, 2017

Member

Could this happen?

This comment has been minimized.

@lamxTyler

lamxTyler Nov 2, 2017

Member

Yes, it occurs quite a few.

@lamxTyler

lamxTyler Nov 2, 2017

Member

Yes, it occurs quite a few.

This comment has been minimized.

@jackysp

jackysp Nov 2, 2017

Member

ok

@jackysp
Show outdated Hide outdated statistics/cmsketch.go
Show outdated Hide outdated statistics/cmsketch.go
Show outdated Hide outdated statistics/cmsketch.go
vals := make([]uint32, c.depth)
min := uint32(math.MaxUint32)
for i := range c.table {
j := (h1 + h2*uint64(i)) % uint64(c.width)

This comment has been minimized.

@zz-jason

zz-jason Nov 2, 2017

Member

how about extracting (h1 + h2*uint64(i)) % uint64(c.width) as a function ?

@zz-jason

zz-jason Nov 2, 2017

Member

how about extracting (h1 + h2*uint64(i)) % uint64(c.width) as a function ?

This comment has been minimized.

@lamxTyler

lamxTyler Nov 6, 2017

Member

Is it too simple to be a function?

@lamxTyler

lamxTyler Nov 6, 2017

Member

Is it too simple to be a function?

zhexuany and others added some commits Nov 3, 2017

@zz-jason

LGTM

@zz-jason zz-jason added status/LGT2 and removed status/LGT1 labels Nov 6, 2017

@zz-jason zz-jason added this to the 1.1 milestone Nov 6, 2017

@lamxTyler lamxTyler merged commit 363957b into pingcap:master Nov 6, 2017

4 checks passed

ci/circleci Your tests passed on CircleCI!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins-ci-tidb/build Jenkins job succeeded.
Details
license/cla Contributor License Agreement is signed.
Details

@lamxTyler lamxTyler deleted the lamxTyler:cms branch Nov 6, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment