Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

min_occ make wrong result #14

Closed
amutu opened this issue Mar 8, 2014 · 2 comments
Closed

min_occ make wrong result #14

amutu opened this issue Mar 8, 2014 · 2 comments

Comments

@amutu
Copy link

amutu commented Mar 8, 2014

if set min_occ to more than 1,the result is most 0.

postgres=# select cs_hash_dup_count(uin,stacktopsignature,1) from myt_span(1,50);
NOTICE: IMCS command: hash_dup_count

cs_hash_dup_count










("int8:{1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,3,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1}","bpchar64:{32c62675f464d72c
eaf530957dfa3d38,fe6e4bdfb42bd353641951599f2d17a3,38882fc51c5904920b6363c611ed3e24,e4e5eaf5a5bdb2df8c73aea43696474a,a91e8a8df6c9
461c89de55fd7c9809e8,cda6dbe8985286a1971f4a3d9777c824,f2dbd4d6a989ba4f539a661a4212458f,74391d674d27e2f252b4d6061ae2290f,38c9469e
7dc79502bd7ec7d279b8efb6,79725250e4a7de4455fbc35bbc039022,29d33060ebe62c3a7abef6b7b4341bbe,2cd943b0d519b34bfb37bbd1b8392a22,cd3e
406cd70290b26a0316c116cfdc17,d52d443cea54213c1b5f11f7b0b74d12,eb697ae8e36d6fdc98a9c3d0b65e77a5,ffd9dbb06a5a8956c71596140938d5d3,
d12f73d3fbd7db2429d3d9e7309a73ae,6e0443a38c7580540e896d8960390713,2c584ff712b31711376090d3d31411ee,31eae0cfda948b3c3852c5cbf2e63
659,cda0502c6cdd597131b41c1b28af9fef,2191a6ae264f1058fb01a0084a73b509,879be69f7d19b3f158b8d056778684c9,7dff25db9e295ed0a61610930
9a0b15d,bd9c2154f0c4482a06f21e14adef0cd1,dc8028c36369b682d695c360f19ccf17,22e9f6e84bee9fab1176687f1104a902,ebbd8e48d04ac8e0a75ea
be4ffae3d8c,...}")
(1 row)

postgres=# select cs_hash_dup_count(uin,stacktopsignature,2) from myt_span(1,50);
NOTICE: IMCS command: hash_dup_count

cs_hash_dup_count










("int8:{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}","bpchar64:{32c62675f464d72c
eaf530957dfa3d38,fe6e4bdfb42bd353641951599f2d17a3,38882fc51c5904920b6363c611ed3e24,e4e5eaf5a5bdb2df8c73aea43696474a,a91e8a8df6c9
461c89de55fd7c9809e8,cda6dbe8985286a1971f4a3d9777c824,f2dbd4d6a989ba4f539a661a4212458f,74391d674d27e2f252b4d6061ae2290f,38c9469e
7dc79502bd7ec7d279b8efb6,79725250e4a7de4455fbc35bbc039022,29d33060ebe62c3a7abef6b7b4341bbe,2cd943b0d519b34bfb37bbd1b8392a22,cd3e
406cd70290b26a0316c116cfdc17,d52d443cea54213c1b5f11f7b0b74d12,eb697ae8e36d6fdc98a9c3d0b65e77a5,ffd9dbb06a5a8956c71596140938d5d3,
d12f73d3fbd7db2429d3d9e7309a73ae,6e0443a38c7580540e896d8960390713,2c584ff712b31711376090d3d31411ee,31eae0cfda948b3c3852c5cbf2e63
659,cda0502c6cdd597131b41c1b28af9fef,2191a6ae264f1058fb01a0084a73b509,879be69f7d19b3f158b8d056778684c9,7dff25db9e295ed0a61610930
9a0b15d,bd9c2154f0c4482a06f21e14adef0cd1,dc8028c36369b682d695c360f19ccf17,22e9f6e84bee9fab1176687f1104a902,ebbd8e48d04ac8e0a75ea
be4ffae3d8c,...}")
(1 row)

@knizhnik
Copy link
Owner

knizhnik commented Mar 8, 2014

If you do not limit number of elements to 50, then there will be some non-zero values.
Sorry, but I can not repeat exactly your query - the table definition you have sent to me contains no "uin" column. Just ts and stacktopsignature.
If I execute query
select cs_hash_dup_count(ts,stacktopsignature,2) from samp2_span(1,50);
then all results will be really zero. But query on the whole table
select cs_hash_dup_count(ts,stacktopsignature,2) from samp2_get();
returns some non-zero values:
("int8:{0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...}"

Why do you think that this result its not correct?
Function cs_hash_dup_count(ts,stacktopsignature,2) calculates number of timestamps repeated more than once for each group of stacktopsignature.
I checked in samp2 table using standard SQL queries that there are not so much such cases.

@amutu
Copy link
Author

amutu commented Mar 9, 2014

I see.After some test and I think I miss undstanding the doc.I mix the min_occ with the SQL having clause.Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants