Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cs_hash_count out put groups returns part of group expression #15

Closed
amutu opened this issue Mar 10, 2014 · 5 comments
Closed

cs_hash_count out put groups returns part of group expression #15

amutu opened this issue Mar 10, 2014 · 5 comments

Comments

@amutu
Copy link

amutu commented Mar 10, 2014

I use || to cat the group cloumn,the count result is ok,but the group name only show the first part of the || expression,not the result of || expression:

postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature || stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 100;
NOTICE: IMCS command: cat
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
--------+----------------------------------
135432 | 6e0443a38c7580540e896d8960390713
109237 | 38c9d70c65d795be0b7eeae3bce97ed2
96682 | 590a0b3935a87697e2047578e870c8f1
96096 | ffd9dbb06a5a8956c71596140938d5d3
89936 | 350a50b27e41c31f35889af56fa3fd95
78961 | fe6e4bdfb42bd353641951599f2d17a3
61733 | 56f2a3e4f20aa16cd1291a4f7fe3c680
53818 | 29d33060ebe62c3a7abef6b7b4341bbe
40864 | bbdcfe475e02a727a7895991b718b70e
40108 | 4a660c2b458219b0cfa502d33b57d8e2
36667 | 31eae0cfda948b3c3852c5cbf2e63659
33902 | 2d3ee26177471eb7d2c47933603ebc34
31763 | 11159bf592c89407f9a5f3ae06ebce17
30802 | 33aa0cf09ba060e99ae128870a6970ab
30576 | eb697ae8e36d6fdc98a9c3d0b65e77a5
26851 | b1baa3580703800e1ef4875fb3ab02d3
26365 | 7f5b6a22f77e15cd79effad35ac0ddaf
26173 | 44e40035e11455f301cf8d032b497d80
25917 | 0acc499cb73079f4a9a5e5a97d756ef9
25872 | cbcf2d51a93b03eb670af8facb9b8653
24406 | e126738795c18a4826fb1e30af8e9cb4
23912 | 86ed551847b91f3118c34f5fbdd03555
23652 | c99ef930cd7bfc0393e527b017ee36d0
22949 | a31bffa06742b09fe52a926e86dfdf01
Time: 649.848 ms
postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature || stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: cat
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
--------+----------------------------------
135432 | 6e0443a38c7580540e896d8960390713
109237 | 38c9d70c65d795be0b7eeae3bce97ed2
96682 | 590a0b3935a87697e2047578e870c8f1
96096 | ffd9dbb06a5a8956c71596140938d5d3
89936 | 350a50b27e41c31f35889af56fa3fd95
78961 | fe6e4bdfb42bd353641951599f2d17a3
61733 | 56f2a3e4f20aa16cd1291a4f7fe3c680
53818 | 29d33060ebe62c3a7abef6b7b4341bbe
40864 | bbdcfe475e02a727a7895991b718b70e
40108 | 4a660c2b458219b0cfa502d33b57d8e2
(10 rows)

Time: 655.848 ms
postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
--------+----------------------------------
135432 | 6e0443a38c7580540e896d8960390713
109237 | 38c9d70c65d795be0b7eeae3bce97ed2
96682 | 590a0b3935a87697e2047578e870c8f1
96096 | ffd9dbb06a5a8956c71596140938d5d3
89936 | 350a50b27e41c31f35889af56fa3fd95
78961 | fe6e4bdfb42bd353641951599f2d17a3
61733 | 56f2a3e4f20aa16cd1291a4f7fe3c680
53818 | 29d33060ebe62c3a7abef6b7b4341bbe
40864 | bbdcfe475e02a727a7895991b718b70e
40108 | 4a660c2b458219b0cfa502d33b57d8e2
(10 rows)

Time: 528.755 ms
postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature || ts)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: cat
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
-----+----------------------------------
336 | 6c66f50d102790decf1d1dbd50727f88
129 | 73f007b86066dff20ea59d54d4c0bf57
112 | 7f8b51a29c7a54251b7a50e603e7ba9f
112 | 7f8b51a29c7a54251b7a50e603e7ba9f
109 | 7f8b51a29c7a54251b7a50e603e7ba9f
109 | df3f306cac45db554d4b65853a83591c
108 | df3f306cac45db554d4b65853a83591c
107 | 7f8b51a29c7a54251b7a50e603e7ba9f
106 | 7f8b51a29c7a54251b7a50e603e7ba9f
104 | df3f306cac45db554d4b65853a83591c
(10 rows)

Time: 6123.984 ms
postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( ts || stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: cat
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
-----+----------------
336 | \x1F)z\x0C\x01
129 |
112 |
112 | @\x0B\x17\x01
109 |
109 | &ʤ\x12\x01
108 | @\x0B\x17\x01
107 | \x16>\x11\x01
106 |
104 | &ʤ\x12\x01
(10 rows)

Time: 5777.381 ms

@knizhnik
Copy link
Owner

Sorry, IMCS provides two concatenation operators: || (cs_concat) and ||| (cs_cat).
First one concatenates sequences themselves, i.e. {1,2,3} || {4,5,6} = {1,2,3,4,5,6}
And cs_can concatenates elements of sequences, i.e. {'a','b','c'} ||| {'1','2','3'} = {'a1','b2','c3'}
As far as I understand you really need to use cs_cat

@amutu
Copy link
Author

amutu commented Mar 10, 2014

I have read the doc again,and do some test,and make sure it is may be a bug:

postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
--------+----------------------------------
135432 | 6e0443a38c7580540e896d8960390713
109237 | 38c9d70c65d795be0b7eeae3bce97ed2
96682 | 590a0b3935a87697e2047578e870c8f1
96096 | ffd9dbb06a5a8956c71596140938d5d3
89936 | 350a50b27e41c31f35889af56fa3fd95
78961 | fe6e4bdfb42bd353641951599f2d17a3
61733 | 56f2a3e4f20aa16cd1291a4f7fe3c680
53818 | 29d33060ebe62c3a7abef6b7b4341bbe
40864 | bbdcfe475e02a727a7895991b718b70e
40108 | 4a660c2b458219b0cfa502d33b57d8e2
(10 rows)

postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature ||| stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: concat
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
--------+----------------------------------
270864 | 6e0443a38c7580540e896d8960390713
218474 | 38c9d70c65d795be0b7eeae3bce97ed2
193364 | 590a0b3935a87697e2047578e870c8f1
192192 | ffd9dbb06a5a8956c71596140938d5d3
179872 | 350a50b27e41c31f35889af56fa3fd95
157922 | fe6e4bdfb42bd353641951599f2d17a3
123466 | 56f2a3e4f20aa16cd1291a4f7fe3c680
107636 | 29d33060ebe62c3a7abef6b7b4341bbe
81728 | bbdcfe475e02a727a7895991b718b70e
80216 | 4a660c2b458219b0cfa502d33b57d8e2
(10 rows)

postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( cs_cat(stacktopsignature , stacktopsignature))) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
NOTICE: IMCS command: cat
NOTICE: IMCS command: hash_count
NOTICE: IMCS command: project
cnt | grp
--------+----------------------------------
135432 | 6e0443a38c7580540e896d8960390713
109237 | 38c9d70c65d795be0b7eeae3bce97ed2
96682 | 590a0b3935a87697e2047578e870c8f1
96096 | ffd9dbb06a5a8956c71596140938d5d3
89936 | 350a50b27e41c31f35889af56fa3fd95
78961 | fe6e4bdfb42bd353641951599f2d17a3
61733 | 56f2a3e4f20aa16cd1291a4f7fe3c680
53818 | 29d33060ebe62c3a7abef6b7b4341bbe
40864 | bbdcfe475e02a727a7895991b718b70e
40108 | 4a660c2b458219b0cfa502d33b57d8e2
(10 rows)

@knizhnik
Copy link
Owner

I am very sorry for confusing you.
Actually operator || corresponds to cs_cat, and ||| - to cs_concat
So you should really use ||
But why you do you see only half of concatenated string value?
Just because string is padded by '\0' character and been printed as ASCI string you do not see second part of concatenated string.
In IMCS there is yet another way of concatenation of types: ooperator +. In this case strings are treated as text and concatenated as strcat C function does:

postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count( stacktopsignature + stacktopsignature)) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
cnt | grp
--------+------------------------------------------------------------------
135432 | 6e0443a38c7580540e896d89603907136e0443a38c7580540e896d8960390713
109237 | 38c9d70c65d795be0b7eeae3bce97ed238c9d70c65d795be0b7eeae3bce97ed2
96682 | 590a0b3935a87697e2047578e870c8f1590a0b3935a87697e2047578e870c8f1
96096 | ffd9dbb06a5a8956c71596140938d5d3ffd9dbb06a5a8956c71596140938d5d3
89936 | 350a50b27e41c31f35889af56fa3fd95350a50b27e41c31f35889af56fa3fd95
78961 | fe6e4bdfb42bd353641951599f2d17a3fe6e4bdfb42bd353641951599f2d17a3
61733 | 56f2a3e4f20aa16cd1291a4f7fe3c68056f2a3e4f20aa16cd1291a4f7fe3c680
53818 | 29d33060ebe62c3a7abef6b7b4341bbe29d33060ebe62c3a7abef6b7b4341bbe
40864 | bbdcfe475e02a727a7895991b718b70ebbdcfe475e02a727a7895991b718b70e
40108 | 4a660c2b458219b0cfa502d33b57d8e24a660c2b458219b0cfa502d33b57d8e2
(10 rows)

But it will not work for (stacktopsignature + ts)
I will fix it now.

@knizhnik
Copy link
Owner

Fixed:
postgres=# select s.cnt,s.grp from samp3_get(),cs_project(cs_hash_count(stacktopsignature + cs_cast(ts, 'bpchar', 20))) as s(cnt bigint,grp char(128)) order by 1 desc limit 10;
cnt | grp
-----+-----------------------------------------------------
336 | 6c66f50d102790decf1d1dbd50727f882014-02-23 07:21:50
129 | 73f007b86066dff20ea59d54d4c0bf572014-02-23 00:08:56
112 | 7f8b51a29c7a54251b7a50e603e7ba9f2014-02-23 18:48:00
112 | 7f8b51a29c7a54251b7a50e603e7ba9f2014-02-23 19:58:25
109 | df3f306cac45db554d4b65853a83591c2014-02-23 18:48:00
109 | 7f8b51a29c7a54251b7a50e603e7ba9f2014-02-23 14:43:15
108 | df3f306cac45db554d4b65853a83591c2014-02-23 19:58:25
107 | 7f8b51a29c7a54251b7a50e603e7ba9f2014-02-23 13:03:06
106 | 7f8b51a29c7a54251b7a50e603e7ba9f2014-02-23 12:58:16
104 | df3f306cac45db554d4b65853a83591c2014-02-23 14:43:15
(10 rows)

@knizhnik
Copy link
Owner

Actually using cs_cast is not needed. cs_cat (or operator ||) correctly concatenate any types without casting them to string. You can check that the query above produces the same result as
result above:

postgres=# select agg_val,cs_cut(group_by, 'c64T8') from samp3_get(),cs_project_agg(cs_hash_count(stacktopsignature || ts)) order by 1 desc limit 10;
agg_val | cs_cut
---------+----------------------------------------------------------
336 | (6c66f50d102790decf1d1dbd50727f88,"2014-02-23 07:21:50")
129 | (73f007b86066dff20ea59d54d4c0bf57,"2014-02-23 00:08:56")
112 | (7f8b51a29c7a54251b7a50e603e7ba9f,"2014-02-23 18:48:00")
112 | (7f8b51a29c7a54251b7a50e603e7ba9f,"2014-02-23 19:58:25")
109 | (7f8b51a29c7a54251b7a50e603e7ba9f,"2014-02-23 14:43:15")
109 | (df3f306cac45db554d4b65853a83591c,"2014-02-23 18:48:00")
108 | (df3f306cac45db554d4b65853a83591c,"2014-02-23 19:58:25")
107 | (7f8b51a29c7a54251b7a50e603e7ba9f,"2014-02-23 13:03:06")
106 | (7f8b51a29c7a54251b7a50e603e7ba9f,"2014-02-23 12:58:16")
104 | (df3f306cac45db554d4b65853a83591c,"2014-02-23 14:43:15")
(10 rows)

Pleas notice, that I have used cs_project_agg instead of cs_project.
cs_project_agg is provided especially for use with aggregates. And it transforms concatenates aggregation key to bytea instead of characters, as cs_project does.

And cs_cut is opposite to cs_cat: it is splitting concatenated value into parts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants