Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix another cs_hash_dup_count #12

Closed
wants to merge 5 commits into from
Closed

fix another cs_hash_dup_count #12

wants to merge 5 commits into from

Conversation

amutu
Copy link

@amutu amutu commented Mar 8, 2014

imcs_dup_hash_initialize should be fixed also as last commit

@amutu
Copy link
Author

amutu commented Mar 8, 2014

thanks for your fix of the imcs_dup_hash_count,but when I try a larger data set,eg more than 5 million rows,the segment fault happens again.After debug,I find this time,the semgment fault happened at the the imcs_dup_hash_initialize,similar with the last.So I fixed it and pull the commit.

this is gdb info:

0x00007f4472db49bd in imcs_dup_hash_initialize (iterator=0x7f282c0008f0) at func.c:5833
5833 elem->count += 1;
(gdb) bt
#0 0x00007f4472db49bd in imcs_dup_hash_initialize (iterator=0x7f282c0008f0) at func.c:5833
#1 0x00007f4472cda11e in imcs_parallel_job (worker_id=16, n_workers=24, arg=0x228fc58) at imcs.c:1725
#2 0x00007f4472dc45ca in imcs_thread_pool_worker (pool=0x224c650) at threadpool.c:33
#3 0x00007f4472aa87f1 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f4478f62ccd in clone () from /lib64/libc.so.6
(gdb) print elem
$1 = (imcs_hash_elem_t *) 0x0
(gdb) l
5828 agg_elem->count = 0;
5829 agg_elem->collision = agg_hash->table[agg_index];
5830 agg_hash->table[agg_index] = agg_elem;
5831 }
5832 if (++agg_elem->count == ctx->min_occurrences) {
5833 elem->count += 1;
5834 }
5835 }
5836 }
5837 agg_hash->table_used = agg_distinct_count;
(gdb) quit
A debugging session is active.

@knizhnik
Copy link
Owner

knizhnik commented Mar 8, 2014

Thank you very much for pointing this issues.
Shame on me: I really do not not notice the same problem in mco_seq_dup_hash_initialize.
I also merge most of your other fixes.
But I do not think that it is good idea to implicitly replace min_occ <= 0 with 1. I prefer to report error in this case.

@knizhnik knizhnik closed this Mar 8, 2014
@amutu
Copy link
Author

amutu commented Mar 9, 2014

thanks,I think min_occ <= 0 report error is OK. I indeed implement the error at first,but after I test the behaviour of substring(s string,start int) function of the PostgreSQL,I find substring do not report error and just set start to 1 when start <= 0:
postgres=# select substring('abcdef',1);

substring

abcdef
(1 row)

postgres=# select substring('abcdef',2);

substring

bcdef
(1 row)

postgres=# select substring('abcdef',3);

substring

cdef
(1 row)

postgres=# select substring('abcdef',7);

substring

(1 row)

postgres=# select substring('abcdef',0);

substring

abcdef
(1 row)

postgres=# select substring('abcdef',-1);

substring

abcdef
(1 row)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants