Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cs_hash_dup_count crash backend #28

Closed
amutu opened this issue Apr 9, 2014 · 8 comments
Closed

cs_hash_dup_count crash backend #28

amutu opened this issue Apr 9, 2014 · 8 comments

Comments

@amutu
Copy link

amutu commented Apr 9, 2014

select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))) order by 1 desc limit 5;

glibc-2.12-1.49.tl1.x86_64
(gdb) c
Continuing.
[New Thread 0x7f7217ef8700 (LWP 18782)]
[New Thread 0x7f72176f7700 (LWP 18783)]
[New Thread 0x7f7216ef6700 (LWP 18784)]
[New Thread 0x7f72166f5700 (LWP 18785)]
[New Thread 0x7f7215ef4700 (LWP 18786)]
[New Thread 0x7f72156f3700 (LWP 18787)]
[New Thread 0x7f7214ef2700 (LWP 18788)]
[New Thread 0x7f72146f1700 (LWP 18789)]
[New Thread 0x7f7213ef0700 (LWP 18790)]
[New Thread 0x7f72136ef700 (LWP 18791)]
[New Thread 0x7f7212eee700 (LWP 18792)]
[New Thread 0x7f72126ed700 (LWP 18793)]
[New Thread 0x7f7211eec700 (LWP 18794)]
[New Thread 0x7f72116eb700 (LWP 18795)]
[New Thread 0x7f7210eea700 (LWP 18796)]
[New Thread 0x7f72106e9700 (LWP 18797)]

Program received signal SIGSEGV, Segmentation fault.
0x00007f7e25c9b28b in imcs_thread_pool_wait (pool=0xb61940) at threadpool.c:45
45 pool->sync->unlock(pool->sync);
(gdb) print pool->sync
$1 = (imcs_mutex_t *) 0x0
(gdb) bt
#0 0x00007f7e25c9b28b in imcs_thread_pool_wait (pool=0xb61940) at threadpool.c:45
#1 0x0000000700004201 in ?? ()
#2 0x0000000000000000 in ?? ()

(gdb) \l
Undefined command: "". Try "help".
(gdb) l
40 {
41 pool->sync->lock(pool->sync);
42 pool->work_id = 0;
43 pool->start->signal(pool->start, pool->n_workers);
44 pool->finish->wait(pool->finish, pool->sync, pool->n_workers, IMCS_TM_INFINITE);
45 pool->sync->unlock(pool->sync);
46
47 }
48 int counters[4] = {0,0,0,0};
49 static void imcs_thread_pool_execute(struct imcs_thread_pool_t* self, imcs_job_t job, void* arg)
(gdb)

if use cs_hash_count,the sql will run ok

@amutu
Copy link
Author

amutu commented Apr 9, 2014

some more info

for un init cs,the sql emmit other error:
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))) order by 1 desc limit 5;
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(text) line 1 at RETURN
ERROR: function returning set of rows cannot return null value

for loaded small data, the sql run OK:
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))) order by 1 desc limit 5;
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(text) line 1 at RETURN
NOTICE: IMCS command: eq
NOTICE: IMCS command: filter
NOTICE: IMCS command: hash_dup_count
NOTICE: IMCS command: project_agg
agg_val | group_by
---------+----------
(0 rows)

for bigger data,it emmit different error message and also crash the backend:
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))) order by 1 desc limit 5;
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(text) line 1 at RETURN
NOTICE: IMCS command: eq
NOTICE: IMCS command: filter
NOTICE: IMCS command: hash_dup_count
NOTICE: IMCS command: project_agg
ERROR: group by sequence doesn't match values sequence
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))) order by 1 desc limit 5;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.

@knizhnik
Copy link
Owner

knizhnik commented Apr 9, 2014

Looks like in both cases memory corruption takes place.
Can you provide me with full data needed to reproduce the problem?

@amutu
Copy link
Author

amutu commented Apr 9, 2014

some more case:
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,class2)) order by 1 desc limit 5;
NOTICE: IMCS command: hash_dup_count
NOTICE: IMCS command: project_agg
agg_val | group_by
---------+--------------------------------------------------------------------
37 | \x6439613632326233396138316163373737316238343338373634333131353961
32 | \x6536623933633334343963373530663538383335663133616663643865316530
22 | \x3732646365313766623262646530373266303465383763343962313537313665
18 | \x3130376131386265353733316331613163306338353533616466666564636462
10 | \x6530623266613538386134343637353563666466316535303163656631306532
(5 rows)
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(cs_const(1)=cs_const(1),class2))) order by 1 desc limit 5;
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(double precision,cs_elem_type) line 1 at RETURN
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(double precision,cs_elem_type) line 1 at RETURN
NOTICE: IMCS command: eq
NOTICE: IMCS command: filter
NOTICE: IMCS command: hash_dup_count
NOTICE: IMCS command: project_agg
agg_val | group_by
---------+--------------------------------------------------------------------
37 | \x6439613632326233396138316163373737316238343338373634333131353961
32 | \x6536623933633334343963373530663538383335663133616663643865316530
22 | \x3732646365313766623262646530373266303465383763343962313537313665
18 | \x3130376131386265353733316331613163306338353533616466666564636462
10 | \x6530623266613538386134343637353563666466316535303163656631306532
(5 rows)

crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))) order by 1 desc limit 5;
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(text) line 1 at RETURN
NOTICE: IMCS command: eq
NOTICE: IMCS command: filter
NOTICE: IMCS command: hash_dup_count
NOTICE: IMCS command: project_agg
ERROR: group by sequence doesn't match values sequence
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,cs_filter(cs_const(1)=cs_const(1),class2))) order by 1 desc limit 5;
ERROR: group by sequence doesn't match values sequence
crash=> select agg_val,group_by from crashlog_get('{604176597}'::int[]),cs_project_agg(cs_hash_dup_count(uin,class2)) order by 1 desc limit 5;
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(double precision,cs_elem_type) line 1 at RETURN
NOTICE: IMCS command: const
CONTEXT: PL/pgSQL function cs_const(double precision,cs_elem_type) line 1 at RETURN
NOTICE: IMCS command: eq
NOTICE: IMCS command: filter
NOTICE: IMCS command: hash_dup_count
NOTICE: IMCS command: project_agg
The connection to the server was lost. Attempting reset: Failed.
!>

I will dump the data to you.

@knizhnik
Copy link
Owner

knizhnik commented Apr 9, 2014

Actually the query you are using is not correct: there is inconsistency between "values" and "group-by" sequences in hash_agg operator. You should duplicate filter condition:
cs_hash_dup_count(cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),uin),
cs_filter(class1=cs_const('737a7612853859405ba846b99f35f4c3'),class2))

@amutu
Copy link
Author

amutu commented Apr 9, 2014

Oh,I get it.Thank you for your explaination.
So the followed crash caused by this error? do I need provide the data?

@knizhnik
Copy link
Owner

knizhnik commented Apr 9, 2014

Not sure if crash really caused by this error.
But in any case such error should cause crash.
So I will be pleased if you can help me to reproduce the problem.

@amutu
Copy link
Author

amutu commented Apr 9, 2014

I have send the data to you by email.

@knizhnik
Copy link
Owner

The problem was caused by non-reentrant (thread-unsafe) errro reporting in PostgreSQL (because PostgreSQL itself is not using threads).
I implemented thread-safe error reporting in IMCS (revision 57).
Now this query correctly returns error without memory corruption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants