Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some command hang on FreeBSD #25

Open
amutu opened this issue Apr 4, 2014 · 3 comments
Open

some command hang on FreeBSD #25

amutu opened this issue Apr 4, 2014 · 3 comments

Comments

@amutu
Copy link

amutu commented Apr 4, 2014

I try to run the test file in imcs/sql/ dir on FreeBSD,find some tests hang as flowing:

psql postgres -f sql/create.sql
OK

[~/postgresql/contrib/imcs/sql]$ psql postgres -f ./sql/grandagg.sql

quote_count

       5

(1 row)

quote_first

2013-11-01
(1 row)

quote_last

2013-11-06
(1 row)
---hang here


[~/postgresql/contrib/imcs/sql]$ psql postgres -f ./hashagg.sql
^CCancel request sent
^CCancel request sent
---hang here


[~/postgresql/contrib/imcs/sql]$ psql postgres -f ./operators.sql

?column?

float4:{20.7,40.4,60.7,80.7,100.7}
(1 row)

              ?column?                   

float4:{-0.3,0,0.299999,0.299999,-0.299999}
(1 row)

             ?column?                  

float4:{107.1,408.04,921.1,1628.1,2535.1}
(1 row)

               ?column?                   

float4:{0.971429,1,1.00993,1.00746,0.994059}
(1 row)

            ?column?                

float4:{10.2,0,0.299999,0.299999,50.2}
(1 row)

                                               ?column?                                                   

float8:{38931552097.3912,2.33398999511658e+26,6.6966296391943e+44,4.16574958399165e+64,7.68312765681365e+85}
(1 row)
-----hang here


[~/postgresql/contrib/imcs/sql]$ psql postgres -f ./scalarop.sql
^CCancel request sent
^CCancel request sent
^CCancel request sent
----hang here

[~/postgresql/contrib/imcs/sql]$ psql postgres -f sort.sql
---hang here

[~/postgresql/contrib/imcs/sql]$ psql postgres -f spec.sql
---hang here

[~/postgresql/contrib/imcs/sql]$ psql postgres -f transform.sql

cs_cast

float4:{100,200,300,400,500}
(1 row)

          cs_iif               

float4:{10.2,20.2,30.5,40.5,50.2}
(1 row)

    cs_if        

char:{1,10,20,30,2}
(1 row)

  cs_thin       

float4:{20.2,40.2}
(1 row)

    cs_limit         

float4:{20.2,30.2,40.2}
(1 row)

  cs_head       

float4:{10.5,20.2}
(1 row)

  cs_tail       

float4:{40.2,50.5}
(1 row)

      cs_filter           

date:{2013-11-04,2013-11-05}
(1 row)

        quote_project             

(IBM,2013-11-01,10.2,11,10,10.5,100)
(IBM,2013-11-04,30.5,31,30,30.2,300)
(IBM,2013-11-05,40.5,41,40,40.2,400)
(3 rows)
---hang here

[~/postgresql/contrib/imcs/sql]$ psql postgres -f ./drop.sql
--hang here

top show the process in umtxn state:
74761 jovz 1 20 0 9293M 1768K umtxn 3 0:05 0.00% postgres: jovz postgres [local] SELECT (postgres)
25048 jovz 1 20 0 9293M 87540K umtxn 0 0:04 0.00% postgres: jovz postgres [local] SELECT (postgres)

procstat -kk 74761 shows:
PID TID COMM TDNAME KSTACK
74761 100212 postgres - mi_switch+0xde sleepq_catch_signals+0xb2 sleepq_wait_sig+0xf _sleep+0x265 umtxq_sleep+0x129 do_lock_umutex+0x64f __umtx_op_wait_umutex+0x78 amd64_syscall+0x357 Xfast_syscall+0xfb

gdb shows:
(gdb) bt
#0 0x0000000801ef689a in __error () from /lib/libthr.so.3
#1 0x0000000801ef139d in pthread_mutex_destroy () from /lib/libthr.so.3
#2 0x0000000801ef0f6a in pthread_mutex_lock () from /lib/libthr.so.3
#3 0x000000080102b777 in flockfile () from /lib/libc.so.7
#4 0x00000008010117f2 in fileno () from /lib/libc.so.7
#5 0x0000000000704674 in write_pipe_chunks ()
#6 0x0000000000701634 in EmitErrorReport ()
#7 0x00000000007003d7 in errfinish ()
#8 0x0000000000639e23 in quickdie ()
#9
#10 0x0000000801ef689a in __error () from /lib/libthr.so.3
#11 0x0000000801ef139d in pthread_mutex_destroy () from /lib/libthr.so.3
#12 0x0000000801ef0f6a in pthread_mutex_lock () from /lib/libthr.so.3
#13 0x0000000800f97460 in sbrk () from /lib/libc.so.7
#14 0x0000000800f826ce in syscall () from /lib/libc.so.7
#15 0x0000000800fa10da in calloc () from /lib/libc.so.7
#16 0x0000000801ef1475 in pthread_mutex_destroy () from /lib/libthr.so.3
#17 0x0000000801ef0f6a in pthread_mutex_lock () from /lib/libthr.so.3
#18 0x0000000800f97460 in sbrk () from /lib/libc.so.7
#19 0x0000000800f826ce in syscall () from /lib/libc.so.7
#20 0x0000000800fa10da in calloc () from /lib/libc.so.7
#21 0x0000000801ef2b8f in pthread_kill () from /lib/libthr.so.3
#22 0x0000000801eeadad in pthread_create () from /lib/libthr.so.3
#23 0x0000000801cbaa4e in imcs_rank_float_next (iterator=0x801cbeda0) at func.c:3633
#24 0x0000000801cbec1b in imcs_unique_double_next (iterator=0x801aaf840) at func.c:3783
#25 0x0000000801c1f463 in _PG_init () at imcs.c:1105
#26 0x0000000801c29853 in cs_if (fcinfo=0xffffffff) at imcs.c:2853
#27 0x00000000005768c1 in ExecMakeFunctionResultNoSets ()
#28 0x0000000000576021 in ExecProject ()
#29 0x0000000000576dc1 in ExecScan ()
#30 0x0000000000570a78 in ExecProcNode ()
#31 0x000000000056ea10 in standard_ExecutorRun ()
#32 0x000000000063e4aa in PortalRunSelect ()
#33 0x000000000063e0d9 in PortalRun ()
#34 0x000000000063cc05 in PostgresMain ()

---Type to continue, or q to quit---
#35 0x00000000005f8295 in PostmasterMain ()
#36 0x000000000059c54a in main ()

postgres=# select * from pg_stat_activity ;
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start | state_change | waiting | state | query
-------+----------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+-------------------------------+---------+--------+---------------------------------------------------------------------------------------------
12344 | postgres | 36519 | 10 | jovz | psql | | | -1 | 2014-04-04 10:12:53.999918+08 | | 2014-04-04 10:13:09.936195+08 | 2014-04-04 10:13:09.938922+08 | f | idle | select * from pg_stat_activity ;
12344 | postgres | 59229 | 10 | jovz | psql | | | -1 | 2014-04-04 13:01:10.174897+08 | 2014-04-04 13:01:10.297773+08 | 2014-04-04 13:01:10.297773+08 | 2014-04-04 13:01:10.297789+08 | f | active | select cs_hash_max(Close,Day % 2) from Quote_get('IBM');
12344 | postgres | 71307 | 10 | jovz | psql | | | -1 | 2014-04-04 13:04:32.682695+08 | 2014-04-04 13:04:32.768294+08 | 2014-04-04 13:04:32.768294+08 | 2014-04-04 13:04:32.768308+08 | f | active | select cs_sum(Volume) from Quote_get('IBM');
12344 | postgres | 80664 | 10 | jovz | psql | | | -1 | 2014-04-04 13:09:05.342826+08 | 2014-04-04 13:09:05.426278+08 | 2014-04-04 13:09:05.426278+08 | 2014-04-04 13:09:05.426293+08 | f | active | select Close+Volume from Quote_get('IBM');
12344 | postgres | 81058 | 10 | jovz | psql | | | -1 | 2014-04-04 13:10:01.280513+08 | 2014-04-04 13:10:01.305852+08 | 2014-04-04 13:10:01.305852+08 | 2014-04-04 13:10:01.305868+08 | f | active | select cs_wsum(Volume,Close) from Quote_get('IBM');
12344 | postgres | 82575 | 10 | jovz | psql | | | -1 | 2014-04-04 13:11:12.077115+08 | 2014-04-04 13:11:12.122891+08 | 2014-04-04 13:11:12.122891+08 | 2014-04-04 13:11:12.122913+08 | f | active | select cs_top_max(Close, 3) from Quote_get('IBM');
12344 | postgres | 83218 | 10 | jovz | psql | | | -1 | 2014-04-04 13:12:01.299246+08 | 2014-04-04 13:12:01.359916+08 | 2014-04-04 13:12:01.359916+08 | 2014-04-04 13:12:01.359932+08 | f | active | select cs_histogram(Close, 0, 100, 10) from Quote_get('IBM');
12344 | postgres | 85894 | 10 | jovz | psql | | | -1 | 2014-04-04 13:13:01.726521+08 | 2014-04-04 13:13:01.864072+08 | 2014-04-04 13:13:01.864072+08 | 2014-04-04 13:13:01.864089+08 | f | active | select Quote_project(q.
, cs_filter_first_pos(High > Low*1.01, 3)) from Quote_get('IBM') q;
12344 | postgres | 87416 | 10 | jovz | psql | | | -1 | 2014-04-04 13:14:08.096213+08 | 2014-04-04 13:14:08.182947+08 | 2014-04-04 13:14:08.182947+08 | 2014-04-04 13:14:08.183325+08 | f | active | select Quote_delete('IBM', date('02-Nov-2013'));
12344 | postgres | 13427 | 10 | jovz | psql | | | -1 | 2014-04-04 13:21:51.076577+08 | 2014-04-04 13:21:57.879408+08 | 2014-04-04 13:21:57.879408+08 | 2014-04-04 13:21:57.880425+08 | f | active | select * from pg_stat_activity ;
(10 rows)

uname -a
FreeBSD 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64

@knizhnik
Copy link
Owner

knizhnik commented Apr 4, 2014

Sorry, the stack above seems to be incorrect. imcs_unique_double_next should not call imcs_rank_float_next and it in turn should not call pthread_create. So either stack is corrupted(by be because of stack overflow?), either gdb is not able to correctly unwind stack...

I do not have access to the system with FreeBSD at this moment. Can you provide me ssh access to some account at this system, so that I can try to reproduce ans investigate the problem myself?

@amutu
Copy link
Author

amutu commented Apr 13, 2014

the problem can be solved whit add -pthread cflag when compile postgresql:
[jovz@ ~/postgresql/contrib/imcs/sql]$ uname -a
FreeBSD 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
[jovz@ ~/postgresql/contrib/imcs/sql]$ psql postgres -f ./hashagg.sql

cs_hash_max

("float4:{50.5,40.2}","int4:{0,1}")
(1 row)

         cs_hash_min             

("float4:{20.2,10.5}","int4:{0,1}")
(1 row)

                     cs_hash_sum                         

("float8:{100.900001525879,50.7000007629395}","int4:{0,1}")
(1 row)
.....

the inspiration is this thread:
http://code.google.com/p/plv8js/issues/detail?id=34

@knizhnik
Copy link
Owner

O, yes.
Actually I have already reproduced the problem with my lockbench test. It work when build as application with -pthread, but doesn't work if it built as shared library and loaded from application built without -pthread. Unfortunately I do not know other solution rather than rebuilt PostgreSQL with pthread. Sorry, that I have not informed you about my investigations and thank you for your help. I also run this test at OS/X - there is no such problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants