segfault in connsoonestjob at conn.c:166 #160

kr · 2012-12-14T08:28:11Z

Reported by @aqibsm in #119:

I am facing the same issue on CentOS 6.3 Beanstalkd 1.8

Dec 13 13:02:35 vm7-d2 kernel: beanstalkd[8701] general protection ip:40200a sp:7fff6c65fa10 error:0 in beanstalkd[400000+11000]
Dec 13 14:59:17 vm7-d2 kernel: beanstalkd[9993] general protection ip:40200a sp:7fff6c6f2fc0 error:0 in beanstalkd[400000+11000]

Beanstalkd 1.4 was running fine without any issue so reverting back 1.4

kr · 2012-12-14T08:29:41Z

Can you reproduce the error? If so, would you mind compiling
beanstalkd with

make clean
make CFLAGS='-O0 -g'

and producing a stack trace?

aqibsm · 2013-01-24T05:27:02Z

Hi Kr,

I could not proceed for further testing due to workload.

Could you please guide how to see general protection error as I am starting
beanstalkd from init script.

Thanks.

On 14 December 2012 13:29, Keith Rarick notifications@github.com wrote:

Can you reproduce the error? If so, would you mind compiling
beanstalkd with

make clean
make CFLAGS='-O0 -g'

and producing a stack trace?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/160#issuecomment-11368642.

Aaqib Salman Malik | Technical Manager

aqibsm · 2013-01-29T04:44:46Z

Hi Kr,

I managed to get core dump.

GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html

This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /usr/local/bin/beanstalkd...done.
[New Thread 17286]
Missing separate debuginfo for
Try: yum --disablerepo='' --enablerepo='-debug*' install
/usr/lib/debug/.build-id/8e/312e8752e924c26341440ec3a032bc0e20cba3
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols
found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Core was generated by `/usr/local/bin/beanstalkd -l 0.0.0.0 -p 11300 -u
root -b /var/lib/beanstalkd/bi'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000040200a in connsoonestjob (c=0x14df3090) at conn.c:166
166 if (j->r.deadline_at <= (soonest ? : j)->r.deadline_at)
soonest = j;
Missing separate debuginfos, use: debuginfo-install
glibc-2.12-1.80.el6_3.6.x86_64

On 24 January 2013 10:26, Aaqib Salman Malik aqibsm@gmail.com wrote:

Hi Kr,

I could not proceed for further testing due to workload.

Could you please guide how to see general protection error as I am
starting beanstalkd from init script.

Thanks.

On 14 December 2012 13:29, Keith Rarick notifications@github.com wrote:

Can you reproduce the error? If so, would you mind compiling
beanstalkd with

make clean
make CFLAGS='-O0 -g'

and producing a stack trace?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/160#issuecomment-11368642.

Aaqib Salman Malik | Technical Manager

MOBIZONE PAKISTAN – An ARPUPLUS Company
Algeria | Morocco | Tunis | Egypt | Sudan | Jordan |
Palestine | Lebanon | Iraq | KSA | Yemen | UAE| Pakistan | Bangladesh |
Italy | Canada | USA |
Mob: +92.300.852.3010 | Tel: +92.51.285.6581/2 | Fax: +92.51.285.6580 -
EXT 221
Email: aaqib.malik@mobizone.com.pk | Web: www.mobizone.com.pk

Aaqib Salman Malik | Technical Manager

JensRantil · 2016-04-03T16:58:28Z

@kr Did you have any idea on this?

thorro · 2017-11-17T22:49:40Z

We're getting the same error on v1.10:

Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00000000004021c0 in connsoonestjob (c=0x64d0f40) at conn.c:166 166 if (j->r.deadline_at <= (soonest ? : j)->r.deadline_at) soonest = j;

Anyone found the cause of this? @kr ?

jhammer · 2019-04-09T04:28:30Z

We have just hit this, as well:

Apr 8 17:46:37 db01 kernel: [56880694.274429] traps: beanstalkd[1048] general protection ip:40241c sp:7ffd78a06a20 error:0 in beanstalkd[400000+e000]

ysmolski · 2019-06-28T10:56:34Z

@jhammer, what version do you run? Can you produce the stack trace?

jhammer · 2019-07-03T19:08:09Z

@ysmolsky version 1.9. Sorry, I do not have the stack trace. Seem pretty rare in our case. Only hit the issue once after literally years of uptime.

thorro · 2019-07-04T05:31:26Z

There are a few stacktraces lying around here. We hit it once in a while. Lately not much, since we've moved majority of our workload to another queue manager. It seems the number of crashes correlate with workload.

ysmolski · 2019-07-04T06:00:32Z

@thorro could you paste those stacktraces here? Did you move to another queue manager because beanstalkd failed to handle the load?

thorro · 2019-07-04T06:10:10Z

@ysmolsky we moved mainly because of the crashes. Link to bt:

#328 (comment)

Also, because sometimes binlog files got out of hand. Meaning, when there were a lot of jobs, binglog size grew, which is normal, but when the jobs got processed and deleted, binlog size stayed the same, it didn't shrink. But that's another issue.

ysmolski · 2019-08-17T07:14:06Z

Without a core dump it's next to impossible to fix this kind of error. If anyone can share core dump (that will contain everything that is inside of beanstalkd), I would be glad to look at and try to fix this.
How to get a core dump for a segfault on Linux

kr mentioned this issue Dec 14, 2012

heap invariant compromised in connsched #119

Closed

JensRantil added the needs-label label Aug 26, 2018

ysmolski added Help Wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed needs-label labels Jun 26, 2019

ysmolski changed the title ~~general protection error~~ segfault in connsoonestjob at conn.c:166 Jul 1, 2019

ysmolski mentioned this issue Jul 1, 2019

conn.c: segfault in connclose #328

Closed

ysmolski added WaitingForInfo Issue is not actionable because of missing required information, which needs to be provided. and removed Help Wanted NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Aug 17, 2019

ysmolski closed this as completed May 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segfault in connsoonestjob at conn.c:166 #160

segfault in connsoonestjob at conn.c:166 #160

kr commented Dec 14, 2012

kr commented Dec 14, 2012

aqibsm commented Jan 24, 2013

aqibsm commented Jan 29, 2013

JensRantil commented Apr 3, 2016

thorro commented Nov 17, 2017

jhammer commented Apr 9, 2019

ysmolski commented Jun 28, 2019

jhammer commented Jul 3, 2019

thorro commented Jul 4, 2019

ysmolski commented Jul 4, 2019

thorro commented Jul 4, 2019

ysmolski commented Aug 17, 2019

segfault in connsoonestjob at conn.c:166 #160

segfault in connsoonestjob at conn.c:166 #160

Comments

kr commented Dec 14, 2012

kr commented Dec 14, 2012

aqibsm commented Jan 24, 2013

aqibsm commented Jan 29, 2013

JensRantil commented Apr 3, 2016

thorro commented Nov 17, 2017

jhammer commented Apr 9, 2019

ysmolski commented Jun 28, 2019

jhammer commented Jul 3, 2019

thorro commented Jul 4, 2019

ysmolski commented Jul 4, 2019

thorro commented Jul 4, 2019

ysmolski commented Aug 17, 2019