Segfault since latest update (3.13.0). #6950

Closed
r3oath opened this Issue Mar 31, 2016 · 56 comments

Projects

None yet
@r3oath
r3oath commented Mar 31, 2016

Since updating to the latest version of HHVM on my production servers I'm continually getting a segmentation fault reported in the logs. I don't know yet what's causing it, but I have a huge amount of traffic going to these servers so it's not pleasant. I have a PHP7-FPM fallback in place, so luckily there's no downtime for the users at the moment.

BootTimer: mapping self...
BootTimer: mapping self block done, took 25ms wall, 24ms cpu
BootTimer: pagein_self done, took 25ms wall, 25ms cpu
BootTimer: loading static content...
Core dumped: Segmentation fault
[ 1816.997092] traps: hhvm[2427] general protection ip:7fb5f1bde4e4 sp:7ffc8863a870 error:0 in libc-2.19.so[7fb5f1ad5000+1bb000]

HHVM Version

HipHop VM 3.13.0 (rel)
Compiler: tags/HHVM-3.13.0-0-g5b0e52b83f1dc0aa0c1dfe3a5687995c9693f5a6
Repo schema: 5fa42b79ea61b0af8d3fe1af48e0308a834bd224

OS Version

Linux AcmeCorp 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

@vlcty
vlcty commented Mar 31, 2016

I experience the same issue but don't get any log outputs. Up to 3 sites are rendered and then the daemon dies.

hhvm --version
HipHop VM 3.13.0 (rel)
Compiler: tags/HHVM-3.13.0-0-g5b0e52b83f1dc0aa0c1dfe3a5687995c9693f5a6
Repo schema: 5fa42b79ea61b0af8d3fe1af48e0308a834bd224

Running on a Debian 8.3 server.

Linux mineralwasser 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 (2016-01-17) x86_64 GNU/Linux

@vlcty
vlcty commented Mar 31, 2016

Maybe usefull: When the daemon is started via systemctl it dies after some requests. As soon as I run it with

hhvm -m server -c /etc/hhvm/server.ini

everything works fine.

@simpsonjulian

Same issue on Ubuntu LTS: hhvm started via init scripts works for our healthcheck scripts on EBS and promptly fails.

@r3oath
r3oath commented Mar 31, 2016

Yup mine is being started and monitored with monit. Literally every time monit spawns it up it's dies straight afterwards and keeps going in an endless loop, so I've killed it across all our servers.

@roxyxty
roxyxty commented Mar 31, 2016

Hi,
I have the same problem. It crashes after approx. 10 requests.

@simpsonjulian

We did the same thing. My wrapper script now ends in /usr/bin/hhvm -m server -c /etc/hhvm/server.ini --user www-data -vPidFile=/var/run/hhvm/pid and it looks better locally. Won't deploy until colleagues are at work, but can update this.

@roxyxty
roxyxty commented Mar 31, 2016

When starting with --mode server as vlcty and simpsonjulian suggested it works without crashing. Tnx.

@igorclark

Hi there, we have the same issue here. It seems to happen after a few requests, as reported above, though unfortunately nothing so helpful as the same number of requests each time.

Mar 31 13:17:51 myhost kernel: [14694.491410] traps: hhvm[12187] general protection ip:7f921744e4e4 sp:7fff83645730 error:0 in libc-2.19.so[7f9217345000+1bb000]
ubuntu@myhost:~$ hhvm --version
HipHop VM 3.13.0 (rel)
Compiler: tags/HHVM-3.13.0-0-g5b0e52b83f1dc0aa0c1dfe3a5687995c9693f5a6
Repo schema: 5fa42b79ea61b0af8d3fe1af48e0308a834bd224
ubuntu@myhost:~$ uname -a
Linux myhost 3.13.0-29-generic #53-Ubuntu SMP Wed Jun 4 21:00:20 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

No stack trace gets written, no cores to be found, so pretty hard to track down. Ran strace attached to the hhvm process and the only thing of any interest was error 4 in libc.

Sorry not to be able to help with more detail; just reporting that we too can reliably reproduce the problem. We rolled back to a previous disk image with 3.12.1 and the problem went away.

@Orvid
Contributor
Orvid commented Mar 31, 2016

Looking into the issue now, but a backtrace from a -dbg version of the package would be helpful.

@Orvid Orvid added the crash label Mar 31, 2016
@pkirk
pkirk commented Mar 31, 2016

Same boat here on debian 7.9
Linux www.example.com 3.2.0-4-amd64 #1 SMP Debian 3.2.73-2+deb7u3 x86_64 GNU/Linux
Switched to wheezy-lts-3.12 in the meantime.
Thanks.

@vlcty
vlcty commented Mar 31, 2016

@Orvid
I've installed hhvm-dbg from the repo. How to produce your requested backtrace?

@vlcty
vlcty commented Mar 31, 2016

I think I found what you need under /tmp

[Link removed. Was not the right thing]

@Orvid
Contributor
Orvid commented Mar 31, 2016

Attach to the instance of HHVM with GDB and then run bt to get a back trace.

@fredemmott
Contributor

thread apply all bt - otherwise you only get a backtrace for one thread, which is usually not useful - even if it is the thread that crashed

@vlcty
vlcty commented Mar 31, 2016

Well, I don't even get gdb running ... It segfaults there. hhvm-dbg is installed.

root@mineralwasser:# ps aux | grep hhvm
veloc1ty 22753 1.3 1.5 753600 247176 ? Ssl 18:57 0:01 /usr/bin/hhvm --config /etc/hhvm/php.ini --config /etc/hhvm/server.ini --user veloc1ty --mode daemon -vPidFile=/var/run/hhvm/pid
root 22846 0.0 0.0 12748 2052 pts/5 S+ 18:58 0:00 grep hhvm
root@mineralwasser:
# gdb /usr/bin/hhvm 22753
GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/.
Find the GDB manual and other documentation resources online at:
http://www.gnu.org/software/gdb/documentation/.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/hhvm...Speicherzugriffsfehler

@Orvid
Contributor
Orvid commented Mar 31, 2016

I've got it reproducing locally, digging into the fix now.

@fredemmott
Contributor

Sorry, HHVM is known to trigger bugs in old versions of GDB; 7.10 has been fairly heavily used without issues, and 7.11 recently got released - building a new version of gdb usually only takes a few minutes and fixes this kind of issue.

@vlcty
vlcty commented Mar 31, 2016

Oh okay. Well @Orvid said he was able to reproduce it. He has by far more knowledge than I in that stuff than I :-)

@pepijnblom

This is plaguing at least one of our servers as well, downgraded to 3.12.1 and the problem has stopped.

@Orvid
Contributor
Orvid commented Mar 31, 2016

Alright, so I apparently I was failing to start HHVM correctly to begin with, so I haven't actually been able to reproduce this (I'm continuing to try) :(

@PDowney
PDowney commented Mar 31, 2016

Ouch, this update just crashed like 10 of my sites. All instances of HHVM silently fail after calling a site.

@Orvid Orvid added Diff Ready and removed needs more info labels Apr 1, 2016
@Orvid Orvid self-assigned this Apr 1, 2016
@Orvid
Contributor
Orvid commented Apr 1, 2016

Alright, so, it looks like the issue was with one of the cherry-picks, which was causing a segfault if you tried to run in daemon mode with logging disabled. As a temporary fix, enabling logging should get it running again. I've got a diff up internally to fix this (D3124742).

@r3oath
r3oath commented Apr 1, 2016

This definitely the issue? I was getting the segfaults while in daemon mode with logging enabled, hence why I was seeing it in the logs.

@Orvid
Contributor
Orvid commented Apr 1, 2016

There are multiple ways that that same issue can be triggered, so I believe it should fix your issue, even if the temporary fix doesn't apply in your specific case.

@ccnie
ccnie commented Apr 1, 2016

@Orvid I have uploaded full backtrace with hhvm-dbg, please check it in Issue:6941

@igorclark

I forgot to mention that setting hhvm.log.level = Verbose would stop the daemon from running at all. Immediate crash on startup. Leaving the directive out entirely meant it would start up, and only crash after an unspecified number of requests. Which does seem to suggest that writing to the log might have been what was triggering the problem for us.

@roxyxty
roxyxty commented Apr 1, 2016

@igorclark Setting hhvm.log.level to anything less than Warning (Info or Verbose) will prevent daemon to start at all.
Also only reliable workaround at the moment seem to be setting hhvm mode to run as server (--mode server). I have put that in init script so I don't have to worry about it on hhvm restart.

@ccnie
ccnie commented Apr 1, 2016

@igorclark @roxyxty setting hhvm.log.level = Error makes server work fine till now.

@roxyxty
roxyxty commented Apr 1, 2016

@ccnie log level Error is above Warning so yes that works:)

@pixelive
pixelive commented Apr 1, 2016

Changing my PHP.ini using hhvm.log.level = Error doenst work for me on Ubuntu. Starting it with service hhvm start.

Ubuntu 14.04.4 LTS

Any idea?

@Agiley
Agiley commented Apr 1, 2016

Tried setting hhvm.log.level = Error as well, initially it seemed to work better than running with hhvm.log.level = Warning, but after putting some load on hhvm it eventually crashed.

/usr/bin/hhvm -m server seemed to work a bit better, but felt a bit like a hack.

Got tired of trying to fix this so just installed the previous stable version (3.12.1) instead and now everything works again.

Here's how to install 3.12.1 on Ubuntu 14.04:

sudo apt-get remove hhvm && sudo apt-get autoremove
cd /tmp
wget http://dl.hhvm.com/ubuntu/pool/main/h/hhvm/hhvm_3.12.1~trusty_amd64.deb
sudo dpkg -i hhvm_3.12.1~trusty_amd64.deb
sudo apt-get -f install

/usr/bin/hhvm --version

HipHop VM 3.12.1 (rel)
Compiler: tags/HHVM-3.12.1-0-gf516f1bb9046218f89885a220354c19dda6d8f4d
Repo schema: f2e5f39b2ad4a08bcbd90b5d8bcb580f40fba6c8
@pkirk
pkirk commented Apr 1, 2016

@Orvid isn't this issue the same as #6941 ?

@lesterchan

Yeap setting hhvm.log.level = Error doesn't help on Ubuntu 14.04 as well.

Going to fall back to PHP 7.0.5 and wait for HHVM 3.13.1

@Orvid
Contributor
Orvid commented Apr 1, 2016

There appears to be two issues, I've got the fix to one of them, but will need to dig to fix the other.

@Orvid
Contributor
Orvid commented Apr 1, 2016

I do have a fix for the issue where it crashes immediately on startup when run in daemon mode, however I'm having difficulty reproducing the issue where it crashes after serving some requests. For those getting the second issue, what does your configuration file look like?

@roxyxty
roxyxty commented Apr 1, 2016

@Orvid From what I tested (and Im still using 3.13.0 only in server mode which works fine) it only crashes in daemon mode immediately if I run it with logging level Info or Verbose. If I run it with default settings it crashes after 10-15 different requests or so.

This was referenced Apr 2, 2016
@igorclark

Hi there, yep, exactly the same as @roxyxty - Info or Verbose means immediate crash on startup, whereas with no hhvm.log.level set, 10-15 requests sounds about right, and then it crashes.

@webeau
webeau commented Apr 4, 2016

Hate to pile on here but this a major issue considering that it affects the release version and not just nightly builds. Hopefully this issue is fixed quickly for both but I don't think many can wait until the next scheduled release for a fix. I ended up switching off hhvm completely for now as, I imagine, other users were forced to do as well. Not every release user would think/know to look at the github issues for this information.

@fredemmott
Contributor

@webeau :

@webeau
webeau commented Apr 4, 2016

Thanks for the link. I tried the apt-get install hhvm=3.12.0 and that didn't work (also apt-get -t=3.12.0 install hhvm) so this link helps a great deal.

@fredemmott
Contributor

I'll file a documentation issue - we should cover downgrading there. apt-get install hhvm=3.12.1~trusty should work (replacing 'trusty' as appropriate for your distribution

@simpsonjulian

@fredemmott a doc fix would be great. When I realised that this version was broken, I ran aptitude versions hhvm on Ubuntu LTS to see what other options there were, but the 3.12 releases didn't seem to be there.

@fredemmott
Contributor

hhvm/user-documentation#299 - does also need the LTS apt source to be used as well as that apt command.

@abvdveen
abvdveen commented Apr 5, 2016

Some additional feedback: having the same issue since updating from 3.12 to 3.13 last Friday. Running hhvm as a service, mode daemon with the fastcgi script by Dominic Luechinger. The error level doesn't seem to be the cause, but hhvm attempting to write to the errorlog. This happens with a php error of course, when the level is Error for instance. And when using for example the level Verbose, will occur sooner. As soon as hhvm tries to write to the logfile it crashes.

However, when I change the mode to server in the service startup script, logging functions as before and hhvm doesn't crash. So using that setup till a fix for 3.13 is ready.

@Orvid
Contributor
Orvid commented Apr 6, 2016

Alright, so I do have the issue reproduced locally, and do have fixes for the actual crashes, however it currently refuses to write to the log file.

As getting the last few issues worked out is taking (much) longer than I'd like, I'm going to revert the commit that is causing the issue, and release it as 3.13.1. I'm in the process of building a version with that commit reverted to make sure there aren't any issues caused by reverting it, and once that's done I'll start building the updated packages.

@r3wt
r3wt commented Apr 6, 2016

every time i choose to upgrade...

@Orvid
Contributor
Orvid commented Apr 6, 2016

Reverting that commit was enough to get things working again, and I kicked off the builds a while ago to build the packages.

The 3.13.1 builds for 14.04, 15.04, 15.10 and Debian 8 are almost finished, and once the Debian 8 build is done, I'll kick off the build for Debian 7. The -dbg packages will come after that.

@LZL0
LZL0 commented Apr 6, 2016

nice job thanks!

@Orvid
Contributor
Orvid commented Apr 7, 2016

The packages for HHVM 3.13.1 for Ubuntu 14.04, 15.04, 15.10, Debian 7 and Debian 8 are all out.

Ubuntu 14.04 and 15.10 both have the -dbg package also built, with the 15.04 and Debian 8 debug packages in progress.

@r3wt
r3wt commented Apr 7, 2016

@Orvid thanks for your hardwork. i really appreciate it, however i'm staying on 3.12.1 just to be safe. for those of us running hhvm under a custom user, its a huge pain in the ass to upgrade as /var/run/hhvm must be chown'd each time upgrading the package, which can be a nightmare in prod. for future reference it might be helpful to include some directions for people who need to do an emergency rollback. here's a gist i apparently wrote last time this happened haha.

https://gist.github.com/r3wt/83629275bd3a153915ef

@lesterchan

Thanks for the fix, i can verify that 3.13.1 works fine.

@leewillis77

Thanks @Orvid - 3.13.1 installed and doesn't crash immediately :)

@roxyxty
roxyxty commented Apr 7, 2016

I can also confirm that 3.13.1 works good so far.
Thanks.

@RalphCorderoy

Thanks @Orvid. Is there a commit ID? I had a look and it didn't seem obvious, so I was probably looking in the wrong place. Also, are any regression tests planned?

@SiebelsTim
Contributor

I think this is 864a9a4

@Orvid
Contributor
Orvid commented Apr 7, 2016

Yep, 864a9a4 is the revert commit. I just kicked off the build for the last of the debug packages, so I'm going to close this.

@Orvid Orvid closed this Apr 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment