Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high hhvm CPU usage when idle after high usage #2069

Closed
gasparfm opened this issue Mar 15, 2014 · 11 comments
Closed

high hhvm CPU usage when idle after high usage #2069

gasparfm opened this issue Mar 15, 2014 · 11 comments

Comments

@gasparfm
Copy link

Hi,
I'm using hhvm 2.4.2 on an Ubuntu server 12.04 machine with Apache 2.2 in fastcgi mode, and using a WordPress blog to make all the tests Everything works out of the box, but from another server I do:

$ ab -n 50000 -c 400 http://my/server

The test goes fine, but after the test hhvm process keeps eating 100% CPU until I restart the server. How can I know what is hhvm doing? I've tested php-fpm and it doesn't do the same.

Thanks

@ptarjan
Copy link
Contributor

ptarjan commented Mar 18, 2014

What is the server doing? Can you do a strace -p <pid> when it is stuck?

@gasparfm
Copy link
Author

When I do strace -p [hhvm-pid] it just:
root@cgitest:/etc/init.d# strace -p 6401
Process 6401 attached - interrupt to quit
futex(0x7f813c226854, FUTEX_WAIT_PRIVATE, 1, NULL

I was testing something more:

root@cgitest:/etc/init.d# strace -ifTv -p 6401
But it repeats something like this again and again:
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 825117}, NULL) = 0 <0.000096>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 828357}, NULL) = 0 <0.003144>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 832355}, NULL) = 0 <0.003877>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 832553}, NULL) = 0 <0.000091>
[pid 6749] [ 7fffb71b3c43] <... gettimeofday resumed> {1395550735, 832644}, NULL) = 0 <0.189691>
[pid 6831] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6632] [ 7fffb71b3c43] <... gettimeofday resumed> {1395550735, 832667}, NULL) = 0 <0.015994>
[pid 6831] [ 7fffb71b3c43] <... gettimeofday resumed> {1395550735, 833006}, NULL) = 0 <0.000311>
[pid 6632] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 833235}, NULL) = 0 <0.000101>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 836466}, NULL) = 0 <0.000102>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 837046}, NULL) = 0 <0.000473>
[pid 6585] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 837248}, NULL) = 0 <0.000092>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 837445}, NULL) = 0 <0.000090>
[pid 6567] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6831] [ 7fffb71b3c43] gettimeofday({1395550735, 844353}, NULL) = 0 <0.004000>
[pid 6698] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6831] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6632] [ 7fffb71b3c43] <... gettimeofday resumed> {1395550735, 845862}, NULL) = 0 <0.012884>
[pid 6570] [ 7f8141cb305d] <... write resumed> ) = 97 <0.021152>
[pid 6831] [ 7fffb71b3c43] <... gettimeofday resumed> {1395550735, 848362}, NULL) = 0 <0.002476>
[pid 6570] [ 7f8141cb305d] write(16, "db6666a e8 PHP::/var/www/"..., 97 <unfinished ...>
[pid 6831] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6570] [ 7f8141cb305d] <... write resumed> ) = 97 <0.000145>
[pid 6831] [ 7fffb71b3c43] <... gettimeofday resumed> {1395550735, 848668}, NULL) = 0 <0.000128>
[pid 6570] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
[pid 6831] [ 7fffb71b3c43] gettimeofday( <unfinished ...>
....
sometimes it does it for a while, but sometimes it keeps on eating CPU all night long. I've been testing a nightly build (downloaded last monday) and it does much less frequently.

@scannell
Copy link
Contributor

scannell commented Apr 2, 2014

Glad to hear it has improved at least. Without a reliable repro (or more investigation to pinpoint the issue) we're probably not going to make much progress on this.

@ptarjan
Copy link
Contributor

ptarjan commented May 12, 2014

Sadly we don't have enough info to be actionable. Please re-open if the issue still persists and we can do something to help.

@ptarjan ptarjan closed this as completed May 12, 2014
@oliwarner
Copy link

I've just installed hhvm from the repo (3.0.1~precise) and had similar behaviour. Hit it with a few pages (nothing nearly as heavy as the ab test) and it got stuck cycling at 100%. Other requests still make it through , it's just chewing on one core indefinitely.

Mine is sitting behind nginx but still using fastcgi.

@chielsen
Copy link

I have the exact same thing as TS with 3.5.0 (rel). I see no other errors. Any advise on how to debug further?

@jwatzman
Copy link
Contributor

A backtrace from the stuck thread could be useful.

@chielsen
Copy link

How do i do that?

@jwatzman
Copy link
Contributor

Run sudo gdb $(which hhvm) $(pidof hhvm), then thread apply all bt at the gdb prompt.

@fredemmott
Copy link
Contributor

Deleted @ksaltik's comment for spamming it all over the place.

@steelbrain
Copy link
Contributor

Thanks @fredemmott much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants