Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PassengerAgent dead with "corrupted double-linked list" #1587

Closed
dmitry opened this issue Aug 26, 2015 · 20 comments
Closed

PassengerAgent dead with "corrupted double-linked list" #1587

dmitry opened this issue Aug 26, 2015 · 20 comments
Milestone

Comments

@dmitry
Copy link

@dmitry dmitry commented Aug 26, 2015

Today we got an outage on one of our servers with the following message in our passenger.log:

*** glibc detected *** PassengerAgent server: corrupted double-linked list: 0x00007f928407a420 ***

I haven't researched what does it mean, but looks like it's related to the PassengerAgent, not nginx directly.

After that nginx always responded with 502, and passenger-status wasn't able to complete at all.

Another thing I found in our passenger.log at the same server few hours before that: https://gist.github.com/dmitry/623378b4fd177c8c03d6

@OnixGH

This comment has been minimized.

Copy link
Contributor

@OnixGH OnixGH commented Aug 26, 2015

Can you give us the entire output? This looks like a backtrace (after a crash?) of a thread that's just waiting (no error). The most interesting thread is usually the one that contains the keyword "raise", but the entire output is most helpful.

Also the logs surrounding (especially before) the error are necessary to analyze the problem.

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Aug 26, 2015

I can't find anything related to this crash in logs. The log I've provided was happened before few hours, and might be related, but it didn't crashed the whole server anyway. If you can help me to find out where I can get the PassengerAgent logs, I would help you with the crash logs if I would find it.

Just few minutes ago I saw another crash, but PassengerAgent restored and now continues to work, not leading to the hard crash as it was before. Will send you the log directly by an email as it has a some internal credentials.

@OnixGH

This comment has been minimized.

Copy link
Contributor

@OnixGH OnixGH commented Aug 26, 2015

Thanks for the logs! Did you recently upgrade Passenger, and from what version?

We have a similar report from a 5.0.6 -> 5.0.15 upgrade, and it would help a lot to narrow this down.

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Aug 26, 2015

We are running on 5.0.9 for something around few months. No issues like that happened for the last few months, I've started to receive some reports just on the last week.

Just a thought: might be a relation to the union station?
We are going to upgrade from a 5.0.9 to 5.0.16 tomorrow, lets look how it will behave. Hopefully no such issues will be in the future again.

PS. Interestingly, that all the monitoring tools haven't reported anything strange (pingdom, newrelic health check), only customer support started to receive more issues; but it's more our internal problems :)

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Aug 27, 2015

After upgrade to 5.0.16 there are still passengers are shutting down on both our application servers:

*** glibc detected *** Passenger core: double free or corruption (!prev): 0x00007fd20404da50 ***
@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 2, 2015

Any progress resolving this? This happens many times a day, with different types of messages:

The last one I've got from the passenger.log:

*** glibc detected *** Passenger core: free(): invalid next size (normal): 0x00007f17a00180f0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7db26)[0x7f17c7f8db26]
@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 3, 2015

OnixGH hasn't had any luck so far. Testing with Valgrind didn't yield anything. He is currently on vacation, so I will take over this issue in the mean time. I'm going to give it a try too.

@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 3, 2015

@dmitry Would you be able to provide with more detailed crash logs? You already provided one by email, but if you have more then that would be preferable.

If Passenger crashes, you should see a message like this inside passenger.log:

Crash log dumped to /var/tmp/passenger-crash-log.XXXXX

Actually, if you look in /var/tmp you should see a bunch of historical crash logs. Could you send 3 of the most recent logs to hongli@phusion.nl?

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 3, 2015

@FooBarWidget thanks for the information. Just sent you archives with a log files.

@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 3, 2015

I think I have fixed this. Could you test the GH-1587 branch?

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 3, 2015

Pushed a ticket to our SysOps. Most likely tomorrow in the evening I will let you know if it fixed an issue or no.

Thanks!

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 3, 2015

Branch code is running for the latest few hours without any issues in logs. I will let you know tomorrow if it will be still clean.

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 4, 2015

Hongli, it's now fixed, at least no more issues were noticed since yesterday's update. Today we had downtime because it's not the enterprise branch (some of the features were disabled) and I haven't noticed, though SysOps told me yesterday :(

We are now back to 5.0.16 but waiting for the 5.0.17 release with this fix.

@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 4, 2015

Thanks for the confirmation. We'll release OSS and Enterprise 5.0.17 as soon as possible.

@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 4, 2015

It should be noted: if you installed Passenger Enterprise through the source tarball or the gem, then you can apply the patch at commit b8d08b on top of the Enterprise source directory.

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 4, 2015

Thanks for the information.

I think we can live with current version for next few days, I'm a little bit scarry to upgrade on Friday.

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 7, 2015

@FooBarWidget do you have an idea, when the next version of passenger will be released (on the weekend we had outages for the 3 times)?

@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 8, 2015

A release was planned was for today.

@FooBarWidget

This comment has been minimized.

Copy link
Member

@FooBarWidget FooBarWidget commented Sep 8, 2015

5.0.17 has been released. The release announcement hasn't been published yet, but you can already download the upgrade. The Debian and RPM packages are being built and will follow shortly.

@dmitry

This comment has been minimized.

Copy link
Author

@dmitry dmitry commented Sep 8, 2015

@FooBarWidget thank you, as always, for the quick feedback! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.