Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

session_create_id causes SIGSEGV #1158

Open
iluuu1994 opened this issue Jul 5, 2019 · 32 comments

Comments

@iluuu1994
Copy link

commented Jul 5, 2019

Describe the bug

Since we have updated to SimpleSaml 1.17 we get daily SIGSEGV of PHP-FPM due to the new call to session_create_id():

$sessionId = session_create_id();

/var/log/kern.log

Jul  5 16:00:03 servername kernel: [265630.815422] traps: php-fpm7.3[21032] general protection ip:5580a3494efd sp:7fff8cff8498 error:0
Jul  5 16:00:03 servername kernel: [265630.815428]  in php-fpm7.3[5580a3231000+3ee000]

/var/log/php7.3-fpm.log

[05-Jul-2019 16:24:44] WARNING: [pool www] child 21558 exited on signal 11 (SIGSEGV) after 0.454303 seconds from start
[05-Jul-2019 16:24:44] NOTICE: [pool www] child 21561 started

This is the PHP version we use:

PHP 7.3.6-1+0~20190531112735.39+stretch~1.gbp6131b7 (cli) (built: May 31 2019 11:27:35) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.6, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.3.6-1+0~20190531112735.39+stretch~1.gbp6131b7, Copyright (c) 1999-2018, by Zend Technologies

To Reproduce

Unfortunately, we could not recognize any patterns on how to reproduce the issue. It seems to happen randomly.

Obviously this is not your issue to solve but I'm posting this here for two reason:

  • Have you encountered this issue before?
  • Could SimpleSaml add an option to use bin2hex(openssl_random_pseudo_bytes(16)) instead?

Thank you!

@iluuu1994 iluuu1994 changed the title session_create_id causes segfault session_create_id causes SIGSEGV Jul 5, 2019
@tvdijen

This comment has been minimized.

Copy link
Member

commented Jul 5, 2019

Ha, that was the same behaviour I was seeing but I didn't look into it any further yet due to time restrictions! It does indeed happen randomly.. I've been running 1.17 on 7.2 as well and it doesn't segfault there.. It started for me when I upgraded to 7.3..

@iluuu1994 have you reported this at PHP already?

@thijskh this was the unexpected crashes I have mentioned on the OC-mailinglist when we we're discussing PHP73..

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

I'm honestly relieved to hear that we're not the only ones with this issue. Do you think this is critical enough to switch to the old session id generation until this issue is fixed?

have you reported this at PHP already?

Not yet. I was trying to get a core dump to get some debugging information but didn't succeed yet. I'll try tomorrow and I'll create a bug report then.

Thank you @tvdijen for your response!

@tvdijen

This comment has been minimized.

Copy link
Member

commented Jul 5, 2019

I'd like to have a look at the core dump and see how they respond to the bug-report before we start jumping hoops, but it's definitely an option to disable it until a solution is in place.

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

@tvdijen Fair enough! I can send a PR for the option tomorrow.

@tvdijen

This comment has been minimized.

Copy link
Member

commented Jul 5, 2019

Thanks for the offer, but I think I prefer a && version_compare(phpversion(), '7.3', '<') added to this line.. If we add this to 1.17 it would buy us some time to get to a more permanent solution. No need for a PR for such a one-liner ;)

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

Great! Thanks for your help!

mrtronje added a commit to mrtronje/simplesamlphp that referenced this issue Jul 6, 2019
@tvdijen

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

Can either of you confirm that the patch by @mrtronje gets rid of the SIGSEGV's?

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

@tvdijen Yes, we have been using it in production for a few days and have no more SIGSEVs.

@tvdijen

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

@iluuu1994 Did you manage to get some debug logging and to report a bug @php?

@jaimeperez

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

I have mixed feelings about this. On the one hand, it's really bad that we're getting SIGSEGVs with PHP 7.3 by simply using the software. On the other hand, that's clearly a bug in PHP that should be solved. The issue I see with the one-liner is that it would downgrade session ID generation for everyone using PHP 7.3, no matter if the bug is fixed or not. So if PHP 7.3.8 is released tomorrow fixing this issue, people running SSP 1.17.3 would still use non-collision-resistant session IDs unless we release SSP 1.17.4 reverting the one-liner and they install it.

I've been looking for a bug report, but didn't see anything. We should definitely report this to PHP and get it fixed there. Whether we should apply the one-liner as a temporary workaround, I honestly don't know...

@thijskh do you have any comments on this?

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

@tvdijen Unfortunately not. We haven't had a lot of time lately to look into it. I'll have to do it in my free time.

@jaimeperez

would still use non-collision-resistant session IDs unless we release SSP 1.17.4 reverting the one-liner and they install it.

Another simple version check will do it once it's fixed. A slightly less optimal session id generation is preferable a crashing application.

@thijskh

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

I agree with @jaimeperez that I'd be uneasy to disable this for all future PHP versions. We'd definitely need some way to come back to this issue later to see if it's still relevant. Maybe we can apply the workaround only in 1.17.3 and keep this issue open for 1.18. And indeed obviously file an issue with PHP itself.

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

Why not just:

if (
    function_exists('session_create_id')
    && (
        version_compare(phpversion(), '7.3', '<')
        || version_compare(phpversion(), '7.3.future_patched_version', '>=')
    )
) {

After a reasonable amount of time this could be removed.

@thijskh

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

That is useful in the future but not something we can do now since we don't know the value of future_patched_version yet.

@jaimeperez

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

I agree that having PHP crash is much, much worse than anything else.

What about applying this one-liner fix, releasing 1.17.3, then reverting the fix? We keep the issue open, and before releasing next version, we check if there is a fix in PHP already. If not, apply the same patch. If it is resolved, then apply your suggested check to avoid only affected versions, and then commit that to master as well.

Does that sound reasonable?

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

Sounds good 🙂

@thijskh

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

Let's go for it

@thijskh

This comment has been minimized.

Copy link
Member

commented Jul 9, 2019

I have applied the temporary workaround on the 1.17 branch now.

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 9, 2019

Thanks guys!

@tymees

This comment has been minimized.

Copy link

commented Jul 10, 2019

Hey all,

We've encountered this SIGSEGV on our servers too, running on PHP 7.3 as well. However, we only encountered this with SSP 1.17.2, downgrading to 1.17.1 fixed this issue for us.

I don't see any changes that directly relate to the session generation when looking at the commit comparisons between these versions. It could be that I'm overlooking something, I don't really know this codebase of course.

Does anyone know why this downgrade solved it for us? (Just to be sure that there isn't a different issue at play here and I can upgrade safely)

@tvdijen

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

The only significant change between those versions is that we've started using Webmozart-assertions on the saml2-library.. What happens if you run v1.17.2 and downgrade simplesaml/saml2 to v3.3.8?

@jaimeperez

This comment has been minimized.

Copy link
Member

commented Jul 10, 2019

We have 1.17.3 in place now with the fix for this 😄

@tymees

This comment has been minimized.

Copy link

commented Jul 10, 2019

The only significant change between those versions is that we've started using Webmozart-assertions on the saml2-library.. What happens if you run v1.17.2 and downgrade simplesaml/saml2 to v3.3.8?

This didn't fix the errors sadly. However, 1.17.3 does seem to fix it for now.

That still doesn't explain why 1.17.1 seems to work for us, but I don't think it's worth it to keep digging for an answer.

@txigreman

This comment has been minimized.

Copy link

commented Jul 29, 2019

Hi guys,

The issue is not related exclusively to PHP 7.3, I'm getting the SIGSEGV error en 7.2. I changed the comparision of the workaround from 7.3 to 7.2 and now its working. It was previously failing with both PHP 7.2.19 and 7.2.20.

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Jul 29, 2019

We never tried this on 7.2. We were stuck on PHP 7.0 and thus also on an older version of SimpleSaml that didn't use session_create_id. Thanks for letting us know.

If anybody can get a stack trace please post it in the comments of the bug report.

@jaimeperez jaimeperez added this to the 1.18 milestone Sep 16, 2019
@jaimeperez

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

Hi guys,

I'm considering introducing a regression deliberately. We are on the verge of release 1.18 (at least the first release candidate), and I would like to get rid of this temporary fix if possible. I've been trying to reproduce this locally in the command line with the two versions of PHP that are reportedly causing crashes (7.2 and 7.3). Essentially, I'm creating millions of session IDs with session_create_id() in a loop. So far, I've been unable to reproduce it.

@iluuu1994, @txigreman, @tymees would you be willing to give 1.18-rc1 a try and tell us if you still get those crashes or not?

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Sep 16, 2019

Essentially, I'm creating millions of session IDs with session_create_id() in a loop

I've tried that back then too but couldn't reproduce it this way. I highly doubt the issue is gone since there was no activity in the PHP bug report...

@tvdijen

This comment has been minimized.

Copy link
Member

commented Sep 16, 2019

I just deployed current master without the fix on a fully updated PHP 7.3 install.. Let's see what happens..

@txigreman

This comment has been minimized.

Copy link

commented Sep 16, 2019

Sorry, but I can not test it because it only happens in the production server :/
For me the way to replicate the error was changing the value of the SimpleSAML cookie. I hope it works for you!

@tymees

This comment has been minimized.

Copy link

commented Sep 18, 2019

I've deployed the latest master on our acceptation server, so far no issues...

I'll do some testing next week when I come back to work

@tvdijen

This comment has been minimized.

Copy link
Member

commented Sep 24, 2019

I found out how to create coredumps and it's actually quite easy:
https://ma.ttias.be/generate-php-core-dumps-segfaults-php-fpm/

I have enabled it on my 7.2 + 7.3 machine to add to the chance to catch one.. It would help if you guys could all do this, so we can get this fixed by PHP ASAP..

@iluuu1994

This comment has been minimized.

Copy link
Author

commented Sep 24, 2019

@tvdijen Yeah those are the same instructions we've followed but for some reason the core dump file was never generated. Hopefully you'll have more luck.

jaimeperez added a commit that referenced this issue Sep 24, 2019
The issue seems to still be in place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.