Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

session_create_id causes SIGSEGV #1158

Closed
iluuu1994 opened this issue Jul 5, 2019 · 37 comments
Closed

session_create_id causes SIGSEGV #1158

iluuu1994 opened this issue Jul 5, 2019 · 37 comments
Labels
Milestone

Comments

@iluuu1994
Copy link

iluuu1994 commented Jul 5, 2019

Describe the bug

Since we have updated to SimpleSaml 1.17 we get daily SIGSEGV of PHP-FPM due to the new call to session_create_id():

$sessionId = session_create_id();

/var/log/kern.log

Jul  5 16:00:03 servername kernel: [265630.815422] traps: php-fpm7.3[21032] general protection ip:5580a3494efd sp:7fff8cff8498 error:0
Jul  5 16:00:03 servername kernel: [265630.815428]  in php-fpm7.3[5580a3231000+3ee000]

/var/log/php7.3-fpm.log

[05-Jul-2019 16:24:44] WARNING: [pool www] child 21558 exited on signal 11 (SIGSEGV) after 0.454303 seconds from start
[05-Jul-2019 16:24:44] NOTICE: [pool www] child 21561 started

This is the PHP version we use:

PHP 7.3.6-1+0~20190531112735.39+stretch~1.gbp6131b7 (cli) (built: May 31 2019 11:27:35) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.3.6, Copyright (c) 1998-2018 Zend Technologies
    with Zend OPcache v7.3.6-1+0~20190531112735.39+stretch~1.gbp6131b7, Copyright (c) 1999-2018, by Zend Technologies

To Reproduce

Unfortunately, we could not recognize any patterns on how to reproduce the issue. It seems to happen randomly.

Obviously this is not your issue to solve but I'm posting this here for two reason:

  • Have you encountered this issue before?
  • Could SimpleSaml add an option to use bin2hex(openssl_random_pseudo_bytes(16)) instead?

Thank you!

@iluuu1994 iluuu1994 changed the title session_create_id causes segfault session_create_id causes SIGSEGV Jul 5, 2019
@tvdijen
Copy link
Member

tvdijen commented Jul 5, 2019

Ha, that was the same behaviour I was seeing but I didn't look into it any further yet due to time restrictions! It does indeed happen randomly.. I've been running 1.17 on 7.2 as well and it doesn't segfault there.. It started for me when I upgraded to 7.3..

@iluuu1994 have you reported this at PHP already?

@thijskh this was the unexpected crashes I have mentioned on the OC-mailinglist when we we're discussing PHP73..

@iluuu1994
Copy link
Author

iluuu1994 commented Jul 5, 2019

I'm honestly relieved to hear that we're not the only ones with this issue. Do you think this is critical enough to switch to the old session id generation until this issue is fixed?

have you reported this at PHP already?

Not yet. I was trying to get a core dump to get some debugging information but didn't succeed yet. I'll try tomorrow and I'll create a bug report then.

Thank you @tvdijen for your response!

@tvdijen
Copy link
Member

tvdijen commented Jul 5, 2019

I'd like to have a look at the core dump and see how they respond to the bug-report before we start jumping hoops, but it's definitely an option to disable it until a solution is in place.

@iluuu1994
Copy link
Author

@tvdijen Fair enough! I can send a PR for the option tomorrow.

@tvdijen
Copy link
Member

tvdijen commented Jul 5, 2019

Thanks for the offer, but I think I prefer a && version_compare(phpversion(), '7.3', '<') added to this line.. If we add this to 1.17 it would buy us some time to get to a more permanent solution. No need for a PR for such a one-liner ;)

@iluuu1994
Copy link
Author

Great! Thanks for your help!

@tvdijen
Copy link
Member

tvdijen commented Jul 9, 2019

Can either of you confirm that the patch by @mrtronje gets rid of the SIGSEGV's?

@iluuu1994
Copy link
Author

iluuu1994 commented Jul 9, 2019

@tvdijen Yes, we have been using it in production for a few days and have no more SIGSEVs.

@tvdijen
Copy link
Member

tvdijen commented Jul 9, 2019

@iluuu1994 Did you manage to get some debug logging and to report a bug @php?

@jaimeperez
Copy link
Member

I have mixed feelings about this. On the one hand, it's really bad that we're getting SIGSEGVs with PHP 7.3 by simply using the software. On the other hand, that's clearly a bug in PHP that should be solved. The issue I see with the one-liner is that it would downgrade session ID generation for everyone using PHP 7.3, no matter if the bug is fixed or not. So if PHP 7.3.8 is released tomorrow fixing this issue, people running SSP 1.17.3 would still use non-collision-resistant session IDs unless we release SSP 1.17.4 reverting the one-liner and they install it.

I've been looking for a bug report, but didn't see anything. We should definitely report this to PHP and get it fixed there. Whether we should apply the one-liner as a temporary workaround, I honestly don't know...

@thijskh do you have any comments on this?

@iluuu1994
Copy link
Author

@tvdijen Unfortunately not. We haven't had a lot of time lately to look into it. I'll have to do it in my free time.

@jaimeperez

would still use non-collision-resistant session IDs unless we release SSP 1.17.4 reverting the one-liner and they install it.

Another simple version check will do it once it's fixed. A slightly less optimal session id generation is preferable a crashing application.

@thijskh
Copy link
Member

thijskh commented Jul 9, 2019

I agree with @jaimeperez that I'd be uneasy to disable this for all future PHP versions. We'd definitely need some way to come back to this issue later to see if it's still relevant. Maybe we can apply the workaround only in 1.17.3 and keep this issue open for 1.18. And indeed obviously file an issue with PHP itself.

@iluuu1994
Copy link
Author

Why not just:

if (
    function_exists('session_create_id')
    && (
        version_compare(phpversion(), '7.3', '<')
        || version_compare(phpversion(), '7.3.future_patched_version', '>=')
    )
) {

After a reasonable amount of time this could be removed.

@thijskh
Copy link
Member

thijskh commented Jul 9, 2019

That is useful in the future but not something we can do now since we don't know the value of future_patched_version yet.

@jaimeperez
Copy link
Member

I agree that having PHP crash is much, much worse than anything else.

What about applying this one-liner fix, releasing 1.17.3, then reverting the fix? We keep the issue open, and before releasing next version, we check if there is a fix in PHP already. If not, apply the same patch. If it is resolved, then apply your suggested check to avoid only affected versions, and then commit that to master as well.

Does that sound reasonable?

@iluuu1994
Copy link
Author

Sounds good 🙂

@thijskh
Copy link
Member

thijskh commented Jul 9, 2019

Let's go for it

@thijskh
Copy link
Member

thijskh commented Jul 9, 2019

I have applied the temporary workaround on the 1.17 branch now.

@iluuu1994
Copy link
Author

Thanks guys!

@tymees
Copy link

tymees commented Jul 10, 2019

Hey all,

We've encountered this SIGSEGV on our servers too, running on PHP 7.3 as well. However, we only encountered this with SSP 1.17.2, downgrading to 1.17.1 fixed this issue for us.

I don't see any changes that directly relate to the session generation when looking at the commit comparisons between these versions. It could be that I'm overlooking something, I don't really know this codebase of course.

Does anyone know why this downgrade solved it for us? (Just to be sure that there isn't a different issue at play here and I can upgrade safely)

@tvdijen
Copy link
Member

tvdijen commented Jul 10, 2019

The only significant change between those versions is that we've started using Webmozart-assertions on the saml2-library.. What happens if you run v1.17.2 and downgrade simplesaml/saml2 to v3.3.8?

@jaimeperez
Copy link
Member

We have 1.17.3 in place now with the fix for this 😄

@tymees
Copy link

tymees commented Jul 10, 2019

The only significant change between those versions is that we've started using Webmozart-assertions on the saml2-library.. What happens if you run v1.17.2 and downgrade simplesaml/saml2 to v3.3.8?

This didn't fix the errors sadly. However, 1.17.3 does seem to fix it for now.

That still doesn't explain why 1.17.1 seems to work for us, but I don't think it's worth it to keep digging for an answer.

@txigreman
Copy link

Hi guys,

The issue is not related exclusively to PHP 7.3, I'm getting the SIGSEGV error en 7.2. I changed the comparision of the workaround from 7.3 to 7.2 and now its working. It was previously failing with both PHP 7.2.19 and 7.2.20.

@iluuu1994
Copy link
Author

We never tried this on 7.2. We were stuck on PHP 7.0 and thus also on an older version of SimpleSaml that didn't use session_create_id. Thanks for letting us know.

If anybody can get a stack trace please post it in the comments of the bug report.

@jaimeperez jaimeperez added this to the 1.18 milestone Sep 16, 2019
@jaimeperez
Copy link
Member

Hi guys,

I'm considering introducing a regression deliberately. We are on the verge of release 1.18 (at least the first release candidate), and I would like to get rid of this temporary fix if possible. I've been trying to reproduce this locally in the command line with the two versions of PHP that are reportedly causing crashes (7.2 and 7.3). Essentially, I'm creating millions of session IDs with session_create_id() in a loop. So far, I've been unable to reproduce it.

@iluuu1994, @txigreman, @tymees would you be willing to give 1.18-rc1 a try and tell us if you still get those crashes or not?

@iluuu1994
Copy link
Author

Essentially, I'm creating millions of session IDs with session_create_id() in a loop

I've tried that back then too but couldn't reproduce it this way. I highly doubt the issue is gone since there was no activity in the PHP bug report...

@tvdijen
Copy link
Member

tvdijen commented Sep 16, 2019

I just deployed current master without the fix on a fully updated PHP 7.3 install.. Let's see what happens..

@txigreman
Copy link

Sorry, but I can not test it because it only happens in the production server :/
For me the way to replicate the error was changing the value of the SimpleSAML cookie. I hope it works for you!

@tymees
Copy link

tymees commented Sep 18, 2019

I've deployed the latest master on our acceptation server, so far no issues...

I'll do some testing next week when I come back to work

@tvdijen
Copy link
Member

tvdijen commented Sep 24, 2019

I found out how to create coredumps and it's actually quite easy:
https://ma.ttias.be/generate-php-core-dumps-segfaults-php-fpm/

I have enabled it on my 7.2 + 7.3 machine to add to the chance to catch one.. It would help if you guys could all do this, so we can get this fixed by PHP ASAP..

@iluuu1994
Copy link
Author

@tvdijen Yeah those are the same instructions we've followed but for some reason the core dump file was never generated. Hopefully you'll have more luck.

jaimeperez added a commit that referenced this issue Sep 24, 2019
The issue seems to still be in place.
tvdijen pushed a commit that referenced this issue Nov 2, 2019
The issue seems to still be in place.
@tvdijen tvdijen modified the milestones: 1.18, 1.19 Nov 28, 2019
@tvdijen
Copy link
Member

tvdijen commented Dec 24, 2019

I haven't been able to generate core dumps with the provided instruction either.. @tymees have you had any problems with more recent versions of SSP?

@tymees
Copy link

tymees commented Dec 25, 2019

Not that I know of no, error logs look clean too!

@thijskh
Copy link
Member

thijskh commented Feb 25, 2020

Given the feedback above I think it’s ok to close this now.

@thijskh thijskh closed this as completed Feb 25, 2020
@jaimeperez
Copy link
Member

1.18 has been using the temporary patch we created as well, so the fact that we haven't had any more people observing segfaults doesn't mean the issue is gone, unfortunately 😞

@jaimeperez jaimeperez reopened this Feb 25, 2020
@jaimeperez
Copy link
Member

My bad, Tim just corrected me that all the feedback we have now (confirming the issue is gone) was based on master, which didn't have the workaround in place, so I'm closing it back. Sorry for the noise!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants