Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(profiling): potential allocation profiling crashes with certain opcodes #2352

Merged
merged 16 commits into from
Jan 10, 2024

Conversation

morrisonlevi
Copy link
Collaborator

@morrisonlevi morrisonlevi commented Nov 10, 2023

Description

TODO: update the whole description for state of the current patch.

On PHP 7.4 and on PHP 8.0 before 8.0.26, the engine doesn't SAVE_OPLINE(); for the ZEND_GENERATOR_CREATE opcode. If the allocation profiler triggers on an allocation in this state, then it will sigsegv because the opline is non-null but invalid.

Note that the SAVE_OPLINE(); is missing on PHP < 7.4 as well, but on those versions i_init_func_execute_data initializes the opline. Coupled with the new test, we are confident that we don't need to install an opcode handler for those versions.

Who is affected by the crash?

  • If you are running PHP 8.0.0 up to and including 8.0.25, then you are affected if you use a generator and the allocation profiler is enabled, and it's been enabled by default since dd-trace-php 0.89.0 (todo: double-check).
  • On PHP 7, if you not running the tracer (only the profiler), then you are affected. This configuration is very rare, generally only Datadog engineers run this configuration.

Everything else is unaffected. To be clear:

  • On PHP 7 you are fine if you also run the tracer (highly likely).
  • On PHP 8, if you are on 8.0.26 or higher, then you are unaffected.
  • On MacOS you are fine

In summary, no known customers are affected. Nearly everyone on PHP 7 runs the tracer, and nearly everyone is on a patched version of PHP 8.0, or on PHP 8.1 or newer. If you are on an affected version of PHP 8.0, as a quick mitigation you can disable the allocation profile by setting the ini datadog.profiling.allocation_enabled=0. You can do this with config mode of the setup script:

# Do this after `php datadog-setup.php --enable-profiling ...`
php datadog-setup.php config set --php-bin all \
    -ddatadog.profiling.allocation_enabled=0

Readiness checklist

  • Changelog has been added to the release document.
  • Tests added for this feature/bug.

Reviewer checklist

  • Appropriate labels assigned.
  • Milestone is set.

@morrisonlevi morrisonlevi added 🐛 bug Something isn't working cat:app-crash profiling Relates to the Continuous Profiler labels Nov 10, 2023
@morrisonlevi morrisonlevi added this to the 0.94.0 milestone Nov 10, 2023
@morrisonlevi morrisonlevi marked this pull request as ready for review November 10, 2023 12:17
@morrisonlevi morrisonlevi requested a review from a team as a code owner November 10, 2023 12:17
@pr-commenter
Copy link

pr-commenter bot commented Nov 10, 2023

Benchmarks

Benchmark execution time: 2023-12-29 17:05:07

Comparing candidate commit c004213 in PR branch levi-and-florian/fix-generator-create-crash with baseline commit 9f4cb76 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 3 unstable metrics.


/* Issue is fixed in 8.0.12:
* https://github.com/php/php-src/commit/ec54ffad1e3b15fedfd07f7d29d97ec3e8d1c45a
* The tracer didn't save us here. Fortunately, throwing destru
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: fix this comment lol

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I remember. It is not only about fixing the comment, but try and see if the fix is enough, as there is a zend_array_dup() call few lines before the SAVE_OPLINE() happens. I'll check if I can trigger this, I highly assume it will crash as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we need to adjust the version range too. If the bad opline is still on PHP 8.0.12+, then we need to install the opcode handler on more versions too (maybe up to latest 😞).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jup, we need to install up to latest, can you check bf3fd02

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I add another commit that cleans things up and adds a pointless test for ZEND_BIND_STATIC with a todo.

@realFlowControl realFlowControl force-pushed the levi-and-florian/fix-generator-create-crash branch from 8ec66a9 to ad46787 Compare November 14, 2023 15:29
@morrisonlevi morrisonlevi marked this pull request as draft November 15, 2023 23:11
@morrisonlevi
Copy link
Collaborator Author

I moved this back to draft. I need to figure out:

  • Does the JIT do anything with these specific opcodes?
  • If not, then bypass the handler check by installing these opcodes after they are checked.
  • If so, rework the phpt files or use skipifs or something to accommodate the optional JIT warning about getting disabled if there are user opcode handlers.

@morrisonlevi
Copy link
Collaborator Author

ZEND_GENERATOR_CREATE is definitely referred to in the JIT a few times, but as far as I can tell, it's about determining exit/entry points, and nothing really about replacing the opcode itself. However, I am not the best at reading this JIT code.

ZEND_BIND_STATIC is never referred to in the JIT. I think we can safely install this opcode handler without affecting the JIT.

@morrisonlevi morrisonlevi modified the milestones: 0.94.0, future Nov 17, 2023
@morrisonlevi morrisonlevi changed the title fix(profiling): potential allocation profiling crash with generators fix(profiling): potential allocation profiling crashes with certain opcodes Nov 17, 2023
@morrisonlevi morrisonlevi force-pushed the levi-and-florian/fix-generator-create-crash branch 2 times, most recently from 3c8cceb to 60dccae Compare November 17, 2023 23:53
@realFlowControl realFlowControl force-pushed the levi-and-florian/fix-generator-create-crash branch from ea2e364 to 1b6cbc3 Compare December 22, 2023 08:40
@github-actions github-actions bot removed this from the future milestone Dec 22, 2023
@realFlowControl
Copy link
Collaborator

realFlowControl commented Dec 22, 2023

@morrisonlevi I added ZEND_FUNC_GET_ARGS and finalised the PHP versions, now that the PRs have been merged and released.
There are other opcodes that we think should also crash, but we where unable to reproduce this to date. I would think that after successful review, we can merge this an continue in a new PR to add the other opcodes once we have a reproducer and a fix contributed to upstream

@realFlowControl realFlowControl marked this pull request as ready for review December 22, 2023 13:15
@morrisonlevi
Copy link
Collaborator Author

@realFlowControl Thank you! It looks good to me. I merged in master and added your upstream phpt with a long-form description. Can you do a final code review?

@morrisonlevi morrisonlevi force-pushed the levi-and-florian/fix-generator-create-crash branch from 16a35aa to c004213 Compare December 29, 2023 16:57
Copy link
Collaborator

@bwoebi bwoebi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

I also like set_run_time_php_version_id being moved to C code, looks much more readable there.

@realFlowControl realFlowControl self-assigned this Jan 10, 2024
@realFlowControl realFlowControl merged commit e752ef9 into master Jan 10, 2024
551 of 554 checks passed
@realFlowControl realFlowControl deleted the levi-and-florian/fix-generator-create-crash branch January 10, 2024 11:52
@realFlowControl realFlowControl added this to the 0.97.0 milestone Jan 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working cat:app-crash profiling Relates to the Continuous Profiler
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants