Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make some parts of _zend_mm_heap read-only at runtime #14570

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jvoisin
Copy link
Contributor

@jvoisin jvoisin commented Jun 14, 2024

PHP's heap implementation is the one that virtually everybody uses: it's fast, it's there by default, it works, …
The only major ever I've found of custom heap implementation is phpdbg but it looks dispensable at best. Some other debuggers and profilers might use it, and that's alright, but I don't think that this feature should be enabled by default.

Disabling ZEND_MM_CUSTOM will allow to save a couple of bytes (yay), but the main goal is to close a low-hanging exploitation vector: as presented at OffensiveCon 2024, having trivially callable writeable function pointers at the top of the heap makes it straightforward to turn a limited write into an arbitrary code execution.

As presented at OffensiveCon 2024, having trivially
callable writeable function pointers at the top of the heap makes it straightforward to turn a limited write into an arbitrary code execution.

Disabling ZEND_MM_HEAP by default isn't doable, as it's used by a couple of profilers, so we're making some parts of _zend_mm_heap read-only at runtime instead: this will prevent the custom heap functions pointers from being hijacked.

cc @arnaud-lb

@jvoisin jvoisin requested a review from dstogov as a code owner June 14, 2024 16:28
@jvoisin jvoisin mentioned this pull request Jun 14, 2024
6 tasks
@iluuu1994
Copy link
Member

I've talked to @bwoebi about this before and some profilers like DataDog depend on this.

@jvoisin jvoisin marked this pull request as draft June 14, 2024 17:01
@arnaud-lb
Copy link
Member

Unfortunately this will break some useful functionality:

  • https://github.com/arnaud-lb/php-memory-profiler uses the custom callbacks to hook into memory allocations, and I would expect most memory profilers to do that. Having ZEND_MM_CUSTOM enabled by default allows profilers to work on standard php builds. Proposing ZendMM Observer API #11758 proposed an alternative API for profilers, but it also relies on function pointers in the zend_mm_heap struct. One other way to hook into the allocator would be to override the _emalloc/_efree functions and their variants, but this seems much more involved, less easy to maintain, and I don't know if this is practicable. Maybe @bwoebi @realFlowControl have some insights on this?
  • This will also break USE_ZEND_ALLOC=0, which is sometimes useful in non-debug builds for debugging/analysis

Instead of disabling the functionality we could mangle the function pointers with a secret. I believe that the glibc does this to protect atexit() callbacks, but I don't know how effective this is. This would be defeated by learning the secret, but we could argue that if they have a primitive to leak the secret, they could find the address of an other function pointer as well.

Alternatively, we could store the custom callbacks in a separate readonly memory region. In zend_mm_init(), allocate two pages for the zend_mm_heap + the function pointers. mprotect() the second page. Unprotect when necessary, e.g. in zend_mm_set_custom_handlers(). As an additional benefit this page would double as a guard page.

We might want to protect zend_mm_heap.storage as well, as it also points to a struct of function pointers.

@jvoisin
Copy link
Contributor Author

jvoisin commented Jun 14, 2024

I thought about having some parts of zend_mm_heap made read-only, like done in other allocators like hardened_malloc, but it sounded like complexity and maybe performance impact.

But I agree, having function pointers as well as other read-only fields like shadow_key, limit, pid, … would be interesting.

@bwoebi
Copy link
Member

bwoebi commented Jun 14, 2024

I don't think disabling ZEND_MM_CUSTOM is a good thing for all the reasons Arnaud outlined. Overriding the _emalloc/_efree needs platform dependent work and is not easy.

The proposal to make the parts of the head readonly sounds more reasonable. After all, these need to only be initializaed once per GINIT, so that sounds pretty okay I think.

@arnaud-lb
Copy link
Member

I thought about having some parts of zend_mm_heap made read-only, like done in other allocators like hardened_malloc, but it sounded like complexity and maybe performance impact.

But I agree, having function pointers as well as other read-only fields like shadow_key, limit, pid, … would be interesting.

I think if we only store custom_heap and storage in the readonly page, there will be no overhead as we never update these fields in the common case, and this will keep the change simple: We need to protect the page in zend_mm_init(), which is called once per process/thread, and then we never need to unprotect it unless zend_mm_set_custom_handlers() is used.

We can leave the use_custom_heap boolean in zend_mm_heap as changing it doesn't help exploitation, and I think moving it farther from other fields may have performance implications.

@realFlowControl
Copy link
Contributor

[...]
We can leave the use_custom_heap boolean in zend_mm_heap as changing it doesn't help exploitation, and I think moving it farther from other fields may have performance implications.

The Datadog profiler relies on the use_custom_heap boolean to be the first element in the zend_mm_heap, in order to bypass side effects when installing a custom handler to observe, but forwarding the calls back to the same heap. I started some work to fix these side effects and/or give APIs that allow extension developer to work around this. As of now, we rely on this field being where it is and being writeable.

@jvoisin
Copy link
Contributor Author

jvoisin commented Jun 16, 2024

I changed the code to go the read-only way :)

@jvoisin jvoisin force-pushed the disable_custom_heap_default branch from d91c592 to 120967f Compare June 16, 2024 21:01
@jvoisin jvoisin changed the title Disable ZEND_MM_CUSTOM by default. Make some parts of _zend_mm_heap read-only at runtime Jun 16, 2024
Zend/zend_alloc.c Outdated Show resolved Hide resolved
Zend/zend_alloc.c Outdated Show resolved Hide resolved
@iluuu1994
Copy link
Member

We have various other function pointers not living in .rodata. Examples are zend_execute_ex and object->handlers (with another level of indirection). zend_execute_ex could be addressed in a similar manner, but I don't see how object->handlers can. Does it make sense to fix some of those but leave others? Are some particularly problematic, compared to others?

@arnaud-lb
Copy link
Member

@iluuu1994 These pointers in zend_mm_heap are a particularly interesting for attackers because they can learn their address trivially. We can protect them very effectively and without overhead, so it seems beneficial to do so.

An other dangerous one, similar to object handlers, is the array dtor function, as attackers have plenty of ways to place arrays on the heap. It would be great if we could protect those too. I'm not sure how to, either. Mangling/xor would increase difficulty a bit, but I expect a performance regression especially for objects.

@jvoisin jvoisin force-pushed the disable_custom_heap_default branch from 96cbe36 to d25588c Compare June 17, 2024 21:45
@dstogov
Copy link
Member

dstogov commented Jun 18, 2024

I suppose, this can't be compatible with USE_ZEND_ALLOC_HUGE_PAGES.

@jvoisin
Copy link
Contributor Author

jvoisin commented Jun 18, 2024

I suppose, this can't be compatible with USE_ZEND_ALLOC_HUGE_PAGES.

Why not? It'll simply use a bit more memory.

@jvoisin jvoisin force-pushed the disable_custom_heap_default branch from d25588c to 25a3663 Compare June 18, 2024 12:45
Zend/zend_alloc.c Outdated Show resolved Hide resolved
Zend/zend_alloc.c Show resolved Hide resolved
Zend/zend_alloc.c Outdated Show resolved Hide resolved
Zend/zend_alloc.c Outdated Show resolved Hide resolved
@jvoisin jvoisin force-pushed the disable_custom_heap_default branch 5 times, most recently from c1dd1ee to 44bc6fd Compare June 21, 2024 15:13
Zend/zend_alloc.c Outdated Show resolved Hide resolved
@jvoisin jvoisin force-pushed the disable_custom_heap_default branch 3 times, most recently from 3aca506 to 9044d3e Compare July 2, 2024 08:28
@jvoisin jvoisin force-pushed the disable_custom_heap_default branch from 9044d3e to 7bd296d Compare July 3, 2024 11:09
@jvoisin jvoisin marked this pull request as ready for review July 3, 2024 11:09
Copy link
Member

@arnaud-lb arnaud-lb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM apart from a few minor comments

Zend/zend_alloc.c Outdated Show resolved Hide resolved
Zend/zend_alloc.c Outdated Show resolved Hide resolved
Zend/zend_alloc.c Show resolved Hide resolved
Zend/zend_alloc.c Outdated Show resolved Hide resolved
As [presented at
OffensiveCon 2024](https://youtu.be/dqKFHjcK9hM?t=1622), having trivially
callable writeable function pointers at the top of the heap makes it
straightforward to turn a limited write into an arbitrary code execution.

Disabling ZEND_MM_HEAP by default isn't doable, as it's used by a couple of
profilers, so we're making some parts of `_zend_mm_heap` read-only at runtime
instead: this will prevent the custom heap functions pointers from being
hijacked, as well as the custom storage ones.

We don't put the shadow_key there, since it has a performance impact, and an
attacker able to precisely overwrite it is likely already able to read it
anyway.
@jvoisin jvoisin force-pushed the disable_custom_heap_default branch from 7bd296d to 7d35e89 Compare July 3, 2024 12:22
Copy link
Member

@arnaud-lb arnaud-lb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@dstogov do you want to take an other look?

@dstogov
Copy link
Member

dstogov commented Jul 3, 2024

zend_alloc.c was especially optimized for performance. It's especially stored heap as a part of the first chunk, to improve data locality and CPU cache usage, for an ability to fit into a into a single huge page and to reduce TLB pressure.

This patch changes some key design ideas. At least it should be benchmarked with callgrind (with cache simulation) and on real hardware.

From a quick review, I don't see anything terrible. (I'm going to be on vacation next two weeks).

@jvoisin
Copy link
Contributor Author

jvoisin commented Jul 3, 2024

This patch changes some key design ideas. At least it should be benchmarked with callgrind (with cache simulation) and on real hardware.

Any recommendation on how to properly benchmark php?

@arnaud-lb
Copy link
Member

For valgrind, check how https://github.com/php/php-src/blob/master/benchmark/benchmark.php does it. The symfony demo and wordpress benchmarks are probably the most relevant in this case.

@jvoisin
Copy link
Contributor Author

jvoisin commented Jul 8, 2024

jvoisin@chernabog 14:58 (gmp_bench *) ~/dev/php-src time php ./benchmark/benchmark.php false sapi/cgi/php-cgi
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench.jit' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo.jit' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
{
    "Zend\/bench.php": {
        "instructions": "3602853467"
    },
    "Zend\/bench.php JIT": {
        "instructions": "3602853424"
    },
    "Symfony Demo 2.2.3": {
        "instructions": "1049217139"
    },
    "Symfony Demo 2.2.3 JIT": {
        "instructions": "1049216240"
    }
}

real	9m35.845s
user	9m28.012s
sys	0m4.266s
jvoisin@chernabog 15:07 (gmp_bench *) ~/dev/php-src 
jvoisin@chernabog 15:09 (master *) ~/dev/php-src time php ./benchmark/benchmark.php false sapi/cgi/php-cgi
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench.jit' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo.jit' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
{
    "Zend\/bench.php": {
        "instructions": "3603167523"
    },
    "Zend\/bench.php JIT": {
        "instructions": "3603167510"
    },
    "Symfony Demo 2.2.3": {
        "instructions": "1051339479"
    },
    "Symfony Demo 2.2.3 JIT": {
        "instructions": "1051339825"
    }
}

real	9m38.145s
user	9m30.166s
sys	0m4.276s
jvoisin@chernabog 15:20 (master *) ~/dev/php-src 

quick'n'dirty testing doesn't show anything incredible performance-wise

@iluuu1994
Copy link
Member

iluuu1994 commented Jul 8, 2024

As the person who has created the Valgrind benchmark: It's shown not to be super reliable in the past. Some changes were more or much less profitable/expensive than Valgrind suggested. I would suggest running a benchmark with php-cgi -T{warmup},{repeat} -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php instead. You might also want to disable turbo boost or similar mechanisms. I also usually use taskset and exclude a CPU core from the scheduler.

@jvoisin
Copy link
Contributor Author

jvoisin commented Jul 8, 2024

As the person who has created the Valgrind benchmark: It's shown not to be super reliable in the past. Some changes were more or much less profitable/expensive than Valgrind suggested. I would suggest running a benchmark with php-cgi -T{warmup},{repeat} -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php instead. You might also want to disable turbo boost or similar mechanisms. I also usually use taskset and exclude a CPU core from the scheduler.

Can this be documented somewhere? Ideally in a README.md file in the benchmark/ folder :/

@jvoisin
Copy link
Contributor Author

jvoisin commented Jul 8, 2024

More (ghetto) benchmarks, impact in the noise levels:

jvoisin@chernabog 15:56 (master *) ~/dev/php-src time sapi/cgi/php-cgi -T100,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null

real	2m52.627s
user	2m37.837s
sys	0m13.672s
jvoisin@chernabog 15:59 (master *) ~/dev/php-src 
jvoisin@chernabog 16:02 (disable_custom_heap_default *) ~/dev/php-src time sapi/cgi/php-cgi -T100,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null

real	2m53.594s
user	2m38.454s
sys	0m13.856s
jvoisin@chernabog 16:05 (disable_custom_heap_default *) ~/dev/php-src 

@iluuu1994
Copy link
Member

iluuu1994 commented Jul 8, 2024

@jvoisin Thanks! For the warmup to be effective, you should look at the time printed by php-cgi itself (it goes to stderr). But it doesn't look like there will be a big change either way, which is good!

@jvoisin
Copy link
Contributor Author

jvoisin commented Jul 8, 2024

With a bigger warmup time:

jvoisin@chernabog 16:05 (disable_custom_heap_default *) ~/dev/php-src time sapi/cgi/php-cgi -T1000,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null

real	3m44.438s
user	3m24.555s
sys	0m18.260s
jvoisin@chernabog 16:15 (disable_custom_heap_default *) ~/dev/php-src 
jvoisin@chernabog 16:21 (master) ~/dev/php-src time sapi/cgi/php-cgi -T1000,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null

real	3m43.036s
user	3m23.905s
sys	0m17.655s
jvoisin@chernabog 16:24 (master) ~/dev/php-src 

@jvoisin jvoisin requested a review from bwoebi July 8, 2024 14:27
@iluuu1994
Copy link
Member

iluuu1994 commented Jul 8, 2024

@jvoisin I think you misunderstood. You should not use time, but look at the stderr of php-cgi. Anyway, it's unlikely to make a difference.

@dstogov
Copy link
Member

dstogov commented Jul 23, 2024

the last benchmark shows 0.6% real time slowdown, including initialization and warm-up. Without them the slowdown should be probably even more significant.

@bwoebi
Copy link
Member

bwoebi commented Jul 24, 2024

@dstogov These slowdowns come to a point where I wonder whether we should make these #if ZEND_ALLOC_HARDENED and such.

And possibly not enable by default? :-/

@dstogov
Copy link
Member

dstogov commented Jul 24, 2024

@dstogov These slowdowns come to a point where I wonder whether we should make these #if ZEND_ALLOC_HARDENED and such.

And possibly not enable by default? :-/

I would agree. We already introduced ZEND_MM_HEAP_PROTECTION, but it's currently enabled by default.

@iluuu1994
Copy link
Member

IMO, opt-in security is useless.

@bwoebi
Copy link
Member

bwoebi commented Jul 24, 2024

@iluuu1994 I do agree, but all we're doing here is some hardening after an exploit was already found, nothing insurmountable for attackers. It's not actually fixing gaping holes in PHPs security or fully preventing exploit classes.
And the price to pay for that is high in terms of performance. So it probably should be actually disabled by default and users/distributions which provide hardened build. Like Gentoo would be probably a good target for these, who also compile with mitigations like stack-clash-protection by default.

@jvoisin
Copy link
Contributor Author

jvoisin commented Jul 29, 2024

@dstogov can you please share how you did your benchmarks so I can them them here as well and try to reduce the performance impact?

@dstogov
Copy link
Member

dstogov commented Jul 29, 2024

@dstogov can you please share how you did your benchmarks so I can them them here as well and try to reduce the performance impact?

I didn't bench this. I used your numbers. 3m44.438s / 3m43.036s = 224.438 / 223.036 = 1.00628 ~= 0.6% slowdown

It's better to use execution time provided by php-cgi. The same command, just see "Execution Time: ..." at stderr tail. It excludes startup/shutdown and warm up time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants