fix: heap corruption when returning Binary of 0/1 packed bytes#730
Merged
ptondereau merged 1 commit intoextphprs:masterfrom Apr 22, 2026
Merged
Conversation
Returning a Binary from a #[php_function] crashed PHP with "zend_mm_heap corrupted" when the packed byte length was 0 or 1. This was a regression introduced in 0.15.8 by PR extphprs#701, which short-circuits ext_php_rs_zend_string_init to return PHP's interned permanent statics (zend_empty_string, zend_one_char_string[c]) for length 0 and 1. Zval::set_zend_string was updated in that PR to detect these statics via GC_IMMUTABLE and flag the zval as InternedStringEx, but Zval::set_binary was missed and always flagged the zval as refcounted. On drop PHP tried to free a process-global static, leading to heap corruption. Zval::set_binary now wraps the raw pointer from Pack::pack_into in ZBox<ZendStr> and delegates to set_zend_string, so every path that assigns a zend_string to a zval shares one flag-selection site, matching PHP's own ZVAL_STR macro in Zend/zend_types.h. Also tightened the NULL guard in wrapper.c so the fast path is safe when only one of the two weak symbols resolves at link time, and fixed an adjacent correctness issue where Zval::binary::<T>() and Zval::binary_slice::<T>() silently truncated trailing bytes when the byte length was not a multiple of size_of::<T>(). Both functions now return None in that case rather than dropping data. Fixes extphprs#729
Coverage Report for CI Build 24792404673Coverage increased (+0.3%) to 66.241%Details
Uncovered Changes
Coverage Regressions1 previously-covered line in 1 file lost coverage.
Coverage Stats
💛 - Coveralls |
|
| Branch | fix/729-binary-interned-heap-corruption |
| Testbed | PHP 8.4.20 (cli) (built: Apr 11 2026 00:52:25) (NTS) |
⚠️ WARNING: Truncated view!The full continuous benchmarking report exceeds the maximum length allowed on this platform.
🐰 View full continuous benchmarking report in Bencher
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fixes #729.
What was broken
Returning a
Binaryfrom a#[php_function]crashed PHP withzend_mm_heap corruptedwhenever the packed byte length was 0 or 1. A minimal repro from the issue:The regression shipped in 0.15.8.
Why it crashed (short version for readers new to PHP internals)
PHP keeps a small table of pre-allocated, immutable strings that live for the entire process lifetime: one empty string (
zend_empty_string) and one single-byte string per byte value (zend_one_char_string[0..=255]). They are meant to be shared and must never be freed by extension code.When a
zvalholds a string, PHP records whether that string is:IS_STRING_EX, meaning "refcounted, the engine owns the memory and must release it on drop", orIS_INTERNED_STRING_EX, meaning "immutable global, do not touch on drop".The previous optimization PR (#701) taught our
ext_php_rs_zend_string_initC shim to return these interned globals directly for 0 and 1 byte inputs, and taughtZval::set_zend_stringto detect them via theGC_IMMUTABLEflag and pickIS_INTERNED_STRING_EX. That part is fine.The bug is that
Zval::set_binary, which is what theBinary<T>return path uses, was not updated. It always marked thezvalas refcounted. When PHP later tried to release thezvalit tried toefreea process-global static, which is heap corruption.The fix
Zval::set_binarynow wraps the raw pointer returned byPack::pack_intointo the safeZBox<ZendStr>wrapper and delegates toZval::set_zend_string. That function already readsGC_IMMUTABLEand picks the correct flag, mirroring PHP's ownZVAL_STRmacro inZend/zend_types.h.After this change there is a single place in the crate that decides the refcounted vs interned flag for a string
zval, which removes the class of bug "someone added a new string path and forgot to handle interned statics".Extra cleanups bundled in
Hardened NULL guard in the C shim.
src/wrapper.cuses#pragma weakto referencezend_empty_stringandzend_one_char_stringso the fast path falls back tozend_string_initif the symbols do not resolve at link time. The previous check only verifiedzend_one_char_string. If only one of the two weak symbols failed to resolve, alen == 0call would dereference a NULLzend_empty_string. The check now covers both symbols.Silent truncation fixed in
Zval::binaryandZval::binary_slice. These methods previously divided the byte length bysize_of::<T>()and silently dropped trailing bytes. For example, callingbinary::<u32>()on a 5 byte string returnedSome(vec![one_u32])and threw away the last byte. Both functions now returnNonewhen the byte length is not a clean multiple of the element size.Binary::from_zvalalready propagatesNoneas a conversion failure, so callers that rely on PHP'sFromZvalboundary will see the same "could not convert" error surface they already handle for other failure modes.Notes on behavior change
The change to
Zval::binary/Zval::binary_sliceis a minor behavioral change: callers that previously received truncated data now receiveNone. Any downstream code that was relying on the truncation was almost certainly masking a correctness issue. No other public API changes.Checklist