Skip to content

Add DEEPCLONE_HYDRATE_PRESERVE_REFS flag (v0.4.0)#12

Merged
nicolas-grekas merged 2 commits intomainfrom
preserve-ref-flag
Apr 15, 2026
Merged

Add DEEPCLONE_HYDRATE_PRESERVE_REFS flag (v0.4.0)#12
nicolas-grekas merged 2 commits intomainfrom
preserve-ref-flag

Conversation

@nicolas-grekas
Copy link
Copy Markdown
Member

@nicolas-grekas nicolas-grekas commented Apr 15, 2026

Summary

Two related changes for v0.4.0 on deepclone_hydrate():

  1. Make ref-preservation opt-in via a new DEEPCLONE_HYDRATE_PRESERVE_REFS flag — by default, PHP & references in $vars are dropped on write.
  2. Align scoped-mode property-name validation with unserialize() — drop the pre-v0.4.0 ValueError on integer keys / NUL-in-middle / mangled-shape keys in favor of the engine's native semantics.

Motivation comes from the polyfill side: both the unconditional ref-probe and the per-prop name check dominated cost for typical DTO hydration. With both relaxed, the polyfill now matches or beats raw Reflection for flat-class hydration. Keeping ext/polyfill semantics aligned means shipping the same changes on both sides.

1. PRESERVE_REFS flag

Call Behavior
deepclone_hydrate($obj, $vars) references in $vars are dereferenced — property slots hold plain values
deepclone_hydrate($obj, $vars, DEEPCLONE_HYDRATE_PRESERVE_REFS) ref links are preserved (pre-0.4.0 default)

Composes with MANGLED_VARS, CALL_HOOKS, and NO_LAZY_INIT.

BC break

Callers that intentionally share a value slot between two properties (or between a property and a caller-side variable) must add DEEPCLONE_HYDRATE_PRESERVE_REFS. symfony/var-exporter's Hydrator::hydrate() and Instantiator::instantiate() pass the flag internally to preserve the VarExporter contract.

2. unserialize()-permissive property-name handling in scoped mode

In scoped mode, the pre-v0.4.0 hot-path validation rejected integer keys, NUL-in-middle names, and mangled-shape keys with a ValueError. That was stricter than what unserialize() does on the same shapes in an O:… payload. The ext now:

  • Integer keys — coerce to string on dynamic property access (PHP engine rule). Iteration switched to ZEND_HASH_FOREACH_KEY_VAL with zend_long_to_str for the synthesised name.
  • NUL-in-middle names — stored as raw dynamic properties, same as unserialize().
  • NUL-prefix names — surface the engine's native Error: Cannot access property starting with "\0" from zend_std_write_property / zend_hash_update.

DEEPCLONE_HYDRATE_MANGLED_VARS mode still parses and validates mangled keys (that's the entire point of the flag).

Implementation

  • deepclone.c:
    • ZVAL_DEREF(prop_val) at the top of the inner write loop when PRESERVE_REFS is absent (zero overhead when the zval isn't a ref — one type-tag check per prop).
    • Dropped the memchr NUL check on the stdClass and dynamic-fallback write paths.
    • Switched the inner loop from ZEND_HASH_FOREACH_STR_KEY_VAL to ZEND_HASH_FOREACH_KEY_VAL, synthesising the property name for integer keys.
  • deepclone.stub.php / regenerated deepclone_arginfo.h: expose DEEPCLONE_HYDRATE_PRESERVE_REFS = 1 << 3.
  • tests/deepclone_hydrate.phpt: ref-preservation tests exercise both the default-drop and PRESERVE_REFS paths; property-name tests updated to reflect unserialize()-matching behavior.
  • README.md: flag table + rationale paragraph.
  • CHANGELOG.md: 0.4.0 entry (BC break + Added + Changed).
  • php_deepclone.h: bump to 0.4.0 (minor SemVer since default semantics change).

Test plan

  • 30/30 .phpt tests green locally (PHP 8.4, NTS)
  • CI green on the matrix (Linux NTS/ZTS/i386/debug, macOS, Windows)

See the polyfill-side changes (same flags, same semantics) in the companion PR on symfony/polyfill.

By default, deepclone_hydrate() now drops PHP & references from $vars on
write (ZVAL_DEREF before the property assignment). Pass the new flag to
keep the ref link, matching the pre-v0.4.0 default behavior.

Motivation: in the polyfill, the ref-preserving path required a per-call
probe of the input array (ReflectionReference::fromArrayElement over every
key) that cost ~220ns on a 6-prop hydration — enough to push the polyfill
below raw Reflection in most scenarios. Making ref preservation opt-in
brings the polyfill to parity with Reflection for typical DTO hydration,
and keeps the ext-side semantics aligned with the polyfill.

BC break: callers that intentionally share a value slot between two
properties (or between a property and a caller-side variable) must now
pass DEEPCLONE_HYDRATE_PRESERVE_REFS. Symfony's Hydrator/Instantiator pass
the flag by default to preserve the VarExporter contract.

- deepclone.c: add flag + ZVAL_DEREF at the top of the inner write loop
- deepclone.stub.php / deepclone_arginfo.h: expose the PHP-level constant
- tests/deepclone_hydrate.phpt: update existing ref tests, add new coverage
  for both default-drop and PRESERVE_REFS paths
- README.md: document the new flag and why it's off by default
- CHANGELOG.md: 0.4.0 entry (BC break + Added)
- php_deepclone.h: PHP_DEEPCLONE_VERSION 0.3.1 → 0.4.0
nicolas-grekas added a commit to symfony/polyfill that referenced this pull request Apr 15, 2026
Mirrors symfony/php-ext-deepclone#12 on the polyfill side.

PRESERVE_REFS flag (BC break)
-----------------------------
By default, deepclone_hydrate() now drops PHP & references from $vars on
write; pass the new DEEPCLONE_HYDRATE_PRESERVE_REFS flag to keep ref links.
Callers that intentionally share a value slot between properties need to
add the flag. symfony/var-exporter's Hydrator/Instantiator pass it by
default to preserve the pre-existing contract.

Perf
----
The ref-preservation path used to run unconditionally and required a
per-call probe (ReflectionReference::fromArrayElement over every key)
plus a by-ref double-write of every property. On a 6-prop hydration that
added ~580 ns — enough to push the polyfill below raw Reflection in every
realistic DTO scenario. With the flag off by default, the polyfill now
matches or beats Reflection for flat-class hydration.

- **Ref-probe gated on flag** — when PRESERVE_REFS is absent, skip the
  probe entirely and use the lean single-write path.
- **Folded NUL/string validation into the hydrator closure** — removed the
  dedicated upfront 'foreach ($properties ...) { str_contains NUL }' pass;
  one branch per prop inside the write loop instead.
- **use($notByRef) closure capture** replacing '(array) $this' — avoids the
  per-call stdClass→array cast; $notByRef is now an array from the start.
- **1-scope inline fast path** in deepclone_hydrate — skips the outer
  'foreach ($scoped_vars as ...)' + cross-scope validation for the most
  common shape (vars keyed solely by the object's own class).
- **Three closure variants each split by $hasRefs** — lean no-refs loop
  sits next to the ref-preserving double-write loop, branch-predicted once
  per call.

Results on PHP 8.4 / opcache, 100k iters, 6-prop DTO hydration:

| case                                   | before | after | vs Reflection |
|----------------------------------------|-------:|------:|--------------:|
| 6-prop mixed (pub + priv + prot)       |  1,483 |   804 |         1.08x |
| 8 typed props                          |  1,766 | 1,059 |         0.80x |
| stdClass, 5 props                      |  1,182 |   742 |             - |
| 3-level inheritance                    |  1,932 | 1,285 |         3.92x |
| readonly via ctor promotion            |  1,519 | 1,283 |         2.63x |

Scenarios 3 and 4 still lag Reflection because of the outer multi-scope
iteration and the per-readonly-prop ReflectionProperty invocation inside
the hydrator.
In scoped mode, the pre-v0.4.0 hot-path validation rejected any property
name that wasn't a string, contained a NUL byte, or looked like a mangled
key, with a ValueError. That was stricter than what `unserialize()` does
when it encounters the same shapes in an `O:…` payload:

- Integer keys: `unserialize()` coerces to string on dynamic property
  access (PHP's engine rule). The ext now iterates via ZEND_HASH_FOREACH_KEY_VAL
  and synthesises the string via `zend_long_to_str`.
- NUL-in-middle names: `unserialize()` stores them as-is on the dynamic
  property table. The ext drops the upfront `memchr` check on the stdClass
  write path and the dynamic-fallback write path, letting the engine store
  the raw name.
- NUL-prefix names: the engine native `Error` ("Cannot access property
  starting with \0") already surfaces from `zend_std_write_property` /
  `zend_hash_update`. No need for a pre-check.

DEEPCLONE_HYDRATE_MANGLED_VARS mode is unchanged — it still parses and
validates mangled keys as before (that's the entire point of the flag).

Besides the semantic alignment, this saves a hot-path validation per
written property — ~18 ns per prop in the polyfill, cheap in the ext but
the polyfill side is the real win.
nicolas-grekas added a commit to symfony/polyfill that referenced this pull request Apr 15, 2026
Mirrors symfony/php-ext-deepclone#12 follow-up commit.

In scoped mode, the pre-v0.4.0 hot-path validation rejected non-string
keys, NUL-containing names, and mangled-shape keys with a ValueError.
That was stricter than what unserialize() does on the same shapes in an
O:… payload — unserialize() coerces integer keys, stores NUL-in-middle
names as raw dynamic properties, and lets the engine's native Error
surface for NUL-prefix names.

- Removed the per-prop `!\is_string($name) || str_contains($name, "\0")`
  check from all eight hydrator closures (stdClass + 3 variants × 2
  ref-modes).
- Dropped the now-unused `DeepClone::throwInvalidPropName()` helper.
- Updated and renamed the corresponding tests.

Besides semantic alignment with unserialize(), this saves ~18 ns per
property in the hot path — on a 5-prop stdClass hydration that's ~130 ns
(~18% of the call), matching what the ext-side commit also avoids.
@nicolas-grekas nicolas-grekas merged commit 9072a51 into main Apr 15, 2026
20 checks passed
@nicolas-grekas nicolas-grekas deleted the preserve-ref-flag branch April 15, 2026 16:58
nicolas-grekas added a commit to symfony/polyfill that referenced this pull request Apr 15, 2026
Mirrors symfony/php-ext-deepclone#12 follow-up commit.

In scoped mode, the pre-v0.4.0 hot-path validation rejected non-string
keys, NUL-containing names, and mangled-shape keys with a ValueError.
That was stricter than what unserialize() does on the same shapes in an
O:… payload — unserialize() coerces integer keys, stores NUL-in-middle
names as raw dynamic properties, and lets the engine's native Error
surface for NUL-prefix names.

- Removed the per-prop `!\is_string($name) || str_contains($name, "\0")`
  check from all eight hydrator closures (stdClass + 3 variants × 2
  ref-modes).
- Dropped the now-unused `DeepClone::throwInvalidPropName()` helper.
- Updated and renamed the corresponding tests.

Besides semantic alignment with unserialize(), this saves ~18 ns per
property in the hot path — on a 5-prop stdClass hydration that's ~130 ns
(~18% of the call), matching what the ext-side commit also avoids.
nicolas-grekas added a commit to symfony/polyfill that referenced this pull request Apr 15, 2026
Mirrors symfony/php-ext-deepclone#12 follow-up commit.

In scoped mode, the pre-v0.4.0 hot-path validation rejected non-string
keys, NUL-containing names, and mangled-shape keys with a ValueError.
That was stricter than what unserialize() does on the same shapes in an
O:… payload — unserialize() coerces integer keys, stores NUL-in-middle
names as raw dynamic properties, and lets the engine's native Error
surface for NUL-prefix names.

- Removed the per-prop `!\is_string($name) || str_contains($name, "\0")`
  check from all eight hydrator closures (stdClass + 3 variants × 2
  ref-modes).
- Dropped the now-unused `DeepClone::throwInvalidPropName()` helper.
- Updated and renamed the corresponding tests.

Besides semantic alignment with unserialize(), this saves ~18 ns per
property in the hot path — on a 5-prop stdClass hydration that's ~130 ns
(~18% of the call), matching what the ext-side commit also avoids.
nicolas-grekas added a commit to symfony/polyfill that referenced this pull request Apr 15, 2026
…ot-path perf (nicolas-grekas)

This PR was squashed before being merged into the 1.x branch.

Discussion
----------

[DeepClone] Add DEEPCLONE_HYDRATE_PRESERVE_REFS flag + hot-path perf

Mirrors [symfony/php-ext-deepclone#12](symfony/php-ext-deepclone#12) on the polyfill side.

## 1. PRESERVE_REFS flag (BC break)

By default, `deepclone_hydrate()` now drops PHP `&` references from `$vars` on write; pass the new `DEEPCLONE_HYDRATE_PRESERVE_REFS` flag to keep ref links. Callers that intentionally share a value slot between properties need to add the flag.

`symfony/var-exporter`'s `Hydrator` / `Instantiator` pass the flag by default to preserve the pre-existing contract.

## 2. unserialize()-permissive property-name handling in scoped mode

In scoped mode, the pre-v0.4.0 hot-path validation rejected integer keys, NUL-in-middle names, and mangled-shape keys with a `ValueError`. That was stricter than what `unserialize()` does on the same shapes in an `O:…` payload. The polyfill now matches `unserialize()`:

- **Integer keys** — coerce to string on dynamic property access (PHP engine rule).
- **NUL-in-middle names** — stored as raw dynamic properties.
- **NUL-prefix names** — surface the engine's native `Error: Cannot access property starting with "\0"`.

`DEEPCLONE_HYDRATE_MANGLED_VARS` mode still parses and validates mangled keys (that's the entire point of the flag).

## Perf

The ref-preservation path used to run unconditionally and required a per-call probe (`ReflectionReference::fromArrayElement` over every key) plus a by-ref double-write of every property. The per-prop `str_contains(\$name, \"\\0\")` / `is_string(\$name)` checks were another ~18 ns per property on top. With both relaxed, the polyfill now **matches or beats raw Reflection** for flat-class hydration.

### Individual wins

- **Ref-probe gated on the flag** — default path skips it entirely, lean single-write inner loop.
- **Property-name validation matches `unserialize()`** — dropped the per-prop `is_string + str_contains NUL` check in favor of the engine's native semantics.
- **`use ($notByRef)` closure capture** replacing `(array) $this` — avoids the per-call stdClass→array cast.
- **1-scope inline fast path** in `deepclone_hydrate` — skips the outer `foreach ($scoped_vars as ...)` + cross-scope validation for the most common shape.
- **Three closure variants each split by `$hasRefs`** — lean no-refs loop sits next to the ref-preserving double-write loop, branch-predicted once per call.

### Results

PHP 8.4 / opcache, 100k iters, realistic DTO hydration:

| case                                   | before | after | vs Reflection |
|----------------------------------------|-------:|------:|--------------:|
| 6-prop mixed (pub + priv + prot)       |  1,483 |   812 |         1.36× |
| 8 typed props                          |  1,766 | 1,077 |         1.32× |
| stdClass, 5 props                      |  1,182 |   609 |             — |
| 3-level inheritance                    |  1,932 | 1,128 |         3.80× |
| readonly via ctor promotion            |  1,519 | 1,144 |         2.35× |

Scenarios 3 (3-level inheritance) and 4 (readonly) still lag Reflection — residual outer multi-scope iteration and per-readonly-prop `ReflectionProperty` invocation inside the hydrator. Flat-class hydration (scenarios 1, 2, 5) is at or below Reflection.

## Test plan

- [x] 374/374 DeepClone tests green locally
- [ ] CI green across the PHP matrix

Commits
-------

7e0c66d [DeepClone] Match unserialize() permissiveness on scoped-mode prop names
a1c9ed9 [DeepClone] Add DEEPCLONE_HYDRATE_PRESERVE_REFS flag + hot-path perf
symfony-splitter pushed a commit to symfony/polyfill-deepclone that referenced this pull request Apr 15, 2026
Mirrors symfony/php-ext-deepclone#12 on the polyfill side.

PRESERVE_REFS flag (BC break)
-----------------------------
By default, deepclone_hydrate() now drops PHP & references from $vars on
write; pass the new DEEPCLONE_HYDRATE_PRESERVE_REFS flag to keep ref links.
Callers that intentionally share a value slot between properties need to
add the flag. symfony/var-exporter's Hydrator/Instantiator pass it by
default to preserve the pre-existing contract.

Perf
----
The ref-preservation path used to run unconditionally and required a
per-call probe (ReflectionReference::fromArrayElement over every key)
plus a by-ref double-write of every property. On a 6-prop hydration that
added ~580 ns — enough to push the polyfill below raw Reflection in every
realistic DTO scenario. With the flag off by default, the polyfill now
matches or beats Reflection for flat-class hydration.

- **Ref-probe gated on flag** — when PRESERVE_REFS is absent, skip the
  probe entirely and use the lean single-write path.
- **Folded NUL/string validation into the hydrator closure** — removed the
  dedicated upfront 'foreach ($properties ...) { str_contains NUL }' pass;
  one branch per prop inside the write loop instead.
- **use($notByRef) closure capture** replacing '(array) $this' — avoids the
  per-call stdClass→array cast; $notByRef is now an array from the start.
- **1-scope inline fast path** in deepclone_hydrate — skips the outer
  'foreach ($scoped_vars as ...)' + cross-scope validation for the most
  common shape (vars keyed solely by the object's own class).
- **Three closure variants each split by $hasRefs** — lean no-refs loop
  sits next to the ref-preserving double-write loop, branch-predicted once
  per call.

Results on PHP 8.4 / opcache, 100k iters, 6-prop DTO hydration:

| case                                   | before | after | vs Reflection |
|----------------------------------------|-------:|------:|--------------:|
| 6-prop mixed (pub + priv + prot)       |  1,483 |   804 |         1.08x |
| 8 typed props                          |  1,766 | 1,059 |         0.80x |
| stdClass, 5 props                      |  1,182 |   742 |             - |
| 3-level inheritance                    |  1,932 | 1,285 |         3.92x |
| readonly via ctor promotion            |  1,519 | 1,283 |         2.63x |

Scenarios 3 and 4 still lag Reflection because of the outer multi-scope
iteration and the per-readonly-prop ReflectionProperty invocation inside
the hydrator.
symfony-splitter pushed a commit to symfony/polyfill-deepclone that referenced this pull request Apr 15, 2026
Mirrors symfony/php-ext-deepclone#12 follow-up commit.

In scoped mode, the pre-v0.4.0 hot-path validation rejected non-string
keys, NUL-containing names, and mangled-shape keys with a ValueError.
That was stricter than what unserialize() does on the same shapes in an
O:… payload — unserialize() coerces integer keys, stores NUL-in-middle
names as raw dynamic properties, and lets the engine's native Error
surface for NUL-prefix names.

- Removed the per-prop `!\is_string($name) || str_contains($name, "\0")`
  check from all eight hydrator closures (stdClass + 3 variants × 2
  ref-modes).
- Dropped the now-unused `DeepClone::throwInvalidPropName()` helper.
- Updated and renamed the corresponding tests.

Besides semantic alignment with unserialize(), this saves ~18 ns per
property in the hot path — on a 5-prop stdClass hydration that's ~130 ns
(~18% of the call), matching what the ext-side commit also avoids.
nicolas-grekas added a commit to symfony/symfony that referenced this pull request Apr 15, 2026
This PR was merged into the 8.1 branch.

Discussion
----------

[VarExporter] Bump to ext-deepclone v0.4.0

| Q             | A
| ------------- | ---
| Branch?       | 8.1
| Bug fix?      | no
| New feature?  | no
| Deprecations? | no
| Issues        | -
| License       | MIT

I'm on fire 😆

See symfony/php-ext-deepclone#12 / symfony/polyfill#574

Commits
-------

e1800d3 [VarExporter] Bump to ext-deepclone v0.4.0
nicolas-grekas added a commit that referenced this pull request Apr 15, 2026
…ynamic writes

Review feedback from @arnaudlb (PR #12):

1. Simplify dc_is_backed_declared_property():
   !(pi->flags & ZEND_ACC_VIRTUAL) already implies pi->offset ==
   ZEND_VIRTUAL_PROPERTY_OFFSET, and IS_HOOKED_PROPERTY_OFFSET() is only
   meaningful on offsets returned by zend_get_property_offset(), not on
   the raw pi->offset. Collapse to a single bitmask test.

2. Narrow the backed-enum scalar cast:
   Add `(ZEND_TYPE_PURE_MASK(pi->type) & ~MAY_BE_NULL) == 0` so the cast
   only fires for types of the form `Enum` or `?Enum`. Unions like
   `Enum|string|int` already accept the scalar literally — casting it to
   the enum case would be surprising.

3. Lazy-object safety in dc_write_backed_property():
   Check !zend_lazy_object_initialized(obj) at the top and, when the flag
   is absent, route through zend_update_property_ex() which triggers the
   lazy realization hook. Direct OBJ_PROP slot writes on a lazy ghost
   left it in a half-initialized state. DEEPCLONE_HYDRATE_NO_LAZY_INIT
   keeps its existing opt-out fast path.

4. Dynamic-property fallback via zend_update_property_ex():
   Swap zend_std_write_property() for zend_update_property_ex() so any
   overridden write_property handler on internal classes or extensions
   is respected. Picks up the appropriate scope via scope_ce ?: obj_ce.

A2 (null → unset on non-nullable typed) and A3 (scalar → backed-enum
cast) stay property-type-only, as discussed: the coercions are driven
by the target type at hydration time, which is stable within a single
execution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant