perf: overhaul foreach iterator — valid flag, heap key buffer, string reuse#54
perf: overhaul foreach iterator — valid flag, heap key buffer, string reuse#54
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly optimizes the performance of Judy array iterators, particularly for string keys. By introducing a cached validity flag, reusing a heap-allocated buffer for key storage, and implementing Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces several performance optimizations for foreach iteration. A valid flag is added to the iterator to avoid redundant lookups, a stack-allocated buffer is replaced with a reused heap allocation to prevent large stack frames, and zend_string instances are reused to reduce allocation overhead. These changes are logical, well-implemented, and accompanied by a new stress test. The code appears correct and robust.
… reuse Addresses the #1 performance bottleneck: foreach is 25-50x slower than PHP arrays at 500K elements. Five compounding root causes are fixed: 1. Replace full Judy lookup in valid() with a cached boolean flag. The 65-line function (8 type-dispatched J1T/JLG/JSLG/JHSG lookups) is replaced by a single flag check. The flag is set in rewind() and move_forward(), cleared in judy_iterator_data_dtor(). 2. Replace per-step 64KB stack buffers with a single heap allocation. move_forward() and rewind() each allocated uint8_t key[65536] on the stack for string-keyed types — 128KB of stack touched per iteration step. Now uses a single key_scratch buffer allocated once in judy_get_iterator() and freed in judy_iterator_dtor(). 3. Reuse zend_string allocation across iteration steps. Instead of zval_ptr_dtor + ZVAL_STRING (free + strlen + alloc + copy) per step, reuse the existing zend_string buffer when the new key fits (refcount==1, not interned, new_len <= old_len). Falls back to fresh allocation when reuse isn't possible. 4. Use ZVAL_STRINGL instead of ZVAL_STRING to avoid redundant strlen calls where the length is already known. 5. Use pre-computed strlen for JHSG calls in hash-type iteration, avoiding a second strlen on the same buffer. Adds a 200-key varying-length string key stress test to verify iteration correctness across rewind, full traversal, and empty arrays.
61cb6fe to
b0b33f9
Compare
|
The rebase onto updated |
Test Results
Benchmark Summary (Judy vs PHP Array)Ratio = Judy / Array. Bold = Judy wins (≤0.95x). Plain = Array is faster/smaller. Time (Write / Read) — Linux
Memory — Linux
Time (Write / Read) — Windows
Memory — Windows
Raw benchmark dataWrite Time — Linux
Read Time — Linux
Memory — Linux
Write Time — Windows
Read Time — Windows
Memory — Windows
Batch & Set Operations BenchmarksBenchmarks for Batch Operations — PHP 8.5 (Linux)Set Operations — PHP 8.5 (Linux)All-Types Comparison BenchmarkSide-by-side comparison of all six Judy types and native PHP array (50K elements, 3 iterations, Linux only). All-Types Benchmark — PHP 8.5 (Linux) |
Summary
validboolean field tojudy_iterator— set byrewind/move_forward, read byvalid()— eliminating a redundant Judy lookup per iteration stepuint8_t key[PHP_JUDY_MAX_LENGTH]string buffer with a single heap allocation (key_scratch) reused across iterations, avoiding repeated stack-frame zeroingzend_stringfor the current key acrossmove_forwardcalls instead of destroying and re-allocating on every stepTest plan
foreach_*.phpttests)tests/foreach_string_key_stress_001.phpt— string key iterator under moderate load