Skip to content

feat: batch operations (toArray/fromArray/putAll/getAll) and atomic increment#44

Merged
orieg merged 4 commits intomainfrom
feat/batch-and-increment
Feb 28, 2026
Merged

feat: batch operations (toArray/fromArray/putAll/getAll) and atomic increment#44
orieg merged 4 commits intomainfrom
feat/batch-and-increment

Conversation

@orieg
Copy link
Owner

@orieg orieg commented Feb 28, 2026

Summary

Implements issue #43 features 1 and 2 (batch operations and atomic increment). Features 3 (JudyHS) and 4 (INT_TO_PACKED) are deferred due to high complexity (105+ type-switch sites each).

New Methods

Batch operations:

  • toArray(): array — convert Judy array to native PHP array (reuses judy_build_data_array)
  • static fromArray(int $type, array $data): Judy — static factory from PHP array
  • putAll(array $data): void — bulk-insert into existing Judy array
  • getAll(array $keys): array — retrieve multiple values at once (missing keys return null)

Atomic increment:

  • increment(mixed $key, int $amount = 1): int — efficient counter update for INT_TO_INT (single-traversal via JLI) and STRING_TO_INT (two traversals: JSLG for counter tracking + JSLI). Throws exception on unsupported types.

Scope

  • All changes confined to php_judy.c — no modifications to judy_handlers.c, judy_iterator.c, or judy_arrayaccess.c
  • Reuses existing helpers: judy_build_data_array, judy_create_result, judy_object_write_dimension_helper
  • Shared judy_populate_from_array() helper extracted for fromArray/putAll to reduce duplication
  • 14 new test files (99 total, all passing)
  • Updated package.xml, README.md, and BENCHMARK.md

Benchmark Results (100K elements)

Category Method INT_TO_INT STRING_TO_INT
Bulk Add fromArray() vs individual 1.3x faster ~1.0x
Bulk Get getAll() vs individual 1.9x faster 1.1x faster
Conversion toArray() vs manual foreach 2.8x faster 3.1x faster
Increment increment() vs manual 1.6x faster 1.3x faster

Key insights:

  • toArray() provides the biggest speedup (2.8-3.1x) by using native C iteration, bypassing PHP Iterator overhead
  • getAll() is 1.9x faster for integer keys by avoiding per-element ArrayAccess overhead
  • increment() achieves true single-traversal for INT_TO_INT via JLI; STRING_TO_INT uses two traversals (JSLG+JSLI) for counter tracking but still 1.3x faster than PHP-level read-modify-write
  • fromArray() provides meaningful speedup for integer keys; string key performance is dominated by trie traversal

Full benchmark details in BENCHMARK.md Tables 4-7. CI now runs batch and set operations benchmarks automatically.

Code Review Fixes (Gemini)

  • Security: Added string key length validation (ZSTR_LEN >= PHP_JUDY_MAX_LENGTH) in increment() before JSLG/JSLI calls to prevent buffer overflow
  • Duplication: Extracted judy_populate_from_array() static helper shared by fromArray() and putAll()
  • Duplication: Restructured getAll() to group by is_integer_keyed / is_string_keyed first, reducing redundant key retrieval code
  • Documentation: Corrected "single-traversal" claim — now accurately describes INT_TO_INT as single-traversal (JLI) and STRING_TO_INT as two-traversal (JSLG+JSLI)

Test plan

  • Docker build and test: 99/99 tests pass (PHP 8.3)
  • CI matrix (PHP 8.1-8.5, Linux + Windows)
  • pecl package package.xml produces valid .tgz
  • Batch operations benchmark runs in CI with results in PR summary

…ement

Implements issue #43 features 1 and 2:

Batch operations:
- toArray(): convert Judy array to native PHP array
- fromArray(int $type, array $data): static factory from PHP array
- putAll(array $data): bulk-insert into existing Judy array
- getAll(array $keys): retrieve multiple values at once

Atomic increment:
- increment(mixed $key, int $amount = 1): single-traversal counter
  update for INT_TO_INT and STRING_TO_INT types

All methods reuse existing helpers (judy_build_data_array,
judy_create_result, judy_object_write_dimension_helper) and
require no changes to judy_handlers.c, judy_iterator.c, or
judy_arrayaccess.c.

14 new tests (99 total, all passing).
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the php-judy extension by introducing a suite of batch processing methods and an atomic increment function. These additions aim to improve performance by reducing the overhead of individual operations, allowing for more efficient data manipulation and counter management within Judy arrays. The changes provide developers with more powerful tools for interacting with Judy arrays, streamlining common tasks like data conversion, bulk inserts, and concurrent updates.

Highlights

  • Batch Operations: Introduced new methods for efficient bulk operations: toArray() to convert a Judy array to a PHP array, Judy::fromArray() as a static factory to create a Judy array from a PHP array, putAll() for bulk insertion, and getAll() to retrieve multiple values simultaneously.
  • Atomic Increment: Added an increment() method for INT_TO_INT and STRING_TO_INT Judy array types, enabling single-traversal counter updates, which is more efficient than typical PHP-level increment operations.
  • Documentation and Testing: Updated the README.md with usage examples for the new batch and increment methods, and added 14 new test files to ensure comprehensive coverage and correctness of these new functionalities.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • README.md
    • Added documentation and code examples for new batch operations (toArray, fromArray, putAll, getAll).
    • Added documentation and code examples for the new atomic increment method.
    • Updated the 'Future Plans' section to reflect the completion of batch API and atomic increment features.
  • package.xml
    • Added 14 new test files to the package manifest, covering the new batch operations and atomic increment functionality.
  • php_judy.c
    • Implemented the Judy::toArray() method to convert a Judy array to a native PHP array.
    • Implemented the static factory method Judy::fromArray() to create a Judy array from a PHP array.
    • Implemented the Judy::putAll() method for bulk insertion of key-value pairs from a PHP array.
    • Implemented the Judy::getAll() method to retrieve multiple values based on an array of keys.
    • Implemented the Judy::increment() method for atomic, single-traversal increments on INT_TO_INT and STRING_TO_INT types.
    • Declared the new toArray, fromArray, putAll, getAll, and increment methods in the judy_class_methods array.
    • Defined argument information (arginfo) for all newly added methods.
  • tests/from_array_error_001.phpt
    • Added a test case to verify that Judy::fromArray() throws an exception for invalid Judy types.
  • tests/from_array_roundtrip_001.phpt
    • Added a test case to ensure Judy::fromArray() correctly converts PHP arrays to Judy arrays and back via toArray() for all five Judy types.
  • tests/get_all_001.phpt
    • Added a test case to validate Judy::getAll() functionality across all five Judy types, including scenarios with missing keys.
  • tests/increment_int_to_int_001.phpt
    • Added a test case for Judy::increment() with INT_TO_INT type, covering basic increments, custom amounts, and new key creation.
  • tests/increment_string_to_int_001.phpt
    • Added a test case for Judy::increment() with STRING_TO_INT type, including verification of counter tracking.
  • tests/increment_type_error_001.phpt
    • Added a test case to confirm Judy::increment() throws exceptions when used with unsupported Judy array types.
  • tests/put_all_001.phpt
    • Added a test case to verify Judy::putAll() correctly bulk inserts data for all five Judy types.
  • tests/put_all_overwrite_001.phpt
    • Added a test case to ensure Judy::putAll() correctly overwrites existing keys during bulk insertion.
  • tests/to_array_bitset_001.phpt
    • Added a test case to confirm Judy::toArray() correctly converts a BITSET Judy array to a flat PHP array.
  • tests/to_array_empty_001.phpt
    • Added a test case to verify Judy::toArray() returns an empty array when called on an empty Judy array for all types.
  • tests/to_array_int_to_int_001.phpt
    • Added a test case to confirm Judy::toArray() correctly converts an INT_TO_INT Judy array to a PHP associative array.
  • tests/to_array_int_to_mixed_001.phpt
    • Added a test case to confirm Judy::toArray() correctly converts an INT_TO_MIXED Judy array to a PHP associative array.
  • tests/to_array_string_to_int_001.phpt
    • Added a test case to confirm Judy::toArray() correctly converts a STRING_TO_INT Judy array to a PHP associative array.
  • tests/to_array_string_to_mixed_001.phpt
    • Added a test case to confirm Judy::toArray() correctly converts a STRING_TO_MIXED Judy array to a PHP associative array.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

github-actions bot commented Feb 28, 2026

Test Results

PHP Platform Arch TS Tests Pass Fail Skip Duration
8.1 Linux x64 - 99 99 0 0 0.8s
8.2 Linux x64 - 99 99 0 0 1.1s
8.3 Linux x64 - 99 99 0 0 1.0s
8.4 Linux x64 - 99 99 0 0 1.0s
8.5 Linux x64 - 99 99 0 0 1.1s
8.1 Windows x64 nts 99 99 0 0 5.8s
8.2 Windows x64 nts 99 99 0 0 5.8s
8.3 Windows x64 nts 99 99 0 0 5.8s
8.4 Windows x64 nts 99 99 0 0 5.9s
8.5 Windows x64 nts 99 99 0 0 6.4s
Total 990 990 0 0

Benchmark Summary (Judy vs PHP Array)

Ratio = Judy / Array. Bold = Judy wins (≤0.95x). Plain = Array is faster/smaller.

Time (Write / Read) — Linux

Scenario PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K 3.1x / 2.7x 3.1x / 2.6x 3.0x / 2.6x 2.8x / 2.7x 2.9x / 2.4x
Sparse Int 500K 3.5x / 2.8x 3.5x / 2.8x 3.2x / 2.8x 3.3x / 2.9x 3.3x / 2.8x
Sparse Int 1M 3.7x / 2.4x 3.3x / 1.9x 3.7x / 2.3x 3.5x / 2.6x 3.6x / 2.0x
Sparse Int 10M 2.1x / 2.4x 2.0x / 2.2x 2.0x / 2.3x 1.9x / 2.4x 1.9x / 2.4x
String 100K 3.5x / 3.2x 3.2x / 3.1x 3.6x / 3.2x 2.8x / 3.3x 3.3x / 3.2x
String 500K 2.1x / 2.3x 2.1x / 2.1x 2.4x / 2.4x 2.4x / 2.2x 2.2x / 2.3x
String 1M 2.5x / 2.4x 2.6x / 1.9x 2.4x / 2.3x 2.6x / 2.2x 2.4x / 2.6x
String 10M 2.7x / 2.3x 2.6x / 2.3x 2.7x / 2.2x 2.7x / 2.4x 2.7x / 2.3x

Memory — Linux

Scenario PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K 0.27x 0.26x 0.26x 0.26x 0.26x
Sparse Int 500K 0.46x 0.46x 0.46x 0.46x 0.46x
Sparse Int 1M 0.46x 0.46x 0.46x 0.46x 0.46x
Sparse Int 10M 0.29x 0.29x 0.29x 0.29x 0.29x
String 100K 0.61x 0.61x 0.61x 0.61x 0.61x
String 500K 0.76x 0.76x 0.76x 0.76x 0.76x
String 1M 0.76x 0.76x 0.76x 0.76x 0.76x
String 10M 0.48x 0.48x 0.48x 0.48x 0.48x

Time (Write / Read) — Windows

Scenario PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K 5.5x / 2.9x 5.0x / 2.8x 5.2x / 2.7x 5.3x / 3.0x 4.5x / 3.2x
Sparse Int 500K 5.0x / 2.8x 4.4x / 2.4x 4.9x / 2.6x 4.6x / 2.4x 3.5x / 2.0x
Sparse Int 1M 4.3x / 2.7x 4.2x / 2.8x 4.0x / 2.6x 4.0x / 2.6x 3.2x / 2.0x
Sparse Int 10M 2.4x / 2.1x 2.4x / 2.1x 2.3x / 2.3x 2.1x / 2.2x 3.1x / 2.5x
String 100K 5.0x / 4.4x 3.6x / 3.9x 4.0x / 4.9x 3.6x / 3.7x 2.9x / 2.3x
String 500K 3.9x / 2.8x 3.5x / 2.7x 3.6x / 2.3x 3.4x / 2.3x 3.6x / 2.2x
String 1M 3.7x / 2.4x 3.3x / 2.3x 3.6x / 2.4x 2.9x / 1.8x 3.5x / 2.1x
String 10M 3.5x / 2.4x 3.4x / 2.3x 3.5x / 2.4x 3.5x / 2.5x 3.6x / 2.5x

Memory — Windows

Scenario PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K 0.23x 0.23x 0.23x 0.23x 0.23x
Sparse Int 500K 0.46x 0.46x 0.46x 0.46x 0.46x
Sparse Int 1M 0.46x 0.46x 0.46x 0.46x 0.46x
Sparse Int 10M 0.29x 0.29x 0.29x 0.29x 0.29x
String 100K 0.51x 0.51x 0.51x 0.51x 0.51x
String 500K 0.76x 0.76x 0.76x 0.76x 0.76x
String 1M 0.76x 0.76x 0.76x 0.76x 0.76x
String 10M 0.48x 0.48x 0.48x 0.48x 0.48x
Raw benchmark data

Write Time — Linux

Scenario Subject PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K Judy 0.0121s 0.0122s 0.0119s 0.0119s 0.0119s
Sparse Int 100K PHP Array 0.0039s 0.0040s 0.0040s 0.0042s 0.0041s
Sparse Int 500K Judy 0.0657s 0.0662s 0.0619s 0.0663s 0.0648s
Sparse Int 500K PHP Array 0.0190s 0.0191s 0.0195s 0.0203s 0.0197s
Sparse Int 1M Judy 0.1601s 0.1677s 0.1604s 0.1688s 0.1667s
Sparse Int 1M PHP Array 0.0438s 0.0505s 0.0429s 0.0476s 0.0458s
Sparse Int 10M Judy 2.5544s 2.7309s 2.6654s 2.8781s 2.7866s
Sparse Int 10M PHP Array 1.2451s 1.3479s 1.3465s 1.5245s 1.4504s
String 100K Judy 0.0207s 0.0212s 0.0206s 0.0221s 0.0209s
String 100K PHP Array 0.0059s 0.0067s 0.0058s 0.0080s 0.0063s
String 500K Judy 0.1481s 0.1585s 0.1412s 0.1623s 0.1487s
String 500K PHP Array 0.0706s 0.0738s 0.0592s 0.0676s 0.0686s
String 1M Judy 0.3385s 0.4213s 0.3366s 0.3899s 0.3555s
String 1M PHP Array 0.1368s 0.1597s 0.1391s 0.1494s 0.1452s
String 10M Judy 4.9804s 5.3941s 5.0140s 5.3799s 5.2985s
String 10M PHP Array 1.8593s 2.0418s 1.8598s 2.0164s 1.9532s

Read Time — Linux

Scenario Subject PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K Judy 0.0095s 0.0098s 0.0092s 0.0093s 0.0097s
Sparse Int 100K PHP Array 0.0035s 0.0037s 0.0035s 0.0035s 0.0041s
Sparse Int 500K Judy 0.0556s 0.0579s 0.0553s 0.0568s 0.0559s
Sparse Int 500K PHP Array 0.0201s 0.0210s 0.0200s 0.0199s 0.0203s
Sparse Int 1M Judy 0.1267s 0.1301s 0.1205s 0.1495s 0.1305s
Sparse Int 1M PHP Array 0.0533s 0.0684s 0.0515s 0.0578s 0.0650s
Sparse Int 10M Judy 2.9984s 3.1232s 2.9873s 3.3414s 3.1920s
Sparse Int 10M PHP Array 1.2575s 1.4064s 1.3031s 1.3797s 1.3419s
String 100K Judy 0.0161s 0.0162s 0.0160s 0.0164s 0.0162s
String 100K PHP Array 0.0050s 0.0052s 0.0050s 0.0050s 0.0051s
String 500K Judy 0.1393s 0.1622s 0.1392s 0.1656s 0.1488s
String 500K PHP Array 0.0597s 0.0773s 0.0578s 0.0764s 0.0653s
String 1M Judy 0.3405s 0.4118s 0.3381s 0.3781s 0.3835s
String 1M PHP Array 0.1424s 0.2214s 0.1475s 0.1696s 0.1496s
String 10M Judy 5.3116s 5.8191s 5.1996s 5.8483s 5.6752s
String 10M PHP Array 2.3262s 2.4871s 2.3457s 2.4845s 2.4231s

Memory — Linux

Scenario Subject PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K Judy 1.86 mb 1.84 mb 1.83 mb 1.84 mb 1.84 mb
Sparse Int 100K PHP Array 7 mb 7 mb 7 mb 7 mb 7 mb
Sparse Int 500K Judy 9.19 mb 9.19 mb 9.21 mb 9.18 mb 9.2 mb
Sparse Int 500K PHP Array 20 mb 20 mb 20 mb 20 mb 20 mb
Sparse Int 1M Judy 18.38 mb 18.36 mb 18.37 mb 18.35 mb 18.37 mb
Sparse Int 1M PHP Array 40 mb 40 mb 40 mb 40 mb 40 mb
Sparse Int 10M Judy 183.69 mb 183.52 mb 183.56 mb 183.57 mb 183.5 mb
Sparse Int 10M PHP Array 640 mb 640 mb 640 mb 640 mb 640 mb
String 100K Judy 3.05 mb 3.05 mb 3.05 mb 3.05 mb 3.05 mb
String 100K PHP Array 5 mb 5 mb 5 mb 5 mb 5 mb
String 500K Judy 15.26 mb 15.26 mb 15.26 mb 15.26 mb 15.26 mb
String 500K PHP Array 20 mb 20 mb 20 mb 20 mb 20 mb
String 1M Judy 30.52 mb 30.52 mb 30.52 mb 30.52 mb 30.52 mb
String 1M PHP Array 40 mb 40 mb 40 mb 40 mb 40 mb
String 10M Judy 305.18 mb 305.18 mb 305.18 mb 305.18 mb 305.18 mb
String 10M PHP Array 640 mb 640 mb 640 mb 640 mb 640 mb

Write Time — Windows

Scenario Subject PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K Judy 0.0286s 0.0289s 0.0293s 0.0285s 0.0300s
Sparse Int 100K PHP Array 0.0052s 0.0058s 0.0056s 0.0054s 0.0067s
Sparse Int 500K Judy 0.1474s 0.1488s 0.1537s 0.1485s 0.1752s
Sparse Int 500K PHP Array 0.0296s 0.0341s 0.0316s 0.0321s 0.0495s
Sparse Int 1M Judy 0.3049s 0.3068s 0.3224s 0.3054s 0.4012s
Sparse Int 1M PHP Array 0.0713s 0.0731s 0.0807s 0.0761s 0.1247s
Sparse Int 10M Judy 4.6745s 4.8642s 5.0700s 4.9523s 6.3053s
Sparse Int 10M PHP Array 1.9382s 2.0243s 2.1818s 2.3111s 2.0048s
String 100K Judy 0.0441s 0.0476s 0.0499s 0.0472s 0.0540s
String 100K PHP Array 0.0089s 0.0131s 0.0126s 0.0132s 0.0186s
String 500K Judy 0.2946s 0.2956s 0.3201s 0.2970s 0.3696s
String 500K PHP Array 0.0765s 0.0845s 0.0885s 0.0883s 0.1028s
String 1M Judy 0.6284s 0.6388s 0.6925s 0.6301s 0.8137s
String 1M PHP Array 0.1716s 0.1913s 0.1948s 0.2175s 0.2307s
String 10M Judy 8.6639s 8.6659s 9.4515s 9.3537s 11.0716s
String 10M PHP Array 2.4623s 2.5596s 2.6631s 2.6401s 3.0951s

Read Time — Windows

Scenario Subject PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K Judy 0.0153s 0.0157s 0.0155s 0.0154s 0.0181s
Sparse Int 100K PHP Array 0.0053s 0.0056s 0.0057s 0.0052s 0.0056s
Sparse Int 500K Judy 0.0869s 0.0927s 0.1001s 0.0835s 0.1488s
Sparse Int 500K PHP Array 0.0310s 0.0386s 0.0384s 0.0348s 0.0726s
Sparse Int 1M Judy 0.2089s 0.2180s 0.2467s 0.2058s 0.3549s
Sparse Int 1M PHP Array 0.0785s 0.0786s 0.0951s 0.0784s 0.1772s
Sparse Int 10M Judy 3.6430s 3.8049s 4.1558s 4.1144s 5.2681s
Sparse Int 10M PHP Array 1.7618s 1.7928s 1.8276s 1.8850s 2.1108s
String 100K Judy 0.0273s 0.0285s 0.0356s 0.0272s 0.0410s
String 100K PHP Array 0.0062s 0.0074s 0.0072s 0.0073s 0.0177s
String 500K Judy 0.2087s 0.2152s 0.2360s 0.1950s 0.2981s
String 500K PHP Array 0.0752s 0.0807s 0.1007s 0.0864s 0.1372s
String 1M Judy 0.4722s 0.5006s 0.5779s 0.4855s 0.6486s
String 1M PHP Array 0.1973s 0.2144s 0.2418s 0.2636s 0.3117s
String 10M Judy 7.1831s 7.0521s 7.5821s 7.7598s 9.4430s
String 10M PHP Array 3.0508s 3.0798s 3.1487s 3.1285s 3.7893s

Memory — Windows

Scenario Subject PHP 8.1 PHP 8.2 PHP 8.3 PHP 8.4 PHP 8.5
Sparse Int 100K Judy 1.85 mb 1.84 mb 1.85 mb 1.85 mb 1.84 mb
Sparse Int 100K PHP Array 8 mb 8 mb 8 mb 8 mb 8 mb
Sparse Int 500K Judy 9.2 mb 9.18 mb 9.18 mb 9.18 mb 9.19 mb
Sparse Int 500K PHP Array 20 mb 20 mb 20 mb 20 mb 20 mb
Sparse Int 1M Judy 18.35 mb 18.36 mb 18.37 mb 18.34 mb 18.36 mb
Sparse Int 1M PHP Array 40 mb 40 mb 40 mb 40 mb 40 mb
Sparse Int 10M Judy 183.58 mb 183.61 mb 183.62 mb 183.6 mb 183.53 mb
Sparse Int 10M PHP Array 640 mb 640 mb 640 mb 640 mb 640 mb
String 100K Judy 3.05 mb 3.05 mb 3.05 mb 3.05 mb 3.05 mb
String 100K PHP Array 6 mb 6 mb 6 mb 6 mb 6 mb
String 500K Judy 15.26 mb 15.26 mb 15.26 mb 15.26 mb 15.26 mb
String 500K PHP Array 20 mb 20 mb 20 mb 20 mb 20 mb
String 1M Judy 30.52 mb 30.52 mb 30.52 mb 30.52 mb 30.52 mb
String 1M PHP Array 40 mb 40 mb 40 mb 40 mb 40 mb
String 10M Judy 305.18 mb 305.18 mb 305.18 mb 305.18 mb 305.18 mb
String 10M PHP Array 640 mb 640 mb 640 mb 640 mb 640 mb

Batch & Set Operations Benchmarks

Benchmarks for putAll(), fromArray(), getAll(), toArray(), increment(), and BITSET set operations (union, intersect, diff, xor).

Batch Operations — PHP 8.5 (Linux)
=============================================================
  Judy Batch Operations & Increment Benchmark
=============================================================
  PHP 8.5.3 | Judy ext 2.3.0
  Iterations: 5 (median of each)
=============================================================

===========================================================
  INT_TO_INT — 10,000 elements
===========================================================

  [1. Bulk Add: populate 10000 elements]
  PHP array (foreach assign)                            0.227 ms
  Judy individual $j[$k] = $v                           0.507 ms
  Judy putAll()                                         0.383 ms
  Judy::fromArray()                                     0.392 ms
  putAll() vs individual Judy                        1.32x
  fromArray() vs individual Judy                     1.29x
  Judy putAll() vs PHP array                         1.69x
  Judy fromArray() vs PHP array                      1.73x

  [2. Bulk Get: fetch 1100 keys (incl. 100 missing)]
  PHP array ($a[$k] ?? null)                            0.030 ms
  Judy individual $j[$k]                                0.063 ms
  Judy getAll()                                         0.042 ms
  getAll() vs individual Judy                        1.50x
  Judy getAll() vs PHP array                         1.43x

  [3. Conversion: Judy to PHP array]
  Judy toArray()                                        0.433 ms
  Judy manual foreach loop                              2.346 ms
  toArray() vs manual foreach                        5.42x

  [4. Increment: 10000 ops on 1000 unique keys]
  PHP array $a[$k]++                                    0.220 ms
  Judy $j[$k] = $j[$k] + 1                              0.682 ms
  Judy increment()                                      0.465 ms
  increment() vs manual Judy                         1.47x
  Judy increment() vs PHP array                      2.11x

===========================================================
  STRING_TO_INT — 10,000 elements
===========================================================

  [1. Bulk Add: populate 10000 elements]
  PHP array (foreach assign)                            0.294 ms
  Judy individual $j[$k] = $v                           1.912 ms
  Judy putAll()                                         1.779 ms
  Judy::fromArray()                                     1.788 ms
  putAll() vs individual Judy                        1.08x
  fromArray() vs individual Judy                     1.07x
  Judy putAll() vs PHP array                         6.05x
  Judy fromArray() vs PHP array                      6.08x

  [2. Bulk Get: fetch 1100 keys (incl. 100 missing)]
  PHP array ($a[$k] ?? null)                            0.039 ms
  Judy individual $j[$k]                                0.101 ms
  Judy getAll()                                         0.095 ms
  getAll() vs individual Judy                        1.06x
  Judy getAll() vs PHP array                         2.42x

  [3. Conversion: Judy to PHP array]
  Judy toArray()                                        1.098 ms
  Judy manual foreach loop                              4.050 ms
  toArray() vs manual foreach                        3.69x

  [4. Increment: 10000 ops on 1000 unique keys]
  PHP array $a[$k]++                                    0.324 ms
  Judy $j[$k] = $j[$k] + 1                              1.793 ms
  Judy increment()                                      1.295 ms
  increment() vs manual Judy                         1.38x
  Judy increment() vs PHP array                      4.00x

===========================================================
  INT_TO_INT — 100,000 elements
===========================================================

  [1. Bulk Add: populate 100000 elements]
  PHP array (foreach assign)                            2.769 ms
  Judy individual $j[$k] = $v                           5.513 ms
  Judy putAll()                                         4.255 ms
  Judy::fromArray()                                     4.291 ms
  putAll() vs individual Judy                        1.30x
  fromArray() vs individual Judy                     1.28x
  Judy putAll() vs PHP array                         1.54x
  Judy fromArray() vs PHP array                      1.55x

  [2. Bulk Get: fetch 10100 keys (incl. 100 missing)]
  PHP array ($a[$k] ?? null)                            0.272 ms
  Judy individual $j[$k]                                0.443 ms
  Judy getAll()                                         0.253 ms
  getAll() vs individual Judy                        1.75x
  Judy getAll() vs PHP array                         0.93x

  [3. Conversion: Judy to PHP array]
  Judy toArray()                                        5.113 ms
  Judy manual foreach loop                             24.145 ms
  toArray() vs manual foreach                        4.72x

  [4. Increment: 100000 ops on 1000 unique keys]
  PHP array $a[$k]++                                    2.129 ms
  Judy $j[$k] = $j[$k] + 1                              6.698 ms
  Judy increment()                                      4.554 ms
  increment() vs manual Judy                         1.47x
  Judy increment() vs PHP array                      2.14x

===========================================================
  STRING_TO_INT — 100,000 elements
===========================================================

  [1. Bulk Add: populate 100000 elements]
  PHP array (foreach assign)                            3.490 ms
  Judy individual $j[$k] = $v                          20.498 ms
  Judy putAll()                                        19.210 ms
  Judy::fromArray()                                    19.185 ms
  putAll() vs individual Judy                        1.07x
  fromArray() vs individual Judy                     1.07x
  Judy putAll() vs PHP array                         5.50x
  Judy fromArray() vs PHP array                      5.50x

  [2. Bulk Get: fetch 10100 keys (incl. 100 missing)]
  PHP array ($a[$k] ?? null)                            0.358 ms
  Judy individual $j[$k]                                1.001 ms
  Judy getAll()                                         0.944 ms
  getAll() vs individual Judy                        1.06x
  Judy getAll() vs PHP array                         2.64x

  [3. Conversion: Judy to PHP array]
  Judy toArray()                                       11.742 ms
  Judy manual foreach loop                             41.637 ms
  toArray() vs manual foreach                        3.55x

  [4. Increment: 100000 ops on 1000 unique keys]
  PHP array $a[$k]++                                    3.123 ms
  Judy $j[$k] = $j[$k] + 1                             17.840 ms
  Judy increment()                                     12.544 ms
  increment() vs manual Judy                         1.42x
  Judy increment() vs PHP array                      4.02x

===========================================================
  INT_TO_INT — 500,000 elements
===========================================================

  [1. Bulk Add: populate 500000 elements]
  PHP array (foreach assign)                           12.652 ms
  Judy individual $j[$k] = $v                          27.811 ms
  Judy putAll()                                        21.808 ms
  Judy::fromArray()                                    22.108 ms
  putAll() vs individual Judy                        1.28x
  fromArray() vs individual Judy                     1.26x
  Judy putAll() vs PHP array                         1.72x
  Judy fromArray() vs PHP array                      1.75x

  [2. Bulk Get: fetch 50100 keys (incl. 100 missing)]
  PHP array ($a[$k] ?? null)                            1.570 ms
  Judy individual $j[$k]                                2.428 ms
  Judy getAll()                                         1.496 ms
  getAll() vs individual Judy                        1.62x
  Judy getAll() vs PHP array                         0.95x

  [3. Conversion: Judy to PHP array]
  Judy toArray()                                       24.826 ms
  Judy manual foreach loop                            119.767 ms
  toArray() vs manual foreach                        4.82x

  [4. Increment: 500000 ops on 1000 unique keys]
  PHP array $a[$k]++                                   10.601 ms
  Judy $j[$k] = $j[$k] + 1                             33.813 ms
  Judy increment()                                     23.072 ms
  increment() vs manual Judy                         1.47x
  Judy increment() vs PHP array                      2.18x

===========================================================
  STRING_TO_INT — 500,000 elements
===========================================================

  [1. Bulk Add: populate 500000 elements]
  PHP array (foreach assign)                           16.701 ms
  Judy individual $j[$k] = $v                         103.379 ms
  Judy putAll()                                        96.282 ms
  Judy::fromArray()                                    96.743 ms
  putAll() vs individual Judy                        1.07x
  fromArray() vs individual Judy                     1.07x
  Judy putAll() vs PHP array                         5.77x
  Judy fromArray() vs PHP array                      5.79x

  [2. Bulk Get: fetch 50100 keys (incl. 100 missing)]
  PHP array ($a[$k] ?? null)                            1.926 ms
  Judy individual $j[$k]                                5.294 ms
  Judy getAll()                                         5.014 ms
  getAll() vs individual Judy                        1.06x
  Judy getAll() vs PHP array                         2.60x

  [3. Conversion: Judy to PHP array]
  Judy toArray()                                       59.734 ms
  Judy manual foreach loop                            210.771 ms
  toArray() vs manual foreach                        3.53x

  [4. Increment: 500000 ops on 1000 unique keys]
  PHP array $a[$k]++                                   15.531 ms
  Judy $j[$k] = $j[$k] + 1                             86.640 ms
  Judy increment()                                     60.992 ms
  increment() vs manual Judy                         1.42x
  Judy increment() vs PHP array                      3.93x

=============================================================
  Benchmark complete.
=============================================================
Set Operations — PHP 8.5 (Linux)
=============================================================
  Judy BITSET Set Operations Benchmark
=============================================================
  PHP 8.5.3 | Judy ext 2.3.0
  Overlap: 50% between sets A and B
=============================================================

--- Size: 1,000 indices per set (500 unique + 500 shared) ---

  [Union]
  Judy::union()                                    0.090 ms (median of 5)
  array_replace() keys                             0.014 ms (median of 5)
  Speedup: 0.2x

  [Intersect]
  Judy::intersect()                                0.042 ms (median of 5)
  array_intersect_key()                            0.008 ms (median of 5)
  Speedup: 0.2x

  [Diff]
  Judy::diff()                                     0.041 ms (median of 5)
  array_diff_key()                                 0.012 ms (median of 5)
  Speedup: 0.3x

  [XOR (symmetric difference)]
  Judy::xor()                                      0.108 ms (median of 5)
  array_diff_key() x2 + array_replace()            0.016 ms (median of 5)
  Speedup: 0.1x

  [Memory: union result]
  Judy memoryUsage()                            8.08 kb
  PHP array memory delta                        36.05 kb
  Ratio: PHP uses 4.5x more memory

--- Size: 10,000 indices per set (5000 unique + 5000 shared) ---

  [Union]
  Judy::union()                                    0.588 ms (median of 5)
  array_replace() keys                             0.077 ms (median of 5)
  Speedup: 0.1x

  [Intersect]
  Judy::intersect()                                0.271 ms (median of 5)
  array_intersect_key()                            0.092 ms (median of 5)
  Speedup: 0.3x

  [Diff]
  Judy::diff()                                     0.269 ms (median of 5)
  array_diff_key()                                 0.056 ms (median of 5)
  Speedup: 0.2x

  [XOR (symmetric difference)]
  Judy::xor()                                      0.533 ms (median of 5)
  array_diff_key() x2 + array_replace()            0.164 ms (median of 5)
  Speedup: 0.3x

  [Memory: union result]
  Judy memoryUsage()                            8.08 kb
  PHP array memory delta                        260.05 kb
  Ratio: PHP uses 32.2x more memory

--- Size: 100,000 indices per set (50000 unique + 50000 shared) ---

  [Union]
  Judy::union()                                    7.127 ms (median of 5)
  array_replace() keys                             0.996 ms (median of 5)
  Speedup: 0.1x

  [Intersect]
  Judy::intersect()                                3.352 ms (median of 5)
  array_intersect_key()                            0.980 ms (median of 5)
  Speedup: 0.3x

  [Diff]
  Judy::diff()                                     3.177 ms (median of 5)
  array_diff_key()                                 0.560 ms (median of 5)
  Speedup: 0.2x

  [XOR (symmetric difference)]
  Judy::xor()                                      6.634 ms (median of 5)
  array_diff_key() x2 + array_replace()            2.130 ms (median of 5)
  Speedup: 0.3x

  [Memory: union result]
  Judy memoryUsage()                            20.08 kb
  PHP array memory delta                        4 mb
  Ratio: PHP uses 204.2x more memory

--- Size: 500,000 indices per set (250000 unique + 250000 shared) ---

  [Union]
  Judy::union()                                   37.394 ms (median of 5)
  array_replace() keys                             4.683 ms (median of 5)
  Speedup: 0.1x

  [Intersect]
  Judy::intersect()                               16.727 ms (median of 5)
  array_intersect_key()                            5.567 ms (median of 5)
  Speedup: 0.3x

  [Diff]
  Judy::diff()                                    16.671 ms (median of 5)
  array_diff_key()                                 3.124 ms (median of 5)
  Speedup: 0.2x

  [XOR (symmetric difference)]
  Judy::xor()                                     33.654 ms (median of 5)
  array_diff_key() x2 + array_replace()           11.500 ms (median of 5)
  Speedup: 0.3x

  [Memory: union result]
  Judy memoryUsage()                            56.08 kb
  PHP array memory delta                        16 mb
  Ratio: PHP uses 292.2x more memory

=============================================================
  Benchmark complete.
=============================================================

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces valuable batch operations (toArray, fromArray, putAll, getAll) and an atomic increment method, enhancing performance and usability. However, a critical security vulnerability was identified in the Judy::increment() method due to a missing length check for string keys. This can lead to stack buffer overflows in other methods, potentially enabling arbitrary code execution or denial of service. Additionally, there are suggestions to improve maintainability by reducing code duplication and to correct documentation discrepancies regarding the increment method's performance.

Add examples/judy-bench-batch-operations.php comparing:
- Bulk add: putAll()/fromArray() vs individual assignment vs PHP array
- Bulk get: getAll() vs individual access vs PHP array
- Conversion: toArray() vs manual foreach
- Increment: increment() vs manual read-modify-write vs PHP array

Key findings (100K elements, PHP 8.3):
- getAll() is 1.9x faster than individual Judy lookups (INT_TO_INT)
- toArray() is 2.8-3.1x faster than manual foreach
- increment() is 1.3-1.6x faster than manual $j[$k]+1
- fromArray() is 1.3x faster than individual inserts (INT_TO_INT)

Add Tables 4-7 to BENCHMARK.md with results.
Restructure judy-bench-batch-operations.php so each type section
(INT_TO_INT, STRING_TO_INT) runs all 4 benchmark categories together:
1. Bulk Add (PHP array vs individual Judy vs putAll vs fromArray)
2. Bulk Get (PHP array vs individual Judy vs getAll)
3. Conversion (toArray vs manual foreach)
4. Increment (PHP array vs manual Judy vs increment)

Every method now has a consistent 3-way comparison in context.
- Security: add string key length validation in increment() before
  JSLG/JSLI calls to prevent potential buffer overflow
- Refactor: extract judy_populate_from_array() shared helper for
  fromArray() and putAll() to eliminate code duplication
- Refactor: restructure getAll() to group by is_integer_keyed vs
  is_string_keyed, reducing redundant key retrieval code
- Docs: correct "single-traversal" claim for increment() — INT_TO_INT
  uses single traversal (JLI), STRING_TO_INT uses two (JSLG+JSLI)
- CI: run batch ops and set ops benchmarks, include results in PR report
@orieg orieg merged commit 3bb4639 into main Feb 28, 2026
13 checks passed
@orieg orieg deleted the feat/batch-and-increment branch February 28, 2026 03:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant