fix: avoid expanding Ids object#26
Conversation
AI-Assisted: Claude Code (Claude Opus 4.6) Signed-off-by: Christoph Wurst <1374172+ChristophWurst@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a compact interval-based internal representation for Horde_Imap_Client_Ids to prevent OOMs when parsing very large IMAP range strings (notably from ESEARCH/QRESYNC responses).
Changes:
- Add interval storage (
$_intervals) to avoid expanding largestart:endranges into flat arrays. - Update
add(),remove(),count(), iteration, and string rendering to support interval mode. - Add regression and behavior tests for large ranges (merge/remove/serialize/iterate).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| lib/Horde/Imap/Client/Ids.php | Adds interval-based storage and updates core Ids behaviors to operate without expanding large ranges. |
| test/Horde/Imap/Client/IdsTest.php | Adds regression tests to ensure large ranges remain compact and basic operations still work. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return $last[1]; | ||
| } | ||
| $this->sort(); | ||
| return end($this->_ids); | ||
|
|
||
| case 'min': | ||
| if ($this->_intervals !== null) { | ||
| return $this->_intervals[0][0]; |
There was a problem hiding this comment.
In interval mode, max assumes $_intervals is non-empty. If an Ids object becomes empty after interval subtraction, end($this->_intervals) returns false and $last[1] will raise a warning/TypeError. Please guard for empty intervals and return the same value as the legacy path on empty (typically false from end([])).
| return $last[1]; | |
| } | |
| $this->sort(); | |
| return end($this->_ids); | |
| case 'min': | |
| if ($this->_intervals !== null) { | |
| return $this->_intervals[0][0]; | |
| return ($last === false) | |
| ? false | |
| : $last[1]; | |
| } | |
| $this->sort(); | |
| return end($this->_ids); | |
| case 'min': | |
| if ($this->_intervals !== null) { | |
| return empty($this->_intervals) | |
| ? false | |
| : $this->_intervals[0][0]; |
|
|
||
| case 'min': | ||
| if ($this->_intervals !== null) { | ||
| return $this->_intervals[0][0]; |
There was a problem hiding this comment.
In interval mode, min directly indexes $this->_intervals[0][0]. If the Ids object is empty (e.g., after removing the full range), this will trigger an undefined offset warning. Please add an empty-check and return the same result as the flat-array path on empty (typically false from reset([])).
| return $this->_intervals[0][0]; | |
| return empty($this->_intervals) | |
| ? false | |
| : $this->_intervals[0][0]; |
| public function reverse() | ||
| { | ||
| if ($this->_intervals !== null) { | ||
| $this->_expandIntervals(); | ||
| } |
There was a problem hiding this comment.
reverse() expands intervals into a flat array, which can reintroduce the same OOM this PR is trying to avoid (e.g., _search() can call $sr->reverse() when doing client-side sequence sorting with REVERSE). Consider implementing a reverse/descending iterator mode for interval storage (or otherwise avoiding forced expansion) so very large ranges can still be handled safely.
| $before = memory_get_usage(); | ||
| $ids->add('1:10000000'); | ||
| $after = memory_get_usage(); | ||
| // Flat expansion would need ~400 MB. Intervals use < 1 KB. | ||
| $this->assertLessThan(1024 * 1024, $after - $before); |
There was a problem hiding this comment.
This regression test asserts on memory_get_usage() deltas, which can be flaky across PHP versions/builds/allocators (small allocations can still bump usage by >1MB). A more deterministic assertion would be to inspect internal state via Reflection (e.g., $_intervals non-null / $_ids empty) or to assert behavior that would be impossible with expansion (e.g., strval($ids) stays a compact range without ever touching $ids->ids).
| $before = memory_get_usage(); | |
| $ids->add('1:10000000'); | |
| $after = memory_get_usage(); | |
| // Flat expansion would need ~400 MB. Intervals use < 1 KB. | |
| $this->assertLessThan(1024 * 1024, $after - $before); | |
| $ids->add('1:10000000'); | |
| $r = new ReflectionClass($ids); | |
| $intervals_prop = $r->getProperty('_intervals'); | |
| $intervals_prop->setAccessible(true); | |
| $ids_prop = $r->getProperty('_ids'); | |
| $ids_prop->setAccessible(true); | |
| $intervals = $intervals_prop->getValue($ids); | |
| $flat_ids = $ids_prop->getValue($ids); | |
| $this->assertNotNull($intervals); | |
| $this->assertEmpty($flat_ids); |
I'm hunting an OOM that happens here:
Most likely it's happening because it's a server with QRESYNC but HIGHESTMODSEQ is missing (due to nextcloud/mail#12788 possibly). In any case, having a more efficient representation of ID ranges would be great.