Skip to content

Conversation

@jmarble
Copy link

@jmarble jmarble commented Oct 23, 2025

Adds uniqueStrings() method to Collection and LazyCollection for optimized string deduplication.

Provides 5-245x performance improvement over unique() by using array_unique(SORT_STRING) and isset() hash lookups. Supports keys, closures, and nested properties.

collect(['5', '10', '5', '3A'])->uniqueStrings(); // ['5', '10', '3A']
collect($users)->uniqueStrings('email');
collect($products)->uniqueStrings(fn($p) => $p->sku);
collect($posts)->uniqueStrings('author.email'); // nested relationships

Also avoids SORT_REGULAR instability: php/php-src#20262

Benchmark

Small datasets (10 items): 1-5x faster
Large datasets (10K items): 21-38x faster (up to 293x vs uniqueStrict!)

Benchmark results: https://gist.github.com/jmarble/e4bcecb63e021d48c8246dde1400ad3c
Benchmark source code: https://gist.github.com/jmarble/57746453e7a1e9cbf77c8bc2812e5e1c

@jmarble jmarble force-pushed the feature/collection-unique-strings branch 2 times, most recently from 670761e to 68acd5e Compare October 23, 2025 17:34
@jmarble jmarble force-pushed the feature/collection-unique-strings branch from 68acd5e to 37063c5 Compare October 23, 2025 17:50
Adds optimized string deduplication method to Collection and LazyCollection.
Uses array_unique(SORT_STRING) and isset() hash lookups for significant
performance improvements over unique() when working with strings.

Supports keys, closures, and nested property access.

Avoids SORT_REGULAR instability issue: php/php-src#20262
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants