Skip to content

array_unique() with SORT_REGULAR returns duplicate values #20262

@jmarble

Description

@jmarble

Description

The following code:

<?php
$units = ['5', '10', '5', '3A', '5', '5'];
$unique = array_unique($units, SORT_REGULAR);
print_r($unique);

Resulted in this output:

Array
(
    [0] => 5
    [1] => 10
    [3] => 3A
    [4] => 5
)

But I expected this output instead:

Array
(
    [0] => 5
    [1] => 10
    [3] => 3A
)

Demonstrations:


PRs in progress:


Root Cause

From analyzing PHP source code (ext/standard/array.c, Zend/zend_operators.c):

The algorithm:

  1. Sort array using zendi_smart_strcmp() which calls is_numeric_string_ex()
  2. Walk through sorted array comparing only adjacent elements
  3. Delete duplicates from original array

The bug:

is_numeric_string_ex() extracts leading numeric portions:

  • "3A" → extracts 3
  • "5" → extracts 5
  • "10" → extracts 10
  • Compares numerically: 3 < 5 < 10

However, unstable sort produces:

Sorted: ["5", "10", "10", "3A", "5", "5"]

The "3A" (numeric value 3) ends up AFTER "10" instead of before "5", separating the duplicate "5" values.

The deduplication walks through comparing adjacent elements:

lastkept = position_0;  // "5"
position_1 "10" != "5"keep, lastkept = position_1
position_2 "10" == "10"delete
position_3 "3A" != "10"keep, lastkept = position_3
position_4 "5" != "3A"keepBug! Never compared to position_0
position_5 "5" == "5"delete

The flaw: The algorithm only compares with lastkept (last unique value), not with all previous values. Position 4's "5" is never compared back to position 0's "5".

Source files:

  • ext/standard/array.c - PHP_FUNCTION(array_unique)
  • Zend/zend_operators.c - zendi_smart_strcmp(), is_numeric_string_ex()

Comparison with SORT_STRING

<?php
$units = ['5', '10', '5', '3A', '5', '5'];
echo count(array_unique($units, SORT_REGULAR)) . "\n"; // 4 ✗ Wrong
echo count(array_unique($units, SORT_STRING)) . "\n";  // 3 ✓ Correct

SORT_STRING uses lexical comparison without numeric extraction, so duplicates stay grouped.


Workaround

For simple arrays of scalar values, you can use array_unique with default SORT_STRING flag.

<?php
$unique = array_unique($array, SORT_STRING);

For arrays or objects.

$uniqueAddr = [];
foreach ($addresses as $addr) {
    if (! in_array($addr, $uniqueAddr)) {
        $uniqueAddr[] = $addr;
    }
}

PHP Version

PHP 8.4.13 (cli) (built: Sep 26 2025 00:45:36) (NTS clang 15.0.0)
Copyright (c) The PHP Group
Built by Laravel Herd
Zend Engine v4.4.13, Copyright (c) Zend Technologies
    with Zend OPcache v8.4.13, Copyright (c), by Zend Technologies

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions