-
Notifications
You must be signed in to change notification settings - Fork 7.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sorting stable #5236
Make sorting stable #5236
Conversation
i = ht->nNumUsed; | ||
/* Store original order of elements in extra space to allow stable sorting. */ | ||
for (i = 0; i < ht->nNumUsed; i++) { | ||
Z_EXTRA(ht->arData[i].val) = i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. :)
ec7000b
to
a19dabc
Compare
But call with reversed operands to make sure it continues working.
@nikic Huge thumb up for addressing this! Did you also considered using more and more popular Timsort algorithm? It is stable by default with the same complexity. Now also used by Java and C#. |
@mvorisek Switching to Timsort is a more intrusive change. It's a different algorithm (hybrid merge rather than hybrid quick) with different performance characteristics (and more importantly, memory usage characteristics). If you want to evaluate Timsort usage in PHP, I'm definitely interested in results. |
I would love to help, but my C skills are limited. I have done now more reseach on this: Usage: default also in JS/V8, Python Benchmarks:
Memory complexity is |
if (!ARRAYG(compare_deprecation_thrown)) { | ||
php_error_docref(NULL, E_DEPRECATED, | ||
"Returning bool from comparison function is deprecated, " | ||
"return an integer less than, equal to, or greater than zero"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
an integer less than, equal to, or greater than zero
Isn't that a little redundant? (as it basically includes all integers). Alternatively we could say "-1, 0 or 1".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is intentionally redundant to clarify meaning of the value. I used "-1, 0 or 1" here initially, but as @hikari-no-yume pointed out on the mailing list, there is no actual requirement to return one of those here, and some functions like strcmp() which are reasonable to use in this context don't return -1, 0, 1 specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There could be a better way to word it, but there is a risk of the message being too long…
RFC: https://wiki.php.net/rfc/stable_sorting
This makes the array sorting functions stable (mostly relevant for usort). This does not actually change the core sorting algorithm to be stable (it is still a quick sort). Instead, we store the position of the original elements and use that as a fallback comparison criterion. Due to the way hashtable sorting works, we can do this very cheaply, and without additional memory overhead.