BUG: Interpolate: use stable sort #12566
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference issue
Closes gh-12373 gh-9886
What does this implement/fix?
Calls to numpy.sort and argsort in interpolate.py are changed to use kind="mergesort" instead of the default kind. This avoids a bug where interpolate produces incorrect (or at least: very surprising) results if the user supplied data contains duplicate x values, and assume_sorted=True is not passed to the interp1d or interp2d call (thus scipy performs a sort). The default numpy sort is unstable; if the user data was already sorted, the default numpy sort may still rearrange the x values some of the time, and thus swap interpolation target y values to something other than the provided order. Additionally, the output is random: whether any specific point is swapped depends on the exact data passed in (the whole x array, not just the points being interpolated between).
Implementation: numpy has had a stable sort with
np.sort(x, kind="mergesort")
since at least version 1.3.x; in 1.15.0,kind="stable"
was introduced as a synonym for mergesort; and additionally mergesort may use a different stable implementation under the hood, depending on the data type (but is still guaranteed stable).For forward compatibility, it may be better to check the numpy version here and call kind="stable" if the version is >= 1.15.0. However, kind="mergesort" provides identical behavior in all current and old versions of numpy, and avoids an explicit version check.
See also:
https://numpy.org/doc/stable/release/1.15.0-notes.html?highlight=numpy%20ufunc%20types#sort-functions-accept-kind-stable
Additional information