Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SortedList/SortedKeyList update/add insertion order inconsistency #158

Closed
bamartin125 opened this issue Oct 20, 2020 · 3 comments · Fixed by #159
Closed

SortedList/SortedKeyList update/add insertion order inconsistency #158

bamartin125 opened this issue Oct 20, 2020 · 3 comments · Fixed by #159

Comments

@bamartin125
Copy link
Contributor

Internally of the update() methods of SortedList and SortedKeyList, there is a fork in the logic of whether or not to add() each element of the iterator passed to update() or to extend the iterator passed to update with the elements currently stored in the SortedList/SortedKeyList object by a condition on a ratio of the count of element in the external iterator and the elements contained in the SL/SKL object.

Depending on whether or not external elements are added by add() or the SL/SKL object is reloaded from the result of extend()ing the external iterable, the order that the incoming elements end up in the resulting object can be different. This is only the case when the key to sort by is equivalent between objects being compared.

Example:

from sortedcontainers import SortedList
sl = SortedList(key=lambda x: x[0])
sl.add(('k', 0))
sl.add(('k', 1))
sl.add(('k', 2))
sl.update([('k', -1)])
sl.update([('k', 5)])
sl.add(('k', 6))
print(sl)
SortedKeyList([('k', 5), ('k', -1), ('k', 0), ('k', 1), ('k', 2), ('k', 6)], key=<function <lambda> at 0x7f9b90f6e0d0>)

For the SL/SKL and iterator size ratios applicable for the update() method to use the extension logic fork, the order of elements in how they show up in the final list is inconsistent.

I have a suggestion for a fix for this by essentially prepending the existing elements to the incoming iterator rather than appending (by extending) them. I plan to submit a pull request soon...

@bamartin125
Copy link
Contributor Author

Also, just to be clear, no-one should be dependent on any order that has been unspecified (because the object was never told to sort beyond the key given), but the behavior is inconsistent depending on internal logic, so it seems like a change isn't out of the question.

@grantjenks
Copy link
Owner

I recall seeing this behavior and I wasn’t sure if it mattered. To be clear, there’s no invariant that’s violated internally. The add() logic uses bisect_left just like the update() logic but the update() logic preserves the order of the elements it was given originally.

If you want it to be consistent then a simple method override that always calls add() seems like a workaround.

@bamartin125
Copy link
Contributor Author

I recall seeing this behavior and I wasn’t sure if it mattered. To be clear, there’s no invariant that’s violated internally. The add() logic uses bisect_left just like the update() logic but the update() logic preserves the order of the elements it was given originally.

This issue matters somewhat in my application when trying to track down other things outside of the sortedcontainers module. Being able to make assumptions on the insertion order (at least it being consistent between calls) can allow me to speed up my development as I'm making other pattern checks.

If you want it to be consistent then a simple method override that always calls add() seems like a workaround.

I agree that a workaround like you suggest would technically work, but the performance would not be a good as a single update() call.

My tests through benchmarking have shown that performance is comparable when making the change I suggest since the existing array is just being modified in a different place (beginning vs end).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants