-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Undefined behavior for deque.insert() when len(d) == maxlen #70382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The behavior of deque.insert() in a bounded deque is a bit odd: >>> from collections import deque
>>> d = deque(range(20), maxlen=10)
>>> d
deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], maxlen=10)
>>> d.insert(3, 'New')
>>> d
deque([19, 10, 11, 'New', 13, 14, 15, 16, 17, 18], maxlen=10)
# ^ ^ ^--- The 12 got replaced
# | \--- The elements before the insertion point got rotated
# \--- The final element got rotated to first position With append/appendleft/extend/extendleft, the intended and useful behavior for maxlen is to pop from the opposite end. But for deque.insert(), the best and most useful behavior invariants are less obvious. Ideally, the effect of d.insert(0, item) would be the same as d.appendleft(item), and the effect of d.insert(len(d), item) would be the same as d.append(item). Also, d.insert(i, newitem) should have the post-condition: assert d[i] == newitem if i < len(d) else d[-1] == newitem Having decided where to put the newitem, the question turns to deciding which element should be popped-off to maintain the size invariant. I'm thinking that it should always be the rightmost element that gets popped, so that items before the inserted element never get moved. This matches what insert() does for unbounded deques, making it the least surprising choice. |
Mark, Serhiy and Timmy, do you have any thoughts on what is the right behavior? One option is to always pop the rightmost element to make room, but that results in a weird asymmetry between d.insert(len(d), item) and what d.append(item) would do. I can't seem to come with a coherent set of invariants that doesn't have a surprising discontinuity. Another option is to "refuse the temptation to guess" at what the user intends for the popped-off element to be and to raise an exception. But that isn't very user friendly either. |
Another take: Do a regular insertion with no special cases, followed by a pop from the right: >>> for i in range(len(d) + 1):
s = list(d)
s.insert(i, None)
_ = s.pop()
print(i, s)
0 [None, 'a', 'b']
1 ['a', None, 'b']
2 ['a', 'b', None]
3 ['a', 'b', 'c'] Nice parts:
Not nice part:
Another alternative is to raise an exception for the one case where index==maxlen, with the theory being that d[index] won't be a valid index so you can't insert anything there. |
The plain-insert-followed-by-a-pop-on-the-right is giving reasonable results with nice properties: d.insert(0, i) 0 --> deque([0], maxlen=4) d.insert(len(d), i) |
New changeset ce3d47eaeb21 by Raymond Hettinger in branch '3.5': |
New behavior looks less surprising to me. Old behavior is even more weird for negative indices: >>> from collections import deque
>>> d = deque(range(20), maxlen=10)
>>> d
deque([10, 11, 12, 13, 14, 15, 16, 17, 18, 19], maxlen=10)
>>> d.insert(-3, 'New')
>>> d
deque([11, 12, 13, 14, 15, 16, 'New', 18, 19, 10], maxlen=10) Note that new element not just replaced the old one in the middle of the deque, but all context was rotated one position left. Patched code behave less surprising. >>> d.insert(-3, 'New')
>>> d
deque([10, 11, 12, 13, 14, 15, 'New', 16, 17, 18], maxlen=10) |
But the postcondition d[i] == newitem is broken if i is negative. >>> d = deque('ABC', maxlen=3)
>>> d.insert(-1, None)
>>> d
deque(['A', None, 'B'], maxlen=3) I would expected ['A', 'B', None]. |
Hum, it seems consistent with list behaviour, no? >>> x=list("abc")
>>> x.insert(-1, "X")
>>> x
['a', 'b', 'X', 'c'] -1 is the same than len(x)-1 (2 in this example). deque(['A', None, 'B'], maxlen=3) seems correct to me. |
-1 is the position before the last element ('C'). After popping-off extra element the result can be ['A', 'B', None] or ['B', None, 'C']. |
Oh ok :-) Now I'm confused, I don't know what is the expected behaviour :-) |
The expected behavior should be "insert normally as if the deque were unbounded and then pop-off the rightmost element to restore the maxlen invariant". The applied fix does that for non-negative indices but gives the wrong result for negative indicies: >>> from collections import deque
>>> d = deque('abcde', maxlen=5)
>>> d.insert(-1, 'f')
>>> list(d)
['a', 'b', 'c', 'f', 'd']
>>> s = list('abcde')
>>> s.insert(-1, 'f'); del s[-1]
>>> s
['a', 'b', 'c', 'd', 'f'] I think the behavior can be made explainable and also be useful for common cases, but there is likely no getting around odd looking results with negative index insertions into bounded deques already at their maximum size. |
I'd raise an exception when trying to insert into a bounded deque that's already full. There's simply no way to guess what was _intended_; it's dead easy for the user to implement what they _do_ intend (first make room by deleting the specific item they no longer want); and I can't think of a use case compelling enough to justify whatever arbitrary non-exceptional behavior may be implemented instead. WRT the behavior you settled on, sure, it's explainable. That doesn't imply it's useful, though ;-) I'd rather have an exception. It's plain bizarre that after d.insert(i, x) one can't even rely on
succeeding. Implementing behavior that allows that invariant to fail is _really_ user-unfriendly ;-) In contrast, what .append() and .appendleft() do for a full bounded deque are compelling (and don't violate the weak invariant above). |
What if implement the bahavior described by Raymond for index -len(d) <= i < len(d), but raise an exception if the index is out of this range? The result of deque('abc', maxlen=3).insert(i, 'X') would be: -4: error |
My opinion doesn't change: I'd rather see an exception. I see no use case for inserting "into the middle" of a full bounded queue. If I had one, it would remain trivial to force the specific behavior I intended. |
My only aversion to raising an exception is that it goes against the original intent of maxlen as giving an automatic-pop-to-make-room behavior. Rather that introduce a discontinuity, I like the "smoothness" of letting d.insert(len(d), obj) follow the insert-normally-then-pop-from-the-right rule as opposed to having a special case. |
I agree with Tim that arbitrarily deciding that insert should behave more like appendleft is surprising. This really feels like guessing at what the user wants, silently doing something that really isn't predictable. Any of the following seem nicer (not exactly arguing for any but #1):
|
full_deque2.diff LGTM. But I have added minor suggestions on Rietveld. |
Since slicing is going to be added to deques, it is worth thinking about what the behavior should be there as well: d = deque('abcde', maxlen=7)
d[3:4] = 'efghijklm' Side note: repetition behaves like extend() using maxlen to decide how much to truncate from the left: >>> from collections import deque
>>> d = deque('abcde', maxlen=8)
>>> d *= 2
>>> d
deque(['c', 'd', 'e', 'a', 'b', 'c', 'd', 'e'], maxlen=8) Ideally, the deque API should be coherent and simple taken as a whole so that the behaviors are consistent and predictable as possible even if the general rules lead to some oddities in some uncommon cases. One such general rule could be: "When maxlen is defined, operations proceed as if the deque were unbounded and then truncation is applied to the left" (this is like the rule in decimal where inputs to operations are treated as precise and context rounding is applied after the operation). The general rule would fit a mental models of "inserted new data is prioritized over old data", that "maxlen is all about automatically evicting data to make room for the newer data", and that "methods without 'left' in their name correspond to newest data on the right and next to be evicted on the left". |
*All* reasons look reasonable. May be discuss this on Python-Dev or ask BDFL? From my point it looks that correct implementation of the insertion is too complex and this feature should go only in development release (if any). |
New changeset 7dabb94bdf63 by Raymond Hettinger in branch '3.5': |
FYI Raymond, your revision 0731f097157b didn’t actually merge the 3.5 branch (It only had one parent). Also, the default branch never got any NEWS entry in 58266f5101cc, so maybe it needs to be added? |
Maybe it's also worth to document the change in Python 3.5 and 3.6 documentation with a ".. versionchanged::"? It can be suprising to get a different behaviour between two bugfix versions (Python 3.5.1 and 3.5.2). I'm not saying that we must not fix bugs, just that it helps users to document them :-) It will also avoid bug reports about this behaviour change. Example of user report: issue bpo-26260, "it works on Python 2 but no more on Python 3" ;-) |
Reopened. Otherwise Martin's comments can be lost in spam.
I don't think this is needed. This isn't new feature, this is just a bugfix, old behavior obviously was incorrect. We don't add the versionchanged directive for every bugfix. |
Ok no problem, it was just a suggestion. In asyncio, we try to document behaviour change but asyncio is different. |
[Martin]
No need for a news entry in default. We haven't released 3.6 yet, so the bugfix from 3.5 is just a carry forward of the corrected behavior. That isn't news ;-) [Martin]
I'm not sure I see what was missed here (I am not a mercurial jedi). My 3.5 checkout shows the change and so does the 3.6. What needs to be done at this point ? [Serhiy]
I concur with Serhiy. |
NEWS entries for 3.5.2 are not included in the NEWS file for 3.6. Usually we add bugfix NEWS entries in all branches that take the fix.
It is too late to do anything except add a NEWS entry. Branches were merged by Martin when his applied next patch. |
I thought the 3.6.0 NEWS file generally listed bugs fixed since (at least) the 3.5.0 release. But the NEWS in default has no mention of bpo-26194. Serhiy is right that there is nothing more to do for the merging problem. I was just pointing out that something in your “merge” commit must have not worked properly. Normally a merge looks like revision 58266f5101cc (there are two parents listed). But in this case I pulled from default expecting to get all the 3.5 changes as well, but never got your latest 3.5 change until I pulled it explicitly. |
I can't say that I agree with putting an additional entry the 3.6 NEWS. In the past, I've not made news entries in unreleased versions regarding bug fixes in prior versions. That seems pointless to me and it would tend to clutter a document whose size already gets in the way of its usability. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: