You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While trying to do some shenanigans, I built a utility to simplify yield gen.multi(..., quiet_exceptions=...) for a bunch of code. One of my goals was to be able to abort a yield on the first-available error when yielding on multiple futures, rather than waiting for all futures to resolve - this can lead to a fairly significant response-time improvements when multiple errors occur, and one of them is due to a lengthy timeout.
So I made a small helper with WaitIterator, and started getting errors like this in tests:
File ".../lib/helpers.py", line 275, in parallel
result = yield iterator.next()
File ".../env/local/lib/python2.7/site-packages/tornado/gen.py", line 428, in next
self._return_result(self._finished.popleft())
File ".../env/local/lib/python2.7/site-packages/tornado/gen.py", line 445, in _return_result
self.current_index = self._unfinished.pop(done)
KeyError: <Future at 0x7f2fe2eedd10 state=finished returned MyEntity>
After a bit of hunting, I narrowed it down to this (simplified pieces of WaitIterator):
classWaitIterator(object):
def__init__(self, *args, **kwargs):
ifargsandkwargs:
raiseValueError(
"You must provide args or kwargs, not both")
ifkwargs:
self._unfinished=dict((f, k) for (k, f) inkwargs.items())
futures=list(kwargs.values())
else:
# note that this is a dictionary keyed off items in argsself._unfinished=dict((f, i) for (i, f) inenumerate(args))
# while this is a listfutures=args# and this is also a listself._finished=collections.deque()
self.current_index=self.current_future=Noneself._running_future=Noneforfutureinfutures:
future.add_done_callback(self._done_callback)
defnext(self):
self._running_future=TracebackFuture()
ifself._finished:
# pops off a future from the listself._return_result(self._finished.popleft())
returnself._running_futuredef_return_result(self, done):
chain_future(done, self._running_future)
self.current_future=done# and this removes the *single* future-key that matchesself.current_index=self._unfinished.pop(done)
This crash can be demonstrated with code like this:
In a nutshell, we have some parallel calls that we've mocked to return the same Future. This results in a single future being in the list multiple times, which gets deduplicated in the _unfinished dictionary, so the second duplicate that's finished errors with a KeyError.
This isn't actually breaking anything currently, but it strikes me as a potential landmine, and would've broken some experiments I've been planning. The workaround for users like me is to dedup manually / wrap everything in a new Future / etc, which I can do, but this was at least surprising and took some time to hunt down.
IMO this needs one of two things. Both seem fine to me:
Don't convert to a dictionary like this, keep both as lists. _unfinished.pop(_unfinished.index(done)) in _return_result wouldn't have this problem.
I personally like this. Parallel yields are likely to be relatively small quantities, there's a decent chance that it'll perform better in most cases (at least, in most languages - small list scanning and indexing often out-performs hashing). It also lets WaitIterator return whatever was passed in, regardless of what it was given, which is what I expected.
Document it. This is a pretty low-level tool, it shouldn't under any circumstances be surprising people who haven't read the source in detail. At the very least this isn't expected behavior from reading the docs, since it allows passing in a list and not only sets/dicts.
I can probably get a pull review up if it'd help, but I haven't yet looked into contributing here, and it seems like it'd be a pretty small change either way. And it's a bit esoteric, so I figured it needed some discussion to fit it in best with existing code :) Let me know what you think!
The text was updated successfully, but these errors were encountered:
I see WaitIterator as being usable for fairly large quantities, so I wouldn't want to change it to a form that would perform poorly for larger sets. I would also like to support reusing futures (one of the reasons Tornado's Futures don't support cancellation in the same way as asyncio Futures is that I want to support caching and other abstractions like what I think you're contemplating here), so I don't want to just document the limitation.
I think the best solution is to make _unfinished a multimap, allowing multiple keys for the same future.
While trying to do some shenanigans, I built a utility to simplify
yield gen.multi(..., quiet_exceptions=...)
for a bunch of code. One of my goals was to be able to abort ayield
on the first-available error when yielding on multiple futures, rather than waiting for all futures to resolve - this can lead to a fairly significant response-time improvements when multiple errors occur, and one of them is due to a lengthy timeout.So I made a small helper with WaitIterator, and started getting errors like this in tests:
After a bit of hunting, I narrowed it down to this (simplified pieces of WaitIterator):
This crash can be demonstrated with code like this:
In a nutshell, we have some parallel calls that we've mocked to return the same Future. This results in a single future being in the list multiple times, which gets deduplicated in the
_unfinished
dictionary, so the second duplicate that's finished errors with a KeyError.This isn't actually breaking anything currently, but it strikes me as a potential landmine, and would've broken some experiments I've been planning. The workaround for users like me is to dedup manually / wrap everything in a new Future / etc, which I can do, but this was at least surprising and took some time to hunt down.
IMO this needs one of two things. Both seem fine to me:
_unfinished.pop(_unfinished.index(done))
in_return_result
wouldn't have this problem.I can probably get a pull review up if it'd help, but I haven't yet looked into contributing here, and it seems like it'd be a pretty small change either way. And it's a bit esoteric, so I figured it needed some discussion to fit it in best with existing code :) Let me know what you think!
The text was updated successfully, but these errors were encountered: