New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved logging, and gathering children in parellel #75
Conversation
async_service/asyncio.py
Outdated
if self.daemon: | ||
raise DaemonTaskExit(f"Daemon task {self} exited") | ||
|
||
while self.children: | ||
await tuple(self.children)[0].wait_done() | ||
await asyncio.gather(*( | ||
child.wait_done() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is an improvement.
self.children
retains all of the active child tasks... As they each finish they are removed from the list. In this part of the code what we care about is not blocking the event loop while we wait for all the child tasks to complete. My intuition says that the asyncio.gather
approach is going to be heavier while not actually giving us any change in behavior since it will have to orchestrate the machinery of awaiting all of the tasks rather than a single one.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would require performance testing. If there are 100 children, and they happen to complete in order, then this loop would re-activate 100 times, instead of only reactivating once. Not sure how that compares to whatever overhead gather()
might add.
Without performance testing, I think the gather()
approach is easier to read. It's obvious what it's doing: wait for everything to complete. When looking at the current code, I have follow-up questions, like why is the current code looking at only the first one and then looping again? (It requires understanding that the children disappear from the list on their own, which I didn't know/remember).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 I'll make the light suggestion that we could keep it as it was and augment with a comment to explain why it is implemented that way.
I'm only -0 on the asyncio.gather
approach because of some gut-based confidence that it will be less performant in the vast majority of cases since it will create a new task for every child, run it using ensure_future
which means more event loop context switching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the implementation of asyncio.gather
btw if you're curious
@@ -334,6 +330,5 @@ async def _run_and_manage_task(self, task: TaskAPI) -> None: | |||
else: | |||
if task.parent is None: | |||
self._root_tasks.remove(task) | |||
self.logger.debug("%s: task %s exited cleanly.", self, task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about placing these behind a library level DEBUG flag of some sort that can be enabled/disabled with an environment variable. These messages, while extremely noisy, are also really useful for debugging "what happened".
39a5e4a
to
9c96cbe
Compare
08a6898
to
e04128c
Compare
e04128c
to
d796bd1
Compare
@@ -321,7 +328,7 @@ async def _run_and_manage_task(self, task: TaskAPI) -> None: | |||
self.logger.debug("%s: task %s raised CancelledError.", self, task) | |||
raise | |||
except Exception as err: | |||
self.logger.debug( | |||
self.logger.error( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure about logging this at error level since the exception will get re-raised upon service exit via a MultiError
. I'm interested in your reasoning for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I don't remember the exact scenario, but there was an error that was getting swallowed into the debug logs. I only saw it when I flipped on this logging. It's quite rare, and it is way more painful to have a real failure swallowed than it would be to have a duplicate error printed, IMO.
* new issue and pr templates
What was wrong?
asyncio reports slow coros as originating in async-service, making it difficult to debugHow was it fixed?
name the task before running itWas causing a difficult bugTODO:
Cute Animal Picture