Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved logging, and gathering children in parellel #75

Merged
merged 4 commits into from Sep 14, 2020

Conversation

carver
Copy link
Contributor

@carver carver commented Sep 4, 2020

What was wrong?

  • Logs are very noisy
  • Uncaught exception log was emitted as debug, easy to miss
  • Unexpected way to gather children
  • asyncio reports slow coros as originating in async-service, making it difficult to debug

How was it fixed?

  • Remove debug logs that are happening a ton (there is no DEBUG2 here)
  • Upgrade log level to error
  • Break up line in two, comment it
  • name the task before running it Was causing a difficult bug

TODO:

  • make it friendly to py <3.8

Cute Animal Picture

put a cute animal picture link inside the parentheses

if self.daemon:
raise DaemonTaskExit(f"Daemon task {self} exited")

while self.children:
await tuple(self.children)[0].wait_done()
await asyncio.gather(*(
child.wait_done()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is an improvement.

self.children retains all of the active child tasks... As they each finish they are removed from the list. In this part of the code what we care about is not blocking the event loop while we wait for all the child tasks to complete. My intuition says that the asyncio.gather approach is going to be heavier while not actually giving us any change in behavior since it will have to orchestrate the machinery of awaiting all of the tasks rather than a single one.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it would require performance testing. If there are 100 children, and they happen to complete in order, then this loop would re-activate 100 times, instead of only reactivating once. Not sure how that compares to whatever overhead gather() might add.

Without performance testing, I think the gather() approach is easier to read. It's obvious what it's doing: wait for everything to complete. When looking at the current code, I have follow-up questions, like why is the current code looking at only the first one and then looping again? (It requires understanding that the children disappear from the list on their own, which I didn't know/remember).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I'll make the light suggestion that we could keep it as it was and augment with a comment to explain why it is implemented that way.

I'm only -0 on the asyncio.gather approach because of some gut-based confidence that it will be less performant in the vast majority of cases since it will create a new task for every child, run it using ensure_future which means more event loop context switching.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -334,6 +330,5 @@ async def _run_and_manage_task(self, task: TaskAPI) -> None:
else:
if task.parent is None:
self._root_tasks.remove(task)
self.logger.debug("%s: task %s exited cleanly.", self, task)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about placing these behind a library level DEBUG flag of some sort that can be enabled/disabled with an environment variable. These messages, while extremely noisy, are also really useful for debugging "what happened".

@carver carver force-pushed the quieter-logs branch 4 times, most recently from 39a5e4a to 9c96cbe Compare September 5, 2020 00:07
@carver carver marked this pull request as ready for review September 11, 2020 22:38
@carver carver changed the title Improved logging, named tasks, and gathering children in parellel Improved logging, and gathering children in parellel Sep 11, 2020
@@ -321,7 +328,7 @@ async def _run_and_manage_task(self, task: TaskAPI) -> None:
self.logger.debug("%s: task %s raised CancelledError.", self, task)
raise
except Exception as err:
self.logger.debug(
self.logger.error(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about logging this at error level since the exception will get re-raised upon service exit via a MultiError. I'm interested in your reasoning for this.

Copy link
Contributor Author

@carver carver Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I don't remember the exact scenario, but there was an error that was getting swallowed into the debug logs. I only saw it when I flipped on this logging. It's quite rare, and it is way more painful to have a real failure swallowed than it would be to have a duplicate error printed, IMO.

@carver carver merged commit e2380e9 into ethereum:master Sep 14, 2020
@carver carver deleted the quieter-logs branch September 14, 2020 22:28
pacrob added a commit to pacrob/async-service that referenced this pull request Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants