Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make LoggingThread recover on all errors #106

Closed

Conversation

rohanpm
Copy link
Member

@rohanpm rohanpm commented Jan 4, 2019

Previously, LoggingThread would recover from any XML-RPC fault,
but would stop when any other type of exception was encountered.
That is a problem, as it means the worker will permanently give
up sending messages to the hub when all kinds of temporary
issues occur (e.g. a temporary network disruption between worker
and hub). The task underneath may continue running for hours,
with all log messages being discarded.

Given the nature of this thread, it makes more sense to attempt
recovering from all kinds of errors, as we should try hard not
to lose log messages from a task.

Fixes #60

Previously, LoggingThread would recover from any XML-RPC fault,
but would stop when any other type of exception was encountered.
That is a problem, as it means the worker will permanently give
up sending messages to the hub when all kinds of temporary
issues occur (e.g. a temporary network disruption between worker
and hub). The task underneath may continue running for hours,
with all log messages being discarded.

Given the nature of this thread, it makes more sense to attempt
recovering from *all* kinds of errors, as we should try hard not
to lose log messages from a task.

Fixes release-engineering#60
@coveralls
Copy link

Pull Request Test Coverage Report for Build 122

  • 0 of 1 (0.0%) changed or added relevant line in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.008%) to 52.7%

Changes Missing Coverage Covered Lines Changed/Added Lines %
kobo/worker/logger.py 0 1 0.0%
Totals Coverage Status
Change from base Build 121: -0.008%
Covered Lines: 3123
Relevant Lines: 5926

💛 - Coveralls

Copy link
Member

@ralphbean ralphbean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

My only worry is that the previous behavior may actually matter in some scenarios where self._running is never correctly set to False - but an exception to exit the loop is unexpectedly needed. Any reason to think that might be the case?

@alexandrevicenzi
Copy link
Contributor

@rohanpm could you add a test case for this change? otherwise LGTM.

@rohanpm
Copy link
Member Author

rohanpm commented Jan 22, 2019

Yep, coveralls complained that coverage has decreased, I did plan to address that before moving ahead with this. It might give some more insight into Ralph's concern as well.

@PatrikKopriva
Copy link

Hi @rohanpm, do you plan to continue with this pull request?

@rohanpm
Copy link
Member Author

rohanpm commented Jul 25, 2019

Yes, I think so. Not sure when.

@rohanpm
Copy link
Member Author

rohanpm commented Mar 8, 2020

This still makes sense, but it doesn't look like I'm interested in pushing this forward now. I'll close the PR for cleaner dashboards. Will reopen if I come back to it.

@rohanpm rohanpm closed this Mar 8, 2020
crungehottman added a commit to crungehottman/kobo that referenced this pull request Feb 6, 2024
Previously, LoggingThread would recover from any XML-RPC fault, but
would stop when any other type of exception was encountered.
That is a problem, as it means the worker will permanently give
up sending messages to the hub when all kinds of temporary
issues occur (e.g. a temporary network disruption between worker
and hub). The task underneath may continue running for hours,
with all log messages being discarded.

Given the nature of this thread, it makes more sense to attempt
recovering from all kinds of errors, as we should try hard not
to lose log messages from a task.

Fixes release-engineering#60

(This commit is a reimplementation of
release-engineering#106)
lzaoral pushed a commit to lzaoral/kobo that referenced this pull request Mar 12, 2024
Previously, LoggingThread would recover from any XML-RPC fault, but
would stop when any other type of exception was encountered.
That is a problem, as it means the worker will permanently give
up sending messages to the hub when all kinds of temporary
issues occur (e.g. a temporary network disruption between worker
and hub). The task underneath may continue running for hours,
with all log messages being discarded.

Given the nature of this thread, it makes more sense to attempt
recovering from all kinds of errors, as we should try hard not
to lose log messages from a task.

Fixes release-engineering#60

(This commit is a reimplementation of
release-engineering#106)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LoggingThread should recover from temporary outage on hub
5 participants