Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make logging of RPC retries configurable #2486

Merged
merged 2 commits into from
Aug 13, 2018

Conversation

ulzha
Copy link
Contributor

@ulzha ulzha commented Aug 10, 2018

Motivation and Context

Logging of intermittent RPC exceptions and retries as they happen makes logs spammy and obscures other errors, so we want a way to disable it. If the exceptions persist, the final RPCError does log the last exception anyway.

Especially when long running (hanging, even) tasks are considered, with periods of silence in the logs, any intermittent "Failed connecting to remote scheduler" logging from KeepAliveThread looks very flashy and misleads troubleshooting often, in our experience.

The health of communication channels (presence of intermittent failures) probably should be easy to monitor through dedicated tooling, as opposed to logging it interspersed with task logs.

Have you tested this? If so, how?

Ought to be covered with unit tests.

This simplifies away unnecessary propagation, cleans up argument lists.

As far as I can tell, log_exceptions is True in all execution paths. It seems that 6dfe9af removed the last occasion of it being (spuriously) set to False.
Copy link
Contributor

@Tarrasch Tarrasch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@@ -116,31 +116,34 @@ def __init__(self, url='http://localhost:8082/', connect_timeout=None):

self._rpc_retry_attempts = config.getint('core', 'rpc-retry-attempts', 3)
self._rpc_retry_wait = config.getint('core', 'rpc-retry-wait', 30)
self._rpc_log_retries = config.getboolean('core', 'rpc-log-retries', True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this part of the code still doesn't use the config class syntax? Oh well maybe better done in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, better be a separated PR.

try:
self.get_work(fetch_results)
self.fail("get_work should have thrown RPCError")
except luigi.rpc.RPCError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you use self.assertRaises?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done to assert about e.sub_exception.

@ulzha ulzha changed the title (WIP) Make logging of RPC retries configurable Make logging of RPC retries configurable Aug 13, 2018
@ulzha ulzha merged commit c696f40 into spotify:master Aug 13, 2018
@ulzha ulzha deleted the ulzha/rpc-log-retries branch August 13, 2018 11:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants