Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix confusing error if compression not available #3742

Conversation

necaris
Copy link
Contributor

@necaris necaris commented Apr 26, 2020

Currently, if a compression method isn't available on the client, it raises a KeyError with the name of the method used (e.g. lz4). This updates the decompress function to raise InvalidCompressionMethodError when that happens, which should be easier to interpret.

It also adds an obtrusive "Danger!" warning when connecting to a scheduler if the client doesn't support the scheduler's default compression method.

I would appreciate some pointers as to how to make this configurable to throw an error (see note in the diff) which I think might be a good better-safe-than-sorry behavior. I'm also not sure how best to test this -- thanks in advance for the help!

Copy link
Member

@mrocklin mrocklin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this @necaris . Some rambling comments below.

distributed/protocol/compression.py Outdated Show resolved Hide resolved
distributed/protocol/compression.py Outdated Show resolved Hide resolved
DANGER! The scheduler's default compression method is not available
in this client! You may not be able to receive large result sets!
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of calling out specific library versions that may be particularly concerning.

If we do this though then we might want to be more comprehensive. The compression mismatch can cause a serious issue if any two components (clients, schedulers, or workers) disagree with each other, not just the client/scheduler relationship. In an ideal world two peers would do a little handshake in order to determine which compression to use. We're not there yet today though.

Short term though, if we want to put this somewhere I wonder if we want it to be in versions.py::error_message. I haven't actually dug into this yet, so please don't rush and change a bunch of things. This is mostly a question.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be totally fine putting it in versions::error_message for the moment, and rely on the Invalid compression method message for the user to recognize the error even if they aren't paying attention to the warning. I'd like to note that while I understand the warning is shown by default, it's really easy to skim over, especially when most of the mismatches are likely to be minor version issues that probably won't cause problems. So I'd ideally like to have some way to fail fast on connecting, if an error is likely.

I'll hold off on making changes for now, though -- thank you for taking a look!

Currently, if a compression method isn't available on the client, it
raises a `KeyError` with the name of the method used (e.g. `lz4`). This
updates the decompress function to raise `InvalidCompressionMethodError`
when that happens, which should be easier to interpret.
...if the client doesn't support the scheduler's default compression
method.
...in favor of `ValueError`
@necaris necaris force-pushed the fix/confusing-keyerror-if-compression-not-available branch from 95a4d4e to 32b6cad Compare April 29, 2020 23:42
@necaris
Copy link
Contributor Author

necaris commented Jul 3, 2020

Closing in favor of #3936 which addresses the more general case of mismatched libraries / Python versions.

@necaris necaris closed this Jul 3, 2020
@necaris necaris deleted the fix/confusing-keyerror-if-compression-not-available branch July 3, 2020 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants