New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop setting CAS upload deadline #6433

Open
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
2 participants
@illicitonion
Copy link
Contributor

illicitonion commented Aug 31, 2018

It turns out this deadline applies to the whole stream, not per chunk,
so is completely useless.

We can re-add something more clever if we find a need.

@illicitonion illicitonion requested review from stuhood , dgassaway and blorente Aug 31, 2018

@@ -276,8 +273,7 @@ def register_bootstrap_options(cls, register):
default=DEFAULT_EXECUTION_OPTIONS.remote_store_chunk_bytes,
help='Size in bytes of chunks transferred to/from the remote file store.')
register('--remote-store-chunk-upload-timeout-seconds', type=int, advanced=True,
default=DEFAULT_EXECUTION_OPTIONS.remote_store_chunk_upload_timeout_seconds,
help='Timeout (in seconds) for uploads of individual chunks to the remote file store.')
default=0, help='Deprecated and ignored.')

This comment has been minimized.

@stuhood

stuhood Aug 31, 2018

Member

Would be good to include a target removal date here, so that it's more obvious when it can actually be removed.

This comment has been minimized.

@illicitonion

illicitonion Sep 3, 2018

Contributor

Kind of tempted to just remove the flag, as I'd be surprised if anyone was actually setting it... Any strong objection? :)

This comment has been minimized.

@stuhood

stuhood Sep 4, 2018

Member

Nope.

illicitonion added some commits Aug 31, 2018

Stop setting CAS upload deadline
It turns out this deadline applies to the whole stream, not per chunk,
so is completely useless.

We can re-add something more clever if we find a need.

@illicitonion illicitonion force-pushed the twitter:dwagnerhall/fsutil/deadline branch from 7452217 to f616d7b Sep 5, 2018

@illicitonion

This comment has been minimized.

Copy link
Contributor

illicitonion commented Sep 7, 2018

This is currently blocked because grpcio doesn't have an obvious way of erroring out if a remote host can't be DNS resolved.

We can try to hack around this by manually probing or something, or racing all operations with timeouts, or possibly switching to Tower will help us out. I'll investigate some time soon-ish.

@illicitonion

This comment has been minimized.

Copy link
Contributor

illicitonion commented Sep 7, 2018

Looks like this may be related to grpc/grpc#14170 or similar, but the grpcio version of grpc is pretty old (~1 year), so this is hard to cherry-pick in.

I'm going to punt on this until we switch to Tower, which has an explicit TCP connection future which may fail in a useful way. In the mean time, we can bump the default to be large enough that we'll never hit this error, at the expense of never failing if someone passes a bad DNS address

illicitonion added a commit to twitter/pants that referenced this pull request Sep 7, 2018

Extend fs_util deadline to 30 minutes
This deadline is really only in place because otherwise DNS failures
leave this hanging forever.

Make fs_util have a very long deadline (because it's not configurable,
like it is inside pants) until we switch to Tower (where we can more
carefully control specific components of timeouts).

See pantsbuild#6433 for more context.

illicitonion added a commit to twitter/pants that referenced this pull request Sep 10, 2018

Extend fs_util deadline to 30 minutes
This deadline is really only in place because otherwise DNS failures
leave this hanging forever.

Make fs_util have a very long deadline (because it's not configurable,
like it is inside pants) until we switch to Tower (where we can more
carefully control specific components of timeouts).

See pantsbuild#6433 for more context.

illicitonion added a commit that referenced this pull request Sep 10, 2018

Extend fs_util deadline to 30 minutes (#6471)
This deadline is really only in place because otherwise DNS failures
leave this hanging forever.

Make fs_util have a very long deadline (because it's not configurable,
like it is inside pants) until we switch to Tower (where we can more
carefully control specific components of timeouts).

See #6433 for more context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment