Skip to content

Add grpc headers and deadline to unary/unary calls#3433

Merged
wild-endeavor merged 1 commit into
masterfrom
client-tweaks-amend-grpc-settings
May 18, 2026
Merged

Add grpc headers and deadline to unary/unary calls#3433
wild-endeavor merged 1 commit into
masterfrom
client-tweaks-amend-grpc-settings

Conversation

@wild-endeavor
Copy link
Copy Markdown
Contributor

@wild-endeavor wild-endeavor commented May 18, 2026

Why are the changes needed?

FlyteRemote.wait() polls execution state by repeatedly calling sync_execution(). When one of the underlying unary gRPC calls blocks on a stale or unhealthy connection, the wait loop can appear to hang because the loop timeout cannot be evaluated until the RPC returns.

The client also did not set gRPC keepalive options explicitly, so idle TCP connection drops through load balancers or NAT could leave the next RPC waiting on a connection that should have been detected as unhealthy.

What changes were proposed in this pull request?

  • Add default gRPC keepalive options to RawSynchronousFlyteClient so the client can detect dead idle connections and reconnect.
  • Add a unary-unary-only scoped deadline interceptor for gRPC calls.
  • Apply a default 60 second per-RPC deadline while FlyteRemote.sync_execution() is running.
  • Keep streaming RPCs unaffected by the deadline interceptor.
  • Map gRPC DEADLINE_EXCEEDED to FlyteTimeout and avoid retrying deadline-expired calls.
  • Add unit coverage for the keepalive options, deadline interceptor behavior, channel wrapping, and sync_execution deadline scoping.

How was this patch tested?

Targeted unit tests:

/Users/ytong/envs/flytekit/bin/python -m pytest \
  tests/flytekit/unit/clients/test_raw.py \
  tests/flytekit/unit/clients/test_auth_helper.py::test_wrap_exceptions_channel \
  tests/flytekit/unit/clients/test_deadline_interceptor.py \
  tests/flytekit/unit/remote/test_remote.py::test_remote_fetch_execution \
  tests/flytekit/unit/remote/test_remote.py::test_sync_execution_sets_default_rpc_deadline \
  tests/flytekit/unit/remote/test_remote.py::test_sync_execution_accepts_rpc_deadline_override \
  tests/flytekit/unit/remote/test_remote.py::test_sync_execution_does_not_override_active_rpc_deadline \
  -q

Result: 14 passed.

A broader local run of tests/flytekit/unit/clients/test_auth_helper.py hit unrelated macOS keyring failures in the existing PKCE/deviceflow tests, so only the changed auth-helper test was included in the targeted verification above.

Setup process

No special setup beyond the existing flytekit development environment.

Screenshots

Not applicable.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

None.

Docs link

Not applicable.

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@wild-endeavor wild-endeavor changed the title Add grpc headers and deadline to grpc calls Add grpc headers and deadline to unary/unary calls May 18, 2026
@wild-endeavor wild-endeavor merged commit 3157ce5 into master May 18, 2026
56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants