Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ray_client]: Monitor client stream errors #13386

Merged
merged 5 commits into from
Jan 26, 2021

Conversation

barakmich
Copy link
Contributor

@barakmich barakmich commented Jan 13, 2021

Why are these changes needed?

Fixes the blocking bug for clients that have the server disconnect on them. Adds better connection monitoring in general.

Related issue number

Parallel to #13376
Closes #13353. (reopen or new bug if more issues come up)

Checks

  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Copy link
Contributor

@ericl ericl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for this?

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 13, 2021
@barakmich
Copy link
Contributor Author

Added test and rebased off of #13478 (so it's now dependant) as they share a bit of startup code.

@barakmich barakmich removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 22, 2021
@barakmich
Copy link
Contributor Author

Rebased again so we're ready to go when the dependent PR gets in

@ericl
Copy link
Contributor

ericl commented Jan 24, 2021

Should be rebased now

@ericl ericl added the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 24, 2021
Change-Id: I002e413fce18de221a1c6e961487d1c9aa00be59
Change-Id: I6296a2c8ad23f1296c218048fccbcce4222a94ef
Change-Id: I5837b7ccb5050d3caa2f70f95b18c96176d73c5a
Change-Id: Id2a3a49c4e6b61ca78356214408c7c873bcc1ee0
@barakmich barakmich removed the @author-action-required The PR author is responsible for the next step. Remove tag to send back to the reviewer. label Jan 25, 2021
Change-Id: Ie1fad3233482cea45471c527187263c47338e522
@barakmich
Copy link
Contributor Author

@ericl PTAL; I think the test failure is unrelated

@ericl ericl merged commit 0c46d09 into ray-project:master Jan 26, 2021
fishbone pushed a commit to fishbone/ray that referenced this pull request Feb 16, 2021
fishbone added a commit to fishbone/ray that referenced this pull request Feb 16, 2021
@pcmoritz pcmoritz mentioned this pull request Aug 30, 2021
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[core] Data client operations block forever if server fails.
2 participants