Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change network retry delay strategy #2354

Merged
merged 2 commits into from Aug 29, 2019
Merged

change network retry delay strategy #2354

merged 2 commits into from Aug 29, 2019

Conversation

guolinke
Copy link
Collaborator

to fix #2348

src/network/linkers_socket.cpp Outdated Show resolved Hide resolved
@StrikerRUS
Copy link
Collaborator

Can't we simply expose connect_fail_retry_cnt and connect_fail_delay_time to config and let users decide how long they want to wait according to their network configuration?

@guolinke
Copy link
Collaborator Author

@StrikerRUS I think the exponential backoff strategy is better for the network connecting retry.
For the user-configurable, actually, I want to expose them for the first time. But then I think it is not a good idea, for we don't want to user to tune them.

And exponential backoff strategy should work for most cases:
for stable network env, it should connect successfully in a short time.
for unstable network env, it also provides enough time buffers to retry.

@StrikerRUS
Copy link
Collaborator

@guolinke Ah, got it! Thanks a lot for your detailed explanation!

@jameslamb jameslamb self-requested a review August 29, 2019 04:38
Copy link
Collaborator

@jameslamb jameslamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes and thorough explanation. Looks good!

@guolinke guolinke merged commit 0551f77 into master Aug 29, 2019
@StrikerRUS StrikerRUS deleted the network-conf branch August 29, 2019 09:14
@lock lock bot locked as resolved and limited conversation to collaborators Mar 10, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Smarter Backoffs for MPI ring connection
3 participants