Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry on load failure #271

Merged
merged 4 commits into from
Apr 26, 2023
Merged

Retry on load failure #271

merged 4 commits into from
Apr 26, 2023

Conversation

pankajroark
Copy link
Collaborator

It's common to load weights over the network during model.load, and it's easy to forget to add retries. Add retries by default, and make it configurable through environment variable.

Copy link
Collaborator

@bolasim bolasim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to use a retry library instead of coding up our own, but it's your call

@@ -0,0 +1,20 @@
from typing import Callable
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why write this from scratch as opposed to using tenacity?
https://tenacity.readthedocs.io/en/latest/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a backoff?

@pankajroark
Copy link
Collaborator Author

I think it would be best to use a retry library instead of coding up our own, but it's your call

Trying to keep the dependencies low in the model wrapper. Adding a library seems overkill for just one function. We can add it if we need advanced cases like exponential backoff, current use case is pretty basic.

@pankajroark pankajroark merged commit 278b4cd into main Apr 26, 2023
3 checks passed
@pankajroark pankajroark deleted the pg/retry_load branch April 26, 2023 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants