Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document use of retry_on_error for dedicated inference endpoints #554

Merged

Conversation

jinnovation
Copy link
Contributor

Shortly after #549, the inference endpoint backend
was updated to block by default on model loading.
This PR adds documentation explaining how to
circumvent that blocking so that the user, if
desired, can handle the 500 errors themselves.

Shortly after huggingface#549, the inference endpoint backend
was updated to block by default on model loading.
This PR adds documentation explaining how to
circumvent that blocking so that the user, if
desired, can handle the 500 errors themselves.
@jinnovation
Copy link
Contributor Author

CC: @coyotte508

@coyotte508
Copy link
Member

Thank you!

Just a note, the backend hasn't been updated yet but should be in the coming days.

(And waiting for endpoints to be scaled actually needs #555, my bad)

@jinnovation
Copy link
Contributor Author

(And waiting for endpoints to be scaled actually needs #555, my bad)

Makes sense, thanks for the heads up 👍

Just a note, the backend hasn't been updated yet but should be in the coming days.

Will defer to you folks on when is best to merge this PR in.

@coyotte508 coyotte508 merged commit b757f81 into huggingface:main Mar 14, 2024
2 of 3 checks passed
@coyotte508
Copy link
Member

Backend was updated, and version 2.6.6 on the lib should work fine with inference endpoints :)

@jinnovation
Copy link
Contributor Author

Amazing (: will give it a try soon-ish and report back.

@jinnovation jinnovation deleted the jinnovation/docs-retry-on-error branch March 14, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants