-
Notifications
You must be signed in to change notification settings - Fork 26
Closed
Description
When processing a commit request at v1/connection/commit/ the blobber attempts to get the write pool from the sharders. a consensus of 50% is required.
If this consensus is not reached (for example one or multiple sharders returning a 503) the blobber returns the following response:
HTTP 400
{"code":"write_pre_redeem","error":"write_pre_redeem: can't request write pools from sharders: requesting write pools stat: invalid_response: Sharder responses were invalid. Hash mismatch"}
There are a few issues with this that could hopefully be resolved to make a more resilient API:
- HTTP 503 is a retryable status code. Rather than the blobber immediately excluding a sharder in a 503 scenario, it should inspect the response code and if a 503, wait a few seconds and retry the request. This will resolve the issue the majority of the time
- The blobber returns the response "Sharder responses were invalid. Hash mismatch" which is misleading as the issue is really with connectivity (this message will create a lot of confusion for developers who will assume there is a data integrity issue). In fact this message is a generic catch all for "there was an issue with the sharder response" - could the error messages be made more specific eg. Include the sharder response?
- The response code from the blobber to the client is 400 Bad request. This suggests an issue with the client's request that needs to be fixed on their side. In reality this is an environmental issue with the network and the request can be retried as-is.
Clients often use http response status codes for error control flow and 4xx is unlikely to be retried. Therefore in keeping with REST standards the status code from the blobber should be 5xx (our fault,retry) not 4xx (your fault don't retry).
NoSkillGuy
Metadata
Metadata
Assignees
Labels
No labels