You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Retrying a failing request 100 times in 1 second intervals can be a harmful pattern for connecting with external services. At the scale that AO SUs operate, this behavior can cause substantial heartburn for downstream services. It would be great if an exponential backoff pattern could be used here. Thanks!
The text was updated successfully, but these errors were encountered:
I would also consider adding some jitter here :) Also I believe the ultimate solution is to move the dataItems delivery to a separate/background job (#877)
In uploader.rs, add exponential backoff to the loop that retries the upload, if it reaches some specified number of retries, say 10, add it to a persistent queue of items that need to be tried again later. And have a background process pulling from this queue and uploading.
uploader.rs will need access to store.rs, so store.rs will become a dependency of uploader.rs, new methods will need to be added to store.rs to persist the list of items that need to be retried. Perhaps they can be stored in rocksdb with a different key
The code in question:
https://github.com/permaweb/ao/blob/main/servers/su/src/domain/clients/uploader.rs#L75-L90
Retrying a failing request 100 times in 1 second intervals can be a harmful pattern for connecting with external services. At the scale that AO SUs operate, this behavior can cause substantial heartburn for downstream services. It would be great if an exponential backoff pattern could be used here. Thanks!
The text was updated successfully, but these errors were encountered: