Issue:
Encountering botocore.errorfactory.ThrottlingException when using invoke_model due to rate limits. AWS recommends exponential backoff for API retries.
Current Function:
def _call_model(self, body: str) -> str: return self.predictor.invoke_model(modelId=self._model_name, body=body, accept="application/json", contentType="application/json")
Question:
Would a PR implementing exponential backoff with the backoff library for this operation be welcome? Keen to contribute a solution to improve request reliability.