FYI, I opened this issue for LiteLLM:
The Replicate handler gracefully handles the interim processing state, but does not handle the starting state, throwing an exception.
The current completion() and async_completion() is this:
if ( response.status_code == 200 and response.json().get("status") == "processing" ): continue
A more accurate implementation would be:
if response.status_code == 200 and response.json().get("status") not in [ "succeeded", "failed", "canceled", ]: continue
I implemented these changes in a fork and verified they work in our IBM Granite notebooks (e.g., this notebook)
Cheers,
Jonathan.