Network calls to Ollama and OpenAI APIs have no retry logic — transient failures (timeouts, 429s, 503s) immediately fail the entire operation.
Suggested implementation:
- Add configurable retry count (default 3) and exponential backoff
- Retry on network errors, 429 (rate limit), and 5xx responses
- Log each retry attempt at warn level
- Apply to both
embed() and embedBatch() methods