fix: configurable AI request timeout with retry on timeout errors#36
Conversation
Agent-Logs-Url: https://github.com/Blueion76/OctoGen/sessions/bd8c3c72-3777-4dfc-bc9d-2d3e23d02686 Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a configurable AI API request timeout and extends the AI generation retry logic to also retry on timeout-like failures, improving resilience for slow models / large outputs.
Changes:
- Introduces
AI_REQUEST_TIMEOUT(default 300s, min 30s) in config/models/docs and passes it intoAIRecommendationEngine. - Removes the hardcoded OpenAI request timeout and uses the configured timeout instead.
- Updates retry logic to treat timeout errors as retryable with exponential backoff (similar to rate limits).
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Documents AI_REQUEST_TIMEOUT in the env var quick-reference table. |
| ENV_VARS.md | Adds full documentation for AI_REQUEST_TIMEOUT and updates the variable summary. |
| .env.example | Adds AI_REQUEST_TIMEOUT example entry. |
| octogen/models/config_models.py | Adds AIConfig.request_timeout with validation (min 30s). |
| octogen/config.py | Loads AI_REQUEST_TIMEOUT into the env-derived config dict. |
| octogen/main.py | Passes request_timeout into AIRecommendationEngine (but currently doesn’t load the env var in this entrypoint). |
| octogen/ai/engine.py | Uses self.request_timeout for OpenAI calls and retries on timeout-like errors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -107,6 +108,7 @@ def __init__(self, dry_run: bool = False): | |||
| base_url=self.config.get("ai", {}).get("base_url"), | |||
| max_context_songs=max_context, | |||
| max_output_tokens=max_output, | |||
| request_timeout=request_timeout, | |||
| ) | |||
There was a problem hiding this comment.
request_timeout is read from self.config['ai'], but _load_config_from_env() in this module never populates an ai.request_timeout key (and AI_REQUEST_TIMEOUT isn’t read anywhere in main.py). As a result, the new env var won’t actually be configurable via the main entrypoint and will always fall back to 300.
Wire AI_REQUEST_TIMEOUT into the ai config dict (e.g., via _get_env_int) and enforce the documented minimum (30s) in _validate_env_config() or when parsing the env var.
| # Check if it's a timeout error — cover both exception type names and | ||
| # message-based detection for provider-specific exceptions (e.g. httpx, | ||
| # openai, google-genai) that don't all inherit from TimeoutError. | ||
| is_timeout = ( | ||
| isinstance(e, (TimeoutError, ConnectionError)) | ||
| or 'Timeout' in error_type | ||
| or any(phrase in error_msg for phrase in [ | ||
| 'timed out', 'timeout', 'read timeout', | ||
| 'connect timeout', 'connection timeout', | ||
| ]) |
There was a problem hiding this comment.
is_timeout treats all ConnectionError exceptions as timeouts. ConnectionError also covers non-timeout failures (e.g., DNS resolution errors, connection refused, misconfigured AI_BASE_URL), so this can introduce long retry delays where an immediate failure would be more actionable.
Consider narrowing the type check to timeout-specific exceptions only (and/or gating connection errors on a timeout-related message/errno), or splitting “timeout” vs “connection” handling so only genuine timeouts get the exponential backoff retries.
|
@copilot apply changes based on the comments in this thread |
…ut detection Agent-Logs-Url: https://github.com/Blueion76/OctoGen/sessions/346e335b-f71c-4e5d-87fd-9a4572dd1ddc Co-authored-by: Blueion76 <128919662+Blueion76@users.noreply.github.com>
Both issues addressed in ea02085:
|
AI API calls used a hardcoded 120s timeout with no env var to override it, and timeout errors caused immediate failure with no retry — leaving users with large
AI_MAX_OUTPUT_TOKENSor slow connections stuck in a dead end.Changes
AI_REQUEST_TIMEOUTenv var (default300s, min30s) — wired throughconfig.py→AIConfig→AIRecommendationEngine.__init__timeout=120in_generate_with_openai— now usesself.request_timeout_generate_with_retry— timeout errors now trigger exponential backoff retries (same as rate limits), detected viaisinstance(e, (TimeoutError, ConnectionError))+ type-name/message matching for broad provider coverage (httpx, openai SDK, google-genai)ENV_VARS.md(new section + summary table),README.mdquick-reference table,.env.example# Increase if gpt-4o-mini or other slow models time out on large responses AI_REQUEST_TIMEOUT=600