[MISC] Remove max_time from SDK1 retry decorator#1800
[MISC] Remove max_time from SDK1 retry decorator#1800chandrasekharan-zipstack merged 3 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Organization UI Review profile: CHILL Plan: Pro Cache: Disabled due to Reviews > Disable Cache setting Knowledge base: Disabled due to 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Summary by CodeRabbit
WalkthroughRemoved configurable total-time cap ( Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
backend/sample.env (1)
87-89: Same missing explanatory comment line asworkers/sample.env— the fix diff above applies here identically.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/sample.env` around lines 87 - 89, The environment variable PROMPT_SERVICE_MAX_TIME is missing the explanatory comment present in the counterpart workers/sample.env; update the backend/sample.env entry for PROMPT_SERVICE_MAX_TIME to include the same explanatory comment describing that this is the retry configuration for prompt service calls (SDK1) and that it should be set high enough to allow at least one retry after long-running LLM calls, matching the comment used alongside PROMPT_SERVICE_MAX_TIME in workers/sample.env so both files stay consistent.
🧹 Nitpick comments (1)
workers/sample.env (1)
237-239: Minor: missing explanatory comment line (inconsistent withtools/structure/sample.env).Both
workers/sample.envandbackend/sample.env(line 87–89) are missing the third comment that explains why this value must be so large — the same line present intools/structure/sample.env:# Retry configuration for prompt service calls (SDK1) # Set high enough to allow at least 1 retry after long-running LLM calls +# since max_time is a total wall-clock budget including the initial attempt PROMPT_SERVICE_MAX_TIME=1800This is the most operationally useful comment; without it, future tuners lose the key insight that
max_timecovers the initial attempt and any retries combined.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@workers/sample.env` around lines 237 - 239, Add the missing explanatory comment above PROMPT_SERVICE_MAX_TIME that explains this max_time covers the total duration for the initial attempt plus any retries (hence must be large to allow at least one retry after long-running LLM calls); update both workers/sample.env and backend/sample.env to mirror the comment used in tools/structure/sample.env so future tuners understand why PROMPT_SERVICE_MAX_TIME is large.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (3)
backend/sample.envtools/structure/sample.envworkers/sample.env
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@workers/sample.env`:
- Around line 237-239: Add the PROMPT_SERVICE_MAX_TIME setting to the other
SDK1-dependent sample.env files by inserting the same comment and variable
declaration as shown (a comment "Retry configuration for prompt service calls
(SDK1)" and the line PROMPT_SERVICE_MAX_TIME=1800) into each file that
configures SDK1 (the classifier, text_extractor, platform-service and
prompt-service sample.envs); place it near the existing SDK1-related
timeout/retry settings so it’s visible alongside other prompt service config.
---
Duplicate comments:
In `@backend/sample.env`:
- Around line 87-89: The environment variable PROMPT_SERVICE_MAX_TIME is missing
the explanatory comment present in the counterpart workers/sample.env; update
the backend/sample.env entry for PROMPT_SERVICE_MAX_TIME to include the same
explanatory comment describing that this is the retry configuration for prompt
service calls (SDK1) and that it should be set high enough to allow at least one
retry after long-running LLM calls, matching the comment used alongside
PROMPT_SERVICE_MAX_TIME in workers/sample.env so both files stay consistent.
---
Nitpick comments:
In `@workers/sample.env`:
- Around line 237-239: Add the missing explanatory comment above
PROMPT_SERVICE_MAX_TIME that explains this max_time covers the total duration
for the initial attempt plus any retries (hence must be large to allow at least
one retry after long-running LLM calls); update both workers/sample.env and
backend/sample.env to mirror the comment used in tools/structure/sample.env so
future tuners understand why PROMPT_SERVICE_MAX_TIME is large.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py (1)
237-238: Consider guarding against a non-numeric value being forwarded.The implementation correctly follows the established pattern for optional env vars and the truthiness guard handles
None/""properly. One optional improvement: sincePROMPT_SERVICE_MAX_TIMEmust be a positive integer for SDK1 to accept it, a lightweight validation (e.g.str.isdigit()) before forwarding would surface misconfiguration at the workflow-execution layer rather than silently inside the tool container.- if self.prompt_service_max_time: - platform_vars[ToolRV.PROMPT_SERVICE_MAX_TIME] = self.prompt_service_max_time + if self.prompt_service_max_time: + if not self.prompt_service_max_time.isdigit(): + logger.warning( + f"Invalid {ToolRV.PROMPT_SERVICE_MAX_TIME}=" + f"'{self.prompt_service_max_time}'; expected a positive integer. Skipping." + ) + else: + platform_vars[ToolRV.PROMPT_SERVICE_MAX_TIME] = self.prompt_service_max_timeThat said, this is fully optional — the existing pattern for
ADAPTER_LLMW_*omits this check too, so deferring to a follow-up is reasonable.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py` around lines 237 - 238, The code forwards self.prompt_service_max_time to platform_vars[ToolRV.PROMPT_SERVICE_MAX_TIME] without validating it's a positive integer; add a lightweight guard in the same location (in unstract.workflow_execution.tools_utils near where self.prompt_service_max_time is checked) that verifies the value is numeric and positive (e.g., str(self.prompt_service_max_time).isdigit() and int(...) > 0) before setting platform_vars[ToolRV.PROMPT_SERVICE_MAX_TIME]; if validation fails, skip setting the env var and optionally log a warning/error referencing prompt_service_max_time so misconfiguration is surfaced early.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (2)
unstract/workflow-execution/src/unstract/workflow_execution/constants.pyunstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/workflow-execution/src/unstract/workflow_execution/tools_utils.py`:
- Around line 237-238: The code forwards self.prompt_service_max_time to
platform_vars[ToolRV.PROMPT_SERVICE_MAX_TIME] without validating it's a positive
integer; add a lightweight guard in the same location (in
unstract.workflow_execution.tools_utils near where self.prompt_service_max_time
is checked) that verifies the value is numeric and positive (e.g.,
str(self.prompt_service_max_time).isdigit() and int(...) > 0) before setting
platform_vars[ToolRV.PROMPT_SERVICE_MAX_TIME]; if validation fails, skip setting
the env var and optionally log a warning/error referencing
prompt_service_max_time so misconfiguration is surfaced early.
max_time acted as a wall-clock budget including the initial attempt, causing retries to never fire for long-running LLM calls (60-600s+). It also had a bug where it was passed as max_delay to calculate_delay, allowing a single backoff delay up to max_time seconds. max_retries is the correct mechanism to limit retries. Per-retry delay is now capped at 60s as a safety net. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
92b07b9 to
5df51c4
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py (1)
155-156: Hardcoded magic number60.0— extract to a named constant.The
60.0max-delay cap is an undocumented magic number. Consider extracting it to a module-level constant (e.g.,MAX_RETRY_DELAY_SECONDS = 60.0) for clarity and easier future adjustment.♻️ Suggested refactor
+MAX_RETRY_DELAY_SECONDS = 60.0 + + def retry_with_exponential_backoff( # noqa: C901 ... - # Calculate delay for next retry (capped at 60s) - delay = calculate_delay(attempt, base_delay, multiplier, 60.0, jitter) + # Calculate delay for next retry + delay = calculate_delay( + attempt, base_delay, multiplier, MAX_RETRY_DELAY_SECONDS, jitter + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py` around lines 155 - 156, The hardcoded max delay 60.0 used when calling calculate_delay should be extracted to a module-level named constant (e.g., MAX_RETRY_DELAY_SECONDS = 60.0) and used in the call instead of the literal; update the call in the retry logic that computes delay (where calculate_delay(attempt, base_delay, multiplier, 60.0, jitter) is invoked) to use MAX_RETRY_DELAY_SECONDS, and add a short doc comment on the constant explaining it caps exponential backoff delays.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/utils/retry_utils.py`:
- Around line 155-156: The hardcoded max delay 60.0 used when calling
calculate_delay should be extracted to a module-level named constant (e.g.,
MAX_RETRY_DELAY_SECONDS = 60.0) and used in the call instead of the literal;
update the call in the retry logic that computes delay (where
calculate_delay(attempt, base_delay, multiplier, 60.0, jitter) is invoked) to
use MAX_RETRY_DELAY_SECONDS, and add a short doc comment on the constant
explaining it caps exponential backoff delays.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (6)
unstract/sdk1/README.mdunstract/sdk1/src/unstract/sdk1/platform.pyunstract/sdk1/src/unstract/sdk1/prompt.pyunstract/sdk1/src/unstract/sdk1/utils/retry_utils.pyunstract/sdk1/tests/conftest.pyunstract/sdk1/tests/utils/test_retry_utils.py
💤 Files with no reviewable changes (5)
- unstract/sdk1/src/unstract/sdk1/prompt.py
- unstract/sdk1/tests/utils/test_retry_utils.py
- unstract/sdk1/README.md
- unstract/sdk1/src/unstract/sdk1/platform.py
- unstract/sdk1/tests/conftest.py
|
Closing in favor of new PR from renamed branch UN-3154-remove-retry-max-time |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.gitignore:
- Line 692: The unignore for .claude/skills/worktree won't match because parent
directories are still ignored by the .claude/* rule; update the .gitignore so
you first unignore the parent directories (e.g., add exceptions for .claude/ and
.claude/skills/) before the existing !.claude/skills/worktree entry and ensure
the unignore lines appear after the broader .claude/* ignore rule so nested
files under .claude/skills/worktree are actually tracked.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to Reviews > Disable Cache setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (1)
.gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ae5565b to
fc73fdb
Compare
Test ResultsSummary
Runner Tests - Full Report
SDK1 Tests - Full Report
|
|



What
max_timeparameter from SDK1 retry decorator entirelyWhy
max_timeacted as a wall-clock budget including the initial attempt, causing retries to never fire for long-running LLM calls (60-600s+)max_delaytocalculate_delay, allowing a single backoff delay up tomax_timesecondsmax_retriesis the correct mechanism to limit retries; per-retry delay is now capped at 60s as a safety netHow
max_timerelated parameters, environment variables, and logic from the SDK1 retry decoratormax_retrieswith a 60s delay capCan this PR break any existing features. If yes, please list possible items. If no, please explain why. (PS: Admins do not merge the PR without this section filled)
max_timewas effectively broken for long-running LLM calls (it prevented retries from firing). Removing it makes retries actually work as intended.Database Migrations
Env Config
MAX_TIMEenv var from SDK1Relevant Docs
Related Issues or PRs
Dependencies Versions
Notes on Testing
max_time-specific tests removed)MAX_TIMEenv var remain in SDK1Screenshots
N/A
Checklist
I have read and understood the Contribution Guidelines.