[WIP] [FEAT] V1 - Time based trials by harini-venkataraman · Pull Request #13 · Zipstack/unstract

harini-venkataraman · 2024-02-28T13:07:03Z

What

...

Why

...

How

...

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

...

Screenshots

...

Checklist

I have read and understood the Contribution Guidelines.

… wrappers Address review comments on PR #1886: - #9 (typing): call_with_retry / acall_with_retry / iter_with_retry previously returned `object`, erasing caller type info. Add PEP 695 generics so the return type flows from the wrapped callable: acall_with_retry now takes Callable[[], Awaitable[T]] and iter_with_retry takes Callable[[], Iterable[T]] -> Generator[T, ...]. - #11 / #13 (DRY): `_pop_retry_params` in embedding.py and `_disable_litellm_retry` in llm.py were identical logic. Lift to shared `pop_litellm_retry_kwargs` helper in retry_utils.py and delete both methods. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Unified retry for LLM and embedding providers litellm's retry only works for SDK-based providers (OpenAI/Azure). httpx-based providers (Anthropic, Vertex, Bedrock, Mistral) and ALL embedding calls silently ignore max_retries. This adds self-managed retry with exponential backoff at the SDK layer, disabling litellm's own retry entirely for consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [REFACTOR] DRY retry logic into reusable call_with_retry utilities Move retry loops out of LLM/Embedding classes into generic call_with_retry, acall_with_retry, and iter_with_retry functions in retry_utils.py. Both classes now call these directly instead of maintaining their own retry helper methods. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Consolidate retry logic, expose max_retries for all adapters - Extract _get_retry_delay() shared helper to eliminate duplicated retry decision logic across call_with_retry, acall_with_retry, iter_with_retry, and retry_with_exponential_backoff - Add num_retries=0 to embedding._pop_retry_params() to fully disable litellm's internal retry for embedding calls - Expose max_retries in UI JSON schemas for embedding adapters (OpenAI, Azure, VertexAI, Ollama) and Ollama LLM — previously the field existed in Pydantic models but wasn't shown to users, silently defaulting to 0 retries - Add debug logging to LLM and Embedding retry parameter extraction - Clarify docstrings distinguishing is_retryable_litellm_error() from is_retryable_error() (different exception hierarchies) - Remove stale noqa: C901 from simplified retry_with_exponential_backoff Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Set max_retries default to 3 for all embedding and Ollama LLM adapters Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Address greptile review: fix shadowed ConnectionError, use MRO check - Fix `requests.ConnectionError` shadowing Python's builtin `ConnectionError` in `is_retryable_litellm_error()` — rename import to `RequestsConnectionError` and use `builtins.ConnectionError` / `builtins.TimeoutError` explicitly - Use `__mro__`-based class name check instead of `type(error).__name__` to also catch subclasses of retryable error types - P1 (num_retries not zeroed) was already fixed in prior commit Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Address CodeRabbit review: add APITimeoutError, validate max_retries - Add APITimeoutError to _RETRYABLE_ERROR_NAMES for explicit OpenAI SDK timeout coverage - Add _validate_max_retries() guard to call_with_retry, acall_with_retry, iter_with_retry to fail fast on negative values instead of silently returning None Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * UN-3344 [FIX] Reduce cognitive complexity and remove useless except clause Address SonarCloud findings on PR #1886: - S3776: Flatten retry_with_exponential_backoff.wrapper by moving the success logging + return out of the try block and using `continue` in the retry path, so the except branch only handles the give-up case. - S2737: Drop the `except Exception: raise` clause — it was a no-op that added complexity without changing behavior (non-matching exceptions propagate naturally). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3344 [FIX] Extract retry loop to top-level helper to drop cognitive complexity Sonar still flagged retry_with_exponential_backoff at complexity 16 after the previous flatten. Nested def decorator / def wrapper counted against the outer function's score. Move the retry body to a module-level _invoke_with_retries helper so the decorator factory just delegates, bringing the outer function well under the 15 threshold. Behavior is unchanged — all paths (success, retry, give-up, non-retryable propagate) are preserved and covered by the existing SDK1 tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3344 [FIX] Honor Retry-After, close stream gen on retry, share give-up log Address review comments on PR #1886: - #10 (resource leak): close the generator returned by fn() before retrying in iter_with_retry — otherwise streaming providers leak an in-flight HTTP socket until GC. - #12 (behavioral regression): when we zero out SDK/wrapper retries we also lose the OpenAI SDK's native Retry-After handling on 429/503. _get_retry_delay now checks error.response.headers["retry-after"] and uses that value ahead of exponential backoff. HTTP-date form is not parsed; those fall back to backoff. - #8 (observability gap): move the "Giving up ... after N attempt(s)" log into _get_retry_delay so all four retry helpers (call_with_retry, acall_with_retry, iter_with_retry, decorator) share the same exhaustion signal. Previously only the decorator path logged it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3344 [REFACTOR] Share retry-kwargs helper and add TypeVar to retry wrappers Address review comments on PR #1886: - #9 (typing): call_with_retry / acall_with_retry / iter_with_retry previously returned `object`, erasing caller type info. Add PEP 695 generics so the return type flows from the wrapped callable: acall_with_retry now takes Callable[[], Awaitable[T]] and iter_with_retry takes Callable[[], Iterable[T]] -> Generator[T, ...]. - #11 / #13 (DRY): `_pop_retry_params` in embedding.py and `_disable_litellm_retry` in llm.py were identical logic. Lift to shared `pop_litellm_retry_kwargs` helper in retry_utils.py and delete both methods. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: ali <117142933+muhammad-ali-e@users.noreply.github.com>

* UN-3450 [FEAT] Add golden-path smoke test for callback worker Adds workers/tests/test_callback_sanity.py mirroring the existing test_executor_sanity.py pattern. Covers: - Worker enums and registry (WorkerType.CALLBACK, QueueName.CALLBACK + CALLBACK_API, TaskName.PROCESS_BATCH_CALLBACK, health port 8083, WorkerRegistry queue config + task routing). - Celery task wiring (process_batch_callback, process_batch_callback_api, healthcheck — registration, retry config, max_retries, autoretry_for). - Full dispatch -> task -> return round-trip via Celery eager mode, using the simple healthcheck task (process_batch_callback itself needs a configured InternalAPIClient + downstream HTTP, which is heavy mocking territory unsuitable for a smoke test — covered later in #1.2 characterisation suite). Coverage gains on callback module: - callback/__init__.py: 0% -> 100% - callback/worker.py: 0% -> 52% - callback/tasks.py: 0% -> 10% - Module total: 0% -> 12% 14 tests, run in ~17s. Regression net for spine PRs #6 (CallbackStatus enum migration) and #13 (chord -> Barrier lift in callback chain). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3450 [FEAT] Address Greptile review on callback smoke test Two P2 findings from Greptile, both fixed: 1. test_healthcheck_result_is_json_serializable: drop the `default=str` fallback in json.dumps. With it, any non-JSON-serializable value (UUID, datetime, custom object) gets silently coerced to a string — exactly the failure mode the test claims to catch. Without it, the assertion faithfully tests "round-trips cleanly via JSON" the way Celery's serializer would. 2. Add registration check for `process_batch_callback_django_compat` (the backward-compat task name `workflow_manager.workflow_v2. file_execution_tasks.process_batch_callback`). Both this task and `process_batch_callback` delegate to `_process_batch_callback_core`, so refactors that touch the core (e.g. the upcoming CallbackStatus enum migration) affect this path too. Three-line addition. 15 tests now (was 14), runtime 11s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

harini-venkataraman closed this Mar 7, 2024

harini-venkataraman force-pushed the feat/trials branch from a9b9db3 to 57e454c Compare March 7, 2024 07:56

muhammad-ali-e mentioned this pull request Apr 28, 2026

UN-3435 [FIX] Bind request_id and trace context to worker logs #1932

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] [FEAT] V1 - Time based trials #13

[WIP] [FEAT] V1 - Time based trials #13
harini-venkataraman wants to merge 0 commit intomainfrom
feat/trials

harini-venkataraman commented Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

harini-venkataraman commented Feb 28, 2024

What

Why

How

Relevant Docs

Related Issues or PRs

Dependencies Versions / Env Variables

Notes on Testing

Screenshots

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant