UN-3056 [FEAT] Filter notifications by run outcome by kirtimanmishrazipstack · Pull Request #1936 · Zipstack/unstract

kirtimanmishrazipstack · 2026-04-29T06:42:27Z

What

Add notify_on_failures boolean to Notification (default False). When True, fire only on terminal failures (ERROR / STOPPED for any pipeline type, or any file errored in a COMPLETED run). Surface as a "Notify on failures only" checkbox in the notification create/edit form.
Introduce the BATCHED dispatch pipeline: every notification now buffers into the new NotificationBuffer table; a periodic worker flush groups events by (org, webhook_url, auth_sig, platform) and ships them as one clubbed message per group per NOTIFICATION_CLUB_INTERVAL window. Interval is configurable per-org from Platform Settings (1–120 minutes, default 5).
Remove the previous synchronous-dispatch (IMMEDIATE) code path: deleted the entire notification_v2/provider/ subtree, NotificationHelper, split_by_delivery_mode, and the worker's send_notification_to_worker + get_webhook_headers. Single dispatch path.

Why

Notifications previously fired on every workflow completion, so users wanting only failure alerts (the UN-3056 ask) got noise on successful runs. The new flag lets each subscription opt out of success notifications.
High-volume deployments collapsed signal-to-noise by getting one webhook per run. BATCHED clubbing replaces that with one combined message per window, identical envelope across API / ETL / Task pipelines.

How

On terminal failure (ERROR/STOPPED, or any file errored in a COMPLETED run) all active rows fire. On terminal success the queryset excludes notify_on_failures=True rows. Defaulting to False keeps existing rows on the previous "every completion" behavior.

The failure-status set is consolidated into a single notification_v2.enums.FAILURE_STATUSES (frozen set of {ERROR, STOPPED}) used by all three dispatch sites (api_v2/notification.py, pipeline_v2/notification.py, worker callback).

Dispatch flow: PipelineNotification.send / APINotification.send and the worker callback path all funnel into notification_v2.helper.enqueue which writes a NotificationBuffer row. The workers/log_consumer/scheduler.sh sidecar periodically calls the internal /v1/webhook/buffer/process/ endpoint; the backend groups PENDING rows whose MIN(flush_after) ≤ NOW() by (org, webhook_url, auth_sig, platform), renders one combined message via unstract.core.notification_clubbed_renderer ({summary, events} envelope; Slack mrkdwn body for platform=SLACK), and queues a single send_webhook_notification Celery task per group with max_retries = max() across the rows. Concurrency-safe via SELECT … FOR UPDATE OF o SKIP LOCKED.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

Two caveats for operators:

Dispatch behavior change (migration 0003_add_notification_buffer.py). This PR introduces the NotificationBuffer table; every notification now clubs dispatches at the per-org NOTIFICATION_CLUB_INTERVAL cadence (default 5 min) instead of firing synchronously per run. The previous synchronous-dispatch code paths have been removed — there is no opt-out to per-event dispatch. The delivery_mode column is added with default BATCHED and stays on the model for backward DB compat (existing rows are auto-backfilled to BATCHED); no code reads IMMEDIATE and the field is no longer surfaced through the internal notifications API or settable through the public NotificationSerializer. The column + enum value will be dropped in a follow-up schema migration once deploys are coordinated.
Notification webhook envelope reshape (already shipped via prior merges on this branch). API webhook receivers and Slack rendering wrap each event in a canonical {summary, events} shape. Any prior consumer that parsed the old flat per-run payload needs to read body["events"][0] instead of body directly. See the in-tree plan file (UNS-611 v2.5/v2.6) for the contract.

The notify_on_failures filter itself defaults to False and changes no existing caller signatures.

Database Migrations

backend/notification_v2/migrations/0002_notification_notify_on_failures.py — adds the boolean column.
backend/notification_v2/migrations/0003_add_notification_buffer.py — adds delivery_mode (default BATCHED) and creates NotificationBuffer. See the operator note in "Can this PR break any existing features" above.
the idx_notif_buffer_pending partial index in 0003 includes platform so SLACK and API rows on the same (org, url, auth) split into separate dispatches at flush time.

Env Config

NOTIFICATION_CLUB_INTERVAL (seconds; default 300 = 5 min). Per-org override available in Platform Settings (1–120 minutes).

Relevant Docs

Related Issues or PRs

UN-3056

Dependencies Versions

None

Notes on Testing

Manual:

In the notification modal for an API deployment and an ETL pipeline, check "Notify on failures only", trigger a successful run (no webhook should fire) and a failed run (webhook should fire).
All notifications are batched; confirm the message arrives within NOTIFICATION_CLUB_INTERVAL (default 5 min, or the per-org override set in Platform Settings).

Screenshots

Checklist

I have read and understood the Contribution Guidelines.

…down rendering (#1927) * [FIX] Make tool-run logs visible in workflow execution UI Two stacked gaps were keeping tool-level log lines (Processing prompt, Running LLM completion, lookup calls, etc.) out of the workflow execution logs UI and the execution_log DB table for API / workflow runs: 1. Empty log_events_id. structure_tool_task seeded LOG_EVENTS_ID in StateStore but never threaded it into pipeline_ctx / agentic_ctx. ExecutorToolShim.stream_log gated publishing on self.log_events_id, so every tool-level log was dropped before it ever reached the broker. 2. Wrong payload shape. Even with the channel threaded, stream_log used LogPublisher.log_progress(...) whose payload omits execution_id / organization_id / file_execution_id. get_validated_log_data (log_utils.py) requires those IDs and LogType == LOG to persist to execution_log, so tool-level messages were silently filtered at the Redis->DB drain step — orchestration logs persisted, tool logs did not. Fixes: - ExecutionContext gains execution_id + file_execution_id, populated in structure_tool_task for both the legacy pipeline and agentic contexts. - LegacyExecutor caches the three IDs on self during execute() and passes them into every ExecutorToolShim construction (~7 callsites). - ExecutorToolShim.stream_log now dual-emits: PROGRESS (unchanged, drives the IDE prompt-card live progress pane) plus LOG carrying the workflow IDs (feeds the workflow execution logs UI and persists to execution_log via the existing drain). LOG emission is gated on execution_id + organization_id being present, so bare IDE test runs without a workflow still behave as before. Rendering polish - The LogModal and pipeline LogsModal now pipe log text through the existing CustomMarkdown renderer, so backticked identifiers render as inline-code pills and embedded newlines break lines. This lets multi-line structured events (e.g. the lookup pre-call trio) surface as a single row with readable inner formatting. - Prompt-key mentions inside legacy_executor tool logs are wrapped in backticks for consistency with the rest of the log surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Wrap prompt_name in backticks in remaining stream_log calls Completes the consistency pass on tool-run log formatting: the table- and line-item-extraction success and error paths still emitted prompt names without backticks, so the markdown-rendered logs UI showed them as bare text instead of inline-code pills. Matches the pattern already applied to the other 9 stream_log calls in this file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Validate URL schemes in CustomMarkdown link renderer Workflow logs rendered via CustomMarkdown can contain tool-generated or user-derived content, so an untrusted \`[text](url)\` sequence could inject a \`javascript:\` or \`data:\` scheme and get clickable through antd \`Typography.Link\`. Allow-list the safe external schemes (http, https, mailto, tel) before rendering as a link; everything else falls back to plain text while still honouring the existing internal-path branch used for in-app navigation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Thread workflow IDs into remaining shim/context callsites Addresses CodeRabbit review gaps so the log-plumbing fix is consistent across every pre-dispatch and plugin-dispatch path: - `table_ctx` / `line_item_ctx` in `legacy_executor.py` now carry `log_events_id`, `execution_id`, `file_execution_id` from context so downstream table/line-item plugins that build their own `ExecutorToolShim` pass the `execution_id + organization_id` gate and emit workflow LOG payloads. - `structure_tool_task.py` threads the same IDs into the bare pre-dispatch shim, so `X2Text.process()` calls during agentic extraction reach the workflow logs UI. - `LogsModal.jsx` stores the raw log string in row data and lets the column renderer wrap it in `CustomMarkdown` — the previous map stored a `<CustomMarkdown />` element that was then passed back into `CustomMarkdown.text`, producing `[object Object]` for multi-row lookups. - Dropped `getattr(context, ...)` on `execution_id` / `file_execution_id` now that they are dataclass fields — matches the direct access used for `organization_id`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REFACTOR] Trim overly specific comments in log-plumbing changes Pass through the new comments added across this PR and either remove or tighten the ones that restate what the code already shows. Keep only the WHY lines that protect future readers from missing a non-obvious constraint (XSS guard in CustomMarkdown, dual PROGRESS/LOG emission in the shim, pre-dispatch shim needing workflow IDs so X2Text logs are not silently dropped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REFACTOR] Extract isSafeExternalUrl into shared helpers module Moves the URL scheme allow-list check out of CustomMarkdown into helpers/urlSafety.js so any future component that renders links from user- or tool-derived content can reuse the same guard instead of re-implementing it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Tighten URL guard, split publish try/excepts, and extract shim builder Addresses the must-fix and worth-doing comments from the PR review: Security - CustomMarkdown: treat protocol-relative URLs (`//host/...`) as external, not internal, so they can no longer skip the scheme guard via the `startsWith("/")` branch. - `isSafeExternalUrl`: drop the `window.location.origin` base so bare strings ("javascript", "../foo") fail to parse instead of silently resolving to `https://<origin>/...` and passing the scheme check. Silent failure + comment accuracy - ExecutorToolShim.stream_log: split the PROGRESS and LOG publish paths into separate try/except blocks so a LogDataDTO validation failure on the LOG payload is no longer mis-attributed to "progress publish failed". Corrected the inline comments — the DB drop is driven by LogPublisher's `payload.type == 'LOG'` check, and only `execution_id` + `organization_id` are strictly required. Refactor - New `LegacyExecutor._build_shim()` helper — all seven ExecutorToolShim callsites now share one construction path so the workflow-ID plumbing can't drift out of sync across sites again. - Thread `execution_id` / `file_execution_id` into the seven self-dispatched sub-`ExecutionContext`s alongside `log_events_id`, matching the table/line-item sites and keeping the context consistent for any downstream consumer that reads the IDs from the context rather than from the executor instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Address remaining type-design and silent-failure comments - ExecutionContext: drop the BE-coupled inline comment, document the new IDs in the Attributes block, and enforce the invariant that execution_id implies organization_id via __post_init__. - ExecutorToolShim: typed the three new IDs as str | None instead of str = "" so the signature matches the Optional semantics already enforced by the runtime guards. - LegacyExecutor: move per-request state to __init__ so _log_component is no longer a class-level mutable default shared across instances; stop silently coercing None IDs to ""; add a one-shot warning when a tool-sourced run lands without workflow IDs so the silent-no-persist case is visible in GKE logs. - structure_tool_task: emit the same warning when LOG_EVENTS_ID is absent from StateStore. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Surface first publish failure per shim at WARN Both PROGRESS and LOG publish paths previously swallowed every broker failure at DEBUG, so a misconfigured or down Redis broker meant every tool-level log silently vanished with no operator-visible signal. Track a per-shim _progress_publish_failed / _log_publish_failed flag and log the first failure at WARNING (with traceback), then downgrade subsequent failures on the same shim back to DEBUG. Preserves the non-fatal semantics of the publish path while making broker outages visible in GKE logs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Auto-bump modified_at on QuerySet.update() and bulk_update() Django's auto_now=True only fires on Model.save(); QuerySet.update() and bulk_update() bypass save(), so BaseModel.modified_at silently stayed at the creation time for every bulk-path write. Audit trail drifted. Introduce BaseModelQuerySet that injects modified_at=timezone.now() into both paths, and expose it via BaseModelManager. Migrate all custom managers on BaseModel subclasses to compose BaseModelManager so their querysets inherit the overrides. Drop the ad-hoc modified_at=now() kwarg in FileHistoryHelper now that the queryset handles it. * [FIX] Materialize objs in BaseModelQuerySet.bulk_update to support generators Addresses PR review: if callers pass a non-rewindable iterable (generator, queryset iterator), the modified_at stamping loop would exhaust it before super().bulk_update() saw it, silently updating zero rows. list(objs) up front keeps generator callers working. Also drop the mock-based unit test — it needed django.setup() at module import which isn't viable without pytest-django, and proper DB-backed coverage is tracked separately. * [FIX] Auto-inject modified_at into BaseModel.save(update_fields=...) Django only runs auto_now for fields listed in update_fields, so every save(update_fields=["foo"]) on a BaseModel subclass silently drops the modified_at bump — same family of bug as QuerySet.update/bulk_update. Override BaseModel.save() to add modified_at to update_fields whenever the caller supplies a restricted list without it. Also drop two dead manual-assignment lines (execution.modified_at = timezone.now() before save()) that were redundant with auto_now on a full save(). * [FIX] Auto-bump modified_at on upsert bulk_create and drop workarounds QuerySet.bulk_create(update_conflicts=True, update_fields=[...]) runs an UPDATE on conflict with only the listed fields — same auto_now-bypass as save(update_fields=...) and QuerySet.update(). Patch BaseModelQuerySet's bulk_create to inject modified_at into update_fields on upsert. With that in place, the explicit "modified_at" entries in dashboard_metrics upsert callers are redundant. Drop them. * [REFACTOR] Tighten BaseModel auto-bump helpers and edge cases - Extract `_with_modified_at` helper; single source of truth for the "inject modified_at into a partial field list" rule across `bulk_update`, `bulk_create` and `BaseModel.save`. - Preserve Django's documented `save(update_fields=[])` no-op (signals-only save, no column writes) instead of rewriting it to `["modified_at"]`. Apply the same guard to `bulk_create(update_conflicts=True, update_fields=[])`. - Match Django's positional `save()` signature (`force_insert`, `force_update`, `using`, `update_fields`) so callers passing flags positionally still hit the auto-bump override. - Skip the per-obj `modified_at` stamp + `objs` materialization in `bulk_update` when the caller already listed `modified_at` — lets the opt-in path stay O(1) before the `super()` delegation. - Docstring corrections: "previous save() timestamp" (not just creation time); manager-level convention note; precise `auto_now` semantics (attribute still updates in-memory, just isn't persisted without `update_fields` inclusion).

@staticmethod

…wered table extraction (#1914) * Execution backend - revamp * async flow * Streaming progress to FE * Removing multi hop in Prompt studio ide and structure tool * UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item * Added executors for agentic prompt studio * Added executors for agentic prompt studio * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * Removed redundant envs * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Removed redundant envs * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Pluggable apps and plugins to fit the new async prompt execution architecture * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * adding worker for callbacks * fix: write output files in agentic extraction pipeline Agentic extraction returned early without writing INFILE (JSON) or METADATA.json, causing destination connectors to read the original PDF and fail with "Expected tool output type: TXT, got: application/pdf". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests (#1850) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update docs * UN-3266 fix: remove dead code with undefined names in fetch_response Remove unreachable code block after the async callback return in fetch_response that still referenced output_count_before and response from the old synchronous implementation, causing ruff F821 errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * Un 3266 fix security hotspot tmp paths (#1851) * UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in tests Replace hardcoded /tmp/ paths (SonarCloud S5443 security hotspots) with pytest's tmp_path fixture or module-level tempfile.mkdtemp() constants in all affected test files to avoid world-writable directory vulnerabilities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve ruff linting failures across multiple files - B026: pass url positionally in worker_celery.py to avoid star-arg after keyword - N803: rename MockAsyncResult to mock_async_result in test_tasks.py - E501/I001: fix long line and import sort in llm_whisperer helper - ANN401: replace Any with object|None in dispatcher.py; add noqa in test helpers - F841: remove unused workflow_id and result assignments Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849 - S2259: guard against None after _discover_plugins() in loader.py to satisfy static analysis on the dict[str,type]|None field type - S1244: replace float equality checks with pytest.approx() in test_answer_prompt.py and test_phase2h.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud code smells in PR #1849 - S5799: Merge all implicit string concatenations in log messages (legacy_executor.py, tasks.py, dispatcher.py, orchestrator.py, registry.py, variable_replacement.py, structure_tool_task.py) - S1192: Extract duplicate literal to _NO_CELERY_APP_MSG constant in dispatcher.py - S1871: Merge identical elif/else branches in tasks.py and test_sanity_phase6j.py - S1186: Add comment to empty stub method in test_sanity_phase6a.py - S1481: Remove unused local variables in test_sanity_phase6d/e/f/g/h/j and test_phase5d.py - S117: Rename PascalCase local variables to snake_case in test_sanity_phase3/5/6i.py - S5655: Broaden tool type annotation to StreamMixin in IndexingUtils.generate_index_key and PlatformHelper.get_adapter_config - docker:S7031: Merge consecutive RUN instructions in worker-unified.Dockerfile - javascript:S1128: Remove unused pollForCompletion import in usePromptRun.js Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: wrap long log message in dispatcher.py to fix E501 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud S117 naming violations Rename PascalCase local variables to snake_case to comply with S117: - legacy_executor.py: rename tuple-unpacked _get_prompt_deps() results (AnswerPromptService→answer_prompt_svc, RetrievalService→retrieval_svc, VariableReplacementService→variable_replacement_svc, LLM→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls) and update all downstream usages including _apply_type_conversion and _handle_summarize - test_phase1_log_streaming.py: rename Mock* local variables to mock_* snake_case equivalents - test_sanity_phase3.py: rename MockDispatcher→mock_dispatcher_cls and MockShim→mock_shim_cls across all 10 test methods - test_sanity_phase5.py: rename MockShim→mock_shim, MockX2Text→mock_x2text in 6 test methods; MockDispatcher→mock_dispatcher_cls in dispatch test; fix LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls in _mock_prompt_deps helper Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849 - test_sanity_phase2/4.py, test_answer_prompt.py: rename PascalCase local variables in _mock_prompt_deps/_mock_deps to snake_case (RetrievalService→retrieval_svc, VariableReplacementService→ variable_replacement_svc, Index→index_cls, LLM_cls→llm_cls, EmbeddingCompat→embedding_compat_cls, VectorDB→vector_db_cls, AnswerPromptService→answer_prompt_svc_cls) — fixes S117 - test_sanity_phase3.py: remove unused local variable "result" — fixes S1481 - structure_tool_task.py: remove redundant json.JSONDecodeError from except clause (subclass of ValueError) — fixes S5713 - shared/workflow/execution/service.py: replace generic Exception with RuntimeError for structure tool failure — fixes S112 - run-worker-docker.sh: define EXECUTOR_WORKER_TYPE constant and replace 10 literal "executor" occurrences — fixes S1192 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve SonarCloud cognitive complexity and code smell violations - Reduce cognitive complexity in answer_prompt.py: - Extract _build_grammar_notes, _run_webhook_postprocess helpers - _is_safe_public_url: extracted _resolve_host_addresses helper - handle_json: early-return pattern eliminates nesting - construct_prompt: delegates grammar loop to _build_grammar_notes - Reduce cognitive complexity in legacy_executor.py: - Extract _execute_single_prompt, _run_table_extraction helpers - Extract _run_challenge_if_enabled, _run_evaluation_if_enabled - Extract _inject_table_settings, _finalize_pipeline_result - Extract _convert_number_answer, _convert_scalar_answer - Extract _sanitize_dict_values helper - _handle_answer_prompt CC reduced from 50 to ~7 - Reduce CC in structure_tool_task.py: guard-clause refactor - Reduce CC in backend: dto.py, deployment_helper.py, api_deployment_views.py, prompt_studio_helper.py - Fix S117: rename PascalCase local vars in test_answer_prompt.py - Fix S1192: extract EXECUTOR_WORKER_TYPE constant in run-worker.sh - Fix S1172: remove unused params from structure_tool_task.py - Fix S5713: remove redundant JSONDecodeError in json_repair_helper.py - Fix S112/S5727 in test_execution.py Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: remove unused RetrievalStrategy import from _handle_answer_prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: rename UsageHelper params to lowercase (N803) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * UN-3266 fix: resolve remaining SonarCloud issues from check run 66691002192 - Add @staticmethod to _sanitize_null_values (fixes S2325 missing self) - Reduce _execute_single_prompt params from 25 to 11 (S107) by grouping services as deps tuple and extracting exec params from context.executor_params - Add NOSONAR suppression for raise exc in test helper (S112) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * UN-3266 fix: remove unused locals in _handle_answer_prompt (F841) execution_id, file_hash, log_events_id, custom_data are now extracted inside _execute_single_prompt from context.executor_params. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: resolve Biome linting errors in frontend source files Auto-fixed 48 lint errors across 56 files: import ordering, block statements, unused variable prefixing, and formatting issues. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: replace dynamic import of SharePermission with static import in Workflows Resolves vite build warning about SharePermission.jsx being both dynamically and statically imported across the codebase. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: resolve SonarCloud warnings in frontend components - Remove unnecessary try-catch around PostHog event calls - Flip negated condition in PromptOutput.handleTable for clarity Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Address PR #1849 review comments: fix null guards, dead code, and test drift - Remove redundant inline `import uuid as _uuid` in views.py (use module-level uuid) - URL-encode DB_USER in worker_celery.py result backend connection string - Remove misleading task_queues=[Queue("executor")] from dispatch-only Celery app - Remove dead `if not tool:` guards after objects.get() (already raises DoesNotExist) - Move profile_manager/default_profile null checks before first dereference - Reorder ProfileManager.objects.get before mark_document_indexed in tasks.py - Handle ProfileManager.DoesNotExist as warning, not hard failure - Wrap PostHog analytics in try/catch so failures don't block prompt execution - Handle pending-indexing 200 response in usePromptRun.js (clear RUNNING status) - Reset formData when metadata is missing in ConfigureDs.jsx - Fix test_should_skip_extraction tests: function now takes 1 arg (outputs only) - Fix agentic routing tests: mock X2Text.process, remove stale platform_helper kwarg Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix missing llm_usage_reason for summarize LLM usage tracking Add PSKeys.LLM_USAGE_REASON to usage_kwargs in _handle_summarize() so summarization costs appear under summarize_llm in API response metadata. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor - Route _handle_structure_pipeline to _handle_single_pass_extraction when is_single_pass=True (was always calling _handle_answer_prompt) - Delegate _handle_single_pass_extraction to cloud plugin via ExecutorRegistry, falling back to _handle_answer_prompt if plugin not installed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing API depployment response mismatches * Add complete_vision() method to SDK1 LLM for multimodal completions Adds a new complete_vision() method alongside existing complete() that accepts pre-built multimodal messages (text + image_url) in OpenAI-style format. LiteLLM auto-translates for Anthropic/Bedrock/Vertex providers. This enables the agentic table extractor plugin to send page images alongside text prompts for VLM-based table detection and extraction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Gate Run button by agentic table readiness checklist - PromptCardItems loads AgenticTableChecklist plugin and owns the isAgenticTableReady state, rendering the checklist above the prompt text area and delegating the settings gear visibility to the plugin. - Header and PromptOutput disable their Run buttons when isAgenticTableReady is false (default true for non-agentic types). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * [FIX] Use correct primary key field in prompt count subquery (#1905) ToolStudioPrompt uses prompt_id as its primary key, not id. Count("id") causes FieldError on the list endpoint (500). Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * [FIX] Add agentic_table as valid enforce_type choice The cloud build adds "agentic_table" to the prompt enforce_type dropdown, but the OSS ToolStudioPrompt model rejected it as an invalid choice. Add AGENTIC_TABLE to EnforceType and ship a matching migration so the value can be persisted. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Wire agentic_table enforce_type to executor dispatch The single-prompt run flow had no branch for prompts with enforce_type=agentic_table, so clicking Run silently fell through to the legacy prompt-service path and never invoked the agentic_table executor. Adds an AGENTIC_TABLE constant to TSPKeys, includes it in the OperationNotSupported guard, and dispatches to PayloadModifier.execute_agentic_table when the plugin is available so the result still flows through _handle_response. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Add agentic_table queue to executor worker defaults The ExecutionDispatcher derives the queue name from the executor name (celery_executor_{name}), so dispatches to the agentic_table executor land on celery_executor_agentic_table. The local docker-compose default only listed celery_executor_legacy and celery_executor_agentic, so no worker consumed the new queue and dispatch hung for the full 1-hour result timeout. Adds the missing queue to the docker-compose default. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Dispatch agentic_table prompts to executor on IDE Run The IDE Run button was building a legacy answer_prompt payload for agentic_table prompts, so the agentic table executor was never invoked. Branch fetch_response on enforce_type so agentic_table prompts are built via the cloud payload_modifier plugin and dispatched directly to celery_executor_agentic_table. Add the enforce_type to the OSS dropdown choices and the JSON-dump set in OutputManagerHelper so the persisted output is parseable by the FE table renderer. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * UN-3266 [FIX] Reshape agentic_table executor output in IDE callback The agentic_table executor returns {"output": {"tables": [...], "page_count": ..., "headers": [...], ...}}, but OutputManagerHelper.handle_prompt_output_update reads outputs[prompt.prompt_key] when persisting prompt output. Without a reshape the table list never lands under the prompt key and the FE sees an empty result. When cb_kwargs carries is_agentic_table=True and prompt_key (set by the cloud build_agentic_table_payload), reshape outputs to {prompt_key: tables} before calling update_prompt_output. The executor itself also shapes its envelope, so this is a defensive double-keying that keeps the legacy answer_prompt path untouched. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fixing timeout issues * API deployment fixes for Agentic table extractor * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixing syntax issues * Fix agentic_table executor reading INFILE after JSON overwrite Read from SOURCE instead of INFILE when dispatching to the agentic_table executor. INFILE gets overwritten with JSON output by the regular pipeline, causing PDFium parse errors when the agentic_table executor tries to process it as a PDF. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: harini-venkataraman <115449948+harini-venkataraman@users.noreply.github.com> Co-authored-by: Ghost Jake <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Chandrasekharan M <chandrasekharan@zipstack.com>

) * list bucket * greptile review

…fy-on-API-deployment-failures

coderabbitai · 2026-04-29T06:42:35Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8ae7d465-8a99-4091-9933-cd7876819cf3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

Walkthrough

Implements clubbed (buffered) notification delivery with org-configurable batch intervals, adds delivery modes and failure-only routing, persists per-execution file counts, introduces NotificationBuffer model/endpoints and enqueue/dispatch helpers, normalizes clubbed webhook envelopes and Slack rendering, and wires worker/frontend to use the new flow.

Changes

Core models, enums and buffering

Layer / File(s)	Summary
Model + migration changes `backend/notification_v2/models.py`, `backend/notification_v2/migrations/0002_notification_notify_on_failures.py`, `backend/notification_v2/migrations/0003_add_notification_buffer.py`	Adds `notify_on_failures` and `delivery_mode` to `Notification`; adds `NotificationBuffer` model (webhook grouping keys, payload, flush_after, status, dispatched_at, organization FK) and partial index for pending rows; migrations added.
Enums and constants `backend/notification_v2/enums.py`	Adds `DeliveryMode` and `BufferStatus` enums and `FAILURE_STATUSES` (ERROR, STOPPED) constant used for routing decisions.
Settings and config key `backend/backend/settings/base.py`, `backend/configuration/enums.py`	New Django settings `NOTIFICATION_CLUB_INTERVAL` and `NOTIFICATION_BUFFER_RETENTION_DAYS`; adds `ConfigKey.NOTIFICATION_CLUB_INTERVAL` with min/max validation (60–7200s).

Buffer enqueue, dispatch and helper utilities

Layer / File(s)	Summary
Buffer enqueue & dispatch helpers `backend/notification_v2/helper.py`	Removes legacy NotificationHelper; adds `dispatch_with_delivery_mode()` and `enqueue()` which compute auth sigs, webhook hash, resolve org interval, normalize timestamps, persist `NotificationBuffer` rows, and emit structured metric logs.
Internal API endpoints for buffer `backend/notification_v2/internal_api_views.py`, `backend/notification_v2/internal_urls.py`	Adds `enqueue_notification_buffer` (POST) to accept per-execution events and create buffer rows (with validation and notify_on_failures suppression for INPROGRESS), and `process_notification_buffer` (POST) to group pending rows, claim with FOR UPDATE SKIP LOCKED, render clubbed payloads, mark DISPATCHED, schedule Celery send, and GC terminal rows.
Celery task for dead-lettering `backend/notification_v2/tasks.py`	Adds `mark_buffer_dead_letter` task to mark buffer rows DEAD_LETTER when broker/dispatch exhausts retries.

Clubbed envelope rendering and provider changes

Layer / File(s)	Summary
Canonical envelope & Slack renderer `unstract/core/src/unstract/core/notification_clubbed_renderer.py`, `backend/notification_v2/clubbed_renderer.py`	Adds `build_envelope(payloads)` (caps batches, projects events, computes totals/succeeded/failed) and `render_slack_text(envelope)`; re-exports and dispatches platform-specific render in `clubbed_renderer`.
Webhook providers updated `backend/notification_v2/provider/webhook/webhook.py`, `backend/notification_v2/provider/webhook/api_webhook.py`, `backend/notification_v2/provider/webhook/slack_webhook.py`, `workers/notification/providers/*`	Providers now wrap payloads into canonical envelope; Slack rendering delegates to shared renderer; type annotations added to send/get_headers/format_payload methods; old Block Kit builders removed.

Notification dispatch flow and payload enrichment

Layer / File(s)	Summary
Pipeline/API notification senders `backend/pipeline_v2/notification.py`, `backend/api_v2/notification.py`	Load optional execution, compute `total_files`/`successful_files`/`failed_files`, determine failure if status ∈ `FAILURE_STATUSES` or `failed_files>0`, filter out `notify_on_failures` on non-failure, build `PipelineStatusPayload` with counts, and call `dispatch_with_delivery_mode(...)`.
Payload DTO changes `backend/pipeline_v2/dto.py`, `unstract/core/src/unstract/core/data_models.py`	`PipelineStatusPayload` and `NotificationPayload.from_execution_status` accept and include file-count aggregates under `additional_data`.

Execution file-count denormalization and status updates

Layer / File(s)	Summary
Execution model and migration `backend/workflow_manager/workflow_v2/models/execution.py`, `backend/workflow_manager/workflow_v2/migrations/0020_workflowexecution_file_counts.py`	Adds nullable `successful_files` and `failed_files` fields to `WorkflowExecution` and uses them when computing last-run statuses.
Status update serializer and view `backend/workflow_manager/internal_serializers.py`, `backend/workflow_manager/internal_views.py`	Adds optional `successful_files`/`failed_files` fields with cross-field validation; `update_status` conditionally persists provided file-count fields with targeted `save(update_fields=...)`.
Worker clients forwarding counts `workers/shared/api/internal_client.py`, `workers/shared/clients/execution_client.py`, `workers/callback/tasks.py`	Internal API client and execution client accept/forward `successful_files`/`failed_files`; callback task includes these counts when calling update endpoint.

Worker routing and scheduling changes

Layer / File(s)	Summary
Worker enqueue routing `workers/shared/patterns/notification/helper.py`	Refactors to POST WEBHOOK events to backend enqueue endpoint (`v1/webhook/buffer/enqueue/`), adds `_enqueue_to_buffer` and `_route_notification`, passes `execution_id` when listing notifications, and extends NotificationPayload with file counts.
Buffer processing CLI & scheduler `workers/log_consumer/process_notification_buffer.py`, `workers/log_consumer/scheduler.sh`	Adds CLI to call internal `buffer/process/`; scheduler runs two independent tasks (log history + buffer flush) with separate commands and non-fatal failures.
Scheduler wiring `workers/scheduler/tasks.py`	Passes `execution_id` through when triggering notifications.

Frontend: settings and notification UI

Layer / File(s)	Summary
Platform settings UI `frontend/src/components/settings/platform/PlatformSettings.jsx`	Adds org-scoped notification batch-interval UI: GET/patch `club_interval_seconds` (seconds) endpoint, display in minutes (1–120), validation and save flow.
Create notification modal `frontend/src/components/pipelines-or-deployments/notification-modal/CreateNotification.jsx`	Adds `notify_on_failures` checkbox in the notification creation form and default form state.

Sequence Diagrams

sequenceDiagram
    participant Producer as Notifier Source
    participant Backend as Backend API
    participant BufferAPI as Buffer Enqueue API
    participant DB as NotificationBuffer
    participant Scheduler as Log Consumer / Scheduler
    participant Celery as Celery Worker
    participant Webhook as External Webhook

    Producer->>Backend: Execution completes / trigger notification
    Backend->>Backend: Load execution, compute file counts
    Backend->>Backend: Filter notifications (notify_on_failures logic)
    Backend->>BufferAPI: POST enqueue_notification_buffer (payload + counts)
    BufferAPI->>DB: Persist NotificationBuffer row (PENDING, flush_after)
    Scheduler->>DB: process_notification_buffer (groups due flushes)
    DB->>DB: Claim rows FOR UPDATE SKIP LOCKED
    DB->>DB: build_envelope() / render_clubbed_message()
    DB->>DB: Mark rows DISPATCHED
    DB->>Celery: Enqueue single clubbed send task (with buffer ids)
    Celery->>Webhook: POST clubbed payload
    Webhook-->>Celery: 200 OK
    Celery->>DB: Success -> leave DISPATCHED / failure -> revert to PENDING or mark DEAD_LETTER

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 73.03% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'UN-3056 [FEAT] Filter notifications by run outcome' clearly and specifically summarizes the main change—adding outcome-based notification filtering.
Description check	✅ Passed	The PR description is comprehensive, covering all required template sections including What, Why, How, breaking changes, migrations, env config, testing notes, and the contribution guidelines checklist.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch UN-3056-Notify-on-API-deployment-failures

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…fy-on-API-deployment-failures

@jaseemjaskp

* UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS (#1938) * UN-3439 [FIX] Accept wildcard subdomain origins in SocketIO and Django CORS Production socket connections were failing for `*.env.us-central.unstract.com` because python-socketio does exact-string comparison on `cors_allowed_origins`, so a literal `*` pattern silently rejected every real subdomain. - Add `CORS_ALLOWED_ORIGIN_REGEXES` derived from `WEB_APP_ORIGIN_URL_WITH_WILD_CARD`. - Wire SocketIO via `_RegexOrigin` whose `__eq__` does the regex match — single list entry covers all wildcard subdomains, no library subclass needed. - Normalize `WEB_APP_ORIGIN_URL` through `urlparse` so trailing slashes / paths in env are stripped (also fixes the `…com//oauth-status/` double-slash). - Add startup guard for malformed env values. Resolves item #1 of UN-3439. Items #2/#3 (decoupling indexing from Socket.io, fallback) are owned separately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address PR review: canonical origin, fullmatch, unhashable RegexOrigin, tests Addresses five review comments on #1938: 1. coderabbitai (Major) — RFC 6454 canonicalization. Browsers serialize `Origin` headers with a lowercase host and no explicit default ports; `parsed_url.netloc` preserved both, so `https://APP.EXAMPLE.COM:443` would silently fail to match the browser's `https://app.example.com`. Switch to `parsed_url.hostname` + drop default ports, and reject non-http(s) schemes at startup. 2. greptile (P2) — `re.fullmatch` instead of `re.match`. With `re.match` plus `$`, a candidate ending in `\n` matches because `$` is allowed before an optional trailing newline. `fullmatch` removes the ambiguity. 3. self — `_RegexOrigin.__hash__` violated `a == b ⇒ hash(a) == hash(b)` (one fixed pattern hash vs. many matching strings). Today this is masked because python-socketio uses linear `__eq__` on a list, but if the allow-list is ever wrapped in a set, every legitimate subdomain would silently be rejected — exactly the failure mode UN-3439 closes. Make instances unhashable so the contract can't be broken. 4. self — No regression tests. Add `backend/utils/tests/test_cors_origin.py` (33 cases) covering: regex match/no-match, lookalike spoofing, scheme mismatch, trailing-newline rejection, non-string equality protocol, unhashability, ReDoS bounds, URL normalization (case, default ports, trailing slash, paths, queries), startup-guard rejections (empty, no-scheme, non-browser-scheme, no-host), and end-to-end via the same `RegexOrigin` path SocketIO uses. 5. self — Over-clever wildcard-to-regex builder. The `split('*').join(re.escape, ...)` construction generalised to N wildcards but the input has exactly one; replace with a direct rf-string that's self-evident on review. Refactor for testability: extract `RegexOrigin` and `normalize_web_app_origin` into `backend/utils/cors_origin.py` (Django-free, importable from settings and tests). Settings now delegates to one helper call; `log_events.py` imports `RegexOrigin`. No behavioural change beyond what each comment fixes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address SonarCloud quality gate The Sonar quality gate failed with C reliability + 5 security hotspots, all on the new test file: - S905 (Bug, Major) — `{ro}` flagged as no-side-effect statement (Sonar doesn't see the implicit `__hash__` call). Drove the C reliability rating. Fix: use `len({ro})` so the side effect is via an explicit function call; test still asserts the same `TypeError`. - S5727 (Code Smell, Critical) — `assert ro != None` is tautological and doesn't exercise `__eq__`. Switch to `(ro == None) is False` which directly tests that `NotImplemented` falls back to identity-equality. - S5332 × 5 (Hotspots) — `http://` and `ftp://` literals in test data. These are intentional inputs proving the rejection logic. Annotate with `# NOSONAR` and an explanatory comment so the hotspots can be marked reviewed. No production code changed; tests still 33/33 passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Remove last S5727 code smell — test __eq__ via dunder Sonar S5727 correctly inferred that ``ro == None`` is statically always False (NotImplemented falls back to identity), making the assertion look tautological. The intent is to lock the protocol contract: ``__eq__`` must return the ``NotImplemented`` sentinel for non-strings. Test that directly via ``ro.__eq__(None) is NotImplemented`` instead of going through ``==``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3439 [FIX] Address remaining CodeRabbit nits — port validation, ReDoS bound Two minor follow-ups from the second CodeRabbit pass: - `parsed.port` is a property that raises ValueError on malformed/out-of-range inputs (e.g. `:abc`, `:99999`). That bypassed our normalized config-error message and surfaced as a stack trace. Wrap the access and re-raise with the same actionable text. Adds two test cases (`https://example.com:abc`, `https://example.com:99999`) to lock the new behaviour. - The 50ms ReDoS timing bound is too tight for noisy CI runners. Loosen to 500ms — still orders of magnitude below what catastrophic backtracking would produce. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ReverseMerge: V0.161.4 hotfix (#1943) * Change csp to report only * [HOTFIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var (v0.161.4) (#1939) [HOTFIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var (#1937) [FIX] Bool-parse ENABLE_HIGHLIGHT_API_DEPLOYMENT env var os.environ.get returns the raw string when the variable is set, so ENABLE_HIGHLIGHT_API_DEPLOYMENT="False" was truthy in Python (any non-empty string is truthy). Wrap in CommonUtils.str_to_bool so "False" / "false" / "0" actually evaluate to False. The setting is consumed by the cloud configuration plugin's spec default (ConfigSpec.default in plugins/configuration/cloud_config.py) on cloud and on-prem builds. With this fix, an admin who explicitly sets the env var to a falsy string sees highlight data stripped as expected. Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3448 [FIX] Remove vestigial `uv pip install` line in uv-lock-automation workflow (#1941) * UN-3448 [FIX] Add --system flag to uv pip install in uv-lock-automation workflow Modern uv requires uv pip install to run inside a virtual environment OR with the explicit --system flag. The workflow currently has neither, so it errors out: error: No virtual environment found for Python 3.12.9; run `uv venv` to create an environment, or pass `--system` to install into a non-virtual environment This breaks every PR that touches a pyproject.toml (the workflow's paths filter triggers on those). Last successful run was 2026-04-01, before a behaviour change in uv or astral-sh/setup-uv@v7. The --system flag is exactly what the error message suggests and is correct here — we install pip into the runner's system Python; the downstream uv-lock.sh script creates its own venvs as needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * UN-3448 [FIX] Remove vestigial `uv pip install` line per review Per @jaseemjaskp's review: the pre-step `uv pip install ... pip` does nothing useful for this workflow. The downstream uv-lock.sh script uses uv sync at line 74, which manages its own venvs internally and never invokes pip directly: $ grep -rn 'pip' docker/scripts/uv-lock-gen/ docker/scripts/uv-lock-gen/uv-lock.sh:2:set -o pipefail Only match is pipefail (shell option), no real pip references. Removing the line entirely is cleaner than papering over with --system. The line was likely copy-pasted from a sibling workflow that legitimately needed pip in the system Python. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ReverseMerge: V0.163.2 hotfix (#1946) * [HOTFIX] Use importlib.util.find_spec for pluggable worker discovery (#1918) * [FIX] Use importlib.util.find_spec for pluggable worker discovery _verify_pluggable_worker_exists() previously checked for the literal file `pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin has been compiled to a .so (Nuitka, Cython, or any C extension) — the module is perfectly importable but the pre-check rejects it because only the .py extension is considered. Replace the filesystem check with importlib.util.find_spec(), which is Python's standard way to ask "is this module resolvable by the import system?". It honors every registered finder — source .py, compiled .so, bytecode .pyc, namespace packages, zipimports — so the function now matches what its docstring claims: verifying the module can be loaded, not that a specific file extension is present. Behavior is preserved for existing deployments: - Images with no `pluggable_worker/<name>/` subpackage → find_spec raises ModuleNotFoundError (ImportError subclass) → returns False. - Images with source .py → find_spec resolves the .py → returns True. - Images with compiled .so → find_spec resolves the .so → returns True. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Handle ValueError from find_spec in pluggable worker verification Greptile-flagged edge case: importlib.util.find_spec() can raise ValueError (not just ImportError) when sys.modules has a partially initialised module entry with __spec__ = None from a prior failed import. Broaden the except to catch both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FIX] Resolve api-deployment worker directory from enum import path worker.py:452 did worker_type.value.replace("-", "_") to derive the on-disk dir name. All WorkerType enum values already use underscores, so the replace was a no-op; for API_DEPLOYMENT whose dir is "api-deployment" (hyphen), it resolved to "api_deployment" and the os.path.exists() check failed. Boot then logged a spurious "❌ Worker directory not found: /app/api_deployment" at ERROR level. The task registration path (builder + celery autodiscover via to_import_path) is unaffected, so this was purely log noise — but noise at ERROR level that masks real failures in log scans. Fix: derive the directory from the authoritative to_import_path() which already handles the hyphen case (api_deployment -> api-deployment). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [HOTFIX] Add IAM Role / Instance Profile auth mode to AWS Bedrock adapter (#1944) * [FEAT] Allow Bedrock to fall through to boto3's default credential chain Match the S3/MinIO connector pattern: when AWS access keys are left blank on the Bedrock LLM and embedding adapter forms, drop them from the kwargs dict so boto3's default credential chain handles authentication. This unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on hosts that already have ambient AWS credentials (e.g. EKS workers with IRSA, EC2 with an instance profile). - llm1/static/bedrock.json: clarify access-key descriptions to mention IRSA and instance profile (already non-required at v0.163.2 base). - embedding1/static/bedrock.json: drop aws_access_key_id and aws_secret_access_key from top-level required; same description fix; expose aws_profile_name for parity with the LLM form. - base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters now strip empty access-key values from the validated kwargs before returning, so empty strings don't override boto3's default chain. AWSBedrockEmbeddingParameters fields gain explicit None defaults and an aws_profile_name field. Backward-compatible: existing adapters with access keys filled in continue to work unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [FEAT] Add Authentication Type selector to Bedrock adapter form Add an explicit `auth_type` selector with two options, making the auth choice clear to users: - "Access Keys" (default): existing flow, keys required - "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on boto3's default credential chain (IRSA on EKS, task role on ECS, instance profile on EC2). Description on the selector explicitly notes this option is only for AWS-hosted Unstract deployments. The form-only auth_type field is stripped before LiteLLM validation in both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters. validate(). Empty access keys continue to be stripped so boto3 falls through to the default chain even when the access_keys arm is selected without values (matches the S3/MinIO connector pattern). Backward-compatible: legacy adapters without auth_type behave as "Access Keys" mode (the default), and existing keys are forwarded unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [REVIEW] Address Bedrock auth_type review feedback Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on PR #1944. Behaviour fixes: - Stale-key leak in IAM Role mode: switching an existing adapter from Access Keys to IAM Role would carry truthy stored access keys through the strip-empty-only loop, so boto3 silently authenticated with the old long-lived credentials instead of falling through to the host's IRSA / instance-profile identity. Both LLM and embedding paths were affected. - Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or a malformed payload from a non-UI client passed through the dict comprehension untouched, with no enum guard. - Cross-field validation gap: explicit Access Keys mode with blank or whitespace-only values silently fell through to the default credential chain instead of surfacing the misconfiguration. Implementation: - Add a module-level _resolve_bedrock_aws_credentials helper used by both AWSBedrockLLMParameters.validate() and AWSBedrock EmbeddingParameters.validate(), so the auth-type contract is expressed once. - Validates auth_type against an allowlist (None | "access_keys" | "iam_role"); raises ValueError on anything else. - iam_role: unconditionally drops aws_access_key_id and aws_secret_access_key. - access_keys (explicit): requires non-blank values; raises ValueError if either is empty or whitespace-only. - Legacy (auth_type absent): retains the lenient strip behaviour so pre-PR adapter configurations continue to deserialise unchanged. - Restore aws_region_name as required (no `= None` default) on AWSBedrockEmbeddingParameters; only credentials may legitimately be absent. - Drop the orphan aws_profile_name field from embedding1/static/bedrock.json: it was added for parity with the LLM form but lives outside the auth_type oneOf and contradicts the selector's "no further input" semantics. The LLM form already had aws_profile_name pre-PR and is left alone for backwards compatibility. Tests: - New tests/test_bedrock_adapter.py covers 15 cases across LLM and embedding adapters: legacy-no-auth-type, explicit access_keys with valid/blank/whitespace keys, iam_role with stale/no keys, unknown auth_type rejection, cross-field validation, and preservation of unrelated params (model_id, aws_profile_name, region, thinking). Skipped (P2 nice-to-have): - Comment-scope clarification, MinIO reference rewording, validate-mutates-caller'\''s-dict, and the LLM form description nit about aws_profile_name visibility. These don'\''t change behaviour and can be addressed in a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> --------- Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com> * batch notification --------- Co-authored-by: ali <117142933+muhammad-ali-e@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Ritwik G <100672805+ritwik-g@users.noreply.github.com> Co-authored-by: Deepak K <89829542+Deepak-Kesavan@users.noreply.github.com> Co-authored-by: vishnuszipstack <117254672+vishnuszipstack@users.noreply.github.com> Co-authored-by: Praveen Kumar <praveen@zipstack.com> Co-authored-by: Chandrasekharan M <117059509+chandrasekharan-zipstack@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Athul <89829560+athul-rs@users.noreply.github.com>

…m:Zipstack/unstract into UN-3056-Notify-on-API-deployment-failures

…fy-on-API-deployment-failures

* batch notification * notification slack

coderabbitai

Actionable comments posted: 12

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/workflow_manager/internal_serializers.py (1)

176-184: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate the file aggregates against total_files.

These fields are validated independently right now, so impossible payloads like total_files=1, successful_files=2, failed_files=2 will pass and then be persisted. That will skew the new outcome-based notification logic.

Suggested fix

 class WorkflowExecutionStatusUpdateSerializer(serializers.Serializer):
     """Serializer for updating workflow execution status."""
@@
     failed_files = serializers.IntegerField(required=False, min_value=0)
     attempts = serializers.IntegerField(required=False, min_value=0)
     execution_time = serializers.FloatField(required=False, min_value=0)
+
+    def validate(self, attrs):
+        total_files = attrs.get("total_files")
+        successful_files = attrs.get("successful_files")
+        failed_files = attrs.get("failed_files")
+
+        if (successful_files is not None or failed_files is not None) and total_files is None:
+            raise serializers.ValidationError(
+                {"total_files": "total_files is required when file aggregates are provided."}
+            )
+
+        if total_files is not None:
+            if successful_files is not None and successful_files > total_files:
+                raise serializers.ValidationError(
+                    {"successful_files": "successful_files cannot exceed total_files."}
+                )
+            if failed_files is not None and failed_files > total_files:
+                raise serializers.ValidationError(
+                    {"failed_files": "failed_files cannot exceed total_files."}
+                )
+            if (
+                successful_files is not None
+                and failed_files is not None
+                and successful_files + failed_files > total_files
+            ):
+                raise serializers.ValidationError(
+                    "successful_files + failed_files cannot exceed total_files."
+                )
+
+        return attrs

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/workflow_manager/internal_serializers.py` around lines 176 - 184, The
serializer currently validates total_files, successful_files and failed_files
independently; add a validate(self, data) method in the same serializer (where
status, error_message, total_files, successful_files, failed_files, attempts,
execution_time are defined) that, when total_files is provided, enforces that
successful_files and failed_files are each <= total_files (if present) and that
(successful_files + failed_files) <= total_files; also handle the case where
only one of successful_files/failed_files is present by ensuring it does not
exceed total_files, and raise serializers.ValidationError with a clear message
on violation so invalid aggregates like total_files=1, successful_files=2 are
rejected before persisting.

🧹 Nitpick comments (2)

backend/notification_v2/views.py (1)

56-68: ⚡ Quick win

Use tuple for permission_classes class attribute.

Class attributes that are collections should be immutable (tuples) rather than mutable (lists) to avoid potential issues and follow best practices.

♻️ Proposed fix

-    permission_classes = [IsAuthenticated, IsOrganizationAdmin]
+    permission_classes = (IsAuthenticated, IsOrganizationAdmin)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/notification_v2/views.py` around lines 56 - 68, The class attribute
permission_classes on NotificationSettingsView is currently a list; change it to
an immutable tuple to follow best practices by replacing the mutable list
[IsAuthenticated, IsOrganizationAdmin] with a tuple (IsAuthenticated,
IsOrganizationAdmin) so permission_classes is not modifiable at runtime and
matches other DRF class-attribute patterns.

backend/notification_v2/tasks.py (1)

46-50: ⚡ Quick win

Combine the implicitly concatenated strings.

The two string literals on line 47 are implicitly concatenated. While valid Python, this can be error-prone and less readable.

♻️ Proposed fix

     logger.warning(
-        "metric=notification_batch_dispatched_total result=dead_letter rows=%d " "exc=%r",
+        "metric=notification_batch_dispatched_total result=dead_letter rows=%d exc=%r",
         updated,
         exc,
     )

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/notification_v2/tasks.py` around lines 46 - 50, The logger.warning
call currently uses two implicitly concatenated string literals; replace them
with a single combined format string in the logger.warning invocation so the
message is explicit and readable (keep the format placeholders and the same
arguments: updated and exc), e.g., a single string like
"metric=notification_batch_dispatched_total result=dead_letter rows=%d exc=%r"
passed to logger.warning with updated and exc.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/notification_v2/clubbed_renderer.py`:
- Line 3: Update the docstring in clubbed_renderer.py to replace the ambiguous
multiplication symbol "×" with a plain ASCII "x" so it satisfies Ruff rule
RUF002; locate the module-level or function/class docstring that contains the
sentence "The same envelope shape feeds every channel × mode cell so receivers
never" and change "×" to "x" (i.e., "channel x mode") to avoid the lint failure.

In `@backend/notification_v2/helper.py`:
- Around line 34-41: The current auth_sig is built by joining fields with "|"
which is ambiguous because authorization_key/header may contain "|" — change the
construction in helper.py where the hash is computed (the block that builds raw
from notification.authorization_type/authorization_key/authorization_header and
returns hashlib.sha256(...).hexdigest()) to encode the three parts as an
unambiguous structured value before hashing (e.g., create a fixed-order
list/tuple of the three parts with fallback to _AUTH_SIG_NONE and then
JSON-serialize it with stable separators, or use a length-prefixed
concatenation) and hash that encoded representation so different tuples can
never collide due to delimiter characters.

In `@backend/notification_v2/internal_api_views.py`:
- Around line 423-425: The batching flush currently groups rows by
(organization_id, webhook_url, auth_sig) but then uses rows[0].platform when
calling render_clubbed_message in _dispatch_group, causing mixed-platform
batches; update the flush query that persists/groups batches to include platform
in its grouping key (and add the corresponding DB index change) so grouping is
done by (organization_id, webhook_url, auth_sig, platform), and ensure
_dispatch_group (and any other consumer like the code around lines ~496-500)
reads platform from the grouped key rather than assuming rows[0].platform.

In `@backend/notification_v2/migrations/0002_notification_notify_on_failures.py`:
- Around line 10-21: The add-field migration currently creates a BooleanField
notify_on_failures which only distinguishes ALL vs FAILURES_ONLY; change this to
a tri-state field (e.g. models.CharField with choices or a small IntegerField)
on the Notification model in this migration so it can represent ALL,
FAILURES_ONLY, and SUCCESS_ONLY (use explicit choices like ("all","ALL"),
("failures","FAILURES_ONLY"), ("success","SUCCESS_ONLY")), set a sensible
default (e.g. "all"), and update the db_comment to document the three modes;
ensure the migration operation uses the new field type and name
notify_on_failures so downstream code can read the string/enum value rather than
a boolean.

In `@backend/notification_v2/migrations/0003_add_notification_buffer.py`:
- Around line 14-28: The migration adds a non-null CharField delivery_mode to
the notification model with default="BATCHED", which will change existing rows
to BATCHED on deploy; instead, modify the migration to preserve existing
behavior by performing a two-step change: 1) add delivery_mode as nullable (or
without a DB-level default) and include a RunPython data migration that sets
delivery_mode="IMMEDIATE" for existing Notification rows that should remain
immediate, and 2) then add a subsequent migration to set default="BATCHED" and
make the field non-nullable for future records; reference the migration
module/migration class in 0003_add_notification_buffer.py and the model name
"notification" and field name "delivery_mode" when implementing the nullable
field + RunPython backfill, or alternatively write a single migration that uses
RunPython before altering the field to set existing rows to "IMMEDIATE".

In `@backend/notification_v2/models.py`:
- Around line 58-65: The boolean field notify_on_failures on the model cannot
represent the three states required (ALL / FAILURES_ONLY / SUCCESS_ONLY); change
this to a tri-state enum field (e.g., a CharField or IntegerField with explicit
choices like NOTIFY_ALL, NOTIFY_FAILURES_ONLY, NOTIFY_SUCCESS_ONLY) and rename
the DB column/field to something clearer if helpful (e.g., notify_mode or
notify_condition) so intent is explicit; update the corresponding serializer(s)
and any filters/UI code that read/write notify_on_failures to accept and
validate the new enum values and migrate existing boolean data to the new enum
values in a migration.

In `@backend/notification_v2/provider/webhook/api_webhook.py`:
- Line 15: format_payload() is unconditionally wrapping self.payload which
causes double-enveloping on already-enveloped payloads or repeated send() calls;
change format_payload (and any callers like the constructor assignment
self.payload = self.format_payload() and the send() path at the 24-30 block) to
first detect whether the payload is already in the expected envelope shape
(e.g., check for the envelope root key/structure) and return it unchanged if so,
otherwise wrap it; ensure the envelope-detection logic is deterministic and
idempotent so multiple calls to format_payload/send() do not alter an
already-correctly-enveloped payload.

In `@backend/pipeline_v2/notification.py`:
- Around line 14-15: Remove ExecutionStatus.STOPPED from the set treated as
failures: update the _FAILURE_STATUSES definition to exclude
ExecutionStatus.STOPPED and similarly remove/adjust any other checks that
include ExecutionStatus.STOPPED (the second occurrence around the block handling
audience selection at lines 56-60) so that STOPPED executions are no longer
routed to failure-only subscriptions; ensure only true failure statuses (e.g.,
ExecutionStatus.ERROR) remain in _FAILURE_STATUSES and that any conditional
logic using that set (in notification audience selection) treats STOPPED as
non-failure/catch-all.

In `@backend/workflow_manager/workflow_v2/models/execution.py`:
- Around line 440-441: The current mapping sets successful = e.successful_files
or 0 and failed = e.failed_files or 0 which coerces NULL/None to 0 and can hide
unknown historical counts; change these assignments to preserve NULL/None (e.g.,
successful = e.successful_files if e.successful_files is not None else None, and
likewise for failed) and update any downstream status logic that treats 0 as "no
failures" to explicitly handle None as "unknown" (so PARTIAL_SUCCESS isn't
lost). Ensure references to successful and failed in the execution status
computation explicitly check for None versus integer values.

In
`@frontend/src/components/pipelines-or-deployments/notification-modal/CreateNotification.jsx`:
- Line 15: The component currently sends a boolean notify_on_failures which
can't represent SUCCESS_ONLY—replace this with a notify_on enum carrying one of
"ALL", "FAILURES_ONLY", or "SUCCESS_ONLY": update the component state/props
default (in CreateNotification.jsx) to hold notify_on instead of
notify_on_failures, map the form inputs (checkboxes/radio/select) to produce the
correct enum value, and change all payload constructions and update/create API
calls (the places around the existing notify_on_failures usage and the other
block referenced later in the file) to include notify_on with the proper enum
string; also adjust any validation/serialization logic that reads
notify_on_failures to use notify_on.

In `@frontend/src/components/settings/platform/PlatformSettings.jsx`:
- Around line 61-77: The effect in useEffect that fetches org-scoped batch
interval should guard on sessionDetails?.orgId and include proper deps and
cancellation: at the top of the effect return early if !sessionDetails?.orgId,
add sessionDetails?.orgId and axiosPrivate to the dependency array, and
implement request cancellation (e.g., AbortController or axios cancel token) so
in-flight responses don't call setBatchIntervalMinutes after unmount or when
orgId changes; update references to axiosPrivate and setBatchIntervalMinutes
accordingly.

In `@workers/shared/patterns/notification/helper.py`:
- Around line 77-84: The except block in _enqueue_to_buffer() is swallowing
enqueue failures (logging then returning False) which causes
_route_notification() to treat BATCHED delivery as successful; instead,
propagate the failure so a retrying caller can act (or implement local
retry/backoff). Replace the logger.error+return False with logger.exception(...)
to include stack context and then re-raise the exception (or raise a specific
EnqueueError) so _route_notification() sees the failure; make the same change
for the other similar block referenced (lines ~104-106) to avoid silent drops.

---

Outside diff comments:
In `@backend/workflow_manager/internal_serializers.py`:
- Around line 176-184: The serializer currently validates total_files,
successful_files and failed_files independently; add a validate(self, data)
method in the same serializer (where status, error_message, total_files,
successful_files, failed_files, attempts, execution_time are defined) that, when
total_files is provided, enforces that successful_files and failed_files are
each <= total_files (if present) and that (successful_files + failed_files) <=
total_files; also handle the case where only one of
successful_files/failed_files is present by ensuring it does not exceed
total_files, and raise serializers.ValidationError with a clear message on
violation so invalid aggregates like total_files=1, successful_files=2 are
rejected before persisting.

---

Nitpick comments:
In `@backend/notification_v2/tasks.py`:
- Around line 46-50: The logger.warning call currently uses two implicitly
concatenated string literals; replace them with a single combined format string
in the logger.warning invocation so the message is explicit and readable (keep
the format placeholders and the same arguments: updated and exc), e.g., a single
string like "metric=notification_batch_dispatched_total result=dead_letter
rows=%d exc=%r" passed to logger.warning with updated and exc.

In `@backend/notification_v2/views.py`:
- Around line 56-68: The class attribute permission_classes on
NotificationSettingsView is currently a list; change it to an immutable tuple to
follow best practices by replacing the mutable list [IsAuthenticated,
IsOrganizationAdmin] with a tuple (IsAuthenticated, IsOrganizationAdmin) so
permission_classes is not modifiable at runtime and matches other DRF
class-attribute patterns.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d8dfdd8e-ab77-48bb-a35e-1d2d9ef6fd9a

📥 Commits

Reviewing files that changed from the base of the PR and between 9678d78 and 33d77e9.

📒 Files selected for processing (38)

backend/api_v2/notification.py
backend/backend/settings/base.py
backend/configuration/enums.py
backend/notification_v2/clubbed_renderer.py
backend/notification_v2/enums.py
backend/notification_v2/helper.py
backend/notification_v2/internal_api_views.py
backend/notification_v2/internal_serializers.py
backend/notification_v2/internal_urls.py
backend/notification_v2/migrations/0002_notification_notify_on_failures.py
backend/notification_v2/migrations/0003_add_notification_buffer.py
backend/notification_v2/models.py
backend/notification_v2/provider/webhook/api_webhook.py
backend/notification_v2/provider/webhook/slack_webhook.py
backend/notification_v2/provider/webhook/webhook.py
backend/notification_v2/serializers.py
backend/notification_v2/tasks.py
backend/notification_v2/urls.py
backend/notification_v2/views.py
backend/pipeline_v2/dto.py
backend/pipeline_v2/notification.py
backend/workflow_manager/internal_serializers.py
backend/workflow_manager/internal_views.py
backend/workflow_manager/workflow_v2/migrations/0020_workflowexecution_file_counts.py
backend/workflow_manager/workflow_v2/models/execution.py
frontend/src/components/pipelines-or-deployments/notification-modal/CreateNotification.jsx
frontend/src/components/settings/platform/PlatformSettings.jsx
unstract/core/src/unstract/core/data_models.py
workers/callback/tasks.py
workers/log_consumer/process_notification_buffer.py
workers/log_consumer/scheduler.sh
workers/notification/providers/_clubbed_format.py
workers/notification/providers/api_webhook.py
workers/notification/providers/slack_webhook.py
workers/scheduler/tasks.py
workers/shared/api/internal_client.py
workers/shared/clients/execution_client.py
workers/shared/patterns/notification/helper.py

greptile-apps · 2026-05-13T08:02:15Z

Greptile Summary

This PR introduces two related features: a notify_on_failures flag that lets each notification subscription opt out of success-run webhooks, and a BATCHED dispatch system that buffers per-run events into NotificationBuffer and flushes them as a single clubbed message per (org, webhook_url, auth_sig, platform) group every NOTIFICATION_CLUB_INTERVAL. The previous synchronous-dispatch code path is fully removed.

notify_on_failures filter — added to Notification model (migration 0002), serializer, and UI checkbox; correctly applied at the GET-notification and backend-dispatch layers across all three dispatch sites (ETL pipeline, API deployment, worker callback).
BATCHED dispatch pipeline — new NotificationBuffer table (migration 0003), enqueue() helper, process_notification_buffer flush endpoint, and mark_buffer_dead_letter Celery task; the flush job uses SELECT … FOR UPDATE SKIP LOCKED for concurrency safety, on_commit to prevent broker-vs-DB duplicate-send races, and a GC sweep for DISPATCHED/DEAD_LETTER rows.
NotificationSettingsView exposes a GET/PATCH endpoint for the per-org NOTIFICATION_CLUB_INTERVAL override; the scheduler.sh sidecar fires the buffer-flush script alongside the existing log-history job.

Confidence Score: 3/5

Safe to merge for the notify_on_failures feature itself, but the BATCHED dispatch pipeline has a resource leak: PENDING buffer rows for deactivated notifications are never cleaned up and will accumulate indefinitely in notification_buffer.

The GC function only deletes DISPATCHED and DEAD_LETTER rows. When a notification is deactivated its PENDING rows are excluded from every flush dispatch but also excluded from every GC sweep. High-volume deployments that frequently toggle notifications will see the notification_buffer table and its partial covering index grow without bound, with no automated recovery path.

backend/notification_v2/internal_api_views.py — specifically the _gc_terminal_rows function which is missing a cleanup clause for PENDING rows belonging to inactive notifications.

Important Files Changed

Filename	Overview
backend/notification_v2/internal_api_views.py	Central piece of new functionality — buffer enqueue, group-level flush, and GC. PENDING rows for deactivated notifications are never GC'd, creating unbounded table growth.
backend/notification_v2/helper.py	Old NotificationHelper replaced by dispatch_with_delivery_mode + enqueue; auth_sig and flush_after computed at write time. Clean redesign.
backend/notification_v2/models.py	Adds notify_on_failures bool, delivery_mode, and NotificationBuffer model with partial covering index. Well-structured.
backend/notification_v2/tasks.py	New mark_buffer_dead_letter task used as Celery link_error callback; parameter names (request, exc, traceback) suggest on_failure semantics but link_error only passes the failed task UUID as first arg. buffer_row_ids from kwargs works correctly regardless.
unstract/core/src/unstract/core/notification_clubbed_renderer.py	Shared renderer producing the canonical {summary, events} envelope with Slack mrkdwn output.
workers/shared/patterns/notification/helper.py	Old direct-dispatch (send_notification_to_worker) removed in favour of _route_notification to _enqueue_to_buffer POST. execution_id forwarding added to GET calls so the backend can apply notify_on_failures filter.

Sequence Diagram

sequenceDiagram
    participant W as Worker
    participant BE as Backend API
    participant DB as NotificationBuffer
    participant Broker as Celery Broker

    W->>BE: "GET /notifications/?execution_id=X"
    BE->>DB: Filter Notification (is_active, notify_on_failures)
    BE-->>W: "[{notification_id, platform, ...}]"

    loop each notification
        W->>BE: POST /buffer/enqueue/
        BE->>DB: "INSERT NotificationBuffer(status=PENDING)"
        BE-->>W: "{buffer_row_id}"
    end

    Note over BE,DB: Scheduler fires every interval

    W->>BE: POST /buffer/process/
    BE->>DB: "GROUP BY (org,url,auth_sig,platform) WHERE flush_after<=now"
    loop each ready group
        BE->>DB: SELECT FOR UPDATE SKIP LOCKED
        BE->>DB: "UPDATE status=DISPATCHED"
        BE->>Broker: on_commit to send_webhook_notification
    end
    BE->>DB: GC old DISPATCHED/DEAD_LETTER rows
    BE-->>W: "{dispatched_groups, dispatched_rows}"

    alt retries exhausted
        Broker->>BE: mark_buffer_dead_letter(buffer_row_ids)
        BE->>DB: "UPDATE status=DEAD_LETTER"
    end

Prompt To Fix All With AI

Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
backend/notification_v2/internal_api_views.py:382-387
**PENDING rows for deactivated notifications accumulate forever**

When a notification is deactivated (`is_active=False`), the flush job's `notification__is_active=True` filter prevents those PENDING rows from ever being dispatched — but `_gc_terminal_rows()` only deletes DISPATCHED and DEAD_LETTER rows. The orphaned PENDING rows grow without bound, bloating `notification_buffer` and the `idx_notif_buffer_pending` partial index, and adding noise to every flush-tick GROUP BY scan. A GC pass for old PENDING rows belonging to inactive notifications closes the gap.

```suggestion
    cutoff = timezone.now() - timedelta(days=settings.NOTIFICATION_BUFFER_RETENTION_DAYS)
    deleted_count, _ = NotificationBuffer.objects.filter(
        status__in=[BufferStatus.DISPATCHED.value, BufferStatus.DEAD_LETTER.value],
        created_at__lt=cutoff,
    ).delete()
    # Also GC PENDING rows whose source notification has been deactivated.
    # These are never dispatched (notification__is_active=True filter in
    # _dispatch_group) and the terminal-row filter above never reaches them,
    # so without this clause they accumulate indefinitely.
    inactive_deleted, _ = NotificationBuffer.objects.filter(
        status=BufferStatus.PENDING.value,
        notification__is_active=False,
        created_at__lt=cutoff,
    ).delete()
    return int(deleted_count) + int(inactive_deleted)
```

### Issue 2 of 2
backend/notification_v2/tasks.py:24-35
**Misleading parameter names for a `link_error` callback**

Celery's `link_error` passes the **UUID string of the failed task** as the sole positional argument — not `(request, exc, traceback)` as an `on_failure` handler would. So `request` receives the failed task's ID, while `exc` and `traceback` are always `None`. The function works correctly because `buffer_row_ids` is bound via `kwargs` and is the only field the logic touches, but the signature creates a false impression that callers can recover the exception object from `exc`, which would silently be `None`.

_{Reviews (13): Last reviewed commit: "greptile review" | Re-trigger Greptile}

kirtimanmishrazipstack · 2026-05-13T09:04:22Z

Review triage summary

Thanks to @coderabbitai and @greptile-apps for the thorough pass. Status of each item:

Fixed in this branch

CR clubbed_renderer.py × → x (RUF002)
CR auth_sig JSON-encoded before SHA-256 (no delimiter-collision)
CR + Greptile P1: platform added to flush grouping and folded into the original idx_notif_buffer_pending partial index in migration 0003 (in lockstep with Meta.indexes)
CR + Greptile P2: PlatformSettings.jsx useEffect guards on orgId + adds it to deps
CR: _enqueue_to_buffer now logger.exception/raises; caller catches per-row so sibling notifications still iterate
Greptile P1 duplicate-delivery race: dispatch order in _dispatch_group reversed — UPDATE first inside transaction.atomic(), broker send_task runs via transaction.on_commit(...); broker failure post-commit reverts rows to PENDING for next-tick retry
Greptile P2: _FAILURE_STATUSES consolidated into a single notification_v2.enums.FAILURE_STATUSES
CR outside-diff: WorkflowExecutionStatusUpdateSerializer.validate(...) added for cross-field aggregate checks
CR nitpick (tasks.py): implicit string concat merged

Stale / wont fix (replies posted on each thread with reasoning)\n- Tri-state `notify_on` (CR migration 0002 / model / CreateNotification.jsx) — PR scope cut to boolean `notify_on_failures`; PR description updated to match\n- BATCHED-as-default migration backfill (CR + Greptile P1) — deliberate v2.7 design choice; operator note added to PR description\n- `STOPPED` as failure (CR) — `db_comment` in migration 0002 is the spec\n- `APIWebhook` double-wrap (CR) — fresh instance per dispatch in current dispatcher loop; not reachable today\n- Historical `NULL`-aggregate `PARTIAL_SUCCESS` in `last_run_statuses` tooltip (CR) — display-only path outside UN-3056 scope; deferred\n\n### Nitpick — not adopting\n- `permission_classes` as tuple (CR `views.py:56-68`): project convention is `list` across every DRF view (verified across 10+ files including `pipeline_v2/execution_view.py`, `dashboard_metrics/views.py`, `platform_api/views.py`). Keeping consistent with house style.

coderabbitai · 2026-05-13T09:04:43Z

Tip

For best results, initiate chat on the files or code changes.

@kirtimanmishrazipstack Thanks for the thorough triage — really appreciate the clear breakdown.

All the fixes look solid:

The transaction.on_commit(...) broker dispatch ordering is the right approach for the duplicate-delivery race.
_FAILURE_STATUSES consolidation into notification_v2.enums is a clean improvement.
The WorkflowExecutionStatusUpdateSerializer.validate(...) cross-field check is a good catch to have in place.

The reasoning on the won't-fix items is sound — the db_comment as spec for STOPPED, the deliberate BATCHED-as-default design decision, and the deferred display-only PARTIAL_SUCCESS path are all reasonable calls for this PR's scope.

Noted on the permission_classes convention — keeping it as a list is the right call for project consistency.

📝 Learnings were identified and not saved (knowledge base disabled). Enable

greptile-apps · 2026-05-13T09:05:07Z

Thanks for the thorough triage write-up. All of my P1 and P2 items are confirmed addressed:

Duplicate-delivery race → UPDATE inside transaction.atomic() + on_commit broker dispatch with PENDING revert on post-commit broker failure. Correct pattern.
platform flush grouping → migration 0004 rebuilds the partial index in lockstep.
FAILURE_STATUSES consolidation → single source of truth in notification_v2.enums.
_enqueue_to_buffer error handling → logger.exception/raise with per-row catch in the caller so siblings still iterate.
useEffect orgId guard → dependency array fixed.

The "wont fix" reasoning all holds up:

Boolean notify_on_failures scope cut is reasonable and the PR description is updated to match.
BATCHED-as-default is a deliberate v2.7 design choice with operator notes documented.
STOPPED as failure is spec'd in the migration db_comment.
APIWebhook double-wrap is not reachable with the current dispatcher loop structure.
permission_classes as list — consistent with the project's house style across DRF views.

LGTM from my side. The implementation is solid and the edge cases I flagged are all properly handled.

Tip: You can customize Greptile's behavior for this repo with .greptile/rules.md and .greptile/config.json.

coderabbitai · 2026-05-13T09:09:18Z

Tip

For best results, initiate chat on the files or code changes.

@kirtimanmishrazipstack Thanks for the detailed triage — really well organized. Everything looks good:

Fixes confirmed:

transaction.on_commit(...) for broker dispatch is exactly the right pattern for the duplicate-delivery race.
_FAILURE_STATUSES consolidation into notification_v2.enums is a clean improvement for future consumers.
The WorkflowExecutionStatusUpdateSerializer.validate(...) cross-field check is a solid guard to have.
Per-row exception handling in _enqueue_to_buffer preserving sibling iteration is the right resilience model.

Won't-fix reasoning is sound:

Boolean notify_on_failures is a cleaner scope for this PR; the tri-state can come in a follow-up if needed.
BATCHED-as-default as a deliberate v2.7 design choice with an operator note in the PR description is the right way to handle that.
db_comment as the spec for STOPPED-as-failure is clear and self-documenting.
The PARTIAL_SUCCESS display path being deferred is a reasonable call — it's isolated and not on the critical notification path.

permission_classes convention noted — keeping list for consistency with the rest of the project is the right call.

📝 Learnings were identified and not saved (knowledge base disabled). Enable

Keep execution_id in _ENQUEUE_REQUIRED_FIELDS as the canonical required set; carve out the INPROGRESS exemption at the validator instead of dropping it broadly. Non-INPROGRESS callers (COMPLETED / ERROR / STOPPED / PARTIAL_SUCCESS) once again get a loud 400 if they omit execution_id, addressing Greptile's silent-failure concern on e653494. Extends the comment above the tuple to also flag the consumer-side gap: INPROGRESS buffer rows ship with execution_id=null, so API receivers cannot correlate them with execution logs until the producer-reorder follow-up (UN-3056) lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

kirtimanmishrazipstack · 2026-05-13T12:25:27Z

@greptileai thanks for the scope-pushback — fully agree on the silent-failure surface. Resolved in 724d280c6:

Point 1 — scope of the relaxation. execution_id is back in _ENQUEUE_REQUIRED_FIELDS; the exemption now fires only when body.status == Pipeline.PipelineStatus.INPROGRESS. Non-INPROGRESS callers (COMPLETED / ERROR / STOPPED / PARTIAL_SUCCESS) get a loud 400 if they omit execution_id. Kept the canonical tuple as the single source of truth and put the exemption local to the validator, rather than splitting required-set into a tuple + REQUIRED_FIELDS_EXCEPT_INPROGRESS set — equivalent semantics, fewer places to drift over time. Comment above the tuple names the exact producer (workers/scheduler/tasks.py, UN-2850 / #1562) and the long-term fix layer.

Point 2 — consumer-side correlation gap. Confirmed: INPROGRESS buffer rows do store execution_id=null, and API webhook receivers can't tie them back to execution logs. Renderer's _MISSING = "—" only papers over the display; the data gap is real. Documented in the same comment block (calls out the receiver impact + the UN-3056 follow-up that closes it). Not fixable here without a producer reorder — that lives in UN-3056.

Point 3 — producer location. Confirmed: _send_pipeline_status_notification fires with execution_id=None at tasks.py:~316, before _execute_scheduled_workflow returns an execution_id. Agreed it's a small, contained diff once UN-3056 picks it up.

Point 4 — tracking. UN-3056 is the open follow-up. The producer reorder + the consumer-side correlation gap close together there.

greptile-apps · 2026-05-13T12:27:11Z

Thanks for the thorough write-up on all four points. The reasoning is solid. One thing worth flagging: commit 724d280c6 doesn't appear in the branch's commit history — the current HEAD is e6534949 ("greptile comments resolve"), and the current internal_api_views.py still has execution_id outside _ENQUEUE_REQUIRED_FIELDS:

# `execution_id` is intentionally optional: the scheduler INPROGRESS path
# (workers/scheduler/tasks.py, UN-2850 / #1562) fires before WorkflowExecution
# is created. Renderer falls back to `—` for missing values.
_ENQUEUE_REQUIRED_FIELDS = (
    "notification_id",
    "pipeline_id",
    "pipeline_name",
    "status",
    "platform",
)

If the INPROGRESS-exemption approach is landing in a follow-up push, that's fine — but the described behavior (hard 400 for non-INPROGRESS callers missing execution_id) isn't in the branch yet. Once pushed, the pattern you described — canonical tuple + local INPROGRESS carve-out in the validator — is exactly right, and the scoped comment tying it to the producer + issue number is appreciated.

On points 2–4: the acknowledgement that execution_id=null on INPROGRESS rows is a real data gap (not just a display gap), and that the producer reorder belongs in UN-3056, is the correct framing. The comment block explaining the receiver impact closes the "silent failure" concern I had.

Effectively LGTM once the INPROGRESS-exemption commit lands and is visible on the branch.

kirtimanmishrazipstack · 2026-05-13T12:37:32Z

@greptileai you were right that 724d280c6 wasn't on the branch when you wrote that — the push lagged the reply. Now landed: branch HEAD is 4104fafb2 which carries both 724d280c6 (execution_id exemption scoped to INPROGRESS) and 4104fafb2 (comment tightened).

Current internal_api_views.py:289-301:

# `execution_id` is required except for INPROGRESS, which fires from the
# scheduler (workers/scheduler/tasks.py, UN-2850) before WorkflowExecution
# exists. INPROGRESS rows therefore store execution_id=null — receivers
# cannot correlate with execution logs until the producer-reorder lands
# (UN-3056).
_ENQUEUE_REQUIRED_FIELDS = (
    "notification_id",
    "pipeline_id",
    "pipeline_name",
    "status",
    "platform",
    "execution_id",
)

execution_id is back inside the tuple; the carve-out lives in the validator and only fires when body.status == Pipeline.PipelineStatus.INPROGRESS. Re-scan when you get a chance — the LGTM condition should now be met.

_dispatch_group's lock query did not check notification.is_active, so PENDING NotificationBuffer rows tied to a deactivated source notification still dispatched on the next flush tick (up to one NOTIFICATION_CLUB_INTERVAL of stale traffic). IMMEDIATE deactivation is instant because the GET notifications endpoint filters by is_active=True; this restores the same expectation for BATCHED. Also adds select_related("notification") so the later rows[0].notification read is part of the same query rather than a per-group round-trip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/notification_v2/helper.py`:
- Around line 71-91: The current build_webhook_headers(Notification) recreates
headers from the live Notification and can mix enqueue-time auth_sig buckets
with updated credentials; change the dispatch to use the snapshot stored on the
buffer row instead of reading Notification at flush time: stop calling
build_webhook_headers(notification) directly in
internal_api_views._dispatch_group and instead read auth_type, auth_key,
auth_header (or auth_sig) that were saved on the buffer row when enqueuing and
pass those into a new helper (or overload build_webhook_headers to accept raw
auth fields) so headers are deterministic per row; alternatively implement logic
in the edit path to re-key/rebucket pending buffer rows when Notification auth
is changed so queued rows never use current Notification state at flush time.
- Around line 149-150: When computing flush_after, honor the per-event
delivery_mode instead of always using
get_org_club_interval_seconds(organization): if delivery_mode equals IMMEDIATE
(or the equivalent enum/constant used in your codebase) set flush_after to
timezone.now() (or skip batching) so the event is delivered immediately;
otherwise compute interval_seconds = get_org_club_interval_seconds(organization)
and set flush_after = timezone.now() + timedelta(seconds=interval_seconds).
Update the code paths that set flush_after (and any callers that assume
batching) to handle the IMMEDIATE branch accordingly.

In `@workers/shared/patterns/notification/helper.py`:
- Line 61: Update the exception handler line that currently reads "except
Exception:  # noqa: BLE001 — propagate any failure, don't classify" so the noqa
suppression is standalone and the rationale is a separate comment line;
specifically, change the handler to have the inline "# noqa: BLE001" only on the
"except Exception:" line (no extra text after the noqa) and add a following
comment line like "# propagate any failure, don't classify" above the handler
body; locate the "except Exception" block in helper.py (the except Exception
handler) and make this two-line comment split to satisfy Sonar.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dac68d3e-c63e-4a54-9c37-26332e88cf26

📥 Commits

Reviewing files that changed from the base of the PR and between 77b7956 and 780d51c.

📒 Files selected for processing (15)

backend/notification_v2/clubbed_renderer.py
backend/notification_v2/enums.py
backend/notification_v2/helper.py
backend/notification_v2/internal_api_views.py
backend/notification_v2/provider/__init__.py
backend/notification_v2/provider/notification_provider.py
backend/notification_v2/provider/registry.py
backend/notification_v2/provider/webhook/__init__.py
backend/notification_v2/provider/webhook/api_webhook.py
backend/notification_v2/provider/webhook/slack_webhook.py
backend/notification_v2/provider/webhook/webhook.py
unstract/core/src/unstract/core/notification_clubbed_renderer.py
workers/notification/providers/api_webhook.py
workers/notification/providers/slack_webhook.py
workers/shared/patterns/notification/helper.py

💤 Files with no reviewable changes (5)

backend/notification_v2/provider/registry.py
backend/notification_v2/provider/notification_provider.py
backend/notification_v2/provider/webhook/webhook.py
backend/notification_v2/provider/webhook/api_webhook.py
backend/notification_v2/provider/webhook/slack_webhook.py

🚧 Files skipped from review as they are similar to previous changes (6)

backend/notification_v2/clubbed_renderer.py
backend/notification_v2/enums.py
unstract/core/src/unstract/core/notification_clubbed_renderer.py
workers/notification/providers/api_webhook.py
backend/notification_v2/internal_api_views.py
workers/notification/providers/slack_webhook.py

coderabbitai

🧹 Nitpick comments (1)

workers/shared/patterns/notification/helper.py (1)
77-105: 💤 Low value

Import NotificationType from unstract.core.data_models and replace the hardcoded "WEBHOOK" string.

The file already imports from unstract.core.data_models (line 11); simply add NotificationType to that import list and use NotificationType.WEBHOOK.value at line 90. This prevents silent breakage if the backend ever changes the literal value and keeps the worker and backend aligned.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workers/shared/patterns/notification/helper.py` around lines 77 - 105, Update
the _route_notification function to use the enum instead of a hardcoded string:
add NotificationType to the existing import from unstract.core.data_models and
replace the comparison notification.get("notification_type") != "WEBHOOK" with
notification.get("notification_type") != NotificationType.WEBHOOK.value; this
keeps the worker aligned with backend literal values and avoids brittle string
checks while preserving the existing behavior and exception handling in
_route_notification and its call to _enqueue_to_buffer.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@workers/shared/patterns/notification/helper.py`:
- Around line 77-105: Update the _route_notification function to use the enum
instead of a hardcoded string: add NotificationType to the existing import from
unstract.core.data_models and replace the comparison
notification.get("notification_type") != "WEBHOOK" with
notification.get("notification_type") != NotificationType.WEBHOOK.value; this
keeps the worker aligned with backend literal values and avoids brittle string
checks while preserving the existing behavior and exception handling in
_route_notification and its call to _enqueue_to_buffer.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 73ef426c-df79-43aa-98b0-9ead57a43fc5

📥 Commits

Reviewing files that changed from the base of the PR and between ba04d58 and b1fd243.

📒 Files selected for processing (2)

backend/notification_v2/internal_api_views.py
workers/shared/patterns/notification/helper.py

🚧 Files skipped from review as they are similar to previous changes (1)

backend/notification_v2/internal_api_views.py

github-actions · 2026-05-13T15:16:47Z

Frontend Lint Report (Biome)

✅ All checks passed! No linting or formatting issues found.

github-actions · 2026-05-13T15:17:46Z

Test Results

Summary

✅ Runner Tests: 11 passed, 0 failed (11 total)
✅ SDK1 Tests: 325 passed, 0 failed (325 total)

Runner Tests - Full Report

filepath	function	$$\textcolor{#23d18b}{\tt{passed}}$$	SUBTOTAL
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_logs}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_cleanup\_skip}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_client\_init}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_exists}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_container\_run\_config\_without\_mount}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_run\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_get\_image\_for\_sidecar}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{runner/src/unstract/runner/clients/test\_docker.py}}$$	$$\textcolor{#23d18b}{\tt{test\_sidecar\_container}}$$	$$\textcolor{#23d18b}{\tt{1}}$$	$$\textcolor{#23d18b}{\tt{1}}$$
$$\textcolor{#23d18b}{\tt{TOTAL}}$$		$$\textcolor{#23d18b}{\tt{11}}$$	$$\textcolor{#23d18b}{\tt{11}}$$

SDK1 Tests - Full Report

…m:Zipstack/unstract into UN-3056-Notify-on-API-deployment-failures

sonarqubecloud · 2026-05-14T11:55:42Z

Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

kirtimanmishrazipstack and others added 6 commits April 24, 2026 16:25

api deployment notification init

60d4816

UN-3358 [FIX] Drop cross-region S3 buckets from connector listing (#1931

5d591dc

) * list bucket * greptile review

Merge branch 'main' of github.com:Zipstack/unstract into UN-3056-Noti…

6ad5fa9

…fy-on-API-deployment-failures

kirtimanmishrazipstack changed the title ~~Un 3056 notify on api deployment failures~~ UN-3056 [FEAT] Filter notifications by run outcome Apr 29, 2026

kirtimanmishrazipstack and others added 13 commits May 5, 2026 20:49

payload metadata in api deployment

019f33c

slack webhook payload

9cd8eb1

Merge branch 'main' of github.com:Zipstack/unstract into UN-3056-Noti…

968fdbd

…fy-on-API-deployment-failures

Merge branch 'UN-3056-Notify-on-API-deployment-failures' of github.co…

52e8abf

…m:Zipstack/unstract into UN-3056-Notify-on-API-deployment-failures

Merge branch 'main' of github.com:Zipstack/unstract into UN-3056-Noti…

7d455c1

…fy-on-API-deployment-failures

Merge branch 'main' of github.com:Zipstack/unstract into UN-3056-Noti…

0a5b5bb

…fy-on-API-deployment-failures

Merge branch 'main' into UN-3056-Notify-on-API-deployment-failures

d98f580

Uns 611 clubbed notification dispatch (#1959)

b8bf719

* batch notification * notification slack

notification API

37930d2

delivery mode batch by default

f8052db

UI change

8798737

Merge branch 'main' into UN-3056-Notify-on-API-deployment-failures

33d77e9

kirtimanmishrazipstack marked this pull request as ready for review May 13, 2026 07:55

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/migrations/0003_add_notification_buffer.py

Comment thread backend/notification_v2/internal_api_views.py Outdated

Comment thread backend/api_v2/notification.py Outdated

Comment thread frontend/src/components/settings/platform/PlatformSettings.jsx Outdated

PR reviews

50917e8

greptile comments resolve

4104faf

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/internal_api_views.py

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/internal_api_views.py

kirtimanmishrazipstack and others added 2 commits May 13, 2026 18:15

greptile comments resolve

77b7956

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/pipeline_v2/dto.py

remove immediate mode

780d51c

kirtimanmishrazipstack marked this pull request as draft May 13, 2026 14:02

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/helper.py

Comment thread backend/notification_v2/helper.py

Comment thread workers/shared/patterns/notification/helper.py Outdated

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/internal_api_views.py

kirtimanmishrazipstack added 2 commits May 13, 2026 19:40

add legacy code

743e9a9

add legacy code

ba04d58

kirtimanmishrazipstack marked this pull request as ready for review May 13, 2026 14:16

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/helper.py

greptile review

b1fd243

coderabbitai Bot reviewed May 13, 2026

View reviewed changes

kirtimanmishrazipstack marked this pull request as draft May 13, 2026 15:16

kirtimanmishrazipstack marked this pull request as ready for review May 13, 2026 15:16

kirtimanmishrazipstack marked this pull request as draft May 13, 2026 15:16

greptile-apps Bot reviewed May 13, 2026

View reviewed changes

Comment thread backend/notification_v2/internal_api_views.py Outdated

kirtimanmishrazipstack and others added 3 commits May 14, 2026 17:10

Merge branch 'main' into UN-3056-Notify-on-API-deployment-failures

51910ab

greptile review

fb47c4f

Merge branch 'UN-3056-Notify-on-API-deployment-failures' of github.co…

99d3653

…m:Zipstack/unstract into UN-3056-Notify-on-API-deployment-failures

Conversation

kirtimanmishrazipstack commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

How

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

Database Migrations

Env Config

Relevant Docs

Related Issues or PRs

Dependencies Versions

Notes on Testing

Screenshots

Checklist

Uh oh!

coderabbitai Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagrams

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kirtimanmishrazipstack commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review triage summary

Fixed in this branch

Uh oh!

coderabbitai Bot commented May 13, 2026

Uh oh!

greptile-apps Bot commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026

Uh oh!

kirtimanmishrazipstack commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps Bot commented May 13, 2026

Uh oh!

Uh oh!

kirtimanmishrazipstack commented May 13, 2026

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

kirtimanmishrazipstack commented Apr 29, 2026 •

edited

Loading

coderabbitai Bot commented Apr 29, 2026 •

edited

Loading

greptile-apps Bot commented May 13, 2026 •

edited

Loading

kirtimanmishrazipstack commented May 13, 2026 •

edited

Loading

kirtimanmishrazipstack commented May 13, 2026 •

edited

Loading