Webhook: Add Fingerprinting to Webhook Payloads#765
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a WEBHOOK_SECRET provider and threads project/organization-scoped webhook secrets into callback dispatch. Callbacks are optionally HMAC-SHA256-signed (signature + timestamp headers) and sent as pre-serialized compact JSON bytes to ensure the signed body matches the transmitted payload. (≤50 words) Changes
Sequence Diagram(s)sequenceDiagram
participant Service as Service Layer
participant DB as Provider DB
participant Signer as HMAC Signer
participant Webhook as External Webhook
Service->>DB: get_webhook_secret(project_id, organization_id)
DB-->>Service: secret or None
Service->>Service: build callback payload (dict)
Service->>Signer: compact-serialize(payload) -> raw_body bytes
alt secret present
Service->>Signer: sign_webhook_payload(secret, raw_body)
Signer-->>Service: (signature, timestamp)
Service->>Webhook: POST callback_url with data=raw_body and headers: Content-Type, X-Webhook-Signature, X-Webhook-Timestamp
else no secret
Service->>Webhook: POST callback_url with data=raw_body and header: Content-Type
end
Webhook-->>Service: HTTP response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
backend/app/services/collections/delete_collection.py (2)
241-247:⚠️ Potential issue | 🟠 MajorSign successful delete-collection callbacks too.
The success path still calls
send_callback(...)withoutwebhook_secret, so successful deletion callbacks remain unsigned while failure callbacks are signed.🔐 Proposed fix
if deletion_request.callback_url and collection_job: success_payload = build_success_payload( collection_job=collection_job, collection_id=collection_id, ) - send_callback(deletion_request.callback_url, success_payload) + with Session(engine) as session: + creds = get_provider_credential( + session=session, + org_id=organization_id, + project_id=project_id, + provider="webhook_secret", + ) + webhook_secret = ( + creds.get("webhook_secret") if isinstance(creds, dict) else None + ) + send_callback( + deletion_request.callback_url, + success_payload, + webhook_secret=webhook_secret, + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/collections/delete_collection.py` around lines 241 - 247, The success callback path currently calls send_callback(deletion_request.callback_url, success_payload) without the webhook secret, leaving successful delete-collection callbacks unsigned; update the call to include the webhook secret used for signing (pass the same webhook_secret used for failure callbacks, e.g. deletion_request.webhook_secret or the variable used elsewhere) so change the send_callback invocation in the block that builds success_payload (references: deletion_request, build_success_payload, collection_job, collection_id, send_callback) to pass the webhook_secret as the signing argument.
248-282:⚠️ Potential issue | 🔴 CriticalAdd
organization_idparameter and fix orphaned exception block.The code has a syntax error: the
exceptat line 274 is not attached to anytrystatement. Additionally, the first exception handler (line 248–272) has two issues:
- Missing
organization_idargument in the_mark_job_failed_and_callback()call (line 251)- The code attempts to use
success_payloadat line 270, but it's only defined inside a conditional block at lines 242–245 and may not exist when an exception occursRemove the webhook callback code (lines 258–272) and the orphaned
exceptblock (lines 274–282), then addorganization_idto the first_mark_job_failed_and_callbackcall andraiseto propagate the exception.Proposed fix
except Exception as err: span.record_exception(err) span.set_status(trace.Status(trace.StatusCode.ERROR, str(err))) _mark_job_failed_and_callback( + organization_id=organization_id, project_id=project_id, collection_id=collection_id, job_id=job_uuid, err=err, callback_url=deletion_request.callback_url, ) - with Session(engine) as session: - creds = get_provider_credential( - session=session, - org_id=organization_id, - project_id=project_id, - provider="webhook_secret", - ) - webhook_secret = ( - creds.get("webhook_secret") if isinstance(creds, dict) else None - ) - send_callback( - deletion_request.callback_url, - success_payload, - webhook_secret=webhook_secret, - ) - - - except Exception as err: - _mark_job_failed_and_callback( - organization_id=organization_id, - project_id=project_id, - collection_id=collection_id, - job_id=job_uuid, - err=err, - callback_url=deletion_request.callback_url, - ) + raise🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/collections/delete_collection.py` around lines 248 - 282, In the exception handler inside delete_collection.py update the first _mark_job_failed_and_callback call to include organization_id, remove the downstream webhook callback code that references send_callback and success_payload (delete the block that fetches webhook_secret and calls send_callback), delete the orphaned second except block, and re-raise the caught exception after recording it (i.e., keep span.record_exception/ set_status, call _mark_job_failed_and_callback with organization_id, then raise the exception) so the error is propagated.backend/app/services/llm/jobs.py (3)
184-208:⚠️ Potential issue | 🟠 MajorSend the failure webhook only once, with the signature.
Line 197 sends the signed callback, then Lines 202-208 send the same failure callback again without
webhook_secret. This can double-trigger client-side webhook handlers and make the second delivery fail signature validation.Suggested fix
if callback_url: webhook_secret = None if organization_id is not None and project_id is not None: with Session(engine) as session: creds = get_provider_credential( session=session, org_id=organization_id, project_id=project_id, provider="webhook_secret", ) webhook_secret = ( creds.get("webhook_secret") if isinstance(creds, dict) else None ) - send_callback( - callback_url=callback_url, - data=callback_response.model_dump(), - webhook_secret=webhook_secret, - ) with tracer.start_as_current_span("llm.send_callback") as cb_span: cb_span.set_attribute("callback.url", callback_url) cb_span.set_attribute("callback.status", "failure") send_callback( callback_url=callback_url, data=callback_response.model_dump(), + webhook_secret=webhook_secret, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/jobs.py` around lines 184 - 208, The code is sending the failure webhook twice—once with webhook_secret and again without—causing duplicate deliveries and signature failures; modify the block around send_callback, webhook_secret, get_provider_credential and the tracer span so you only call send_callback once for the failure path and include the webhook_secret (if available) in that single call, remove the redundant send_callback invocation, and keep the tracer span (llm.send_callback) setting "callback.url" and "callback.status" around that single send to preserve tracing.
963-979:⚠️ Potential issue | 🔴 CriticalFix duplicate and improperly indented code in the exception handler.
The code at lines 963-978 contains duplicate
callback_responseandhandle_job_errorcalls with inconsistent indentation. The first version (lines 963-973) is not indented to the except block, and the second (lines 974-978) is over-indented. Additionally, the second version omits theorganization_idandproject_idparameters, losing critical webhook context.Keep only the properly indented version with all required parameters inside the
exceptblock:Fix
- callback_response = APIResponse.failure_response( - error="Unexpected error occurred", - metadata=request.request_metadata, - ) - return handle_job_error( - job_uuid, - callback_url_str, - callback_response, - organization_id=organization_id, - project_id=project_id, - ) - callback_response = APIResponse.failure_response( - error="Unexpected error occurred", - metadata=request.request_metadata, - ) - return handle_job_error(job_uuid, callback_url_str, callback_response) + callback_response = APIResponse.failure_response( + error="Unexpected error occurred", + metadata=request.request_metadata, + ) + return handle_job_error( + job_uuid, + callback_url_str, + callback_response, + organization_id=organization_id, + project_id=project_id, + ) finally: # Ensure task spans are pushed promptly so Sentry dashboards update faster. flush_telemetry()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/jobs.py` around lines 963 - 979, Remove the duplicated and mis-indented exception handling code: inside the except block create a single callback_response using APIResponse.failure_response(error="Unexpected error occurred", metadata=request.request_metadata) and call handle_job_error(job_uuid, callback_url_str, callback_response, organization_id=organization_id, project_id=project_id); delete the stray duplicate callback_response/handle_job_error calls outside or with wrong indentation so only the properly indented version that includes organization_id and project_id remains.
758-846:⚠️ Potential issue | 🔴 CriticalFix critical control flow errors before merging.
The code has three structural issues:
- Line 771:
callback_responseis referenced before assignment (called oncallback_response.model_dump()before theif result.success:block creates it).- Line 789: De-indentation creates a syntax error—the statement falls outside the
tryblock with no matchingexcept/finally, causing Python to raiseexpected 'except' or 'finally' block.- Lines 817–825: Dead code—the success DB update and return statement are unreachable, placed after the
exceptblock's return.These must be fixed by restructuring control flow: move webhook credential retrieval into the success branch (after
callback_responseis assigned), properly close thetry/except/finallyblocks, and relocate the DB update into the success branch before the return.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/services/llm/jobs.py` around lines 758 - 846, The code currently calls callback_response.model_dump() and sends a webhook before callback_response is created, has mismatched try/except/finally indentation, and places the JobCrud.update success path after an except making it unreachable; fix by moving the webhook credential lookup (get_provider_credential) and any send_callback calls into the success branch after callback_response is assigned (i.e., inside the if result.success block), ensure the try block is properly closed with its corresponding except and finally (flush_telemetry) so no statements fall outside, and relocate the database success update (JobCrud(session).update with JobUpdate(status=JobStatus.SUCCESS)) and the success return (callback_response.model_dump()) into the success branch before any returns so they execute when result.success is true; verify all send_callback usage calls callback_response that has been created and that exception handlers return via handle_job_error as intended.
🧹 Nitpick comments (1)
backend/app/tests/core/test_callback_ssrf.py (1)
330-338: Add positive coverage for signed callbacks.This verifies the unsigned path, but the new feature also needs a test that passes
webhook_secretand assertsX-Webhook-Signature/X-Webhook-Timestampmatch the exact compact body bytes sent indata.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/core/test_callback_ssrf.py` around lines 330 - 338, Add a positive test for the signed callback path: call the same code that produces call_kwargs but pass a webhook_secret, compute the compact body bytes as json.dumps(test_data, separators=(",", ":")).encode(), generate the expected timestamp and HMAC signature using the webhook_secret over the exact bytes (including whatever signing format your code uses, e.g., sig = hmac.new(webhook_secret.encode(), compact_bytes, sha256).hexdigest()), then assert call_kwargs["data"] equals the compact_bytes and that call_kwargs["headers"] contains "X-Webhook-Timestamp" equal to the timestamp you generated and "X-Webhook-Signature" equal to the expected signature; use the existing variables call_kwargs and test_data to locate where to add this check (mirror the unsigned assertions but include webhook_secret and the signature/timestamp assertions).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/core/providers.py`:
- Around line 46-48: The tests in test_credentials.py hardcode the expected
provider set and length (currently asserting {"openai","langfuse"} and len==2),
which breaks now that Provider.WEBHOOK_SECRET is added; update those assertions
to include "webhook_secret" (matching Provider.WEBHOOK_SECRET) or, better,
compute expected providers dynamically from the Provider/ProviderConfig registry
so the test no longer hardcodes the set/length (e.g., derive expectation from
the Provider enum or the providers mapping used in providers.py).
In `@backend/app/services/collections/create_collection.py`:
- Around line 317-332: The code is sending the same success callback twice;
remove the duplicate unsigned send by deleting the new send_callback block that
uses creation_request.callback_url, and instead move the provider credential
lookup (get_provider_credential) and webhook_secret extraction into the existing
success branch where success_payload is already sent (the earlier send_callback
call around success handling). Ensure the existing send_callback call uses the
retrieved webhook_secret when calling send_callback, referencing
creation_request.callback_url, get_provider_credential, webhook_secret, and
success_payload so only a single signed callback is sent.
- Around line 305-332: The except path in execute_job currently builds a success
payload using build_success_payload and logs/sends a "Collection created"
callback even on failure; update the except block (where collection may be
unbound) to either re-raise the exception to be handled by the outer failure
handler OR build and send an explicit failure payload (e.g.,
build_failure_payload or a dict containing collection_job.id, error details from
the caught exception, and status="failed") instead of using
build_success_payload; also replace the success log with an error/exception log
and ensure send_callback is invoked with the failure payload and webhook_secret
(using get_provider_credential) only when creation_request.callback_url is set.
- Around line 334-373: The second "except Exception as err:" is unmatched due to
misplaced code between the try and the handler; fix by consolidating exception
handling in execute_job: move the logging, provider cleanup
(provider.delete(result)), _mark_job_failed call, and callback logic
(build_failure_payload, get_provider_credential/Session, and send_callback) into
a single unified except block for the try that encloses the collection creation,
or alternatively remove the duplicate except and wrap the intervening lines in
the original try; ensure the unique symbols referenced—_mark_job_failed,
build_failure_payload, get_provider_credential, Session/engine, provider.delete,
and send_callback—are executed from that single exception handler so cleanup,
job failure marking, and callback notification always run on errors.
In `@backend/app/utils.py`:
- Around line 479-487: The code logs sensitive values (webhook_secret and
derived signature/timestamp) in the function that builds the webhook payload;
remove those direct logs and replace them with a non-sensitive, prefixed
message. Specifically, in the block where raw_body,
sign_webhook_payload(webhook_secret, raw_body), headers, and logger are used,
stop logging webhook_secret and the signature/timestamp; instead log a single
informational message prefixed with the function name in square brackets (e.g.,
"[function_name] webhook signing performed") or a boolean/DEBUG that signing was
attempted, and if you must show any secret-like value use
mask_string(sensitive_value) per guidelines. Ensure
headers["X-Webhook-Signature"] and headers["X-Webhook-Timestamp"] are still set
from sign_webhook_payload but do not emit their values to logs.
---
Outside diff comments:
In `@backend/app/services/collections/delete_collection.py`:
- Around line 241-247: The success callback path currently calls
send_callback(deletion_request.callback_url, success_payload) without the
webhook secret, leaving successful delete-collection callbacks unsigned; update
the call to include the webhook secret used for signing (pass the same
webhook_secret used for failure callbacks, e.g. deletion_request.webhook_secret
or the variable used elsewhere) so change the send_callback invocation in the
block that builds success_payload (references: deletion_request,
build_success_payload, collection_job, collection_id, send_callback) to pass the
webhook_secret as the signing argument.
- Around line 248-282: In the exception handler inside delete_collection.py
update the first _mark_job_failed_and_callback call to include organization_id,
remove the downstream webhook callback code that references send_callback and
success_payload (delete the block that fetches webhook_secret and calls
send_callback), delete the orphaned second except block, and re-raise the caught
exception after recording it (i.e., keep span.record_exception/ set_status, call
_mark_job_failed_and_callback with organization_id, then raise the exception) so
the error is propagated.
In `@backend/app/services/llm/jobs.py`:
- Around line 184-208: The code is sending the failure webhook twice—once with
webhook_secret and again without—causing duplicate deliveries and signature
failures; modify the block around send_callback, webhook_secret,
get_provider_credential and the tracer span so you only call send_callback once
for the failure path and include the webhook_secret (if available) in that
single call, remove the redundant send_callback invocation, and keep the tracer
span (llm.send_callback) setting "callback.url" and "callback.status" around
that single send to preserve tracing.
- Around line 963-979: Remove the duplicated and mis-indented exception handling
code: inside the except block create a single callback_response using
APIResponse.failure_response(error="Unexpected error occurred",
metadata=request.request_metadata) and call handle_job_error(job_uuid,
callback_url_str, callback_response, organization_id=organization_id,
project_id=project_id); delete the stray duplicate
callback_response/handle_job_error calls outside or with wrong indentation so
only the properly indented version that includes organization_id and project_id
remains.
- Around line 758-846: The code currently calls callback_response.model_dump()
and sends a webhook before callback_response is created, has mismatched
try/except/finally indentation, and places the JobCrud.update success path after
an except making it unreachable; fix by moving the webhook credential lookup
(get_provider_credential) and any send_callback calls into the success branch
after callback_response is assigned (i.e., inside the if result.success block),
ensure the try block is properly closed with its corresponding except and
finally (flush_telemetry) so no statements fall outside, and relocate the
database success update (JobCrud(session).update with
JobUpdate(status=JobStatus.SUCCESS)) and the success return
(callback_response.model_dump()) into the success branch before any returns so
they execute when result.success is true; verify all send_callback usage calls
callback_response that has been created and that exception handlers return via
handle_job_error as intended.
---
Nitpick comments:
In `@backend/app/tests/core/test_callback_ssrf.py`:
- Around line 330-338: Add a positive test for the signed callback path: call
the same code that produces call_kwargs but pass a webhook_secret, compute the
compact body bytes as json.dumps(test_data, separators=(",", ":")).encode(),
generate the expected timestamp and HMAC signature using the webhook_secret over
the exact bytes (including whatever signing format your code uses, e.g., sig =
hmac.new(webhook_secret.encode(), compact_bytes, sha256).hexdigest()), then
assert call_kwargs["data"] equals the compact_bytes and that
call_kwargs["headers"] contains "X-Webhook-Timestamp" equal to the timestamp you
generated and "X-Webhook-Signature" equal to the expected signature; use the
existing variables call_kwargs and test_data to locate where to add this check
(mirror the unsigned assertions but include webhook_secret and the
signature/timestamp assertions).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6c07cc7d-204d-4b8f-9bd1-13b07f588937
📒 Files selected for processing (10)
backend/app/core/providers.pybackend/app/services/collections/create_collection.pybackend/app/services/collections/delete_collection.pybackend/app/services/doctransform/job.pybackend/app/services/llm/chain/executor.pybackend/app/services/llm/jobs.pybackend/app/services/response/callbacks.pybackend/app/services/response/jobs.pybackend/app/tests/core/test_callback_ssrf.pybackend/app/utils.py
| Provider.WEBHOOK_SECRET: ProviderConfig( | ||
| required_fields=["webhook_secret"], sensitive_fields=["webhook_secret"] | ||
| ), |
There was a problem hiding this comment.
Update credential tests that hardcode the provider set.
Adding Provider.WEBHOOK_SECRET here expands the configured provider list; backend/app/tests/crud/test_credentials.py still asserts exactly {"openai", "langfuse"} and len(...) == 2, so those tests should be updated to include webhook_secret or derive expectations from the provider config.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/core/providers.py` around lines 46 - 48, The tests in
test_credentials.py hardcode the expected provider set and length (currently
asserting {"openai","langfuse"} and len==2), which breaks now that
Provider.WEBHOOK_SECRET is added; update those assertions to include
"webhook_secret" (matching Provider.WEBHOOK_SECRET) or, better, compute expected
providers dynamically from the Provider/ProviderConfig registry so the test no
longer hardcodes the set/length (e.g., derive expectation from the Provider enum
or the providers mapping used in providers.py).
| Provider.ELEVENLABS: ProviderConfig( | ||
| required_fields=["api_key"], sensitive_fields=["api_key"] | ||
| ), | ||
| Provider.WEBHOOK_SECRET: ProviderConfig( |
There was a problem hiding this comment.
update the credentials endpoint doc ..
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/services/collections/delete_collection.py`:
- Around line 106-115: The webhook secret lookup in _get_webhook_secret can
raise (e.g., from get_provider_credential or DB Session) and that exception can
bubble up and wrongly mark a deletion job as failed; make the lookup best-effort
by catching all exceptions inside _get_webhook_secret and returning None on any
error so callers will send unsigned webhooks rather than failing the job. Update
_get_webhook_secret to wrap the Session/get_provider_credential call in a
try/except (catch Exception), log the error if helpful, and return None when an
exception occurs.
In `@backend/app/services/llm/jobs.py`:
- Around line 176-189: The _get_webhook_secret lookup can raise on DB/decryption
errors and should "fail open" so optional signing doesn't block callbacks or
job-state updates; update _get_webhook_secret to catch any exceptions around the
Session/get_provider_credential call (and any decryption) and return None on
error (optionally log.debug or processLogger.debug the exception), leaving the
rest of the code (e.g., callers like handle_job_error) to proceed without a
secret.
In `@backend/app/utils.py`:
- Around line 479-486: The serialization and signing logic in send_callback
(json.dumps -> raw_body and sign_webhook_payload) can raise exceptions before
the request try/except, so wrap the json.dumps call and the webhook signing
(sign_webhook_payload, assignment of raw_body, headers, signature, timestamp_ms)
inside the same failure handling used for the request: catch exceptions (e.g.,
TypeError/ValueError or a broad Exception), log the error via logger with
context, and return False so malformed/non-serializable callback data stays
inside send_callback’s documented failure contract.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a1ebed5-faef-42a9-8c9d-81fcae4498dc
📒 Files selected for processing (4)
backend/app/services/collections/create_collection.pybackend/app/services/collections/delete_collection.pybackend/app/services/llm/jobs.pybackend/app/utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/services/collections/create_collection.py
| return None | ||
|
|
||
|
|
||
| def _get_webhook_secret(project_id: int, organization_id: int) -> str | None: |
There was a problem hiding this comment.
this function could be put in a common place since it will be used across multiple services in the services folder
| signature, timestamp_ms = sign_webhook_payload(webhook_secret, raw_body) | ||
| headers["X-Webhook-Signature"] = signature | ||
| headers["X-Webhook-Timestamp"] = str(timestamp_ms) | ||
| logger.info("[send_callback] Callback signed with webhook secret") |
There was a problem hiding this comment.
two logs comes from this function - 2026-04-23 12:22:49,822 - [kaapi-celery] - [cdeff6a8f76d43fa98808762094d0bba] - INFO - app.utils - [send_callback] Callback signed with webhook secret
2026-04-23 12:22:50,782 - [kaapi-celery] - [cdeff6a8f76d43fa98808762094d0bba] - INFO - app.utils - [send_callback] Callback sent successfully to {callback_url}, maybe this can be merged into one log
There was a problem hiding this comment.
Thanks for pointing out. But top one was for debugging purpose and is not required only. So removing that.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
backend/app/utils.py (2)
446-463:⚠️ Potential issue | 🟠 MajorMake optional webhook-secret lookup fail open.
get_webhook_secret()can raise on DB/decryption/credential lookup failures, and every caller treats signing as optional. ReturningNoneon lookup failure keeps callbacks and job-state transitions from failing because signing credentials are unavailable. This is the same unresolved risk as the earlier per-service helper comment, now centralized here.🛡️ Proposed fix
def get_webhook_secret( project_id: int | None, organization_id: int | None ) -> str | None: """Look up the configured webhook signing secret for this project, or None.""" if project_id is None or organization_id is None: return None # Imported lazily: app.core.db pulls in app.crud, which imports app.utils, # so a top-level import here would deadlock module initialization. from app.core.db import engine - with Session(engine) as session: - creds = get_provider_credential( - session=session, - org_id=organization_id, - project_id=project_id, - provider="webhook_secret", - ) - return creds.get("webhook_secret") if isinstance(creds, dict) else None + try: + with Session(engine) as session: + creds = get_provider_credential( + session=session, + org_id=organization_id, + project_id=project_id, + provider="webhook_secret", + ) + return creds.get("webhook_secret") if isinstance(creds, dict) else None + except Exception: + logger.warning( + "[get_webhook_secret] Failed to load webhook secret; sending unsigned callback", + exc_info=True, + ) + return None🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/utils.py` around lines 446 - 463, get_webhook_secret currently can raise on DB/decryption/credential lookup failures; wrap the DB call and credential retrieval (the Session(engine) context and call to get_provider_credential) in a try/except that catches broad operational errors (e.g., DB, decryption, or lookup exceptions) and returns None on any failure so callers treat signing as optional; keep the lazy import of engine and return creds.get("webhook_secret") if creds is a dict on success, otherwise return None.
498-529:⚠️ Potential issue | 🟠 MajorCatch serialization/signing failures inside
send_callback.The
trynow includesjson.dumps(), but the handler still only catchesrequests.RequestException. A non-JSON-safe payload still raisesTypeErrorbefore the callback can returnFalse, so callers like failure handlers may skip job-state updates. This duplicates the earlier serialization failure concern, which is still present in the current code.🛡️ Proposed fix
try: raw_body = json.dumps(data, separators=(",", ":")).encode() headers = {"Content-Type": "application/json"} if webhook_secret: signature, timestamp_ms = sign_webhook_payload(webhook_secret, raw_body) headers["X-Webhook-Signature"] = signature headers["X-Webhook-Timestamp"] = str(timestamp_ms) logger.info("[send_callback] Callback signed with webhook secret") with requests.Session() as session: session.trust_env = False # Ignores environment proxies and other implicit settings for SSRF safety response = session.post( callback_url, data=raw_body, headers=headers, timeout=( settings.CALLBACK_CONNECT_TIMEOUT, settings.CALLBACK_READ_TIMEOUT, ), allow_redirects=False, ) response.raise_for_status() logger.info(f"[send_callback] Callback sent successfully to {callback_url}") return True + except (TypeError, ValueError) as e: + logger.error( + f"[send_callback] Failed to serialize/sign callback payload: {str(e)}", + exc_info=True, + ) + return False except requests.RequestException as e: logger.error(f"[send_callback] Callback failed: {str(e)}", exc_info=True) return FalseVerification:
#!/bin/bash # Description: Demonstrate that stdlib json.dumps raises TypeError for common payload values. python - <<'PY' import json from uuid import uuid4 try: json.dumps({"id": uuid4()}) except TypeError as exc: print(type(exc).__name__, str(exc)) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/utils.py` around lines 498 - 529, send_callback currently only catches requests.RequestException, but json.dumps (and sign_webhook_payload) can raise TypeError/ValueError during serialization or signing and will escape the handler; wrap the serialization and signing steps (the json.dumps(...) call and the sign_webhook_payload(...) call) in the same try/except or broaden the existing except to also catch TypeError and ValueError (or Exception) so those failures are logged via logger.error (with exc_info=True) and the function returns False; locate the logic in send_callback around json.dumps and the sign_webhook_payload invocation to apply the changed exception handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/api/docs/credentials/create.md`:
- Around line 45-58: The examples for webhook credentials are invalid and
inconsistent: fix the multi-provider JSON by closing the missing quote on the
"webhook_secret" key and nesting the provider object under "credential" so it
matches the provider-object shape (i.e., "credential": { "webhook_secret": {
"webhook_secret": "..." } }), and update the standalone "For registering Webhook
Secret" example to use the same object shape (credential -> webhook_secret ->
webhook_secret) so both examples are valid JSON and consistent; key symbols to
locate are the "credential" object and the "webhook_secret" key in the shown
examples.
---
Duplicate comments:
In `@backend/app/utils.py`:
- Around line 446-463: get_webhook_secret currently can raise on
DB/decryption/credential lookup failures; wrap the DB call and credential
retrieval (the Session(engine) context and call to get_provider_credential) in a
try/except that catches broad operational errors (e.g., DB, decryption, or
lookup exceptions) and returns None on any failure so callers treat signing as
optional; keep the lazy import of engine and return creds.get("webhook_secret")
if creds is a dict on success, otherwise return None.
- Around line 498-529: send_callback currently only catches
requests.RequestException, but json.dumps (and sign_webhook_payload) can raise
TypeError/ValueError during serialization or signing and will escape the
handler; wrap the serialization and signing steps (the json.dumps(...) call and
the sign_webhook_payload(...) call) in the same try/except or broaden the
existing except to also catch TypeError and ValueError (or Exception) so those
failures are logged via logger.error (with exc_info=True) and the function
returns False; locate the logic in send_callback around json.dumps and the
sign_webhook_payload invocation to apply the changed exception handling.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ceb16562-719e-4fda-a403-bfe7098f3838
📒 Files selected for processing (10)
backend/app/api/docs/credentials/create.mdbackend/app/api/docs/credentials/update.mdbackend/app/api/docs/credentials/update_by_org_project.mdbackend/app/services/collections/create_collection.pybackend/app/services/collections/delete_collection.pybackend/app/services/doctransform/job.pybackend/app/services/llm/chain/executor.pybackend/app/services/llm/jobs.pybackend/app/services/response/callbacks.pybackend/app/utils.py
✅ Files skipped from review due to trivial changes (2)
- backend/app/api/docs/credentials/update_by_org_project.md
- backend/app/api/docs/credentials/update.md
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/services/collections/delete_collection.py
| "webhook_secret": { | ||
| "webhook_secret: "webhook_secret" | ||
| }, | ||
| } | ||
| } | ||
| ``` | ||
| #### For registering Webhook Secret | ||
| ```json | ||
| { | ||
| "credential":{ | ||
| "webhook_secret":"your-webhook-secret" | ||
| } | ||
|
|
||
| } |
There was a problem hiding this comment.
Fix the webhook credential examples.
The multi-provider JSON is invalid ("webhook_secret: ... is missing the closing key quote), and the standalone example uses a different shape than the provider-object format. This will mislead API users.
📝 Proposed fix
"langfuse": {
"public_key": "pk-lf-....",
"secret_key": "sk-lf-...",
"host": "https://cloud.langfuse.com"
},
"webhook_secret": {
- "webhook_secret: "webhook_secret"
- },
+ "webhook_secret": "your-webhook-secret"
+ }
}
}For registering Webhook Secret
{
- "credential":{
- "webhook_secret":"your-webhook-secret"
+ "credential": {
+ "webhook_secret": {
+ "webhook_secret": "your-webhook-secret"
+ }
}
-
}</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.
In @backend/app/api/docs/credentials/create.md around lines 45 - 58, The
examples for webhook credentials are invalid and inconsistent: fix the
multi-provider JSON by closing the missing quote on the "webhook_secret" key and
nesting the provider object under "credential" so it matches the provider-object
shape (i.e., "credential": { "webhook_secret": { "webhook_secret": "..." } }),
and update the standalone "For registering Webhook Secret" example to use the
same object shape (credential -> webhook_secret -> webhook_secret) so both
examples are valid JSON and consistent; key symbols to locate are the
"credential" object and the "webhook_secret" key in the shown examples.
</details>
<!-- fingerprinting:phantom:medusa:ibis -->
<!-- This is an auto-generated comment by CodeRabbit -->
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
backend/app/tests/core/test_callback_ssrf.py (1)
337-346: Add a positivesend_callbacksigning-path test.This covers the unsigned path, but not that
send_callback(..., webhook_secret=...)actually emitsX-Webhook-Signature/X-Webhook-Timestampfor the exact transmitted body.Suggested test coverage
assert "X-Webhook-Signature" not in call_kwargs["headers"] assert "X-Webhook-Timestamp" not in call_kwargs["headers"] + + `@patch`("app.utils.validate_callback_url") + `@patch`("requests.Session") + def test_callback_adds_signature_headers_when_secret_provided( + self, mock_session_class: Any, mock_validate: Any + ) -> None: + mock_session = MagicMock() + mock_response = MagicMock() + mock_response.raise_for_status.return_value = None + mock_response.iter_content.return_value = [b"test"] + mock_session.post.return_value = mock_response + mock_session_class.return_value.__enter__.return_value = mock_session + + test_data = {"status": "completed", "result": 42} + send_callback( + "https://api.example.com/callback", + test_data, + webhook_secret="secret", + ) + + call_kwargs = mock_session.post.call_args[1] + raw_body = json.dumps(test_data, separators=(",", ":")).encode() + timestamp = int(call_kwargs["headers"]["X-Webhook-Timestamp"]) + expected_signature = hmac.new( + b"secret", + f"{timestamp}.".encode() + raw_body, + hashlib.sha256, + ).hexdigest() + + assert call_kwargs["data"] == raw_body + assert call_kwargs["headers"]["X-Webhook-Signature"] == expected_signature🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/core/test_callback_ssrf.py` around lines 337 - 346, Add a positive test that calls send_callback with a webhook_secret and asserts the signing path: after invoking send_callback (use the same mock_session.post and test_data as the unsigned test), grab call_kwargs = mock_session.post.call_args[1], assert that "X-Webhook-Timestamp" and "X-Webhook-Signature" are present in call_kwargs["headers"], verify the timestamp is parseable/consistent, and recompute the expected HMAC over the exact transmitted body (call_kwargs["data"] — which should equal json.dumps(test_data, separators=(",", ":")).encode()) using the same webhook_secret and the timestamp format your implementation uses, then assert the computed signature equals call_kwargs["headers"]["X-Webhook-Signature"] to ensure send_callback produced the correct signature for the exact bytes sent.backend/app/tests/services/doctransformer/test_job/test_callbacks.py (3)
107-117: Consider extracting the repeated patch stack into a fixture/context manager.Every test in this file re-declares the same set of
patch(...)entries (Session,send_callback,get_webhook_secret, optionalconvert_document,TRANSFORMERS) and themock_session.__enter__/__exit__wiring. A small helper (e.g.,@contextmanager def _patched_job_env(db, *, webhook_secret=None, convert_side_effect=None, storage=None)or a pytest fixture) would significantly cut duplication and make future patching changes (like the newget_webhook_secret) a one-line update across the suite.Purely a DRY improvement — no behavioral impact.
Also applies to: 151-158, 188-199, 236-250, 284-295, 326-340, 386-393, 429-440, 483-497, 536-550
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py` around lines 107 - 117, Create a small reusable contextmanager or pytest fixture to centralize the repeated patch stack used across tests: replace the repeated with (...) patch blocks that mock Session, send_callback, get_webhook_secret, TRANSFORMERS (and optional convert_document/storage) and the manual mock_session.__enter__/__exit__ wiring by implementing e.g. a contextmanager _patched_job_env(db, *, webhook_secret=None, convert_side_effect=None, storage=None) that yields the mocked objects; update tests to call this helper (or use the fixture) and wire return_value/side_effects (for get_webhook_secret and convert_document) and TRANSFORMERS={"test": MockTestTransformer} there so future changes require a single edit.
13-13: Prefer built-intupleover deprecatedtyping.Tuple.Ruff flags this as UP035. On Python 3.11+ (per coding guidelines), use the built-in generic directly and drop the
typingimport.♻️ Proposed change
-from typing import TupleThen update the annotations throughout the file:
- self, db: Session, test_document: Tuple[Document, Project] + self, db: Session, test_document: tuple[Document, Project]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py` at line 13, The import from typing of Tuple is deprecated (UP035); remove "from typing import Tuple" and replace any use of the typing alias Tuple[...] with the built-in generic "tuple[...]" in this module (e.g., update function/type annotations referenced in test_callbacks such as any occurrences of Tuple[int, str] or Tuple[...]) so the file uses the built-in tuple generics on Python 3.11+.
47-48: Replace deprecateddatetime.utcnow()with timezone-aware alternative.In Python 3.12+,
datetime.utcnow()is deprecated in favor of timezone-aware construction. Update todatetime.now(timezone.utc)to align with modern standards and avoid deprecation warnings.Proposed change
-from datetime import datetime +from datetime import datetime, timezone @@ - inserted_at=datetime.utcnow(), - updated_at=datetime.utcnow(), + inserted_at=datetime.now(timezone.utc), + updated_at=datetime.now(timezone.utc),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py` around lines 47 - 48, Replace deprecated datetime.utcnow() usages for the inserted_at and updated_at fields with timezone-aware calls: use datetime.now(timezone.utc) and ensure timezone is imported (e.g., from datetime import datetime, timezone) so both inserted_at and updated_at are timezone-aware.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/app/tests/core/test_callback_ssrf.py`:
- Around line 349-380: Add explicit return type annotations "-> None" to each
test method inside the TestSignWebhookPayload class
(test_returns_hex_signature_and_timestamp,
test_deterministic_with_fixed_timestamp,
test_different_secrets_produce_different_signatures,
test_different_timestamps_produce_different_signatures,
test_signature_matches_manual_hmac) so they conform to the project typing rule;
update each method signature to include the return type while leaving the body
and calls to sign_webhook_payload intact.
In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py`:
- Around line 382-384: The nested helper function capture should be annotated:
add type hints to its parameters and return type (e.g., def capture(path:
os.PathLike[str] | str, **kw: Any) -> None), and ensure Any and os are imported
or referenced; also apply the identical annotation to the second capture helper
mentioned (the one at the other occurrence that calls real_rmtree and appends
str(path) to removed).
- Around line 315-358: The test name implies ordering but currently only checks
final DB state; modify the test_failure_marks_job_failed_before_callback to
assert the FAILED update occurs before send_callback is invoked by attaching a
side_effect to the patched send_callback that refreshes the job
(db.refresh(job)) and asserts job.status == TransformationStatus.FAILED (or
capture a timestamp/order flag there), then run execute_job.__wrapped__ as
before; alternatively, if you prefer not to test ordering, rename the test to
reflect it only verifies the final job state. Ensure references to
send_callback, execute_job, job, and TransformationStatus.FAILED are used so the
side_effect can observe the job status at callback time.
---
Nitpick comments:
In `@backend/app/tests/core/test_callback_ssrf.py`:
- Around line 337-346: Add a positive test that calls send_callback with a
webhook_secret and asserts the signing path: after invoking send_callback (use
the same mock_session.post and test_data as the unsigned test), grab call_kwargs
= mock_session.post.call_args[1], assert that "X-Webhook-Timestamp" and
"X-Webhook-Signature" are present in call_kwargs["headers"], verify the
timestamp is parseable/consistent, and recompute the expected HMAC over the
exact transmitted body (call_kwargs["data"] — which should equal
json.dumps(test_data, separators=(",", ":")).encode()) using the same
webhook_secret and the timestamp format your implementation uses, then assert
the computed signature equals call_kwargs["headers"]["X-Webhook-Signature"] to
ensure send_callback produced the correct signature for the exact bytes sent.
In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py`:
- Around line 107-117: Create a small reusable contextmanager or pytest fixture
to centralize the repeated patch stack used across tests: replace the repeated
with (...) patch blocks that mock Session, send_callback, get_webhook_secret,
TRANSFORMERS (and optional convert_document/storage) and the manual
mock_session.__enter__/__exit__ wiring by implementing e.g. a contextmanager
_patched_job_env(db, *, webhook_secret=None, convert_side_effect=None,
storage=None) that yields the mocked objects; update tests to call this helper
(or use the fixture) and wire return_value/side_effects (for get_webhook_secret
and convert_document) and TRANSFORMERS={"test": MockTestTransformer} there so
future changes require a single edit.
- Line 13: The import from typing of Tuple is deprecated (UP035); remove "from
typing import Tuple" and replace any use of the typing alias Tuple[...] with the
built-in generic "tuple[...]" in this module (e.g., update function/type
annotations referenced in test_callbacks such as any occurrences of Tuple[int,
str] or Tuple[...]) so the file uses the built-in tuple generics on Python
3.11+.
- Around line 47-48: Replace deprecated datetime.utcnow() usages for the
inserted_at and updated_at fields with timezone-aware calls: use
datetime.now(timezone.utc) and ensure timezone is imported (e.g., from datetime
import datetime, timezone) so both inserted_at and updated_at are
timezone-aware.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f0cf79a3-5cba-4c00-89c5-84b3efb7e98c
📒 Files selected for processing (3)
backend/app/tests/core/test_callback_ssrf.pybackend/app/tests/services/doctransformer/test_job/test_callbacks.pybackend/app/utils.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/utils.py
| class TestSignWebhookPayload: | ||
| def test_returns_hex_signature_and_timestamp(self): | ||
| sig, ts = sign_webhook_payload("secret", b"body") | ||
| assert isinstance(sig, str) and len(sig) == 64 # sha256 hex | ||
| assert isinstance(ts, int) and ts > 0 | ||
|
|
||
| def test_deterministic_with_fixed_timestamp(self): | ||
| body = b'{"key":"value"}' | ||
| sig1, _ = sign_webhook_payload("secret", body, timestamp_ms=1000) | ||
| sig2, _ = sign_webhook_payload("secret", body, timestamp_ms=1000) | ||
| assert sig1 == sig2 | ||
|
|
||
| def test_different_secrets_produce_different_signatures(self): | ||
| body = b"payload" | ||
| sig1, _ = sign_webhook_payload("secret-a", body, timestamp_ms=1000) | ||
| sig2, _ = sign_webhook_payload("secret-b", body, timestamp_ms=1000) | ||
| assert sig1 != sig2 | ||
|
|
||
| def test_different_timestamps_produce_different_signatures(self): | ||
| body = b"payload" | ||
| sig1, _ = sign_webhook_payload("secret", body, timestamp_ms=1000) | ||
| sig2, _ = sign_webhook_payload("secret", body, timestamp_ms=2000) | ||
| assert sig1 != sig2 | ||
|
|
||
| def test_signature_matches_manual_hmac(self): | ||
| secret, body, ts = "mysecret", b"hello", 999 | ||
| expected = hmac.new( | ||
| secret.encode(), f"{ts}.".encode() + body, hashlib.sha256 | ||
| ).hexdigest() | ||
| sig, returned_ts = sign_webhook_payload(secret, body, timestamp_ms=ts) | ||
| assert sig == expected | ||
| assert returned_ts == ts |
There was a problem hiding this comment.
Add return type annotations to the new tests.
The new test methods should declare -> None to match the project typing rule and the surrounding test style. As per coding guidelines, **/*.py: “Always add type hints to all function parameters and return values in Python code”.
Suggested typing fix
class TestSignWebhookPayload:
- def test_returns_hex_signature_and_timestamp(self):
+ def test_returns_hex_signature_and_timestamp(self) -> None:
sig, ts = sign_webhook_payload("secret", b"body")
assert isinstance(sig, str) and len(sig) == 64 # sha256 hex
assert isinstance(ts, int) and ts > 0
- def test_deterministic_with_fixed_timestamp(self):
+ def test_deterministic_with_fixed_timestamp(self) -> None:
body = b'{"key":"value"}'
sig1, _ = sign_webhook_payload("secret", body, timestamp_ms=1000)
sig2, _ = sign_webhook_payload("secret", body, timestamp_ms=1000)
assert sig1 == sig2
- def test_different_secrets_produce_different_signatures(self):
+ def test_different_secrets_produce_different_signatures(self) -> None:
body = b"payload"
sig1, _ = sign_webhook_payload("secret-a", body, timestamp_ms=1000)
sig2, _ = sign_webhook_payload("secret-b", body, timestamp_ms=1000)
assert sig1 != sig2
- def test_different_timestamps_produce_different_signatures(self):
+ def test_different_timestamps_produce_different_signatures(self) -> None:
body = b"payload"
sig1, _ = sign_webhook_payload("secret", body, timestamp_ms=1000)
sig2, _ = sign_webhook_payload("secret", body, timestamp_ms=2000)
assert sig1 != sig2
- def test_signature_matches_manual_hmac(self):
+ def test_signature_matches_manual_hmac(self) -> None:
secret, body, ts = "mysecret", b"hello", 999🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/tests/core/test_callback_ssrf.py` around lines 349 - 380, Add
explicit return type annotations "-> None" to each test method inside the
TestSignWebhookPayload class (test_returns_hex_signature_and_timestamp,
test_deterministic_with_fixed_timestamp,
test_different_secrets_produce_different_signatures,
test_different_timestamps_produce_different_signatures,
test_signature_matches_manual_hmac) so they conform to the project typing rule;
update each method signature to include the return type while leaving the body
and calls to sign_webhook_payload intact.
| def test_failure_marks_job_failed_before_callback( | ||
| self, db: Session, test_document: Tuple[Document, Project] | ||
| ) -> None: | ||
| document, project = test_document | ||
| aws = self.setup_aws_s3() | ||
| self.create_s3_document_content(aws, document) | ||
|
|
||
| job = DocTransformationJobCrud(db, project_id=project.id).create( | ||
| DocTransformJobCreate(source_document_id=document.id) | ||
| ) | ||
|
|
||
| with ( | ||
| patch("app.services.doctransform.job.Session") as mock_session, | ||
| patch("app.services.doctransform.job.send_callback"), | ||
| patch( | ||
| "app.services.doctransform.job.get_webhook_secret", return_value=None | ||
| ), | ||
| patch( | ||
| "app.services.doctransform.job.convert_document", | ||
| side_effect=RuntimeError("crash"), | ||
| ), | ||
| patch( | ||
| "app.services.doctransform.registry.TRANSFORMERS", | ||
| {"test": MockTestTransformer}, | ||
| ), | ||
| ): | ||
| mock_session.return_value.__enter__.return_value = db | ||
| mock_session.return_value.__exit__.return_value = None | ||
|
|
||
| with pytest.raises(RuntimeError): | ||
| execute_job.__wrapped__( | ||
| project_id=project.id, | ||
| job_id=str(job.id), | ||
| source_document_id=str(document.id), | ||
| transformer_name="test", | ||
| target_format="markdown", | ||
| task_id=str(uuid4()), | ||
| callback_url="https://example.com/webhook", | ||
| task_instance=None, | ||
| ) | ||
|
|
||
| db.refresh(job) | ||
| assert job.status == TransformationStatus.FAILED | ||
| assert "crash" in job.error_message |
There was a problem hiding this comment.
Test name promises ordering, assertion doesn’t verify it.
test_failure_marks_job_failed_before_callback only asserts the final DB state; it never verifies that the FAILED update happened before send_callback was invoked. If a future refactor reordered those operations, this test would still pass silently. Consider recording observation order via a side effect on the send_callback mock and asserting the job was already FAILED at that moment, or rename the test to reflect what it actually checks.
♻️ Example: verify ordering
- patch("app.services.doctransform.job.send_callback"),
+ patch("app.services.doctransform.job.send_callback") as mock_send,
@@
+ status_at_callback: list[TransformationStatus] = []
+
+ def record_status(*args, **kwargs):
+ db.refresh(job)
+ status_at_callback.append(job.status)
+
+ mock_send.side_effect = record_status
@@
db.refresh(job)
assert job.status == TransformationStatus.FAILED
assert "crash" in job.error_message
+ assert status_at_callback == [TransformationStatus.FAILED]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py` around
lines 315 - 358, The test name implies ordering but currently only checks final
DB state; modify the test_failure_marks_job_failed_before_callback to assert the
FAILED update occurs before send_callback is invoked by attaching a side_effect
to the patched send_callback that refreshes the job (db.refresh(job)) and
asserts job.status == TransformationStatus.FAILED (or capture a timestamp/order
flag there), then run execute_job.__wrapped__ as before; alternatively, if you
prefer not to test ordering, rename the test to reflect it only verifies the
final job state. Ensure references to send_callback, execute_job, job, and
TransformationStatus.FAILED are used so the side_effect can observe the job
status at callback time.
| def capture(path, **kw): | ||
| removed.append(str(path)) | ||
| real_rmtree(path, **kw) |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Add type hints to the capture helpers.
As per coding guidelines ("Always add type hints to all function parameters and return values in Python code"), these nested helpers should be annotated.
♻️ Proposed change
- def capture(path, **kw):
+ def capture(path: str, **kw: object) -> None:
removed.append(str(path))
real_rmtree(path, **kw)Apply identically at Line 425.
Also applies to: 425-427
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@backend/app/tests/services/doctransformer/test_job/test_callbacks.py` around
lines 382 - 384, The nested helper function capture should be annotated: add
type hints to its parameters and return type (e.g., def capture(path:
os.PathLike[str] | str, **kw: Any) -> None), and ensure Any and os are imported
or referenced; also apply the identical annotation to the second capture helper
mentioned (the one at the other occurrence that calls real_rmtree and appends
str(path) to removed).
| return signature, timestamp_ms | ||
|
|
||
|
|
||
| def get_webhook_secret( |
There was a problem hiding this comment.
nitpick: would have been better to keep this the way get openai client or get langfuse client is for consistency.
Summary
Target issue is #766
Explain the motivation for making this change. What existing problem does the pull request solve?
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes
Please add here if any other information is required for the reviewer.
Summary by CodeRabbit