Skip to content

Fix stuck DSRs#8211

Merged
JadeCara merged 12 commits into
mainfrom
ENG-3834/skip-orphaned-async-tasks
May 22, 2026
Merged

Fix stuck DSRs#8211
JadeCara merged 12 commits into
mainfrom
ENG-3834/skip-orphaned-async-tasks

Conversation

@JadeCara
Copy link
Copy Markdown
Contributor

@JadeCara JadeCara commented May 15, 2026

Ticket ENG-3834

Description Of Changes

Two related fixes for stuck DSRs caused by deleted or misconfigured integrations.

1. Orphaned async callback tasks

Async callback tasks in awaiting_processing status were permanently stuck when their ConnectionConfig was deleted or disabled. Three safety mechanisms all failed simultaneously:

  1. The watchdog iterator only checked in_processing/pending tasks — awaiting_processing tasks were invisible
  2. _has_async_tasks_awaiting_external_completion had no status filter, so any PR that ever had an async task was permanently invisible to the watchdog (even after the async task completed)
  3. GraphTask.__init__ crashed on deleted connections instead of skipping gracefully

The fix uses the existing skip machinery (CollectionDisabled@retrylog_skipped) — no new endpoints, no migrations, no UI changes.

2. Dangling erase_after references crash erasure task creation

When a collection's erase_after references a collection belonging to a deleted integration, the unvalidated reference creates a phantom node in the erasure networkx graph via implicit networkx.add_edge node creation. This phantom node then causes a KeyError at dataset_graph.nodes[node] in base_task_data, which kills the erasure task creation loop partway through. The result: partial erasure tasks in the DB, no TERMINATE task, and a privacy request that completes without actually performing any erasure — a silent compliance failure.

The fix adds upfront validation in build_erasure_networkx_digraph that checks all erase_after references point to collections that exist in the traversal before building edges. Raises TraversalError with a clear message before any tasks are persisted.

Code Changes

  • graph_task.py: GraphTask.__init__ now tolerates missing connectors (self.connector = None); skip_if_disabled() checks for None connector first and raises CollectionDisabled, which the existing @retry decorator handles identically to a disabled connection
  • request_service.py: Watchdog iterator now includes awaiting_processing tasks; added status filter to _has_async_tasks_awaiting_external_completion so exited async tasks don't blind the watchdog; new _task_is_orphaned() function checks if a task's connection is deleted or disabled; watchdog per-task loop uses this to distinguish orphaned tasks from legitimately waiting ones
  • create_request_tasks.py: build_erasure_networkx_digraph now validates all erase_after references against the set of known nodes (traversal nodes + end nodes + artificial nodes) before building the graph. Raises TraversalError with a message identifying the offending collection and dangling reference
  • test_requeue_interrupted_tasks.py: 6 new tests covering deleted connection, disabled connection, valid connection (not requeued), and 3 exited-status-blinding cases
  • test_erase_after_dangling_ref.py: 3 new tests — graph builder rejects dangling refs, task creation raises before persisting partial state, valid erase_after continues to work

Steps to Confirm

Prerequisites

  • fidesplus docker stack running (docker compose up)
  • A mock HTTP server on the host that returns 200 {} for all requests (used as the SaaS endpoint target):
    python3 -c "
    from http.server import HTTPServer, BaseHTTPRequestHandler
    class H(BaseHTTPRequestHandler):
        def do_GET(self): self.send_response(200); self.end_headers(); self.wfile.write(b'{}')
        def do_POST(self): self.do_GET()
        def do_DELETE(self): self.do_GET()
        def log_message(self, *a): pass
    HTTPServer(('0.0.0.0', 9999), H).serve_forever()
    " &
  • Verify the container can reach it: docker exec fidesplus-slim curl -s http://host.docker.internal:9999/test
  • Obtain an auth token (use OAuth client credentials) and set: TOKEN=<your_token>

Register connector templates (one-time setup)

Two custom SaaS connector templates are needed: one with async callback, one without.

Async callback template (async_callback_test):

TMPDIR=$(mktemp -d)
cat > "$TMPDIR/config.yml" << 'YAML'
saas_config:
  fides_key: <instance_fides_key>
  name: Async Callback Test
  type: async_callback_test
  description: Test connector with async callback on read
  version: 0.0.1
  connector_params:
    - name: domain
    - name: api_token
      sensitive: true
  client_config:
    protocol: http
    host: <domain>
    authentication:
      strategy: bearer
      configuration:
        token: <api_token>
  test_request:
    method: GET
    path: /test
  endpoints:
    - name: user
      requests:
        read:
          method: GET
          path: /api/access-package
          param_values:
            - name: email
              identity: email
          async_config:
            strategy: callback
YAML
cat > "$TMPDIR/dataset.yml" << 'YAML'
dataset:
  - fides_key: <instance_fides_key>
    name: Async Callback Test Dataset
    collections:
      - name: user
        fields:
          - name: id
            data_categories: [user.unique_id]
            fides_meta:
              data_type: string
          - name: name
            data_categories: [user.name]
            fides_meta:
              data_type: string
YAML
cd "$TMPDIR" && zip template.zip config.yml dataset.yml
curl -s -X POST "http://localhost:8080/api/v1/connector_template/register" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/octet-stream" \
  --data-binary "@template.zip"

Simple (non-async) template (simple_test):

TMPDIR=$(mktemp -d)
cat > "$TMPDIR/config.yml" << 'YAML'
saas_config:
  fides_key: <instance_fides_key>
  name: Simple Test
  type: simple_test
  description: Simple test connector (no async)
  version: 0.0.1
  connector_params:
    - name: domain
    - name: api_token
      sensitive: true
  client_config:
    protocol: http
    host: <domain>
    authentication:
      strategy: bearer
      configuration:
        token: <api_token>
  test_request:
    method: GET
    path: /test
  endpoints:
    - name: user
      requests:
        read:
          method: GET
          path: /api/user
          param_values:
            - name: email
              identity: email
        delete:
          method: DELETE
          path: /api/user/<email>
          param_values:
            - name: email
              identity: email
YAML
cat > "$TMPDIR/dataset.yml" << 'YAML'
dataset:
  - fides_key: <instance_fides_key>
    name: Simple Test Dataset
    collections:
      - name: user
        fields:
          - name: id
            data_categories: [user.unique_id]
            fides_meta:
              data_type: string
          - name: email
            data_categories: [user.contact.email]
            fides_meta:
              data_type: string
              identity: email
YAML
cd "$TMPDIR" && zip template.zip config.yml dataset.yml
curl -s -X POST "http://localhost:8080/api/v1/connector_template/register" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/octet-stream" \
  --data-binary "@template.zip"

Scenario 1: Orphaned async callback tasks

Setup: Create a SaaS connection using the async_callback_test template, pointing at the mock server.

# Create connection
curl -s -X PATCH "http://localhost:8080/api/v1/connection" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"name": "Async Callback Test", "key": "async_callback_test", "connection_type": "saas", "access": "read", "saas_connector_type": "async_callback_test"}]'

# Set SaaS config (async callback on the read endpoint)
curl -s -X PATCH "http://localhost:8080/api/v1/connection/async_callback_test/saas_config" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{
    "fides_key": "async_callback_test",
    "name": "Async Callback Test",
    "type": "async_callback_test",
    "description": "Test async callback connector",
    "version": "0.0.1",
    "connector_params": [{"name": "domain"}, {"name": "api_token", "sensitive": true}],
    "client_config": {"protocol": "http", "host": "<domain>", "authentication": {"strategy": "bearer", "configuration": {"token": "<api_token>"}}},
    "test_request": {"method": "GET", "path": "/test"},
    "endpoints": [{"name": "user", "requests": {"read": {"method": "GET", "path": "/api/access-package", "param_values": [{"name": "email", "identity": "email"}], "async_config": {"strategy": "callback"}}}}]
  }'

# Set secrets (mock server on host)
curl -s -X PUT "http://localhost:8080/api/v1/connection/async_callback_test/secret" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{"domain": "host.docker.internal:9999", "api_token": "test-token"}'

# Set dataset (no identity in fides_meta — identities go in SaaS config param_values)
curl -s -X PATCH "http://localhost:8080/api/v1/connection/async_callback_test/dataset" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"fides_key": "async_callback_test", "name": "Async Callback Test Dataset", "collections": [{"name": "user", "fields": [{"name": "id", "data_categories": ["user.unique_id"], "fides_meta": {"data_type": "string"}}, {"name": "name", "data_categories": ["user.name"], "fides_meta": {"data_type": "string"}}]}]}]'

Steps:

  1. Submit an access request via Privacy Center (localhost:3001) and approve it in the Admin UI (localhost:3000).
  2. Poll execution logs until async_callback_test:user shows awaiting_processing:
    curl -s "http://localhost:8080/api/v1/privacy-request/<PR_ID>/log?page=1&size=100" \
      -H "Authorization: Bearer $TOKEN" | python3 -c "
    import json,sys
    for item in json.loads(sys.stdin.read()).get('items', []):
        if 'async_callback' in str(item.get('dataset_name','')):
            print(f'{item[\"dataset_name\"]}:{item.get(\"collection_name\")} status={item[\"status\"]}')"
  3. Delete the connection:
    curl -s -X DELETE "http://localhost:8080/api/v1/connection/async_callback_test" \
      -H "Authorization: Bearer $TOKEN"
  4. Wait for the next watchdog cycle (runs every 5 minutes). Monitor the API server logs:
    docker logs -f fidesplus-slim 2>&1 | grep -E "orphan|awaiting_processing.*deleted|requeue"

Expected result (API server logs):

Request task <id> (privacy request <pr_id>) is awaiting_processing but connection is deleted or disabled — requeueing
Requeuing privacy request <pr_id> (attempt N/3)

Expected result (worker logs):

docker logs fidesplus-fidesplus-worker-generic-1 2>&1 | grep "async_callback_test"
CollectionDisabled - Skipping collection async_callback_test:user for privacy_request: <pr_id>
Skipping node async_callback_test:user
Access task async_callback_test:user is skipped.

Scenario 2: Dangling erase_after references

Setup: Create two SaaS connections — A (async) and B (simple) — where B's erasure depends on A. Then delete A.

# --- Integration A (async, will be deleted) ---
curl -s -X PATCH "http://localhost:8080/api/v1/connection" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"name": "Integration A", "key": "integration_a", "connection_type": "saas", "access": "write", "saas_connector_type": "async_callback_test"}]'

curl -s -X PATCH "http://localhost:8080/api/v1/connection/integration_a/saas_config" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{
    "fides_key": "integration_a", "name": "Integration A", "type": "async_callback_test",
    "description": "Async integration (will be deleted)", "version": "0.0.1",
    "connector_params": [{"name": "domain"}, {"name": "api_token", "sensitive": true}],
    "client_config": {"protocol": "http", "host": "<domain>", "authentication": {"strategy": "bearer", "configuration": {"token": "<api_token>"}}},
    "test_request": {"method": "GET", "path": "/test"},
    "endpoints": [{"name": "user", "requests": {
      "read": {"method": "GET", "path": "/api/user", "param_values": [{"name": "email", "identity": "email"}], "async_config": {"strategy": "callback"}},
      "delete": {"method": "DELETE", "path": "/api/user/<email>", "param_values": [{"name": "email", "identity": "email"}]}
    }}]
  }'

curl -s -X PUT "http://localhost:8080/api/v1/connection/integration_a/secret" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{"domain": "host.docker.internal:9999", "api_token": "test-token"}'

curl -s -X PATCH "http://localhost:8080/api/v1/connection/integration_a/dataset" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"fides_key": "integration_a", "name": "Integration A", "collections": [{"name": "user", "fields": [{"name": "id", "data_categories": ["user.unique_id"], "fides_meta": {"data_type": "string"}}, {"name": "email", "data_categories": ["user.contact.email"], "fides_meta": {"data_type": "string"}}]}]}]'

# --- Integration B (simple, erase_after -> A) ---
curl -s -X PATCH "http://localhost:8080/api/v1/connection" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"name": "Integration B", "key": "integration_b", "connection_type": "saas", "access": "write", "saas_connector_type": "simple_test"}]'

# NOTE: erase_after is on the SaaS config endpoint, not the dataset
curl -s -X PATCH "http://localhost:8080/api/v1/connection/integration_b/saas_config" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{
    "fides_key": "integration_b", "name": "Integration B", "type": "simple_test",
    "description": "Simple integration with erase_after dependency on A", "version": "0.0.1",
    "connector_params": [{"name": "domain"}, {"name": "api_token", "sensitive": true}],
    "client_config": {"protocol": "http", "host": "<domain>", "authentication": {"strategy": "bearer", "configuration": {"token": "<api_token>"}}},
    "test_request": {"method": "GET", "path": "/test"},
    "endpoints": [{"name": "user",
      "erase_after": ["integration_a.user"],
      "requests": {
        "read": {"method": "GET", "path": "/api/user", "param_values": [{"name": "email", "identity": "email"}]},
        "delete": {"method": "DELETE", "path": "/api/user/<email>", "param_values": [{"name": "email", "identity": "email"}]}
      }
    }]
  }'

curl -s -X PUT "http://localhost:8080/api/v1/connection/integration_b/secret" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '{"domain": "host.docker.internal:9999", "api_token": "test-token"}'

curl -s -X PATCH "http://localhost:8080/api/v1/connection/integration_b/dataset" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"fides_key": "integration_b", "name": "Integration B", "collections": [{"name": "user", "fields": [{"name": "id", "data_categories": ["user.unique_id"], "fides_meta": {"data_type": "string"}}, {"name": "email", "data_categories": ["user.contact.email"], "fides_meta": {"data_type": "string"}}]}]}]'

# --- Delete integration A to create the dangling reference ---
curl -s -X DELETE "http://localhost:8080/api/v1/connection/integration_a" \
  -H "Authorization: Bearer $TOKEN"

Steps:

  1. Submit an erasure request via Privacy Center and approve it in the Admin UI.
  2. The request should fail during erasure task creation.

Expected result (worker logs):

docker logs fidesplus-fidesplus-worker-generic-1 2>&1 | grep -i "erase_after\|TraversalError"
TraversalError encountered for privacy request. Error: Collection integration_b:user has an erase_after
reference to integration_a:user which does not exist in the dataset graph. This may indicate a deleted
integration that is still referenced.
Screenshot 2026-05-21 at 2 56 55 PM Screenshot 2026-05-21 at 2 57 04 PM

Key details:

  • The request errors with a clear TraversalError, not a cryptic KeyError
  • No partial erasure tasks are persisted — the error is raised before any tasks are created
  • Integration B's erase_after is set on the SaaS config endpoint, not the dataset fides_meta
  • Both connections need access: "write" to participate in erasure
  • SaaS datasets must NOT have identity in fides_meta — identities go in SaaS config param_values

Cleanup

curl -s -X DELETE "http://localhost:8080/api/v1/connection/async_callback_test" -H "Authorization: Bearer $TOKEN"
curl -s -X DELETE "http://localhost:8080/api/v1/connection/integration_a" -H "Authorization: Bearer $TOKEN"
curl -s -X DELETE "http://localhost:8080/api/v1/connection/integration_b" -H "Authorization: Bearer $TOKEN"
curl -s -X DELETE "http://localhost:8080/api/v1/connector_template/async_callback_test" -H "Authorization: Bearer $TOKEN"
curl -s -X DELETE "http://localhost:8080/api/v1/connector_template/simple_test" -H "Authorization: Bearer $TOKEN"
# Kill the mock server
kill %1

Pre-Merge Checklist

  • Issue requirements met
  • All CI pipelines succeeded
  • CHANGELOG.md updated
    • Add a db-migration This indicates that a change includes a database migration label to the entry if your change includes a DB migration
    • Add a high-risk This issue suggests changes that have a high-probability of breaking existing code label to the entry if your change includes a high-risk change
    • Updates unreleased work already in Changelog, no new entry necessary
  • UX feedback:
    • No UX review needed
  • Followup issues:
    • Followup issues created
    • No followup issues
  • Database migrations:
    • No migrations
  • Documentation:
    • No documentation updates required

…or disabled

Async callback tasks in awaiting_processing status were permanently stuck when
their ConnectionConfig was deleted — the watchdog couldn't see them, the status
poller couldn't help, and reprocessing reused the same stuck tasks.

Three changes:
- GraphTask.__init__ tolerates missing connectors (sets self.connector = None),
  and skip_if_disabled() checks for None first, raising CollectionDisabled
  through the existing @Retry skip path
- Watchdog iterator now includes awaiting_processing tasks, with a per-task
  connection check (_task_is_orphaned) to distinguish orphaned tasks from
  those legitimately waiting for callbacks
- Status filter added to _has_async_tasks_awaiting_external_completion so
  completed/errored async tasks no longer permanently blind the watchdog

Also fixes a previously-undetected bug: disabled connections with callback
tasks had the same blind spot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown
Contributor

vercel Bot commented May 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments
Project Deployment Actions Updated (UTC)
fides-plus-nightly Ignored Ignored Preview May 22, 2026 4:41pm
fides-privacy-center Ignored Ignored May 22, 2026 4:41pm

Request Review

@JadeCara JadeCara force-pushed the ENG-3834/skip-orphaned-async-tasks branch from 4b86b1c to 030a54f Compare May 15, 2026 20:32
Comment thread changelog/8211-fix-stuck-dsrs-orphaned-async-tasks.yaml Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.09%. Comparing base (279734e) to head (c9ea5f5).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8211      +/-   ##
==========================================
- Coverage   85.16%   85.09%   -0.08%     
==========================================
  Files         670      670              
  Lines       43498    43592      +94     
  Branches     5093     5120      +27     
==========================================
+ Hits        37046    37094      +48     
- Misses       5346     5393      +47     
+ Partials     1106     1105       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Add upfront validation in build_erasure_networkx_digraph that checks all
erase_after references point to collections that exist in the traversal
nodes, end nodes, or artificial nodes. A dangling reference (e.g. from a
deleted integration) previously created a phantom node via implicit
networkx.add_edge, causing a KeyError at dataset_graph.nodes[node] in
base_task_data that killed erasure task creation partway through and left
the privacy request in an unrecoverable state.

Now raises TraversalError with a clear message identifying the offending
collection and the dangling reference before any tasks are persisted.
@JadeCara JadeCara changed the title ENG-3834: Fix stuck DSRs when async task ConnectionConfig is deleted or disabled Fix stuck DSRs May 15, 2026
@JadeCara
Copy link
Copy Markdown
Contributor Author

/code-review

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — PR #8211: Fix stuck DSRs with orphaned async tasks

This PR addresses a real and impactful bug class (permanently stuck DSRs) with three distinct, well-scoped fixes:

  1. _get_request_task_ids_in_progress — adds awaiting_processing to the in-progress statuses so the watchdog can see async tasks.
  2. _has_async_tasks_awaiting_external_completion — adds .notin_(EXITED_EXECUTION_LOG_STATUSES) so completed/errored/skipped async tasks no longer permanently blind the watchdog.
  3. _task_is_orphaned — new helper that detects deleted or disabled ConnectionConfig for an async task, allowing the watchdog to requeue orphaned tasks instead of waiting forever.
  4. GraphTask.__init__ — gracefully handles ConnectorNotFoundException (hard-deleted config) by setting self.connector = None and letting skip_if_disabled() raise CollectionDisabled through the existing @retry path.
  5. build_erasure_networkx_digraph — validates erase_after references upfront before building the graph.

The logic is clear and the test coverage is thorough. A few things worth looking at:

Issues

  • generate_dry_run_query is not guarded against None connector (see inline comment at graph_task.py:290). The production task-execution path is safe because @retry calls skip_if_disabled() first, but generate_dry_run_query has no such guard and would raise AttributeError if called on a task with a deleted connection.

  • erase_after validation is a breaking change for stale configs (see create_request_tasks.py:195). Previously, dangling erase_after references would silently corrupt the task graph with a KeyError mid-execution. Now, they fail fast with a clear TraversalError. This is strictly better from a data-integrity standpoint, but customers with active datasets referencing deleted integrations will start seeing new errors on erasure. Worth a note in the changelog or docs.

  • _task_is_orphaned returns False when connection key is missing (see request_service.py:563). The conservative default is appropriate to avoid false positives, but a comment explaining the intent would help, and the log level may be too low if this represents unexpected state.

Minor

  • Test docstring wording is slightly misleading (see test_requeue_interrupted_tasks.py:618) — describes the pre-fix bug rather than what the test verifies.

Overall this is solid work. The watchdog interaction changes are well-thought-out, and the two-branch logic (orphaned check for awaiting_processing, requeue for everything else) is a clean solution. The main thing to address before merge is the generate_dry_run_query null-safety gap.

🔬 Codegraph: connected (50527 nodes)


💡 Write /code-review in a comment to re-run this review.

Comment thread src/fides/api/task/create_request_tasks.py
Comment thread src/fides/api/service/privacy_request/request_service.py
Comment thread tests/fides/task/test_requeue_interrupted_tasks.py
@JadeCara JadeCara marked this pull request as ready for review May 21, 2026 21:09
@JadeCara JadeCara requested a review from a team as a code owner May 21, 2026 21:09
@JadeCara JadeCara requested review from adamsachs and removed request for a team May 21, 2026 21:09
@JadeCara JadeCara requested review from eastandwestwind and removed request for adamsachs May 21, 2026 21:34
Copy link
Copy Markdown
Contributor

@eastandwestwind eastandwestwind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits / Qs but otherwise good to go!

Comment on lines +548 to +553
# Orphaned task tests (ENG-3834)
#
# These tests document the behavior of the watchdog when async tasks have
# deleted or disabled connections. Tests marked xfail demonstrate the
# current bug — they will pass once the fix is applied.
# ---------------------------------------------------------------------------
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Test-driven approach, thanks for adding these!

Comment on lines +192 to +194
f"Collection {node_name} has an erase_after reference to "
f"{ref} which does not exist in the dataset graph. This may "
f"indicate a deleted integration that is still referenced."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we make this msg more actionable, e.g. "Erasure cannot proceed: collection 'active_api:users' has an 'Erase After' dependency on 'deleted_api:users', which no longer exists in the system. Update the 'Erase After' setting on this collection in the dataset configuration to remove the stale reference."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — updated the error message to be more specific and actionable, naming the collection and stale reference with guidance on how to fix it.

Comment on lines +560 to +562
f"Request task {request_task_id} has no dataset_connection_key "
f"in traversal_details — possible data integrity issue"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the task is actually stuck? Is it worth surfacing an Execution error so it it's evident in Admin-UI?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - and I might have gone overboard - it actually currently has 2
Screenshot 2026-05-21 at 2 56 55 PM

Screenshot 2026-05-21 at 2 57 04 PM

Make the TraversalError message more actionable by naming the specific
collection and stale reference, with guidance on how to fix it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add None guard in generate_dry_run_query to prevent AttributeError
  when ConnectionConfig is deleted during a dry run
- Add comment documenting Redis-only manual-webhook input fragility
  when requeuing requires_input/pending_external PRs
- Add TestDeletedConnectionConfig tests covering connector=None init,
  saas_version fallback, dry_run_query guard, and skip_if_disabled
- Add test for missing connection_key not being treated as orphaned

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@JadeCara JadeCara added this pull request to the merge queue May 22, 2026
Merged via the queue into main with commit d31d98c May 22, 2026
68 of 69 checks passed
@JadeCara JadeCara deleted the ENG-3834/skip-orphaned-async-tasks branch May 22, 2026 17:35
JadeCara added a commit that referenced this pull request May 27, 2026
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants