Skip to content

fix: add visibility to silent failures in NDD adapter#136

Merged
memadi-nv merged 5 commits into
mainfrom
memadi/feature/add-visibility-to-silent-failures-in-nddadapter
Apr 28, 2026
Merged

fix: add visibility to silent failures in NDD adapter#136
memadi-nv merged 5 commits into
mainfrom
memadi/feature/add-visibility-to-silent-failures-in-nddadapter

Conversation

@memadi-nv
Copy link
Copy Markdown
Contributor

@memadi-nv memadi-nv commented Apr 24, 2026

Summary

Surface previously silent failure modes in NddAdapter via structured logs, and introduce a typed error wrapper so callers don't depend on backend exception types.

Changes

  • src/anonymizer/engine/ndd/adapter.py
    • Wrap create() / load_dataset() / preview() in a try/except that logs a WARNING (row count, model aliases, exception) + DEBUG (workflow name, columns) and re-raises as AnonymizerWorkflowError with __cause__ preserved.
    • Add WARNING + DEBUG logs for the two silent short-circuits in _detect_missing_records (input missing tracking column, output missing tracking column, with a column-diff breadcrumb).
  • src/anonymizer/interface/errors.py
    • New AnonymizerWorkflowError for wrapping workflow-step failures.
  • tests/engine/test_ndd_adapter.py
    • Five new tests covering the three exception-wrap paths and the two detection short-circuit warnings, plus an assertion that log messages don't leak backend product names.

Type of Change

  • Bug fix

Testing

  • make test passes locally
  • make check passes locally

Related Issues

Plan for this issue created in draft PR #128 => some parts partially changed
Closes #110

…circuits in adapter logs

- Wrap `preview` and `create`+`load_dataset` in try/except that logs a
  user-facing WARNING (row count, model aliases, `type(exc).__name__`,
  message, actionable hint) plus a DEBUG breadcrumb (`workflow_name`,
  column list), then re-raises unchanged.
- Warn in `_detect_missing_records` on both short-circuits with distinct
  framing: input-missing as a correctness-facing silent-drop / invariant
  violation, output-missing as an observability improvement with a
  column-diff breadcrumb.
- Tests cover each new WARNING/DEBUG path and assert the new log
  messages contain no backend references.

Made-with: Cursor
@memadi-nv memadi-nv requested a review from a team as a code owner April 24, 2026 21:59
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 24, 2026

Greptile Summary

This PR surfaces previously silent failures in NddAdapter by wrapping create(), load_dataset(), and preview() in a structured try/except that emits WARNING + DEBUG logs and re-raises as the new AnonymizerWorkflowError (preserving __cause__), and adds equivalent WARNING + DEBUG logs for the two silent short-circuits in _detect_missing_records. Five new tests cover all changed code paths and assert that log messages don't reference backend product names.

Confidence Score: 5/5

Safe to merge — all logic is correct, tests cover all new paths, and the only finding is a speculative P2 concern about str(exc) embedding backend names.

No P0 or P1 issues found. The sole inline comment is P2 (potential backend name leakage through str(exc)), which is speculative and doesn't affect correctness. Tests are well-structured and assertions align precisely with the actual log format strings.

No files require special attention.

Important Files Changed

Filename Overview
src/anonymizer/engine/ndd/adapter.py Adds try/except around create/load_dataset/preview with structured WARNING+DEBUG logs and re-raises as AnonymizerWorkflowError; also adds WARNING+DEBUG logs for the two silent short-circuits in _detect_missing_records. Logic is correct; minor concern that str(exc) can leak backend product names into the warning log and wrapped error message.
src/anonymizer/interface/errors.py Adds AnonymizerWorkflowError, a typed wrapper that preserves cause for callers who need backend detail without depending on backend exception types. Clean and correctly placed in the error hierarchy.
tests/engine/test_ndd_adapter.py Five new tests covering all three exception-wrap paths and both detection short-circuit warnings. The _unique_records deduplication via id(record) correctly handles the double-capture caused by the _caplog_for_anonymizer autouse fixture in conftest.py. All log-message assertions align with the actual format strings in adapter.py.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[run_workflow called] --> B[_attach_record_ids]
    B --> C[build config + compute record_count]
    C --> D{preview_num_records?}
    D -- None --> E[data_designer.create + load_dataset]
    D -- set --> F[data_designer.preview]
    E --> G{Exception?}
    F --> G
    G -- yes --> H[log WARNING + DEBUG]
    H --> I[raise AnonymizerWorkflowError from exc]
    G -- no --> J[_detect_missing_records]
    J --> K{RECORD_ID_COLUMN in input_df?}
    K -- no --> L[log WARNING + DEBUG\nreturn empty list]
    K -- yes --> M{RECORD_ID_COLUMN in output_df?}
    M -- no --> N[log WARNING + DEBUG\nreturn all as FailedRecord]
    M -- yes --> O[diff input_ids vs output_ids\nreturn FailedRecord list]
    O --> P[WorkflowRunResult]
    N --> P
    L --> P
Loading

Reviews (5): Last reviewed commit: "update test" | Re-trigger Greptile

Comment thread src/anonymizer/engine/ndd/adapter.py Outdated
@memadi-nv memadi-nv changed the title fix(ndd): surface silent preview/create failures and detection short-circuits in adapter logs fix: add visibility to silent failures in ndd adapter Apr 24, 2026
@memadi-nv memadi-nv changed the title fix: add visibility to silent failures in ndd adapter fix: add visibility to silent failures in NDD adapter Apr 24, 2026
Signed-off-by: memadi <memadi@nvidia.com>
Comment thread src/anonymizer/engine/ndd/adapter.py Outdated
Signed-off-by: memadi <memadi@nvidia.com>
Comment thread src/anonymizer/engine/ndd/adapter.py
Copy link
Copy Markdown
Collaborator

@lipikaramaswamy lipikaramaswamy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to approve. Were you able to check using a small model or config with very low timeout to simulate a data designer failure?

Signed-off-by: memadi <memadi@nvidia.com>
Comment thread tests/engine/test_ndd_adapter.py Outdated
@memadi-nv
Copy link
Copy Markdown
Contributor Author

Were you able to check using a small model or config with very low timeout to simulate a data designer failure?

Instead of driving an actual low-timeout model, the tests added in tests/engine/test_ndd_adapter.py mock DataDesigner and inject side_effect exceptions at each of the three points where the backend can fail:

  1. preview failuretest_preview_exception_wraps_in_workflow_error_and_logs
    • mock_dd.preview.side_effect = DataDesignerRuntimeError("endpoint unreachable")
  2. create failuretest_create_exception_wraps_in_workflow_error_and_logs
    • mock_dd.create.side_effect = DataDesignerRuntimeError("quota exceeded")
  3. load_dataset failure (post-create)test_load_dataset_exception_wraps_in_workflow_error_and_logs
    • mock_create_results.load_dataset.side_effect = DataDesignerRuntimeError("corrupt parquet")

Each test asserts that:

  • The failure is wrapped in AnonymizerWorkflowError with the original DataDesigner exception preserved as __cause__.
  • The warning log includes the record count, model alias, and error message.
  • The debug log includes the workflow context (columns, workflow name).
  • Neither log leaks backend identifiers (Data Designer, DataDesigner, etc.).

I assume that's sufficient, but let me know if it's not the case.

Signed-off-by: memadi <memadi@nvidia.com>
@memadi-nv memadi-nv merged commit 3500018 into main Apr 28, 2026
19 of 39 checks passed
@memadi-nv memadi-nv deleted the memadi/feature/add-visibility-to-silent-failures-in-nddadapter branch April 28, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Data Designer failures silently drop records — add visibility and retry

2 participants