test: pin custom generator error boundary#684
Conversation
Greptile SummaryThis PR adds a scheduler-level integration test that pins the error-boundary policy for decorated custom-generator user failures. Specifically, it verifies that a
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/tests/engine/dataset_builders/test_async_scheduler.py | Adds one new integration test; assertions correctly trace the decorated custom-generator failure path through the scheduler. No logic issues found. |
Sequence Diagram
sequenceDiagram
participant Test
participant AsyncTaskScheduler
participant CustomColumnGenerator
participant custom_py as custom.py (_generate)
Test->>AsyncTaskScheduler: scheduler.run()
AsyncTaskScheduler->>CustomColumnGenerator: agenerate(row_dict)
Note over CustomColumnGenerator: sync fn → asyncio.to_thread(self.generate, data)
CustomColumnGenerator->>custom_py: "_generate(data, is_dataframe=False)"
custom_py->>custom_py: _invoke_generator_function(data)
custom_py-->>custom_py: raises KeyError("missing user field")
custom_py->>custom_py: except Exception → log WARNING "This record will be skipped"
custom_py-->>CustomColumnGenerator: raises CustomColumnGenerationError
CustomColumnGenerator-->>AsyncTaskScheduler: raises CustomColumnGenerationError
Note over AsyncTaskScheduler: _is_expected_non_retryable(exc) → True (DataDesignerError)
AsyncTaskScheduler->>AsyncTaskScheduler: "_drop_row(row_group=0, row_index=0)"
AsyncTaskScheduler-->>Test: run() returns normally (no fatal abort)
Reviews (1): Last reviewed commit: "test: pin custom generator error boundar..." | Re-trigger Greptile
Summary
KeyErrorWhy
The original review flagged the error-boundary policy around
KeyError/TypeErrorexceptions. The current implementation already keeps rawColumnGeneratorinternal-bug exceptions fatal and wraps decorated custom-generator user failures asCustomColumnGenerationError. This test locks that distinction in at the scheduler boundary.Validation
.venv/bin/pytest packages/data-designer-engine/tests/engine/dataset_builders/test_async_scheduler.py -k "internal_bug_failure or custom_generator_key_error".venv/bin/ruff check --fix ..venv/bin/ruff format .