Skip to content

feat: post-materialization result file validation#74

Merged
EiffL merged 3 commits into
mainfrom
claude/issue-45-20260422-0905
Apr 26, 2026
Merged

feat: post-materialization result file validation#74
EiffL merged 3 commits into
mainfrom
claude/issue-45-20260422-0905

Conversation

@aboucaud
Copy link
Copy Markdown
Member

@aboucaud aboucaud commented Apr 22, 2026

Adds post-materialization validation for ASTRA output files. When a recipe exits 0 but writes suspicious output (all-NaN table, invalid/all-null metric JSON, empty figure), Dagster now emits context warnings rather than silently reporting success.

Closes #45

Generated with Claude Code

Signed-off-by: Alexandre Boucaud 3065310+aboucaud@users.noreply.github.com
Signed-off-by: Francois Lanusse EiffL@users.noreply.github.com

When a recipe exits 0 but writes suspicious output (all-NaN table,
invalid/all-null metric JSON, empty figure), the runner now emits
Dagster context warnings rather than silently reporting success.

Adds lightcone.engine.validation with type-specific validators:
- metric: checks JSON is parseable and not entirely null/NaN
- table: warns if every numeric column is all-NaN
- figure: warns if no image file exists or it is 0 bytes

Validation is wired into _build_single_asset and never raises —
failures surface as context.log.warning so materialization status
is unaffected while the user is clearly informed.

Closes #45

Co-authored-by: Alexandre Boucaud <aboucaud@users.noreply.github.com>
@aboucaud
Copy link
Copy Markdown
Member Author

@claude please review this PR

@claude
Copy link
Copy Markdown

claude Bot commented Apr 22, 2026

Claude finished @aboucaud's task in 2m 15s —— View job


PR Review: feat: post-materialization result file validation

  • Read changed files
  • Analyze validation logic
  • Check test coverage
  • Review integration with assets.py
  • Provide review feedback

Overall this is a clean, well-structured addition. The "never raise, always warn" contract is the right design for a post-run validation utility. Test coverage is solid. A few issues worth addressing:

src/lightcone/engine/validation.py

Bug: silent skip when output path is a file, not a directory (line 27)

if not output_dir.is_dir():
    return []  # ← no warning when a file exists at the expected dir path

If a recipe writes a single file at results/{universe}/{output} instead of a directory, this returns [] silently. That's suspicious output worth flagging. Consider:

if not output_dir.is_dir():
    return [f"Output '{output_id}': expected a directory at {output_dir}, found a file"]

Fix this →


Dead code in _is_null_scalar (line 178)

if isinstance(v, float) and math.isnan(v):
    return True

Python's json.loads raises ValueError on bare NaN by default — this branch can never trigger from standard JSON parsing. The None check above it is correct (JSON null → Python None). The NaN path is unreachable in practice. It's harmless but misleading; remove it or add a comment if you ever intend to use json.loads with parse_constant.


Minor: double iterdir() in figure validation

validate_output already calls output_dir.iterdir() at line 30 to check for empty. Then _validate_figure calls output_dir.iterdir() again at line 138. For consistency with the other validators, _validate_figure could just trust that the empty-dir guard already ran, but the current code makes a redundant filesystem call. Not a correctness issue, just worth noting.


Minor: CSV loaded fully into memory (line 95)

rows = list(reader)

This materializes the full table. For large scientific CSVs this could be expensive. Streaming column-by-column would be safer for the general case, but given this is a validation heuristic (not a query), it's probably acceptable.


src/lightcone/engine/assets.py

Missing coverage for the project_path is None branch

if project_path is not None:
    ...
    for warning in validate_output(out_dir, output_type, output_id):
        context.log.warning(warning)

Validation is silently skipped when project_path is None. This is the documented fallback path but it means the feature does nothing in that scenario. The existing test_assets.py tests don't set project_path, so validation is never exercised in the existing asset-level tests. A test that passes project_path and checks that warnings appear in the Dagster context would give end-to-end coverage for the new parameter.


output_type not passed at one potential call site

There is only one call site for _build_single_asset (inside build_asset_definitions), and it correctly passes output_type=output_def.get("type"). But _build_single_asset defaults output_type=None, which silently disables content validation. If a second call site is ever added (e.g., in build_definitions), this could be missed. Not a current bug, but worth a comment or a note in a follow-up.


tests/test_validation.py

Coverage is good. A few missing edge cases:

  • No test for the is_dir() == False path (file at expected dir path) — once the silent skip is fixed, add a test.
  • No test for _all_scalars_null with a JSON array of nulls (e.g., [null, null]), though this is covered indirectly by _collect_scalars.
  • test_empty_directory_warns uses output_type="metric" but the warning is emitted before the type dispatch — the test passes, but it would be clearer to test with output_type=None to show the empty-dir check is type-agnostic.

Summary

Severity Issue
Bug Silent pass when output path is a file, not a directory
Nit Dead math.isnan branch in _is_null_scalar
Gap No asset-level test exercising validate_output via context.log.warning
Minor Double iterdir() in figure path

The core logic and "never raise" contract are solid. Fix the silent-skip bug and the PR is good to merge.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 22, 2026

✅ Eval Results

Metric Value
Score 1.00
Build complete
Cost $1.64
Turns 65
Duration 480s
lightcone-cli 0.1.dev1+gb3a02c36e (b3a02c36)
Results Download

Graders

✅ spec_valid (1.00)
✅ all_materialized (1.00)

Full output
b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:22 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:23 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:23 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:24 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:24 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:24 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:25 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:25 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:25 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:25 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:26 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:26 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:26 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:27 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:27 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:16:27 httpx HTTP Request: POST https://proxy.app.daytona.io/toolbox/508544ca-1e6f-47e0-b0b2-7985a4906b80/files/bulk-upload "HTTP/1.1 200 OK"
20:24:30 lightcone.eval.sandbox Deleted sandbox for trial build-snae-0
  snae trial 0: score=1.00 complete

lightcone-cli: 0.1.dev1+gb3a02c36e (HEAD b3a02c36)

  Eval Results: Scores  
┏━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Task ┃     Score     ┃
┡━━━━━━╇━━━━━━━━━━━━━━━┩
│ snae │ 1.00 +/- 0.00 │
│      │ pass@k: 100%  │
└──────┴───────────────┘

   Eval Results: Cost &   
         Duration         
┏━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃ Task ┃ Cost / Duration ┃
┡━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ snae │      $1.64      │
│      │      480s       │
└──────┴─────────────────┘

Total: 1 trials, $1.64, 480s

Results saved to: eval-results/build-b3a02c36/results.json

@aboucaud aboucaud requested a review from EiffL April 22, 2026 09:28
Copy link
Copy Markdown
Member

@EiffL EiffL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 24, 2026

Developer Certificate of Origin

All contributors have signed the DCO. This PR is ready to merge.

Status

@EiffL
Copy link
Copy Markdown
Member

EiffL commented Apr 24, 2026

I have read the Developer Certificate of Origin and I hereby sign the DCO for this PR

1 similar comment
@aboucaud
Copy link
Copy Markdown
Member Author

I have read the Developer Certificate of Origin and I hereby sign the DCO for this PR

@EiffL EiffL merged commit a5f0d6d into main Apr 26, 2026
6 checks passed
@aboucaud aboucaud deleted the claude/issue-45-20260422-0905 branch May 7, 2026 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

prism run reports materialization ok when script exits 0 but writes all-NaN results

2 participants