Skip to content

validate(): extract resume-file parsing into a named helper#1209

Merged
hallerite merged 2 commits intomainfrom
validate-resume-readability
Apr 20, 2026
Merged

validate(): extract resume-file parsing into a named helper#1209
hallerite merged 2 commits intomainfrom
validate-resume-readability

Conversation

@hallerite
Copy link
Copy Markdown
Member

@hallerite hallerite commented Apr 20, 2026

Summary

Addresses @snimu's readability feedback on the resume block in TaskSet.validate() — the previous code was a three-level nested for/try/if chain inline, which obscured the simple intent ("parse prior JSONL, filter to in-range indices").

Changes

verifiers/envs/experimental/composable/task.py:

  • Pulls the resume-file reader into a local helper _read_prior_rows(path) -> (rows, indices) with a brief docstring explaining the in-range filter rationale.

  • Top-level resume flow is now flat and reads top-to-bottom:

    if resume and out_path_p.exists():
        prior_rows, completed_indices = _read_prior_rows(out_path_p)
        if prior_rows:
            logger.info("Resuming: %d rows already in %s, will skip", ...)
    else:
        out_path_p.write_text("")

No behavior change

Helper preserves existing semantics:

  • Only rows whose index is in [0, total) are kept — resuming with a smaller n than a prior run still doesn't leak out-of-range rows into the result list or summary stats.
  • Blank / non-JSON lines and rows without an integer index are silently skipped (same as before).
  • Returns (rows, indices) so the main loop's todo filter keeps working unchanged.

Test plan

  • ruff check / ruff format --check clean
  • CI

🤖 Generated with Claude Code


Note

Low Risk
Low risk refactor confined to TaskSet.validate() resume handling; intended behavior is preserved but could affect edge cases around malformed JSONL or index filtering if incorrect.

Overview
Refactors TaskSet.validate() resume logic by extracting prior JSONL parsing into a local _read_prior_rows() helper with a docstring, reducing nested control flow.

Resume behavior remains the same: it reads existing out_path JSONL, filters to in-range integer index values (< total), skips malformed/blank lines, and returns both prior_rows and completed_indices for subsequent validation and logging.

Reviewed by Cursor Bugbot for commit 8c14c2e. Bugbot is set up for automated code reviews on this repo. Configure here.

Addresses @snimu's readability feedback: the previous resume block
was a three-level nested ``for``/``try``/``if`` chain inline, which
obscured the simple intent ("parse prior JSONL, filter to in-range
indices"). Pulling it out into ``_read_prior_rows(path)`` leaves the
top-level flow legible:

    if resume and out_path_p.exists():
        prior_rows, completed_indices = _read_prior_rows(out_path_p)
        if prior_rows:
            logger.info(...)
    else:
        out_path_p.write_text("")

The helper preserves the existing semantics:

* only rows whose ``index`` is in ``[0, total)`` are kept (so resuming
  with a smaller ``n`` doesn't leak out-of-range rows into the returned
  result list or the summary stats);
* blank / non-JSON lines and rows without an integer ``index`` are
  silently skipped;
* it returns ``(rows, indices)`` — the indices set is what the main
  loop uses for the ``todo`` filter.

No behavior change; just a refactor + brief docstring explaining the
in-range filter rationale.
prior_rows.append(row)
completed_indices.add(row["index"])
prior_rows, completed_indices = _read_prior_rows(out_path_p)
if prior_rows:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move the logging into the helper as well?

Addresses @snimu's follow-up: the log belongs next to the work that
produced the row count, so the caller no longer has to reach back
for len(prior_rows).
@hallerite
Copy link
Copy Markdown
Member Author

Done in 8c14c2e — moved the logger.info(...) inside the helper, so the caller no longer reaches back for len(prior_rows). Docstring also mentions the log side-effect.

@hallerite hallerite merged commit 45b14de into main Apr 20, 2026
6 checks passed
@hallerite hallerite deleted the validate-resume-readability branch April 20, 2026 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants