Skip to content

v0.11.0

@tlamadon tlamadon tagged this 04 Jun 19:13
Inspired by GitHub Actions' ``$GITHUB_STEP_SUMMARY``: tasks can write
structured artifacts to well-known paths and scripthut surfaces them
in the run-detail UI. The use case the user described: scientific
runs that emit plots and markdown tables which deserve a real UI
panel rather than scraped log output.

Two display surfaces, each driven by its own env var so the task
author can decide what's per-task detail vs. what belongs in the
run-wide TL;DR:

- ``$SCRIPTHUT_OUTPUT_DIR`` — the task's private output directory.
  Anything written here surfaces in a new "Outputs" subtab under the
  per-task Details panel. Markdown is rendered (sanitized) inline,
  images are embedded via a new file-serving endpoint, other files
  appear as download links.
- ``$SCRIPTHUT_TASK_SUMMARY`` — convenience path
  (``$SCRIPTHUT_OUTPUT_DIR/task-summary.md``); the markdown shown
  prominently at the top of the per-task panel.
- ``$SCRIPTHUT_RUN_SUMMARY`` — separate file the task writes for the
  run-level Summary panel. Each task's fragment becomes one section,
  concatenated in submission order with task-name headings.

User authors plain bash, no special tooling:

  matplotlib_or_whatever > "$SCRIPTHUT_OUTPUT_DIR/loss.png"
  { echo "| epoch | loss |"; echo "|---|---|"; cat metrics.csv; } \
    > "$SCRIPTHUT_TASK_SUMMARY"
  echo "- **train-0**: 200 epochs → final loss 0.041" \
    > "$SCRIPTHUT_RUN_SUMMARY"

Tasks that write nothing pay nothing — no UI panel, no errors.

v1 covers Slurm + PBS (the SSH backends). Batch + EC2 are explicit
follow-ups requiring S3 / archive-pattern plumbing; their generated
scripts simply lack the new exports so their tasks behave as before.
Streaming updates while a task runs is also a v2; v1 collects on the
SETTLING → COMPLETED transition (matching the new state machine from
v0.10.0).

Implementation:
- ``TaskDefinition.get_output_dir`` / ``get_run_summary_path`` define
  the on-backend layout (``<log_dir>/outputs/<run_id>/<task_id>/`` for
  the per-task dir; ``<log_dir>/outputs/<run_id>/_run/<task_id>.md``
  for the run-summary fragment — colocated so the aggregator can list
  fragments with one ``ls``).
- ``generate_script_body`` grows two kwargs and emits three exports +
  ``mkdir -p`` in the right slot (after env_vars so user ``set:``
  rules can't accidentally clobber the contract paths, before the
  ``cd`` so the user's command can write immediately). Backends that
  don't pass the new kwargs (Batch, EC2) produce unchanged scripts.
- Slurm + PBS ``generate_script`` pass the two new paths through.
- ``RunManager._handle_task_outputs`` mirrors ``_handle_generates_source``:
  one SSH round-trip (``find -printf '%P\\t%s\\n' | head -201`` plus
  ``[ -f $run_summary ] && echo HAS_SUMMARY >&2``), parse, classify
  by suffix, drop oversize files, persist. Errors are non-fatal —
  outputs are decorative, not correctness-critical.
- ``_after_item_completed`` is the new fan-out point for both
  ``_handle_generates_source`` and ``_handle_task_outputs``; the
  three branches in ``main.poll_backend`` that transition items to
  COMPLETED now go through this seam instead of calling
  ``generates_source`` directly. New post-completion hooks land here.
- New ``RunItem.outputs: list[TaskOutput]`` and
  ``has_run_summary: bool`` fields, persisted via to_dict/from_dict.
  Defaults make pre-0.11.0 ``run.json`` files load unchanged.
- ``get_task_detail`` gains a ``detail_type=outputs`` arm and every
  response carries ``has_outputs`` so the UI knows whether to reveal
  the Outputs subtab without a separate probe.
- New ``/runs/{id}/tasks/{tid}/outputs/file/{rel_path:path}`` endpoint
  streams files with the path-traversal guard the plan called out
  (posixpath.normpath + containment check, 404 on escape attempts —
  same observable as "not found" to avoid telling scanners "this
  exists but you can't have it"). 10 MB cap, base64 over SSH to
  survive binary content, ``Cache-Control: private, max-age=60``.
- New ``/runs/{id}/summary`` endpoint aggregates each item's
  ``has_run_summary`` contribution in submission order. Image
  references inside ``$SCRIPTHUT_RUN_SUMMARY`` resolve against the
  contributing task's output dir.
- New ``_render_markdown_for_outputs`` helper: ``markdown`` with
  ``tables`` / ``fenced_code`` / ``nl2br`` / ``sane_lists`` plus
  ``bleach`` sanitization. Relative ``<img src>`` is rewritten to
  the file endpoint so authors write ``![](plot.png)`` naturally;
  absolute URLs and root-relative paths pass through unchanged.
- UI: new ``dtab-outputs`` subtab (hidden until ``has_outputs``),
  a structured renderer for the outputs list (rendered markdown,
  inline images, download links), and a top-level Run Summary card
  populated client-side from ``/runs/{id}/summary`` with a 30s
  refresh — SSE plumbing would have meant SSH reads on every cycle,
  which is overkill for a TL;DR panel.

Dependencies: ``markdown>=3.5`` and ``bleach>=6.1``. Both are well-
trodden, no native deps. ``bleach`` is in maintenance mode but
stable; ``nh3`` is the modern alternative if we ever care to swap
(noted in the plan, not a v1 blocker).

Tests (``tests/test_task_outputs.py``, 29 new):
- Path helpers return the documented layout (no ``_run`` collision).
- Wrapper emits exports in the right order; backwards-compatible
  when kwargs absent.
- ``_handle_task_outputs`` empty-dir / classification by suffix /
  drops-oversize / has_run_summary-from-stderr / cap-at-max-files /
  no-SSH silent-noop.
- ``_after_item_completed`` calls both hooks; skips
  ``generates_source`` when unset.
- Markdown: basic render, tables, relative-img rewrite, absolute-URL
  pass-through, ``<script>`` stripped, ``onerror`` stripped.
- Persistence: ``TaskOutput`` round-trip; ``RunItem.outputs`` +
  ``has_run_summary`` round-trip; old ``run.json`` files load with
  the new fields defaulted.
- Path-traversal guard: normal-file / subdir / ``../../etc/passwd`` /
  ``subdir/../../etc/passwd`` / absolute-path / bare ``..``.

580/580 in the broad sweep + 156 in the backend-specific suites pass.

Bumping to v0.11.0 — additive feature, no API break. Existing
endpoints unchanged, ``RunItem.outputs`` defaults to ``[]``.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Assets 2
Loading