Skip to content

v0.4.0

Latest

Choose a tag to compare

@ankona ankona released this 23 Mar 15:53
· 76 commits to main since this release
e5d1a55

v0.4.0

This is a bug-fix release focused on stabilizing the orchestration of blueprints.

Breaking Changes

  • Rename env var group name constants to omit leading underscore
  • Modified the interpretation of Status.is_running to exclude Unsubmitted

New features

  • Enable status retrieval for multiple SLURM jobs in a single sacct call
  • The orchestrator writes a sentinel file to disk to dynamically alter dependencies
  • Add trace log level to reduce log frequency during Workplan execution
  • Persist WorkplanRun record for every invocation of the orchestrator for run history and lookup
  • Add cstar workplan monitor --run-id <run-id> for reattaching to a running Workplan

Security Fixes

  • N/A

Bug Fixes

  • Fix failure to override output directories for orchestrated blueprints
  • Fix defect where steps defined before dependencies were not mapped correctly after time-splitting
  • Fix unhandled exceptions in cstar [blueprint|workplan] check with invalid paths
  • Fix posix-path conversion bug when passing blueprint URLs to cstar blueprint run
  • Fix case-sensitivity bug when configuring split frequency via CSTAR_ORCH_TRX_FREQ env variable
  • Fix a JSON serialization bug with pydantic models using field aliases
  • Fix incorrect working directories resulting from LiveStep.from_step when a parent is specified
  • Fix failure to terminate health check thread
  • Fix issue when parsing SLURM status of the form CANCELLED by <username>
  • Fix incorrect env var discovery if env_ appears repeatedly in variable name
  • Fix unhandled exception caused by attempting to execute typer app
  • Fix cstar workplan status ... triggering a re-build of the entire workplan

Improvements

  • Improve error handling when CachedRemoteRepositoryRetriever.refresh fails
  • Validate inputs prior to execution with cstar [blueprint|workplan] run
  • Improve module load times
  • Extract nested/duplicate YAML representer functions for re-use
  • Replace use of incorrect mp.Queue where queue.Queue is appropriate
  • Replace blocking time.sleep usage in health check thread
  • Removed slow status retrieval loop for previously submitted jobs
  • Ensure env var display consistency in cstar env show output
  • Enable "path-less" CLI commands, such as cstar workplan status --run-id <id>.
  • Display detailed status information for tasks in a Workplan

Miscellaneous

  • N/A

Full commit history

For more details, please refer to the commit history.

What's Changed

  • Fix environment.yml laptop by @smaticka in #429
  • Add env vars to docs/CLI; unsort by @ScottEilerman in #435
  • CSD-574: Mitigate remote repo cache failure by @ankona in #438
  • Fix failure to override output directories in orchestrated blueprints by @ankona in #434
  • Workplan Tutorial by @ScottEilerman in #436
  • Guide/docs polish: Typos, spacing, formatting by @ScottEilerman in #441
  • fix blueprint link by @ScottEilerman in #442
  • Fix uncaught exceptions when validating via CLI by @ankona in #443
  • Improve env item tests failing when executed on HPC by @ankona in #444
  • Fix posix-path conversion bug for remote blueprints (CSD-535) by @ankona in #440
  • Add type-checking blocks by @ankona in #446
  • Dynamically load roms_tools package by @ankona in #447
  • Perform lazy loading of networkx package by @ankona in #448
  • Use lazy import with matplotlib by @ankona in #449
  • Fix case-sensitivity bug when reading CSTAR_ORCH_TRX_FREQ env var by @ankona in #452
  • Serialization tweaks by @ankona in #451
  • Fix incorrect reset-file paths generated by time splitter by @ankona in #453
  • Revert use of joined outputs during time-splitting/restarts by @ankona in #455
  • Remove dupe/hardcoded job config defaults by @ankona in #457
  • Fix pegging CPU with health check thread by @ankona in #456
  • Enable multi-job status retrieval from SLURM by @ankona in #458
  • Update PR template to enable automatic generation of release notes by @ankona in #454
  • Orchestrator re-entrance bug fixes by @ankona in #459
  • Add trace log level by @ankona in #460
  • Fix broken feature flag discovery by @ankona in #462
  • Enable Workplan run-tracking by @ankona in #461

Full Changelog: v0.3.0...v0.4.0