v0.4.0
This is a bug-fix release focused on stabilizing the orchestration of blueprints.
Breaking Changes
- Rename env var group name constants to omit leading underscore
- Modified the interpretation of
Status.is_runningto excludeUnsubmitted
New features
- Enable status retrieval for multiple SLURM jobs in a single
sacctcall - The orchestrator writes a sentinel file to disk to dynamically alter dependencies
- Add trace log level to reduce log frequency during
Workplanexecution - Persist
WorkplanRunrecord for every invocation of the orchestrator for run history and lookup - Add
cstar workplan monitor --run-id <run-id>for reattaching to a runningWorkplan
Security Fixes
- N/A
Bug Fixes
- Fix failure to override output directories for orchestrated blueprints
- Fix defect where steps defined before dependencies were not mapped correctly after time-splitting
- Fix unhandled exceptions in
cstar [blueprint|workplan] checkwith invalid paths - Fix posix-path conversion bug when passing blueprint URLs to
cstar blueprint run - Fix case-sensitivity bug when configuring split frequency via
CSTAR_ORCH_TRX_FREQenv variable - Fix a
JSONserialization bug with pydantic models using field aliases - Fix incorrect working directories resulting from
LiveStep.from_stepwhen a parent is specified - Fix failure to terminate health check thread
- Fix issue when parsing SLURM status of the form
CANCELLED by <username> - Fix incorrect env var discovery if
env_appears repeatedly in variable name - Fix unhandled exception caused by attempting to execute
typerapp - Fix
cstar workplan status ...triggering a re-build of the entire workplan
Improvements
- Improve error handling when
CachedRemoteRepositoryRetriever.refreshfails - Validate inputs prior to execution with
cstar [blueprint|workplan] run - Improve module load times
- Extract nested/duplicate
YAMLrepresenter functions for re-use - Replace use of incorrect
mp.Queuewherequeue.Queueis appropriate - Replace blocking
time.sleepusage in health check thread - Removed slow status retrieval loop for previously submitted jobs
- Ensure env var display consistency in
cstar env showoutput - Enable "path-less" CLI commands, such as
cstar workplan status --run-id <id>. - Display detailed status information for tasks in a
Workplan
Miscellaneous
- N/A
Full commit history
For more details, please refer to the commit history.
What's Changed
- Fix environment.yml laptop by @smaticka in #429
- Add env vars to docs/CLI; unsort by @ScottEilerman in #435
- CSD-574: Mitigate remote repo cache failure by @ankona in #438
- Fix failure to override output directories in orchestrated blueprints by @ankona in #434
- Workplan Tutorial by @ScottEilerman in #436
- Guide/docs polish: Typos, spacing, formatting by @ScottEilerman in #441
- fix blueprint link by @ScottEilerman in #442
- Fix uncaught exceptions when validating via CLI by @ankona in #443
- Improve env item tests failing when executed on HPC by @ankona in #444
- Fix posix-path conversion bug for remote blueprints (CSD-535) by @ankona in #440
- Add type-checking blocks by @ankona in #446
- Dynamically load roms_tools package by @ankona in #447
- Perform lazy loading of networkx package by @ankona in #448
- Use lazy import with matplotlib by @ankona in #449
- Fix case-sensitivity bug when reading
CSTAR_ORCH_TRX_FREQenv var by @ankona in #452 - Serialization tweaks by @ankona in #451
- Fix incorrect reset-file paths generated by time splitter by @ankona in #453
- Revert use of joined outputs during time-splitting/restarts by @ankona in #455
- Remove dupe/hardcoded job config defaults by @ankona in #457
- Fix pegging CPU with health check thread by @ankona in #456
- Enable multi-job status retrieval from SLURM by @ankona in #458
- Update PR template to enable automatic generation of release notes by @ankona in #454
- Orchestrator re-entrance bug fixes by @ankona in #459
- Add trace log level by @ankona in #460
- Fix broken feature flag discovery by @ankona in #462
- Enable Workplan run-tracking by @ankona in #461
Full Changelog: v0.3.0...v0.4.0