Skip to content

Version 1.10.0

Choose a tag to compare

@bgunnar5 bgunnar5 released this 13 Apr 15:24
· 10 commits to main since this release
3acb30d

[1.10.0]

Fixed

  • Pip wheel wasn't including .sh files for merlin examples
  • The learn.py script in the openfoam_wf* examples will now create the missing Energy v Lidspeed plot
  • Fixed the flags associated with the stop-workers command (--spec, --queues, --workers)
  • Fixed the --step flag for the run-workers command
  • Fixed most of the pylint errors that we're showing up when you ran make check-style
    • Some errors have been disabled rather than fixed. These include:
      • Any pylint errors in merlin_template.py since it's deprecated now
      • A "duplicate code" instance between a function in expansion.py and a method in study.py
        • The function is explicitly not creating a MerlinStudy object so the code must be duplicate here
      • Invalid-name (C0103): These errors typically relate to the names of short variables (i.e. naming files something like f or errors e)
      • Unused-argument (W0613): These have been disabled for celery-related functions since celery does use these arguments behind the scenes
      • Broad-exception (W0718): Pylint wants a more specific exception but sometimes it's ok to have a broad exception
      • Import-outside-toplevel (C0415): Sometimes it's necessary for us to import inside a function. Where this is the case, these errors are disabled
      • Too-many-statements (R0915): This is disabled for the setup_argparse function in main.py since it's necessary to be big. It's disabled in tasks.py and celeryadapter.py too until we can get around to refactoring some code there
      • No-else-return (R1705): These are disabled in router.py until we refactor the file
      • Consider-using-with (R1732): Pylint wants us to consider using with for calls to subprocess.run or subprocess.Popen but it's not necessary
      • Too-many-arguments (R0913): These are disabled for functions that I believe need to have several arguments
        • Note: these could be fixed by using *args and **kwargs but it makes the code harder to follow so I'm opting to not do that
      • Too-many-local-variables (R0914): These are disabled for functions that have a lot of variables
        • It may be a good idea at some point to go through these and try to find ways to shorten the number of variables used or split the functions up
      • Too-many-branches (R0912): These are disabled for certain functions that require a good amount of branching
        • Might be able to fix this in the future if we split functions up more
      • Too-few-public-methods (R0903): These are disabled for classes we may add to in the future or "wrapper" classes
      • Attribute-defined-outside-init (W0201): These errors are only disabled in specification.py as they occur in class methods so init() won't be called
  • Fixed an issue where the walltime value in the batch block was being converted to an integer instead of remaining in HH:MM:SS format

Added

  • Now loads np.arrays of dtype='object', allowing mix-type sample npy
  • Added a singularity container openfoam_wf example
  • Added flux native worker launch support
  • Added PBS flux launch support
  • Added check_for_flux, check_for_slurm, check_for_lsf, and check_for_pbs utility functions
  • Tests for the stop-workers command
  • A function in run_tests.py to check that an integration test definition is formatted correctly
  • A new dev_workflow example multiple_workers.yaml that's used for testing the stop-workers command
  • Ability to start 2 subprocesses for a single test
  • Added the --distributed and --display-tests flags to run_tests.py
    • --distributed: only run distributed tests
    • --display-tests: displays a table of all existing tests and the id associated with each test
  • Added the --disable-logs flag to the run-workers command
  • Merlin will now assign default_worker to any step not associated with a worker
  • Added get_step_worker_map() as a method in specification.py
  • Added tabulate_info() function in display.py to help with table formatting
  • Added get_flux_alloc function for new flux version >= 0.48.x interface change
  • New flags to the query-workers command
    • --queues: query workers based on the queues they're associated with
    • --workers: query workers based on a regex of the names you're looking for
    • --spec: query workers based on the workers defined in a spec file

Changed

  • Changed celery_regex to celery_slurm_regex in test_definitions.py
  • Reformatted how integration tests are defined and part of how they run
    • Test values are now dictionaries rather than tuples
    • Stopped using subprocess.Popen() and subprocess.communicate() to run tests and now instead use subprocess.run() for simplicity and to keep things up-to-date with the latest subprocess release (run() will call Popen() and communicate() under the hood so we don't have to handle that anymore)
  • Rewrote the README in the integration tests folder to explain the new integration test format
  • Reformatted start_celery_workers() in celeryadapter.py file. This involved:
    • Modifying verify_args() to return the arguments it verifies/updates
    • Changing launch_celery_worker() to launch the subprocess (no longer builds the celery command)
    • Creating get_celery_cmd() to do what launch_celery_worker() used to do and build the celery command to run
    • Creating _get_steps_to_start(), _create_kwargs(), and _get_workers_to_start() as helper functions to simplify logic in start_celery_workers()
  • Modified the merlinspec.json file:
    • the minimum gpus per task is now 0 instead of 1
    • variables defined in the env block of a spec file can now be arrays
  • Refactored batch.py:
    • Merged 4 functions (check_for_slurm, check_for_lsf, check_for_flux, and check_for_pbs) into 1 function named check_for_scheduler
      • Modified get_batch_type to accommodate this change
    • Added a function parse_batch_block to handle all the logic of reading in the batch block and storing it in one dict
    • Added a function get_flux_launch to help decrease the amount of logic taking place in batch_worker_launch
    • Modified batch_worker_launch to use the new parse_batch_block function
    • Added a function construct_scheduler_legend to build a dict that keeps as much information as we need about each scheduler stored in one place
    • Cleaned up the construct_worker_launch_command function to utilize the newly added functions and decrease the amount of repeated code
  • Changed get_flux_cmd for new flux version >=0.48.x interface
  • The query-workers command now prints a table as its' output
    • Each row of the Workers column has the name of an active worker
    • Each row of the Queues column has a list of queues associated with the active worker

@koning @lucpeterson @ryannova @bgunnar5