Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates for WCOSS2 TDS (Acorn) #810

Closed
DusanJovic-NOAA opened this issue Sep 16, 2021 · 3 comments
Closed

Updates for WCOSS2 TDS (Acorn) #810

DusanJovic-NOAA opened this issue Sep 16, 2021 · 3 comments

Comments

@DusanJovic-NOAA
Copy link
Collaborator

Minor updates in the build system are required on Acorn. Primarily updates in modulefiles. Regression test is also updated.

See PR #809

@DusanJovic-NOAA
Copy link
Collaborator Author

Regression test on Acorn skips the following tests:

  1. All tests using WW3. WW3 does not support wcoss2 platform yet.
  2. All tests using parallel netcdf output
  3. cpld_decomp and control_fhzero do not reproduce the baselines.

@DusanJovic-NOAA
Copy link
Collaborator Author

This is the error from one of the tests using parallel netcdf:

 in fcst,init total time:    5.45701398799974     
 in fcst run phase 2, na=           0
ADIOI_CRAY_WRITECONTIG(243): filename='dynf000.nc'  error='Bad address'  errno=14  PE=00060  W_rec=00001  off=0000000000  len=0000000676  See MPICH_MPIIO_ABORT_ON_RW_ERROR.
MPICH ERROR [Rank 61] [job id a8b8f7eb-c772-4102-8ad9-f8b068e8bd99] [Wed Sep  1 22:51:03 2021] [nid001003] - MPICH ERROR [Rank 62] [job id a8b8f7eb-c772-4102-8ad9-f8b068e8bd99] [Wed Sep  1 22:51:03 2021] [nid001003] - Abort(137929743) (rank 62 in comm 0): Fatal error in MPIR_CRAY_Bcast_Tree: Other MPI error, error stack:
MPIR_CRAY_Bcast_Tree(183): message sizes do not match across processes in the collective routine: Received -32766 but expected 1

aborting job:
Fatal error in MPIR_CRAY_Bcast_Tree: Other MPI error, error stack:
MPIR_CRAY_Bcast_Tree(183): message sizes do not match across processes in the collective routine: Received -32766 but expected 1
MPICH ERROR [Rank 64] [job id a8b8f7eb-c772-4102-8ad9-f8b068e8bd99] [Wed Sep  1 22:51:03 2021] [nid001003] - Abort(137929743) (rank 64 in comm 0): Fatal error in MPIR_CRAY_Bcast_Tree: Other MPI error, error stack:
MPIR_CRAY_Bcast_Tree(183): message sizes do not match across processes in the collective routine: Received -32766 but expected 1

@junwang-noaa
Copy link
Collaborator

Ben created an new issue #1116 to port ufs-weather-model to wcoss2 (cactus/dogwood), let's close this one.

epic-cicd-jenkins pushed a commit that referenced this issue Apr 17, 2023
* remove wcoss_dell_p3

* remove block for tide and gyre
epic-cicd-jenkins pushed a commit that referenced this issue Apr 17, 2023
* Construct var_defns components from dictionary.

* Bring back config_defaults.yaml

* Add support for sourcing yaml file into shell script.

* Remove newline for printing config, json config fix.

* Make QUILTING a sub-dictionary in predef_grids

* Reorganize config_defaults.yaml by task and feature.

* Bug fix with QUILTING=true.

* Structure a dictionary based on a template dictionary.

* Convert all WE2E config files to yaml.

* Take care of problematic chars when converting to shell string.

* Process only selected keys of config.

* Add symlinked yaml config files.

* Actually use yaml config files for WE2E tests.

* Delete all shell WE2E configs.

* Don't check for single quotes in test description.

* Make WE2E work with yaml configs.

* Make yaml default config format.

* Bug fix in run_WE2E script.

* Add utility to check validity of yaml config file.

* Add config utility interface in ush directory.

* Remove unused check_expt_config_vars script.

* Add description to default config.

* Reorganize source_config.

* Add XML as one of the config formats.

* Update custom_ESGgrid config.

* Bug fix due to update.

* Change ensemble seed.

* Change POST_OUTPUT group due to merge.

* Make xml and ini configs work.

* Maintain config structure down to var_defns.

* Add function to load structured shell config, put description under metadata

* Flatten dicts before importing env now that shell config is structured.

* Support python regex for selecting dict keys.

* Add capability of sourcing task specific portion of config file.

* Access var_defns via env variable.

* Make names of tasks consistent with ex- and j- job script names.

* Append pid to temp file.

* Prettify user config, don't use " in xml texts.

* Compare timestamp of csv vs all files instead of directory.

* Fixes for some pylint suggestions.

* Convert new configs to yaml.

* Format python files with black (no functional change).

* More readable yaml/json formats by using more data types.
Only datetime type is now in quotes.

* More readable yaml config files for WE2E and default configs.

* Make config_defaults itself more readable.

* Correct pyyaml list indentation issue.

* Fix indentation in all config files.

* Use unquoted WTIME in config_defaults

* Cosmotic changes.

* Fix due to merge.

* Make __init__.py clearer.

* Fixes due to merge.

* Minor edits of comments.

* Remove wcoss_dell_p3 from workflow (#810)

* remove wcoss_dell_p3

* remove block for tide and gyre

* Replace deprecated NCAR python environment with conda on Cheyenne (#812)

* Fix issue on get_extrn_lbcs when FCST_LEN_HRS>=40 with netcdf (#814)

* activate b file on hpss for >40h

* add a new we2e test for fcst_len_hrs>40

* reduce fcst time for we2e

* Convert new test case to yaml.

* Fix formatting due to merge.

* Convert new test case to yaml.

* Fix unittest.

* Merge develop

* Remove exception logic from __init__.py

* Minor change to cmd concat.

* Make grid gen methods return dictionary, simplifis code a lot.

* Add a comment why we are suppressing yaml import exception.

* Minor change to beautify unittest output.

* Add status badge for functional tests.

* Reorder tasks in config_default and we2e test cases to match order in FV3LAM.xml

* Keep single quotes and newlines in we2e test description.

* Revert back to not rounding to 10 digits

Co-authored-by: Chan-Hoo.Jeon-NOAA <60152248+chan-hoo@users.noreply.github.com>
Co-authored-by: Michael Kavulich <kavulich@ucar.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants