Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ClimaAtmos new config file interface #395

Merged
merged 1 commit into from
Sep 21, 2023

Conversation

valeriabarra
Copy link
Member

@valeriabarra valeriabarra commented Aug 17, 2023

Purpose

This PR updates ClimaAtmos to v0.16.0 and its new interface for configuration files (rather than CLI options), and keeps the existing functionality for the Coupler to still use CLI options.

This new interface reduces the maintenance burden and code duplication that we used to have (we used to manually copy/paste ClimaAtmos' default CLI options). Now, if any default in ClimaAtmos will change in the future, the Coupler won't need to overwrite them. They will be reflected directly in the Coupler.

Closes #388

To-do

  • Update all buildkite regular and longrun pipeline jobs
  • Complete tests in interactive mode
  • Test & Debug

Content

  • Updated to latest ClimaAtmos release (v0.16.0)
  • Got rid of ClimaAtmos parsed_args table (keeping the only one entry that exists now, --config_file)
  • Renamed "AMIP - modular, Float32 test" -> "AMIP - modular, Float64 test", since we were using Float64 in that test
  • Cleaned up flame.jl and flame_diff.jl scripts (they were referring to an old job_id target_amip_n32_shortrun that we don't use anymore)

Remap and MPI trouble shooting (issues that came up after cluster upgrade)

  • mpi hdf5 circ dependency circular dependencies throwing wrong library errors
    • need to use the JuliaProject.toml and specify JULIA_LOAD_PATH (soon to be resolved with the TempestRemap new release, but for now we need to load packages in a specific order)
  • bus errors, race conditions when writing new regrid files
    • tests exhibit different behaviour with different modules (solution above)
    • apply barriers were needed
  • overloaded ApplyOfflineMap remapping
    • apply the correct const comms_ctx = ClimaComms.context(ClimaComms.CPUSingleThreaded())
  • coupler AMIP race conditions
    • ensure regrid directory contains the run-specific run_name
  • tempest remap file error
    • TR has a character limit, so the regrid directory path shouldn't be too long.

Review checklist

I have:

In the Content, I have included

  • relevant unit tests, and integration tests,
  • appropriate docstrings on all functions, structs, and modules, and included relevant documentation.

-->


  • I have read and checked the items on the review checklist.

@valeriabarra valeriabarra added enhancement New feature or request 🍃 leaf Issue coupled to a PR Dependencies 🔗 labels Aug 17, 2023
@valeriabarra valeriabarra self-assigned this Aug 17, 2023
@valeriabarra valeriabarra marked this pull request as draft August 17, 2023 18:35
@valeriabarra valeriabarra force-pushed the valeria/test-atmos-toml-config-files branch 3 times, most recently from 85f9435 to 7588afb Compare August 21, 2023 19:58
@valeriabarra valeriabarra reopened this Aug 21, 2023
@valeriabarra valeriabarra force-pushed the valeria/test-atmos-toml-config-files branch 5 times, most recently from ed70bd0 to bf44bc6 Compare August 23, 2023 02:59
@valeriabarra
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Aug 23, 2023
@bors
Copy link
Contributor

bors bot commented Aug 23, 2023

try

Build failed:

@valeriabarra valeriabarra force-pushed the valeria/test-atmos-toml-config-files branch 2 times, most recently from c1601d6 to 90075f9 Compare August 24, 2023 17:29
@valeriabarra valeriabarra marked this pull request as ready for review August 24, 2023 17:33
@valeriabarra valeriabarra force-pushed the valeria/test-atmos-toml-config-files branch 4 times, most recently from 17e1ef6 to 95b5959 Compare August 28, 2023 19:07
@valeriabarra valeriabarra force-pushed the valeria/test-atmos-toml-config-files branch 2 times, most recently from ec99c75 to 5504640 Compare August 29, 2023 01:18
@valeriabarra
Copy link
Member Author

CI passes (I just rebased onto main and squashed all commits to simplify the review process) and this PR is finally ready for review! Thanks!

Copy link
Member

@nefrathenrici nefrathenrici left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about some of the perf changes, but the atmos interface changes all look good to me! Thanks for dealing with the new config flow

Copy link
Collaborator

@LenkaNovak LenkaNovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job on this, @valeriabarra ! I have a few clarification comments and suggestions to re-enable interactive runs.

@@ -9,6 +9,9 @@ env:
BUILDKITE_BRANCH: "${BUILDKITE_BRANCH}"
JULIA_MAX_NUM_PRECOMPILE_FILES: 100
GKSwstype: 100
CONFIG_PATH: "config/model_configs"
PERF_CONFIG_PATH: "config/perf_configs"
MPI_CONFIG_PATH: "config/mpi_configs"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be clearer to call this mpi_unit_test_configs since some of the integration test configs also use MPI.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had followed the same name convention they have in Atmos. Do we care? Or we don't mind having them different?
For other cases (e.g., ClimaLSM, I had seen we tend to follow the same name conventions)

perf/flame_diff.jl Show resolved Hide resolved
experiments/AMIP/modular/coupler_driver_modular.jl Outdated Show resolved Hide resolved
experiments/AMIP/modular/coupler_driver_modular.jl Outdated Show resolved Hide resolved
@valeriabarra
Copy link
Member Author

Good job on this, @valeriabarra ! I have a few clarification comments and suggestions to re-enable interactive runs.

Interesting, thanks! I had tested it interactively (in fact, it was one of the checkboxes in the PR description). Let me take a closer look. Perhaps that was some time before I made some latest changes. Thanks for noticing

@valeriabarra
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Aug 30, 2023
@bors
Copy link
Contributor

bors bot commented Aug 30, 2023

try

Build failed:

@valeriabarra
Copy link
Member Author

Hi all, thanks for the review. I believed I replied/addressed all comments and CI passes. Thanks!

Copy link
Collaborator

@LenkaNovak LenkaNovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @valeriabarra , thanks for the revisions! Look good, but I'm noticing that the flame graphs are not being generated on Buildkite anymore. I'm also getting some errors with TemperstRemap when running interactively. I should have some time to debug/merge this next week. Does that work for you?

@valeriabarra
Copy link
Member Author

Hi @valeriabarra , thanks for the revisions! Look good, but I'm noticing that the flame graphs are not being generated on Buildkite anymore. I'm also getting some errors with TemperstRemap when running interactively. I should have some time to debug/merge this next week. Does that work for you?

Oh, so sorry for not noticing that. They were fixed before the revision. I must have missed something. Of course, if you can, please. Otherwise let me know if I can help. Thanks a lot!

@LenkaNovak LenkaNovak force-pushed the valeria/test-atmos-toml-config-files branch 9 times, most recently from b943ea3 to d072248 Compare September 20, 2023 21:15
CI: Renamed AMIP - modular, Float32 test -> AMIP - modular, Float64 test

Apply reviewer's comments

Use different merge order in interactive/non-interactive mode

flame test modification

trigger perf configs using model configs

Update to ClimaAtmos v0.16.0: Use config file interface

CI: Renamed AMIP - modular, Float32 test -> AMIP - modular, Float64 test

flame test modification

trigger perf configs using model configs

rm perf LSM dep

Update to ClimaAtmos v0.16.0: Use config file interface

CI: Renamed AMIP - modular, Float32 test -> AMIP - modular, Float64 test

Apply reviewer's comments

Use different merge order in interactive/non-interactive mode

flame test modification

trigger perf configs using model configs

rm ClimaLSM deps of sub folders

fix rebase

Buildkite name fix

spec run_name

hdf5 mpi fix try

fix perf runs

revert MPI load

try MPI specs

revert

revert

mpi write debug - barrier try

try CC env vars

try CC env vars

srun

CA16 BK setup

flame fix

j1.9

Manifest

v1.9 local pass

add JuliaProject.toml

revert Projects

runs serially, try JuliaProject, J1.8.5

srun for mpi jobs

add srun for amip

rm depot path

rm CLIMACOMMS_CONTEXT

test JULIA_DEPOT_PATH

race in mpi tests

race in mpi tests

race in mpi tests

race in mpi tests

race in mpi tests

show pid

race in mpi tests

show pid

device

race in mpi tests

race in AMIP

cleanup

cleanup

add notes to local mpi tests

add notes to local mpi tests

GKSwstype: 100
@LenkaNovak LenkaNovak force-pushed the valeria/test-atmos-toml-config-files branch from d072248 to 45f86f6 Compare September 20, 2023 22:19
@LenkaNovak
Copy link
Collaborator

bors r+

@bors
Copy link
Contributor

bors bot commented Sep 21, 2023

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot merged commit 5f678b9 into main Sep 21, 2023
10 checks passed
@bors bors bot deleted the valeria/test-atmos-toml-config-files branch September 21, 2023 02:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies 🔗 enhancement New feature or request 🍃 leaf Issue coupled to a PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow both reading config files & CLI options
3 participants