Skip to content

Simple cli wrapper#18

Merged
hdpriest-ui merged 8 commits intoccmmf:mainfrom
hdpriest-ui:simple-cli-wrapper
Apr 9, 2026
Merged

Simple cli wrapper#18
hdpriest-ui merged 8 commits intoccmmf:mainfrom
hdpriest-ui:simple-cli-wrapper

Conversation

@hdpriest-ui
Copy link
Copy Markdown
Collaborator

relating to https://github.com/orgs/ccmmf/discussions/182

  1. magic-ensemble instantiated with verbs for various steps. Internally each verb binds directly to methods in the 2a_grass directory. These would be updated when we re-org the repository.
  2. user-facing configuration with parameters we expect users to change instantiated at example_user_config.yaml. allows users to specify external data resources, control use of apptainer, etc.
  3. non-user facing configuration for the workflow contained in workflow_manifest.yaml, which specifies the pipeline steps and their i/o. Users shouldn't mess with this, but maintainers would.
  4. added tool for patching the XML file with different dispatches as needed by pecan
  5. had to make minor changes to some of the actual 2a_grass scripts - @infotroph ; definitely check here to make sure I didn't break anything. This will have to stay in sync with the content of the data artifacts that we are handing over, and is a potential headache. Stabilizing both sides of this is a good idea at our collective earliest convenience.

This would be the example that would be adapted to fit downscaling, and other workflows as we go.

Note: currently the ccmmf compute nodes are not responsive, so this can only be seen in local execution mode. (login node is fine; Rob is working on the compute nodes)

…isplay run directory during dry-run and modify 00_fetch_s3_and_prepare_run_dir.sh to resolve and use absolute run directory for artifact downloads and extractions.
staging to workflow CLI

- magic-ensemble: --config is now required; supports use_apptainer (run
  prepare steps inside
  a container) and pecan_dispatch (select how ensemble members are
  submitted/executed)
- workflow_manifest.yaml: defines available dispatch modes
  (local-gnu-parallel, slurm-dispatch)
  with appropriate host XML for native and apptainer execution; S3
  resources consolidated
- Prep scripts: accept CLI flags instead of env vars; stage
  user-provided external files
  (e.g. template.xml) into the run directory before prepare steps run
- tools/patch_xml.py: utility to patch elements in PEcAn XML config
  files in-place
- 01_ERA5_nc_to_clim.R: ERA5 met inputs now looked up by grid cell
  center rather than site id
- example_user_config.yaml: documents new user-facing options
  (use_apptainer, pecan_dispatch,
  external_paths)

Relates to: https://github.com/orgs/ccmmf/discussions/182
Copy link
Copy Markdown
Contributor

@dlebauer dlebauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good step toward standardizing the workflow interfaces. The pattern of separating user-facing config from internal workflow details seems like the right direction.

A few things would help clarify the intent and direction. Not to bog down this PR - this works well as MVP proof of concept, but anything not addressed here should be captured in one or more follow up issues.

  1. In the past we've discussed separating data prep from the rest of the analysis workflows. Is that still a viable option, or is there a rationale for keeping data prep combined with ensembles?
  2. It's unclear why this lives under 2a_grass - what is the path from here to using this in the workflows that are our core deliverables?
  3. A README that explains the approach would make it easier to understand and adapt. Including the overall design, what is the role of the cli, config files, execution graph; boundaries between config files, manifest, and template.xml. What general patterns and specific components will be reused when adapting this to other workflows? This can wait until the next iteration, but wanted to make sure it is on the map.
  4. What is the plan for testing individual components and overall integration?

If this works now and is ready to implement, I'm good with the general pattern. My main question is how robust and extensible this will be. After implementing both the targets and custom workflows, do you have any insights on what to look for and how we would know if this gets to a level of complexity where we would consider a more standard workflow solution?

Comment thread .github/workflows/apptainer-sipnet-carb.yml
Comment thread 2a_grass/workflow_manifest.yaml Outdated
Comment thread magic-ensemble Outdated
Comment thread magic-ensemble
Comment thread 2a_grass/00_stage_external_inputs.sh Outdated
@hdpriest-ui
Copy link
Copy Markdown
Collaborator Author

I'm going to answer some of these while working on some other stuff, as i can;

re:

2. It's unclear why this lives under 2a_grass - what is the path from here to using this in the workflows that are our core deliverables?

2a_grass was the most mature ensembles workflow I had; definitely feel that we should attempt to extract the pieces of the various stages and put them together under the "ensembles" concept. I didn't want to make heavy edits to the workflow itself at the same time I was putting together the CLI.

- Stage external inputs to manifest-defined destinations rather than
  source
  basename; enforce that each external_paths key has a matching
  manifest.paths
  entry and error if not
- Make get_val() fall through to defaults for missing config keys
  instead of
  erroring; add explicit post-resolution required check for run_dir only
- Remove spurious check_aws calls from prepare and run-ensembles
  commands
- Reorganize workflow_manifest.yaml: move steps block to top, normalize
  to
  4-space indentation throughout
- Add magic-ensemble-DEVELOPERS.md (architecture, internals, dispatch)
  and
  magic-ensemble-README.md
@hdpriest-ui
Copy link
Copy Markdown
Collaborator Author

addressed repo organization in proposal at https://github.com/orgs/ccmmf/discussions/217

Copy link
Copy Markdown
Contributor

@dlebauer dlebauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Thanks for adding the READMEs. They are a great start. I think that we can iteratively improve these as we go and as we get feedback.

Will CLI tools be separated into different repositories? It seems we would want to avoid maintaining multiple places that document the overall architecture.

Comment thread 2a_grass/template.xml
<revision>git</revision>
<delete.raw>TRUE</delete.raw>
<binary>sipnet.git</binary>
<binary>/usr/local/bin/sipnet.git</binary>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change intended to require either a) that the binary is located at /usr/local/bin and /or assume that this is within a container?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch. I believe this will induce a bug in some situations, and we will require the use of the XML patch tool to ensure functionality in different run contexts.

I will add a change to this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you say more about why this would need a patch tool? Certainly we need to be able to set the binary path to match the run context, but why not set it in xml_build.R?

Replaces the host-only `patch_dispatch()` function with a generic
`patch_xml_block()` that accepts an XML tag name and yq paths for both
plain and Apptainer variants. Uses this to patch both the `<host>` block
(dispatch config) and the new `<model>` block (SIPNET binary path) in a
single prepare pass.

Adds `sipnet_model.model_xml` and `sipnet_model.model_xml_apptainer` to
the workflow manifest, selecting the Apptainer variant (absolute binary
path inside the container) when `use_apptainer=true`. Updates developer
docs to reflect the new calling convention and extensibility pattern.
Comment on lines +87 to +110
# Show path for user: relative to INVOCATION_CWD if under it, else absolute
report_path() {
local abs_path="$1"
if [[ -n "$INVOCATION_CWD" && "$abs_path" == "$INVOCATION_CWD"/* ]]; then
echo "${abs_path#"$INVOCATION_CWD"/}"
else
echo "$abs_path"
fi
}

if [[ ! -f "$MANIFEST" ]]; then
echo "00_fetch_s3_and_prepare_run_dir: Manifest not found: $MANIFEST" >&2
exit 1
fi

if ! command -v yq &>/dev/null; then
echo "00_fetch_s3_and_prepare_run_dir: yq is required to read the manifest." >&2
exit 1
fi

cd "$REPO_ROOT"

# Resolve a path relative to run_dir (RUN_DIR may be absolute or relative to REPO_ROOT).
resolve_run_path() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an inconsistency between "absolute or REPO_ROOT" and "absolute or INVOCATION_CWD" here? The latter sounds more like how the PEcAn functions expect to work, but maybe in this CLI they'll wind up being the same?

Comment on lines +140 to +148
# LandTrendr TIFs: bucket + key from s3.median_tif and s3.stdv_tif
median_key_prefix=$(yq eval '.s3.median_tif.key_prefix' "$MANIFEST")
median_filename=$(yq eval '.s3.median_tif.filename' "$MANIFEST")
stdv_key_prefix=$(yq eval '.s3.stdv_tif.key_prefix' "$MANIFEST")
stdv_filename=$(yq eval '.s3.stdv_tif.filename' "$MANIFEST")
median_s3_key=$(s3_key "$median_key_prefix" "$median_filename")
stdv_s3_key=$(s3_key "$stdv_key_prefix" "$stdv_filename")
median_s3_uri="s3://${s3_bucket}/${median_s3_key}"
stdv_s3_uri="s3://${s3_bucket}/${stdv_s3_key}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up that this may eventually need to support multiple years for the validation workflow. Not certain yet, though

((lat + 0.25) %/% 0.5) * 0.5, "N_",
((abs(lon) + 0.25) %/% 0.5) * 0.5, "W"
)
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historical note since this is the only R file touched here: These are changes backported from the phase 3 version of this script. 👍

Comment on lines +5 to +6
# resources. Fixed paths, S3 resources, and step I/O are defined in
# workflow_manifest.yaml and are not overridden here.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the upper and lower bounds on number of distinct config files a user will need to look at to understand the full workflow at this point?

Comment on lines +18 to +20
n_ens: 20
n_met: 10
ic_ensemble_size: 100
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n_ens and ic_ensemble_size can probably be combined here

<prerun>cp data/events.in @RUNDIR@</prerun>
</model>

# Apptainer (not in user config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "not in user config" part a convenience or a requirement? eg would editing this to point to a different Docker org / tag be a valid way of testing with alternate releases?

Comment thread magic-ensemble
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the filename be magic-ensemble.sh?

Comment thread magic-ensemble
set -euo pipefail

# --- Repo root, manifest, and invocation CWD ---
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL about BASH_SOURCE!

Comment thread magic-ensemble
landtrendr_raw_files="${landtrendr_raw_files}${run_dir}/${segment}"
done < <(yq eval '.paths.landtrendr_raw_files' "$MANIFEST" | tr ',' '\n')

# --- Pre-execution: AWS S3 tools check ---
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a bigger question than this workflow, but: Once CARB has all the files we've delivered them onto their own servers, it should conceptually be possible for the whole workflow to run without any S3 access. How many changes would be needed at the CLI level to achieve that?


## Architecture Overview

The CLI is built on a three-layer configuration model:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This answers my "how many configs" question above. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants