# Configuring Rocoto workflows

The `uw rocoto` tool has three modes:

* `realize` -- creates a Rocoto XML from a UW YAML
* `validate` -- uses the UW framework to validate a Rocoto XML without needing to have a Rocoto installation
* `iterate` -- runs a Rocoto XML to completion through a specified task

For examples of how to use these CLI tools, go to the __[Rocoto UW docs](https://uwtools.readthedocs.io/en/main/sections/user_guide/cli/tools/rocoto.html)__.

# Building a UW YAML for Rocoto

We'll start by building a UW YAML config describing the Rocoto workflow that will run the cycling experiment's tasks. Use the __[UW YAML Rocoto Workflows](https://uwtools.readthedocs.io/en/main/sections/user_guide/yaml/rocoto.html)__ docs to get going.

In [1]:
import os
from pathlib import Path

# IN BASH: export configs=/path/to/uwtools_training/configs
os.environ["configs"] = str(Path(".").resolve().parent / "configs")

## The workflow section

In [2]:
!head -n 20 $configs/rocoto_workflow.yaml 

user:
  account: zrtrr
workflow:
  attrs:
    realtime: false
    scheduler: slurm
  cycledef:
    - attrs:
        group: cold_start_cycle
      spec: 202508271200 202508271200 01:00:00
    - attrs:
        group: prod
      spec: 202508271300 202508271300 01:00:00
  entities:
    ACCOUNT: "{{ user.account }}"
    LOGDIR: "{{ user.expt_dir}}/log"
  log:
    value: "{{ user.expt_dir }}/workflow.log"
  tasks:
    task_prepare_ics:


## Adding a task

Docs: __[Defining Tasks](https://uwtools.readthedocs.io/en/main/sections/user_guide/yaml/rocoto.html#defining-tasks)__

* The UW YAML key should be `task_` followed by an arbitrary name. This name will be shown in `rocotostat` output.
* Values can be string references to XML entities, Jinja2 expressions, hard-coded values.
* Don't use the `--batch` flag when running UW drivers: Rocoto handles submitting jobs to the batch system.
* Use Rocoto-supplied information for date-time flexibility.
* A `jobname` variable, reflecting the arbitrary name following `task_`, will be supplied by `uwtools` for use in Jinja2 expressions.


# Check the progress

Run the `uw rocoto realize` tool to validate the YAML and attempt to create an XML.

In [3]:
!uw rocoto realize -c $configs/rocoto_workflow.yaml -o workflow.xml

Traceback (most recent call last):
  File [35m"/scratch3/BMC/wrfruc/cholt/conda/envs/uwtools-training/bin/uw"[0m, line [35m9[0m, in [35m<module>[0m
    sys.exit([31mmain[0m[1;31m()[0m)
             [31m~~~~[0m[1;31m^^[0m
  File [35m"/scratch3/BMC/wrfruc/cholt/conda/envs/uwtools-training/lib/python3.13/site-packages/uwtools/cli.py"[0m, line [35m100[0m, in [35mmain[0m
    sys.exit(0 if [31mmodes[args[STR.mode]][0m[1;31m(args)[0m else 1)
                  [31m~~~~~~~~~~~~~~~~~~~~~[0m[1;31m^^^^^^[0m
  File [35m"/scratch3/BMC/wrfruc/cholt/conda/envs/uwtools-training/lib/python3.13/site-packages/uwtools/cli.py"[0m, line [35m596[0m, in [35m_dispatch_rocoto[0m
    return [31mactions[args[STR.action]][0m[1;31m(args)[0m
           [31m~~~~~~~~~~~~~~~~~~~~~~~~~[0m[1;31m^^^^^^[0m
  File [35m"/scratch3/BMC/wrfruc/cholt/conda/envs/uwtools-training/lib/python3.13/site-packages/uwtools/cli.py"[0m, line [35m620[0m, in [35m_dispatch_rocoto_realize[0m
    r

## OOPS!
The `rocoto_workflow.yaml` doesn't have all the values it needs. Let's compose a more complete YAML on the fly.

In [4]:
%%bash
realize_args=(
    -i $configs/prepare_bcs.yaml
    -u $configs/rocoto_one_task.yaml
)
    
uw config realize ${realize_args[*]} | uw rocoto realize -o workflow.xml
cat workflow.xml

[2025-09-23T15:39:52]     INFO Schema validation succeeded for Rocoto config
[2025-09-23T15:39:52]     INFO Schema validation succeeded for Rocoto XML


<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE workflow [
  <!ENTITY ACCOUNT "zrtrr">
  <!ENTITY LOGDIR "/scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7/log">
]>
<workflow realtime="False" scheduler="slurm">
  <cycledef group="cold_start_cycle">202508271200 202508271200 01:00:00</cycledef>
  <cycledef group="prod">202508271300 202508271300 01:00:00</cycledef>
  <log>/scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7/workflow.log</log>
  <task name="prepare_ics" cycledefs="cold_start_cycle">
    <account>&ACCOUNT;</account>
    <nodes>1:ppn=20</nodes>
    <walltime>00:30:00</walltime>
    <command>
      <cyclestr>uw chgres_cube run --cycle @Y-@m-@dT@H --leadtime 0 -c /scratch3/BMC/wrfruc/uw_training/uwtools_training/configs/prepare_bcs.yaml --key-path make_ics</cyclestr>
    </command>
    <jobname>prepare_ics</jobname>
    <join>
      <cyclestr>&LOGDIR;/prepare_ics_@Y@m@d@H.log</cyclestr>
    </join>
  </task>
</workflow>


## Adding a metatask

Docs: __[Defining Metatasks](https://uwtools.readthedocs.io/en/main/sections/user_guide/yaml/rocoto.html#defining-metatasks)__

* Metatask blocks require a `var:` block and one more `task_` or `metatask_` blocks.
* The `var:` block has at least one key/value pair to define the iteration loop.

In [5]:
%%bash
realize_args=(
    -i $configs/prepare_bcs.yaml
    -u $configs/rocoto_with_metatask.yaml   
)

uw config realize ${realize_args[*]} | uw rocoto realize -o workflow.xml

[2025-09-23T15:39:56]     INFO Schema validation succeeded for Rocoto config
[2025-09-23T15:39:56]     INFO Schema validation succeeded for Rocoto XML


In [6]:
!cat workflow.xml

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE workflow [
  <!ENTITY ACCOUNT "zrtrr">
  <!ENTITY LOGDIR "/scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7/log">
]>
<workflow realtime="False" scheduler="slurm">
  <cycledef group="cold_start_cycle">202508271200 202508271200 01:00:00</cycledef>
  <cycledef group="prod">202508271300 202508271300 01:00:00</cycledef>
  <log>/scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7/workflow.log</log>
  <task name="prepare_ics" cycledefs="cold_start_cycle">
    <account>&ACCOUNT;</account>
    <nodes>1:ppn=20</nodes>
    <walltime>00:30:00</walltime>
    <command>
      <cyclestr>uw chgres_cube run --cycle @Y-@m-@dT@H --leadtime 0 -c /scratch3/BMC/wrfruc/uw_training/uwtools_training/configs/prepare_bcs.yaml --key-path make_ics</cyclestr>
    </command>
    <jobname>prepare_ics</jobname>
    <join>
      <cyclestr>&LOGDIR;/prepare_ics_@Y@m@d@H.log</cyclestr>
    </join>
  </task>
  <metatask name="prepare_lbcs">
    <var name="LEADTIME">0 1 2 3 4 5 <

## Adding dependencies to a task

Docs: __[The dependency: key](https://uwtools.readthedocs.io/en/main/sections/user_guide/yaml/rocoto.html#the-dependency-key)__

* UW Rocoto YAML follows the same structure as Rocoto XML
* YAML keys must be unique at the same level, so all *dependencies* take an optional suffix after an underscore, e.g., `_arbitrary`, where `arbitrary` can be any unique string identifier. This identifier is not used by Rocoto, but is used when referenced by Jinja2 expressions. While they are not required, they can also be useful to make some dependencies clearer and more human-readable.

In [7]:
%%bash 
uw config realize -i $configs/fv3_config.yaml -u $configs/gsi_config.yaml | \
    uw config realize -u $configs/prepare_bcs.yaml | \
    uw config realize -u $configs/rocoto_workflow.yaml | \
    uw rocoto realize -o workflow.xml

[2025-09-23T15:40:07]     INFO Schema validation succeeded for Rocoto config
[2025-09-23T15:40:08]     INFO Schema validation succeeded for Rocoto XML


In [8]:
!cat workflow.xml

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE workflow [
  <!ENTITY ACCOUNT "zrtrr">
  <!ENTITY LOGDIR "/scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7/log">
]>
<workflow realtime="False" scheduler="slurm">
  <cycledef group="cold_start_cycle">202508271200 202508271200 01:00:00</cycledef>
  <cycledef group="prod">202508271300 202508271300 01:00:00</cycledef>
  <log>/scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7/workflow.log</log>
  <task name="prepare_ics" cycledefs="cold_start_cycle">
    <account>&ACCOUNT;</account>
    <nodes>1:ppn=20</nodes>
    <walltime>00:30:00</walltime>
    <command>
      <cyclestr>uw chgres_cube run --cycle @Y-@m-@dT@H --leadtime 0 -c /scratch3/BMC/wrfruc/uw_training/uwtools_training/configs/prepare_bcs.yaml --key-path make_ics</cyclestr>
    </command>
    <jobname>prepare_ics</jobname>
    <join>
      <cyclestr>&LOGDIR;/prepare_ics_@Y@m@d@H.log</cyclestr>
    </join>
  </task>
  <metatask name="prepare_lbcs">
    <var name="LEADTIME">0 1 2 3 4 5 <

## Use `compose` tool 

* A helper tool that reduces the need for repeated piping.
* Configs increase in priority from left to right. Last one takes most precedence.

In [9]:
%%bash
file_list=(
    $configs/fv3_config.yaml
    $configs/gsi_config.yaml
    $configs/prepare_bcs.yaml
    $configs/rocoto_workflow.yaml
)
uw config compose ${file_list[*]} -o experiment.yaml

In [10]:
!cat experiment.yaml

timevars:
  pyyyymmddhh: '{{ (cycle - user.cycle_freq).strftime("%Y%m%d%H") }}'
  pyyyymmdd: '{{ (cycle - user.cycle_int).strftime("%Y%m%d") }}'
  yyyymmddhh: '{{ cycle.strftime("%Y%m%d%H") }}'
  yyyymmdd: '{{ cycle.strftime("%Y%m%d") }}'
  hh: '{{ cycle.strftime("%H") }}'
  yyyy: '{{ cycle.strftime("%Y") }}'
  mm: '{{ cycle.strftime("%m") }}'
  dd: '{{ cycle.strftime("%d") }}'
  fff: '{{ "%03d" % (leadtime.total_seconds() / 3600) }}'
user:
  uw_training: /scratch3/BMC/wrfruc/uw_training/uwtools_training
  expt_dir: /scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7
  rrfs_workflow: /scratch3/BMC/wrfruc/cholt/rrfs_work/rrfs-workflow
  physics_suite: FV3_HRRR_gf
  cycle_int: !timedelta '1'
  account: zrtrr
  cycle_freq: !timedelta '1'
platform:
  scheduler: slurm
  fix: /scratch3/BMC/wrfruc/cholt/rrfs_work/FIX_RRFS
  fix_am: '{{ platform.fix }}/am'
  fix_lam: '{{ platform.fix }}/lam/RRFS_CONUS_13km_Lake_fracSV'
  input_data: /scratch3/BMC/wrfruc/cholt/rrfs_work/input_data
  gsi_fixdir: /

### `compose` and `realize`

In [11]:
%%bash
file_list=(
    $configs/fv3_config.yaml
    $configs/gsi_config.yaml
    $configs/prepare_bcs.yaml
    $configs/rocoto_workflow.yaml
)
uw config compose ${file_list[*]} -o experiment.yaml --realize

In [12]:
!cat experiment.yaml

timevars:
  pyyyymmddhh: '{{ (cycle - user.cycle_freq).strftime("%Y%m%d%H") }}'
  pyyyymmdd: '{{ (cycle - user.cycle_int).strftime("%Y%m%d") }}'
  yyyymmddhh: '{{ cycle.strftime("%Y%m%d%H") }}'
  yyyymmdd: '{{ cycle.strftime("%Y%m%d") }}'
  hh: '{{ cycle.strftime("%H") }}'
  yyyy: '{{ cycle.strftime("%Y") }}'
  mm: '{{ cycle.strftime("%m") }}'
  dd: '{{ cycle.strftime("%d") }}'
  fff: '{{ "%03d" % (leadtime.total_seconds() / 3600) }}'
user:
  uw_training: /scratch3/BMC/wrfruc/uw_training/uwtools_training
  expt_dir: /scratch3/BMC/wrfruc/cholt/uw_training_prep/expt7
  rrfs_workflow: /scratch3/BMC/wrfruc/cholt/rrfs_work/rrfs-workflow
  physics_suite: FV3_HRRR_gf
  cycle_int: !timedelta '1:00:00'
  account: zrtrr
  cycle_freq: !timedelta '1:00:00'
platform:
  scheduler: slurm
  fix: /scratch3/BMC/wrfruc/cholt/rrfs_work/FIX_RRFS
  fix_am: /scratch3/BMC/wrfruc/cholt/rrfs_work/FIX_RRFS/am
  fix_lam: /scratch3/BMC/wrfruc/cholt/rrfs_work/FIX_RRFS/lam/RRFS_CONUS_13km_Lake_fracSV
  input_data: /

## Visually inspect the Rocoto XML

* Check to make sure there are no remaining Jinja2 expressions that got past the validator.
* Did any loops end up off by one?
* Anything else amiss?


## Run the Rocoto Workflow

Docs: __[uw rocoto iterate](https://uwtools.readthedocs.io/en/main/sections/user_guide/cli/tools/rocoto.html#iterate)__

The `uw rocoto iterate` tool iteratively runs `rocotorun` and `rocotostat` in subprocesses at a default interval of 10 s to step through the workflow until a particular cycle and task are complete.

In [None]:
%%bash
module load rocoto
uw rocoto iterate -w workflow.xml -d workflow.db --cycle 2025-08-27T13 --task cycled_forecast