# Restack hourly 20km WRF outputs

Now that the WRF outputs are available on the scratch filesystem for persistence and fast access, execute the restacking script on all variables of interest.

This is the main lift of the pipeline and it applies to a single WRF group (again, "group" meaning a specific model / scenario combination) for any variables and years specified. It "restacks" the WRF outputs, which means extracting the data for all variables in a single hourly WRF file and combining them into new files grouped by variable and year. It then assigns useful metadata and restructures the files to achieve greater usability (note - this was previously a separate step, but the storage of essentially duplicate intermediate data was not efficient).

As mentioned above, this pipeline is currently configured to run the restacking for all potential combinations of variables / years for each group.

Set up the environment:

In [1]:
import os
# codebase
from config import *
import luts
import slurm

### Run the restacking with slurm

Make the slurm scripts for restacking data for a particular variable and year.

In [16]:
varnames = luts.varnames
years = luts.groups[group]["years"]
year_str = f"{years[0]}-{years[-1]}"

sbatch_fps = []
for varname in varnames:
    # write to .slurm script
    sbatch_fp = slurm_dir.joinpath(f"restack_{group}_{year_str}_{varname}.slurm")
    # filepath for slurm stdout
    sbatch_out_fp = slurm_dir.joinpath(f"restack_{group}_{year_str}_{varname}_%j.out")
    sbatch_head = slurm.make_sbatch_head(
        slurm_email, partition, conda_init_script
    )

    args = {
        "sbatch_fp": sbatch_fp,
        "sbatch_out_fp": sbatch_out_fp,
        "restack_script": restack_script,
        "luts_fp": luts_fp,
        "geogrid_fp": geogrid_fp,
        "anc_dir": anc_dir,
        "restack_dir": hourly_dir,
        "group": group,
        "fn_str": luts.groups[group]["fn_str"],
        "years": years,
        "varname": varname,
        "ncpus": ncpus,
        "sbatch_head": sbatch_head,
    }

    slurm.write_sbatch_restack(**args)
    sbatch_fps.append(sbatch_fp)

Remove existing slurm output scripts if you fancy it:

In [17]:
for varname in varnames:
    _ = [fp.unlink() for fp in list(slurm_dir.glob(f"restack_{group}_{year_str}_{varname}_*.out"))]

Submit the `.slurm` scripts with `sbatch`:

In [18]:
job_ids = [slurm.submit_sbatch(fp) for fp in sbatch_fps]

This should complete this step of the pipeline. Once the slurm jobs have all finished, proceed to resampling the restacked files to a daily resolution. 