# 20 km WRF re-stacking pipeline

This pipeline restructures the raw 20km WRF outputs that cover Alaska and the surrounding regions (created by P Bieniek) into more user-friendly files that can be easily imported into popular GIS software. This WRF dataset consists of hourly outputs for one reanalysis, ERA-Interim, and two GCMs, GFDL-CM3, and NCAR-CCSM4. This pipeline is designed to be executed entirely from this notebook.

This is a rather complicated SNAP data pipeline. It works on a large amount of data (~300 GB for a single model / scenario / year, so that's over 90 TB for $2 * 95 + 2 * 35 + 35$ model / scenario / year combinations), creates a large number of final data files (>10k), and makes use of slurm, specific directory structure / file management, and asyncronous execution ability (i.e. re-run certain steps, run steps for only certain variables, etc). The "Setup" step provides info on executing it.

# 0 - Setup

This step provides instructions for setting up and running the pipeline. 

First off, a snapshot of the structure of the target base data:

In [13]:
ls /archive/DYNDOWN/DIONE/pbieniek/ccsm/hist/hourly | head -5

[0m[38;5;27m1970[0m/
[38;5;27m1971[0m/
[38;5;27m1972[0m/
[38;5;27m1973[0m/
[38;5;27m1974[0m/


In [14]:
ls /archive/DYNDOWN/DIONE/pbieniek/ccsm/hist/hourly/ | tail -6

[38;5;27m2003[0m/
[38;5;27m2004[0m/
[38;5;27m2005[0m/
nohup.out
[38;5;34morgdata.sh[0m*
[m

In [18]:
ls /archive/DYNDOWN/DIONE/pbieniek/ccsm/hist/hourly/1979 | head -5

[0m[38;5;34mdailylog.out[0m*
[38;5;34mWRFDS_d01.1979-01-01_00.nc[0m*
[38;5;34mWRFDS_d01.1979-01-01_01.nc[0m*
[38;5;34mWRFDS_d01.1979-01-01_02.nc[0m*
[38;5;34mWRFDS_d01.1979-01-01_03.nc[0m*
ls: write error


This structure applies for all outputs, and exists for the following model / scenario / year combinations:

* `era/`:
    * `hist/`: 1979-2015
* `gfdl/`
    * `hist/`: 1970-2006
    * `rcp85/`: 2006-2100
* `ccsm/`
    * `hist/`: 1970-2005
    * `rcp85/`: 2005-2100

## 0.1 - Pipeline execution

### Processing

The default configuration for this pipeline is to process all available data - all year / variable / model / scenario combinations possible. However, at the finest level of control, this pipeline can re-stack a single year's worth of data for a single variable / model / scenario combination.

As seen above, the input data are grouped by model and scenario names and are consistently structured - hourly WRF model outputs grouped by yearly folders. Thus, processing is done at the model / scenario "group" level - more on that below.

Given the large file size / count issue, this pipeline is best utilized in an async fashion, with memory management tasks, regular printouts of what's happening and progress on things, what files are where for which groups, etc. 

### System

This pipeline is being developed on the Chinook cluster:

In [2]:
!uname -a

Linux chinook00.rcs.alaska.edu 2.6.32-754.35.1.el6.61015g0000.x86_64 #1 SMP Mon Dec 21 12:41:07 EST 2020 x86_64 x86_64 x86_64 GNU/Linux


This pipeline makes use of slurm and multiple cores / compute nodes for processing in reasonable time.

In [5]:
!sinfo -V

slurm 19.05.7


### Execution

This notebook should be executed sequentially to process the entire dataset. To process only subsets of the target dataset, which might be done for fixing an issue or re-processing some failed runs, all code cells in this Setup section from Section 0.2 onward need to be executed prior.

## 0.2 - Environment

Instead of relying on environment variables, this pipeline utilizes user-supplied parameters specified in the cells of this notebook by simply assigning values to variables prior to executing any processing code cells.

### 0.2.1 - global parameters

The following variables are used throughout the pipeline and are loadset in the code cell below:

* `base_dir` - Full path to the directory that will contain all ancillary and intermediate files that will be kept, such as scripts for slurm / `sbatch`
* `output_dir` - Full path to the directory that will contain the final output data (will be the same as `base_dir` here but specified separately for consistency with other SNAP pipelines)
* `scratch_dir` - Full path to the scratch directory that raw WRF outputs will be copied to prior to processing them
    * This pipelines works with WRF outputs that are on a mounted file system, and so can be copied over to scratch space and removed when done to improve IO and avoid the need to keep them in the `base_dir`.
* `slurm_email` - String containing email address to use for failed slurm notifications
* `conda_init_script` - This is currently specific to Chinook. This is the path to a script that contains commands for initializing the shells on the compute nodes to use `conda activate`, has the typical commands seen in `~/.bashrc` after installing conda:

In [19]:
cat ~/init_conda.sh

#!/bin/bash

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/kmredilla/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/home/kmredilla/miniconda3/etc/profile.d/conda.sh" ]; then
        . "/home/kmredilla/miniconda3/etc/profile.d/conda.sh"
    else
        export PATH="/home/kmredilla/miniconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<< 



Supply the values for these parameters:

In [1]:
# User-parameters
base_dir = "/import/SNAP/wrf_data/project_data/wrf_data"
output_dir = "/import/SNAP/wrf_data/project_data/wrf_data"
scratch_dir = "/center1/DYNDOWN/kmredilla/wrf_data"
slurm_email = "kmredilla@alaska.edu"
conda_init_script = "/home/kmredilla/init_conda.sh"

### 0.2.2 - job parameters

The following arguments are required for a single job of re-stacking data for a particular variable (or variables), model, scenario, and year (or years):

* `varname`: Name of the variable. This is the lower case version of the variable name in the WRF outputs.
* `wrf_dir`: This is the directory containing the WRF files. This codebase is designed for use with hourly output, so this needs to be the `hourly/` directory if there are multiple options (e.g. `daily/`, `monthly/`, etc.).
* `group`: Encoded value specifying the WRF group being worked on, which is just a combination of the model and scenario (or just model, in terms of ERA-Interim).  One of [`era_interim`, `gfdl_hist`, `ccsm_hist`, `gfdl_rcp85`, `ccsm_rcp85`].
* `years`: a list of years to work on specified as integers, such as `[1979, 1980]`, or omit to work on all years available for a given WRF group.

The WRF outputs of interest from different runs of model/scenario may be in separate places, but there is consistency in file structure across all groups - all `hourly` directories have annual subgroups consisting of the WRF outputs to be restacked.

## 0.3 - Global imports and filepaths

Set up all filepathing used in the cell below and import all packages used in multiple sections. 

In [2]:
import os
import time
from multiprocessing import Pool
from pathlib import Path
import xarray as xr
from tqdm.notebook import tqdm
# codebase
import luts
import restack_20km as main


base_dir = Path("/import/SNAP/wrf_data/project_data/wrf_data")
anc_dir = base_dir.joinpath("ancillary")
anc_dir.mkdir(exist_ok=True)
# monthly WRF file to serve as template
template_fp = anc_dir.joinpath("monthly_PCPT-gfdlh.nc")
# WRF geogrid file for correctly projecting data and rotating wind data
geogrid_fp = anc_dir.joinpath("geo_em.d01.nc")
# final output directory for data
output_dir = Path("/import/SNAP/wrf_data/project_data/wrf_data/restacked")
output_dir.mkdir(exist_ok=True)
# scratch space where data will be copied for performant reading / writing
scratch_dir = Path("/center1/DYNDOWN/kmredilla/wrf_data")
# where raw wrf outputs will be copied on scratch
raw_scratch_dir = scratch_dir.joinpath("raw")
raw_scratch_dir.mkdir(exist_ok=True)
# where initially restacked data will be stored on scratch_space
restack_scratch_dir = scratch_dir.joinpath("restacked")
restack_scratch_dir.mkdir(exist_ok=True)

slurm_dir = base_dir.joinpath("slurm")
slurm_dir.mkdir(exist_ok=True)
slurm_email = "kmredilla@alaska.edu"

# this env var is always defined if notebook started with anaconda-project run
project_dir = Path(os.getenv("PROJECT_DIR"))
ap_env = project_dir.joinpath("envs/default")
# cp_script = project_dir.joinpath("restack_20km/mp_cp.py") not used on Chinook, $ARCHIVE not accessible from compute nodes
restack_script = project_dir.joinpath("restack_20km/restack.py")
forecast_times_script = project_dir.joinpath("restack_20km/forecast_times.py")
luts_fp = project_dir.joinpath("restack_20km/luts.py")

# 1 - Re-stack data and improve the file structure

This is the main lift of the pipeline and it applies to a single WRF group (again, "group" meaning a specific model / scenario combination) for any variables and years specified. It re-stacks the WRF outputs, which means extracting the data for all variables in a single hourly WRF file and combining them into new files grouped by variable and year. It then assigns useful metadata and restructures the files to achieve greater usability (note - this was previously a separate step, but the storage of essentially duplicate intermediate data was not efficient).

As mentioned above, this pipeline is currently configured to run for all potential combinations of variables / years for each group. This section will demonstrate execution of all the processing steps required to re-stack one single WRF group, NCAR-CCSM4 historical, and then will proceed to string them all together for processing the remaining WRF groups.

## 1.1 - Copy WRF data to scratch space

If not present on the filesystem (as is the case at the time of developing the current code) then the WRF data need to be copied over. This is done from tape storage if working on Chinook.

This step will copy the annual subdirectory(ies) containing the WRF outputs for all specified years to scratch space for efficient reading. Given the location of the source data on $ARCHIVE, which requires files to be brought back online from tape to read them, this step is very time consuming.

There are two steps required here:
1. "stage" the files that are on tape storage - i.e., read them from tape to a temporary spot
2. copy the files to scratch space for persistence

It takes a very long to to `batch_stage` an entire group directory. This does not make for a useful experience when trying to execute this pipeline. Given that this transfer needs to be initiated from a login node*, it will be useful to have commands that 
1) determine what files (years) are missing from the scratch space
2) checks to see what of those necessary files (years) are currently "offline" (i.e., in tape storage only and not available for immediate copy)
3) copies files (years) that are online 

\* $ARCHIVE is only visible to the login nodes, so this task cannot be split into subtasks for the compute nodes.

### 1.1.1 - Specify the WRF group and years to be processed

Set the parameters for processing. The `wrf_dir` specifies the path to the WRF group.

In [3]:
# job parameters
wrf_dir = Path("/archive/DYNDOWN/DIONE/pbieniek/ccsm/hist/hourly")
group = "ccsm_hist"
# years = [2004, 2005]
# to specify all years:
years = luts.groups[group]["years"]

### 1.1.2 - Stage the files to be processed

Staging all files for a particular WRF group can be done with a single command on the above `wrf_dir`, e.g.:

```
batch_stage -r /archive/DYNDOWN/DIONE/pbieniek/ccsm/hist/hourly
```

However, the cell below will execute the staging for all supplied years. Once this command finishes, then proceed to copying the files. This can take a while.

In [6]:
from subprocess import check_output


stdout = []
for year in years:
    stage_dir = wrf_dir.joinpath(str(year))
    out = check_output(["batch_stage", "-r", stage_dir])
    stdout.append(out)

**Note** - This step should always be done if the data are on `$ARCHIVE` on Chinook, as it should be much more efficient that running the copy without staged (> 3 hours to do a year without staging, ~45 minutes when staged (using 20 cores))

Check that all requested files actually staged:

In [5]:
%time unstaged_fps = main.check_staged(wrf_dir, years)

Requested years: [1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983
 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997
 1998 1999 2000 2001 2002 2003 2004 2005]
All files are staged
CPU times: user 86 ms, sys: 102 ms, total: 188 ms
Wall time: 2.42 s


[]

Ensure yearly subdirectories are present before starting the copying:

In [5]:
main.make_yearly_scratch_dirs(group, years, raw_scratch_dir)

### 1.1.3 - Copy staged files to `scratch_dir`

Iterate over years and copy the files in parallel with `multiprocesing.Pool`:

In [12]:
ncpus = 20
clobber = "all"


group_dir = raw_scratch_dir.joinpath(group)
for year in tqdm(years, total=len(years), desc=f"Copying files for {len(years)} years"):
    src_dir = wrf_dir.joinpath(str(year))
    dst_dir = group_dir.joinpath(str(year))
    # set third arg to False for no-clobber
    args = [(fp, dst_dir.joinpath(fp.name), clobber) for fp in src_dir.glob("*.nc")]
    
    with Pool(ncpus) as pool:
        out = [out for out in tqdm(pool.imap(main.sys_copy, args), total=len(args), desc=f"Year: {year}")]

Copying files for 20 years:   0%|          | 0/20 [00:00<?, ?it/s]

Year: 1984:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1985:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1986:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1987:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1988:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1989:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1990:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1991:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1992:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1993:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1994:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1995:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1996:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1997:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1998:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 1999:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 2000:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 2001:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 2002:   0%|          | 0/8760 [00:00<?, ?it/s]

Year: 2003:   0%|          | 0/8760 [00:00<?, ?it/s]

##### Check progress of copying to scratch

Run the cell below to check the progress of the copy for the current arguments (this can take a while):

In [4]:
# modify this to show what entire years are present in scratch_dir

wrf_fps, existing_scratch_fps = main.check_raw_scratch(wrf_dir, group, years, raw_scratch_dir)

All years present on scratch space


This cell below can be used as a quick check to identify any files that didn't copy properly:

In [7]:
flag_fps = []

for year in years:
    year_scratch_dir = raw_scratch_dir.joinpath(group, year)
    flag_fps.extend(check_scratch_file_sizes(year_scratch_dir, ncpus=20))

Then, re-copy any missing files derived from that check:

In [None]:
main.recopy_raw_scratch_files(flag_fps, wrf_dir)

**Note** - if there is a large number of missing files, it might be more efficient to use the intial section above for copying in batch.

## 1.2 - Restack the data

Now that the WRF outputs are available on the scratch filesystem for faster access, execute the restacking script on all variables of interest.

### 1.2.1 - Make forecast time table

Tables of forecast time values and filenames are used for interpolating the "accumulation" variables, such as snow and precipitation. This should be done after an entire year's worth of data has successfully copied from `$ARCHIVE` to scratch space, because this table will be referenced for the filepaths and timestamp information to re-stack. Plus this step utilizes the compute nodes which cannot see `$ARCHIVE`.

**Note** - it is currently unknown why there is an "accumulation fix" needed at all. There could be some info on this lurking somewhere.

Create the slurm script for getting the forecast times:

In [31]:
ncpus = input("enter number of CPUs to use:")
partition = input("Enter name of compute partition to use:")

enter number of CPUs to use: 24
Enter name of compute partition to use: t1small


In [32]:
# since this is only done once for a group with all files, only need to specify group (no year(s))
sbatch_fp = slurm_dir.joinpath(f"get_forecast_times_{group}.slurm")
sbatch_out_fp = slurm_dir.joinpath(f"get_forecast_times_{group}_%j.out")
wrf_scratch_dir = raw_scratch_dir.joinpath(group)
sbatch_head = main.make_sbatch_head(slurm_email, partition, conda_init_script, ap_env)
main.write_sbatch_forecast_times(sbatch_fp, sbatch_out_fp, wrf_scratch_dir, anc_dir, forecast_times_script, ncpus, sbatch_head)

Forecast times slurm commands written to /center1/DYNDOWN/kmredilla/wrf_data/slurm/get_forecast_times_ccsm_hist.slurm


Submit the script:

In [34]:
# takes > 30 minutes to run on Chinook
job_id = main.submit_sbatch(sbatch_fp)

### 1.2.2 - Ensure ancillary WRF files present

The re-stacking will rely on a monthly WRF output file as a template for consistent metadata, and a WRF geogrid data file for determining correct spatial projection information, and for correctly rotating wind data. Make sure that both are present in the `anc_dir` directory:

In [2]:
import shutil

# get rid of
# if not template_fp.exists():
#     shutil.copy("/archive/DYNDOWN/DIONE/pbieniek/gfdl/hist/monthly/monthly_PCPT-gfdlh.nc", template_fp)
    
if not geogrid_fp.exists():
    # the original location of this file is not known, in case it is ever deleted
    #  from this source location it might still be available on Poseidon at 
    #  /workspace/Shared/Tech_Projects/wrf_data/project_data/ancillary_wrf_constants/geo_em.d01.nc
    shutil.copy("/import/SNAP/wrf_data/project_data/ancillary_wrf_constants/geo_em.d01.nc", geogrid_fp)

### 1.2.3 - Run the restacking with slurm

Make the slurm scripts for re-stacking data for a particular variable and year.

In [4]:
varnames = input("Enter name of WRF variable to re-stack (leave blank for all):") 
if varnames == "":
    varnames = luts.varnames
else:
    varnames = [varnames]
ncpus = input("Enter number of CPUs to use:")
partition = input("Enter name of compute partition to use:")
years = input("Enter year to work on (leave blank for all):")
if years == "":
    years = luts.groups[group]["years"]
else:
    years = [years]

Enter name of WRF variable to re-stack (leave blank for all): 
Enter number of CPUs to use: 24
Enter name of compute partition to use: t1small
Enter year to work on (leave blank for all): 


In [5]:
sbatch_fps = []
year_str = f"{years[0]}-{years[-1]}"
for varname in varnames:
    # write to .slurm script
    sbatch_fp = slurm_dir.joinpath(f"restack_{group}_{year_str}_{varname}.slurm")
    # filepath for slurm stdout
    sbatch_out_fp = slurm_dir.joinpath(f"restack_{group}_{year_str}_{varname}_%j.out")
    sbatch_head = main.make_sbatch_head(
        slurm_email, partition, conda_init_script, ap_env
    )

    args = {
        "sbatch_fp": sbatch_fp,
        "sbatch_out_fp": sbatch_out_fp,
        "restack_script": restack_script,
        "luts_fp": luts_fp,
        "geogrid_fp": geogrid_fp,
        "anc_dir": anc_dir,
        "restacked_dir": restack_scratch_dir,
        "group": group,
        "fn_str": luts.groups[group]["fn_str"],
        "years": years,
        "varname": varname,
        "ncpus": ncpus,
        "sbatch_head": sbatch_head,
    }

    main.write_sbatch_restack(**args)
    sbatch_fps.append(sbatch_fp)

Submit the `.slurm` scripts with `sbatch`:

In [14]:
job_ids = [main.submit_sbatch(fp) for fp in sbatch_fps]

## 2 - Quality Check

### 2.1 - Ensure that all files open and have consistent header info

Check this using both `xarray` and GDAL bindings.

Dimensions are the same



In [73]:
# just a temporary sanity check of the new data before I proceed with submitting the rest of the jobs
ds.pcpt

In [75]:
ds

In [6]:
import numpy as np


def sanity_check(restack_fp):
    """Checks the values of restacked data for a random time slice"""
    varname = restack_fp.parent.name
    with xr.open_dataset(restack_fp) as ds:
        idx = np.random.randint(ds.time.values.shape[0])
        check_time = ds.time.values[idx]
        check_arr = ds[varname].sel(time=check_time).values
        
    fix_fp = base_dir.joinpath(f"hourly_fix/{varname}/{restack_fp.name}")
    with xr.open_dataset(fix_fp) as ds:
        fix_arr = ds[varname].sel(time=check_time).values
    
    year = check_time.astype("datetime64[Y]")
    wrf_time_str = str(check_time.astype("datetime64[h]")).replace("T", "_")
    raw_fp = list(raw_scratch_dir.joinpath(f"{group}/{year}").glob(f"*{wrf_time_str}*"))[0]
    with xr.open_dataset(raw_fp) as ds:
        raw_arr = ds[varname.upper()].values
        
    check = np.all(fix_arr == check_arr)
    check = check & np.all(np.flipud(raw_arr) == check_arr)
    print(f"Time: {wrf_time_str}; match: {check}")
    
    return

In [9]:
fp = restack_scratch_dir.joinpath("t2/t2_hourly_wrf_NCAR-CCSM4_historical_1970.nc")

sanity_check(fp)

Time: 1970-10-26_09; match: True


In [37]:
str(ds.time.values[0].astype("datetime64[h]")).replace("T", "_")

'1970-01-02T00'

In [58]:
with xr.open_dataset(restack_scratch_dir.joinpath("pcpt/pcpt_hourly_wrf_NCAR-CCSM4_historical_1970.nc")) as ds:
    check_time = ds.time.values[10]
    check_arr = ds["pcpt"].sel(time=check_time).values
    check_arr = ds["pcpt"].sel(time=ds.time.values[7:12]).values
    

In [76]:
fix_fp = base_dir.joinpath(f"hourly_fix/pcpt/pcpt_hourly_wrf_NCAR-CCSM4_historical_1970.nc")
with xr.open_dataset(fix_fp) as ds:
    fix_arr = ds["pcpt"].sel(time=check_time).values
    

In [79]:
fix_fp

PosixPath('/import/SNAP/wrf_data/project_data/wrf_data/hourly_fix/pcpt/pcpt_hourly_wrf_NCAR-CCSM4_historical_1970.nc')

In [85]:
fp = "/import/SNAP/wrf_data/project_data/wrf_data/hourly_fix/q2/q2_hourly_wrf_NCAR-CCSM4_historical_1970.nc"
with xr.open_dataset(fp) as ds:
    pass

In [84]:
ls -l /import/SNAP/wrf_data/project_data/wrf_data/hourly_fix/q2/q2_hourly_wrf_NCAR-CCSM4_historical_1970.nc

-rw-rw---- 1 12997 dyndown 1477179050 Oct 21 19:20 /import/SNAP/wrf_data/project_data/wrf_data/hourly_fix/q2/q2_hourly_wrf_NCAR-CCSM4_historical_1970.nc


In [86]:
ds

In [25]:
np.all(check_arr == fix_arr)

True

In [55]:
year = check_time.astype("datetime64[Y]")
wrf_time_str = str(check_time.astype("datetime64[h]")).replace("T", "_")
#wrf_time_str = "1970-01-02_11"
raw_arr = []
wrf_times = []
for day in ["02", "03"]:
    for i in range(24):
        wrf_time_str = f"1970-01-{day}_{str(i).zfill(2)}"
        wrf_times.append(wrf_time_str)
        raw_fp = list(raw_scratch_dir.joinpath(f"{group}/{year}").glob(f"*{wrf_time_str}*"))[0]
        with xr.open_dataset(raw_fp) as ds:
            raw_arr.append(ds["PCPT"].values[-3, -3])


## 3 - Manage files

### 3.1 - Move off scratch space

### 3.2 - Copy to AWS

In [28]:
ls /center1/DYNDOWN/kmredilla/wrf_data/restacked/pcpt -l

total 728350
-rw------- 1 kmredilla dyndown 745440375 Apr 15 17:39 pcpt_hourly_wrf_NCAR-CCSM4_historical_1970.nc


In [39]:
ls /center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/*1970-02-01*

/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_00.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_01.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_02.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_03.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_04.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_05.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_06.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_07.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_08.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_09.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_10.nc
/center1/DYNDOWN/kmredilla/wrf_data/raw/ccsm_hist/1970/WRFDS_d01.1970-02-01_11.nc
/center1/DYNDOWN