# Objective
The purpose of this notebook is to create a unified dataset of landfast ice extent for the Alaskan coastline from the Mahoney / InteRFACE data collection as described in the exploratory data analysis notebooks and the summary documentation thereof. The notebook is based on the SNAP Data Pipeline concept and template.

## Pipeline Steps

The flow here is as follows:

    0. Setup - define directories, intial conditions, etc. Execute the setup code cell before any other step
    1. Fetch the data if it is currently not available locally
    2. Create a DateTime index that encompasses the entire range of the data and gracefully handle that the atomic units of data here have variable ranges or "durations" on the order of 20 days. Also check that the turnover between calendar years is handled.
    3. Map the existing data collection to this index.
    4. For each item in the existing data collection constrain the valid array values and establish a common and well known spatial reference.
    5. Create a merged raster from Chukchi and Beaufort data (or lackthereof) for each date in the time index with appropriate data types and NoData values.
    
Creating the ultimate GeoTIFFs (step 5) will require understanding the four possibles cases of data availability for each date in our DateTime Index.

    A: No data is available at all.
    B: A single raster exists from only one region.
    C: Multiple rasters exist from only one region.
    D: One or more rasters exist from both regions.

We can handle these cases like so:

 - Case A: Skip these dates for now, there is no data available. These dates can be brought back into a datacube at a later time if needed.

 - Case B: Take the single raster and merge it with the mask for whichever region has no data.

 - Case C: Take all the rasters and create a 3D array of values, and filter for the maximum of the time index. Then merge these data with the mask for whichever region has no data.

 - Case D. Same as above, but do the stacking and filtering for both regions, and then then merge those two outputs together.


## 0 - Setup

The following environment variables should be set before running this notebook:
- `$SRC_DIR`: the base directory for storing project data that should be backed up.
- `$DST_DIR`: the output directory where final products are placed.
- `$SCRATCH_DIR`: scratch directory for storing project data which does not need to be backed up.

I like to set these variables from within the notebook for clarity, but it could be done from your shell also.

In [1]:
# place all imports here
import config # this sets the GDAL and PROJ envs before importing geospatial libs
import os
import shutil
import re
import datetime
import pandas as pd
import rasterio as rio
import numpy as np
import pickle
import slurm
from pathlib import Path
from rasterio.crs import CRS
from rasterio.merge import merge
# local helper library, like clippy
import helpers as snappy
# constants
NCORES = 1
SLURM_MAIL = "cparr4@alaska.edu"
AC_CONDA_ENV = "/home/UA/cparr4/miniconda3/envs/conda-project"
COPY_SOURCE = False
project_dir = os.getcwd()


# set the environment variables and create directories and Path objects
os.environ["BASE_DIR"] = "/workspace/Data/Base_Data/Cryosphere/landfast_seaice"
src_p = Path(os.environ["BASE_DIR"])

os.environ["DST_DIR"] = "/atlas_scratch/cparr4/landfast_seaice/mahoney_preprocess/outputs"
dst_p = Path(os.environ["DST_DIR"]).mkdir(parents=True, exist_ok=True)
dst_p = Path(os.environ["DST_DIR"])

os.environ["SCRATCH_DIR"] = "/atlas_scratch/cparr4/landfast_seaice/mahoney_preprocess/scratch"
scratch_p = Path(os.environ["SCRATCH_DIR"]).mkdir(parents=True, exist_ok=True)
scratch_p = Path(os.environ["SCRATCH_DIR"])

# create scratch directory copied and renamed source GeoTIFFs
copy_dir = scratch_p.joinpath("src_copy")
copy_dir.mkdir(exist_ok=True, parents=True)

# create scratch directory for slurm scripts
slurm_dir = scratch_p.joinpath("slurm")
slurm_dir.mkdir(exist_ok=True, parents=True)

# create scratch directory for GeoTIFFs with corrected array values
arrfix_dir = scratch_p.joinpath("arrfix")
arrfix_dir.mkdir(exist_ok=True, parents=True)

# create scratch directory for regional mask or "dummy" GeoTIFFs
mask_dir = scratch_p.joinpath("mask")
mask_dir.mkdir(exist_ok=True, parents=True)

# create scratch directory for max-filtered GeoTIFFs to handle Cases C and D
max_dir = scratch_p.joinpath("max")
max_dir.mkdir(exist_ok=True, parents=True)

snappy.t_out = open("/dev/stdout", "w")
snappy.jprint("Executing pipeline Step 0 (setup)\n")
snappy.jprint(f"SRC_DIR set to {src_p}")
snappy.jprint(f"DST_DIR set to {dst_p}")
snappy.jprint(f"SCRATCH_DIR set to {scratch_p}\n")
snappy.jprint("Pipeline Step 0 (setup) complete\n")
snappy.jprint("Current snapshot of the working directories:")
_ = os.system("cd $SCRATCH_DIR/.. && tree -d")

Executing pipeline Step 0 (setup)

SRC_DIR set to /workspace/Data/Base_Data/Cryosphere/landfast_seaice
DST_DIR set to /atlas_scratch/cparr4/landfast_seaice/mahoney_preprocess/outputs
SCRATCH_DIR set to /atlas_scratch/cparr4/landfast_seaice/mahoney_preprocess/scratch

Pipeline Step 0 (setup) complete

Current snapshot of the working directories:
.
├── outputs
└── scratch
    ├── arrfix
    ├── mask
    ├── max
    ├── slurm
    └── src_copy

7 directories


## 1 - Fetch

We don't have to download any data - these are on our file system. We'll use the fetch stage to point at the data and create the list of filenames we'll use. We can also copy the source data to our scratch directory and do a file name clean up.

In [2]:
snappy.jprint("Executing pipeline Step 1 (fetch)\n")

slie_fps = [x for x in src_p.rglob("*_slie.tif")]
n_src_fps = len(slie_fps)
new_fps = [copy_dir / ''.join(x.parent.parent.name.lower() + "_" + x.name) for x in slie_fps]

if COPY_SOURCE:
    for src, dst in zip(slie_fps, new_fps):
        shutil.copy(src, dst)

snappy.jprint(f"Source data consists of {n_src_fps} Geotiffs, for example:")
snappy.jprint(slie_fps[0])

n_copied_fps = len([x for x in copy_dir.rglob("*_slie.tif")])

assert(n_src_fps == n_copied_fps)

snappy.jprint("Pipeline Step 1 complete\n")

Executing pipeline Step 1 (fetch)

Source data consists of 566 Geotiffs, for example:
/workspace/Data/Base_Data/Cryosphere/landfast_seaice/Beaufort/2000-01/r2000_286-311_slie.tif
Pipeline Step 1 complete



## 2 - Create Time Index

We'll write a few functions to parse file names and convert day-of-year (DOY) style dates to a more readable and DateTime friendly YYYY-MM-DD format. The file names are of the style `chukchi_r2007_326-350_slie.tif`. We'll construct a lookup where we can see all the date information associated with each file. We need to be aware that negative durations are lurking where we jump to the next calendar year.

Once those durations are fixed and all the date information is in place we can establish the start and end dates of the entire collection and create a DateTime index that encompasses all the data.

In [5]:
snappy.jprint("Executing pipeline Step 2 (time index)\n")

def create_regional_fp(fp):
    """Construct a new filename that contains the region info."""
    new_fp = dst_p / ''.join(fp.parent.parent.name.lower() + "_" + fp.name)
    return new_fp
    

def get_doy(fp):
    """Fetch the DOY range from a file name."""
    try:
        doy = re.match(r'.*([0-3][0-9][0-9]-[0-3][0-9][0-9])', fp).group(1)
    except:
        doy = re.match(r'.*([0-3][0-9][0-9]_[0-3][0-9][0-9])', fp).group(1)
        doy = doy.replace("_", "-")
    return doy


def get_re_year(fp):
    """Fetch a single year (YYYY) from a file name."""
    year = re.match(r'.*([1-3][0-9]{3})', fp).group(1)
    return int(year)


def split_doy(doy_range):
    """Split a DOY range string into a integer start and end."""
    doy_start, doy_end = doy_range.split("-")
    return int(doy_start), int(doy_end)    


def doy_date_to_YYYYMMDD(year, days):
    """Convert a a DOY + Year date to a YYYY-MM-DD datetime object."""
    dt = datetime.datetime(year, 1, 1) + datetime.timedelta(days - 1)
    return dt


di = {}
for fp in slie_fps:
    
    k = create_regional_fp(fp).name
    di[k] = {}
    di[k]["start_year"] = get_re_year(k)
    di[k]["doy_range"] = get_doy(k)
    di[k]["doy_start"], di[k]["doy_end"] = split_doy(di[k]["doy_range"])
    di[k]["dt_start"] = doy_date_to_YYYYMMDD(di[k]["start_year"], di[k]["doy_start"])
    di[k]["dt_end"] = doy_date_to_YYYYMMDD(di[k]["start_year"], di[k]["doy_end"])
    di[k]["dt_duration"] = di[k]["dt_end"] - di[k]["dt_start"]

snappy.jprint("Example of a lookup with an incorrect duration:\n")
snappy.jprint(di['beaufort_r2000_349-003_slie.tif'])

def set_end_year(di):
    """Adds an end year for each file and handles rollover cases."""
    for k in di.keys():
        if di[k]["dt_duration"].days < 0:
            di[k]["end_year"] = di[k]["start_year"] + 1
            di[k]["dt_end"] = doy_date_to_YYYYMMDD(di[k]["end_year"], di[k]["doy_end"])
            di[k]["dt_duration"] = di[k]["dt_end"] - di[k]["dt_start"]

        else:
            di[k]["end_year"] = di[k]["start_year"]


set_end_year(di)
snappy.jprint("\n")
snappy.jprint("Example of a lookup with a corrected duration:\n")
snappy.jprint(di['beaufort_r2000_349-003_slie.tif'])


def get_first_dt(di):
    """Get the earliest chronological start date from the lookup."""
    start_dts = []
    for k in di.keys():
        start_dts.append(di[k]["dt_start"])
    start_dt = sorted(start_dts)[0]
    return start_dt


def get_last_dt(di):
    """Get the latest chronological end date from the lookup."""
    end_dts = []
    for k in di.keys():
        end_dts.append(di[k]["dt_end"])
    last_dt = sorted(end_dts)[-1]
    return last_dt

end = get_last_dt(di)
start = get_first_dt(di)
dt_range = pd.date_range(start=start, end=end)
snappy.jprint(f"\nEarliest chronological date is {start}\n")
snappy.jprint(f"Latest chronological date is {end}\n")
snappy.jprint("Pipeline Step 2 complete\n")

Executing pipeline Step 2 (time index)

Example of a lookup with an incorrect duration:

{'start_year': 2000, 'doy_range': '349-003', 'doy_start': 349, 'doy_end': 3, 'dt_start': datetime.datetime(2000, 12, 14, 0, 0), 'dt_end': datetime.datetime(2000, 1, 3, 0, 0), 'dt_duration': datetime.timedelta(days=-346)}


Example of a lookup with a corrected duration:

{'start_year': 2000, 'doy_range': '349-003', 'doy_start': 349, 'doy_end': 3, 'dt_start': datetime.datetime(2000, 12, 14, 0, 0), 'dt_end': datetime.datetime(2001, 1, 3, 0, 0), 'dt_duration': datetime.timedelta(days=20), 'end_year': 2001}

Earliest chronological date is 1996-10-17 00:00:00

Latest chronological date is 2008-07-14 00:00:00

Pipeline Step 2 complete



## 3 - Map Existing Data to the Time Index

Because each file spans many days, a single calendar day can be represented by more than one raster. We need to map dates to the representative rasters by checking if the date is in the time range represented by the file. We'll want to know how many files match from the Beaufort region, how many from the Chukchi region, and how many match total. We also expect that some calendar days will have no matches at all (e.g., August, which is not in the seasonal ice cycle). We'll create a new lookup keyed by each date in the DateTime Index that stores the above information and the matching file names.

Some individual days are represented by as many as six rasters across both regions! We'll stash some pickle files to indicate which dates correspond to which cases described in the introduction.

In [6]:
def time_in_range(start, end, x):
    """Return true if x is in the range [start, end]"""
    if start <= end:
        return start <= x <= end
    else:
        return start <= x or x <= end

dt_di = {}

for dt in dt_range:
    dt_di[dt] = {}
    dt_di[dt]["matching data"] = []
    beaufort_count = 0
    chukchi_count = 0
    
    for k in di.keys():
        if time_in_range(di[k]["dt_start"], di[k]["dt_end"], dt):
            dt_di[dt]["matching data"].append(k)
            
            if "beaufort" in k.lower():
                beaufort_count += 1
            if "chukchi" in k.lower():
                chukchi_count += 1
            
    if len(dt_di[dt]["matching data"]) == 0:
        dt_di[dt]["matching data"].append("no data")
    
    dt_di[dt]["beaufort_count"] = beaufort_count
    dt_di[dt]["chukchi_count"] = chukchi_count
    dt_di[dt]["match_count"] = chukchi_count + beaufort_count
    
df = pd.DataFrame.from_dict(dt_di).T

snappy.jprint("\nSample of mapping data to a time stamp:\n")
snappy.jprint(df.sort_values("match_count", ascending=False).iloc[1])

conditions = [
    (df["match_count"] == 0),
    (df["match_count"] == 1),
    ((df["match_count"] > 1) & (df["chukchi_count"] * df["beaufort_count"] == 0)),
    ((df["match_count"] > 1) & (df["chukchi_count"] * df["beaufort_count"] != 0))]
choices = [1, 2, 3, 4] # these map to A, B, C, and D
df["merge_case"] = np.select(conditions, choices)

snappy.jprint("\nSummary of merge cases - most have data from both regions which is good.\n")
snappy.jprint(df["merge_case"].value_counts())
snappy.jprint("Writing pickles of each merge case to use in processing functions called by slurm.")

merge_case_b = df[df.merge_case == 2]
dt_arr_di_case_b = merge_case_b.T.to_dict()
with open(max_dir / "single_raster_single_region.pickle", "wb") as handle:
    pickle.dump(dt_arr_di_case_b, handle, protocol=pickle.HIGHEST_PROTOCOL)

merge_case_c = df[df.merge_case == 3]
dt_arr_di_case_c = merge_case_c.T.to_dict()
with open(max_dir / "many_raster_single_region.pickle", "wb") as handle:
    pickle.dump(dt_arr_di_case_c, handle, protocol=pickle.HIGHEST_PROTOCOL)

merge_case_d = df[df.merge_case == 4]
dt_arr_di_case_d = merge_case_d.T.to_dict()
with open(max_dir / "many_raster_both_regions.pickle", "wb") as handle:
    pickle.dump(dt_arr_di_case_d, handle, protocol=pickle.HIGHEST_PROTOCOL)

snappy.jprint("\nPipeline Step 3 complete\n")


Sample of mapping data to a time stamp:

matching data     [beaufort_r2006_058-084_slie.tif, beaufort_r20...
beaufort_count                                                    3
chukchi_count                                                     3
match_count                                                       6
Name: 2006-03-23 00:00:00, dtype: object

Summary of merge cases - most have data from both regions which is good.

4    2510
3     829
1     797
2     153
Name: merge_case, dtype: int64
Writing pickles of each merge case to use in processing functions called by slurm.

Pipeline Step 3 complete



## 4 - Constrain and Fix the Array Values, Spatial References, and Generate Masks

The next step is to constrain our array values to a known range. We know from our EDA work the value schema in the existing collection is as follows:

 - 0: No landfast sea ice is present for this pixel. This means either water, or sea-ice that is NOT landfast.
 - 255: landfast sea ice is present for this pixel
 - 128: A landmask.
 - 63, 64, other oddball values: NoData

These values can be reduced to binary set where `0` indicates the absence of landfast sea ice and `1` the presence of it.

We'll define a function (see `fix_raster_values.py`) that forces this array value mapping and writes a new GeoTIFF.

Also, at this stage we can also create cohesive spatial references and raster creation profiles. Ultimately we want to create a merged raster as well, so we'll create three new rasters and raster profiles:
1. An empty mask (all zeros) for the Chukchi Region
2. An empty mask (all zeros) for the Beaufort Region
3. An empty mask (all zeros) for the combined Chukchi and Beaufort regions

Once we have established cohesive raster profiles, we can generate the function arguments that will be used to consolidate the array values. The fixed output GeoTIFFs will just have an "arrfix" prefix and be stashed in the `scratch/arrfix` directory.

In [7]:
snappy.jprint("Executing pipeline Step 4 (fix the array values and spatial references and generate masks)\n")

# create cohesive raster profiles
beauf_sample = new_fps[0]
chuk_sample = new_fps[-1]

with rio.open(beauf_sample) as beauf_src:
    beauf_src = rio.open(beauf_sample)
    beauf_profile = beauf_src.profile
beauf_profile["crs"] = CRS.from_epsg(3338)
beauf_profile.update(compress="lzw")

with rio.open(chuk_sample):
    chuk_src = rio.open(chuk_sample)
    chuk_profile = chuk_src.profile
chuk_profile["crs"] = CRS.from_epsg(3338)
chuk_profile.update(compress="lzw")

arr_merge, aff_merge = merge([beauf_src, chuk_src])

new_profile = chuk_profile.copy()
new_profile["height"], new_profile["width"] = arr_merge[0].shape
new_profile["transform"] = aff_merge
new_profile["crs"] = CRS.from_epsg(3338)
new_profile.update(compress="lzw")

with open(mask_dir / "beaufort_raster_profile.pickle", "wb") as handle:
    pickle.dump(beauf_profile, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open(mask_dir / "chukchi_raster_profile.pickle", "wb") as handle:
    pickle.dump(chuk_profile, handle, protocol=pickle.HIGHEST_PROTOCOL)
with open(mask_dir / "both_region_profile.pickle", "wb") as handle:
    pickle.dump(new_profile, handle, protocol=pickle.HIGHEST_PROTOCOL)

# generate the masks and write to disk
with rio.open(mask_dir / "both_region_mask.tif", 'w', **new_profile) as dst:
        dst.write(np.zeros(arr_merge[0].shape), 1)

with rio.open(mask_dir / "beaufort_mask.tif", 'w', **beauf_profile) as dst:
        dst.write(np.zeros((beauf_profile["height"], beauf_profile["width"])), 1)

with rio.open(mask_dir / "chukchi_mask.tif", 'w', **chuk_profile) as dst:
        dst.write(np.zeros((chuk_profile["height"], chuk_profile["width"])), 1)
        
# we want the mask paths for later
both_mask_fp = mask_dir / "both_region_mask.tif"
chuk_mask_fp = mask_dir / "chukchi_mask.tif"
beauf_mask_fp = mask_dir / "beaufort_mask.tif"

# nice to indicate when data are written to disk
mask_fps = list(mask_dir.glob('*'))
snappy.jprint(f"{len(mask_fps)} files written to {mask_dir}")

snappy.jprint("\nGenerating Slurm scripts to correct the raster values and write new GeoTIFFs.")

# write slurm scripts to fix the raster values of an individual file
project_dir = os.getcwd()

chuk_fps = list(copy_dir.glob("chukchi*"))
chuk_pkl = mask_dir / "chukchi_raster_profile.pickle" 
beauf_fps = list(copy_dir.glob("beaufort*"))
beauf_pkl = mask_dir / "beaufort_raster_profile.pickle" 


chuk_slurm_fps = [
    slurm.write_sbatch(
        NCORES, SLURM_MAIL, slurm_dir, AC_CONDA_ENV, fp, chuk_pkl, project_dir, arrfix_dir
    ) 
    for fp in chuk_fps
]

snappy.jprint(f"\n{len(chuk_slurm_fps)} slurm scripts for the Chukchi written to {slurm_dir}")
snappy.jprint("\nExecuting Chukchi slurm scripts with sbatch...")

# Call the slurm scripts with the `sbatch` command
_ = [os.system(f"sbatch {fp}") for fp in chuk_slurm_fps]

snappy.jprint(f"\n{len(chuk_slurm_fps)} slurm scripts for the Beaufort written to {slurm_dir}")
snappy.jprint("\nExecuting Beaufort slurm scripts with sbatch...\n")
beaufort_slurm_fps = [
    slurm.write_sbatch(
        NCORES, SLURM_MAIL, slurm_dir, AC_CONDA_ENV, fp, beauf_pkl, project_dir, arrfix_dir
    ) 
    for fp in beauf_fps
]

_ = [os.system(f"sbatch {fp}") for fp in beaufort_slurm_fps]

snappy.jprint("Pipeline Step 4 complete\n")

Executing pipeline Step 4 (fix the array values and spatial references and generate masks)

6 files written to /atlas_scratch/cparr4/landfast_seaice/mahoney_preprocess/scratch/mask

Generating Slurm scripts to correct the raster values and write new GeoTIFFs.

316 slurm scripts for the Chukchi written to /atlas_scratch/cparr4/landfast_seaice/mahoney_preprocess/scratch/slurm

Executing Chukchi slurm scripts with sbatch...
Submitted batch job 3893967
Submitted batch job 3893968
Submitted batch job 3893969
Submitted batch job 3893970
Submitted batch job 3893971
Submitted batch job 3893972
Submitted batch job 3893973
Submitted batch job 3893974
Submitted batch job 3893975
Submitted batch job 3893976
Submitted batch job 3893977
Submitted batch job 3893978
Submitted batch job 3893979
Submitted batch job 3893980
Submitted batch job 3893981
Submitted batch job 3893982
Submitted batch job 3893983
Submitted batch job 3893984
Submitted batch job 3893985
Submitted batch job 3893986
Submitted batch

Submitted batch job 3894245
Submitted batch job 3894246
Submitted batch job 3894247
Submitted batch job 3894248
Submitted batch job 3894249
Submitted batch job 3894250
Submitted batch job 3894251
Submitted batch job 3894252
Submitted batch job 3894253
Submitted batch job 3894254
Submitted batch job 3894255
Submitted batch job 3894256
Submitted batch job 3894257
Submitted batch job 3894258
Submitted batch job 3894259
Submitted batch job 3894260
Submitted batch job 3894261
Submitted batch job 3894262
Submitted batch job 3894263
Submitted batch job 3894264
Submitted batch job 3894265
Submitted batch job 3894266
Submitted batch job 3894267
Submitted batch job 3894268
Submitted batch job 3894269
Submitted batch job 3894270
Submitted batch job 3894271
Submitted batch job 3894272
Submitted batch job 3894273
Submitted batch job 3894274
Submitted batch job 3894275
Submitted batch job 3894276
Submitted batch job 3894277
Submitted batch job 3894278
Submitted batch job 3894279
Submitted batch job 

In [8]:
n_fixed_fps = len([x for x in arrfix_dir.rglob("*.tif")])
assert(n_copied_fps == n_fixed_fps)

## 5 - Create rasters that represent all possible data for a given date.

This where we handle cases B, C, and D that we described earlier. The code that executes the logic of these cases (also described earlier) lives in the Python scripts that are slurmed.

In [9]:
snappy.jprint("Executing pipeline Step 5 (Create merged rasters)\n")

case_b_pkl = Path(max_dir / "single_raster_single_region.pickle")
case_b_slurm_dts = [
    slurm.write_case_b(
        NCORES, SLURM_MAIL, slurm_dir, AC_CONDA_ENV, case_b_pkl, k, project_dir, mask_dir, arrfix_dir, dst_p
    ) 
    for k in dt_arr_di_case_b.keys()
]
_ = [os.system(f"sbatch {x}") for x in case_b_slurm_dts]

case_c_pkl = Path(max_dir / "many_raster_single_region.pickle")
case_c_slurm_dts = [
    slurm.write_case_c(
        NCORES, SLURM_MAIL, slurm_dir, AC_CONDA_ENV, case_c_pkl, k, project_dir, mask_dir, max_dir, arrfix_dir, dst_p
    ) 
    for k in dt_arr_di_case_c.keys()
]
_ = [os.system(f"sbatch {x}") for x in case_c_slurm_dts]

case_d_pkl = Path(max_dir / "many_raster_both_regions.pickle")
case_d_slurm_dts = [
    slurm.write_case_d(
        NCORES, SLURM_MAIL, slurm_dir, AC_CONDA_ENV, case_d_pkl, k, project_dir, mask_dir, max_dir, arrfix_dir, dst_p
    ) 
    for k in dt_arr_di_case_d.keys()
]
_ = [os.system(f"sbatch {x}") for x in case_d_slurm_dts]

snappy.jprint("Pipeline Step 5 complete\n")

Executing pipeline Step 5 (Create merged rasters)

Submitted batch job 3894533
Submitted batch job 3894534
Submitted batch job 3894535
Submitted batch job 3894536
Submitted batch job 3894537
Submitted batch job 3894538
Submitted batch job 3894539
Submitted batch job 3894540
Submitted batch job 3894541
Submitted batch job 3894542
Submitted batch job 3894543
Submitted batch job 3894544
Submitted batch job 3894545
Submitted batch job 3894546
Submitted batch job 3894547
Submitted batch job 3894548
Submitted batch job 3894549
Submitted batch job 3894550
Submitted batch job 3894551
Submitted batch job 3894552
Submitted batch job 3894553
Submitted batch job 3894554
Submitted batch job 3894555
Submitted batch job 3894556
Submitted batch job 3894557
Submitted batch job 3894558
Submitted batch job 3894559
Submitted batch job 3894560
Submitted batch job 3894561
Submitted batch job 3894562
Submitted batch job 3894563
Submitted batch job 3894564
Submitted batch job 3894565
Submitted batch job 38945

Submitted batch job 3894824
Submitted batch job 3894825
Submitted batch job 3894826
Submitted batch job 3894827
Submitted batch job 3894828
Submitted batch job 3894829
Submitted batch job 3894830
Submitted batch job 3894831
Submitted batch job 3894832
Submitted batch job 3894833
Submitted batch job 3894834
Submitted batch job 3894835
Submitted batch job 3894836
Submitted batch job 3894837
Submitted batch job 3894838
Submitted batch job 3894839
Submitted batch job 3894840
Submitted batch job 3894841
Submitted batch job 3894842
Submitted batch job 3894843
Submitted batch job 3894844
Submitted batch job 3894845
Submitted batch job 3894846
Submitted batch job 3894847
Submitted batch job 3894848
Submitted batch job 3894849
Submitted batch job 3894850
Submitted batch job 3894851
Submitted batch job 3894852
Submitted batch job 3894853
Submitted batch job 3894854
Submitted batch job 3894855
Submitted batch job 3894856
Submitted batch job 3894857
Submitted batch job 3894858
Submitted batch job 

Submitted batch job 3895117
Submitted batch job 3895118
Submitted batch job 3895119
Submitted batch job 3895120
Submitted batch job 3895121
Submitted batch job 3895122
Submitted batch job 3895123
Submitted batch job 3895124
Submitted batch job 3895125
Submitted batch job 3895126
Submitted batch job 3895127
Submitted batch job 3895128
Submitted batch job 3895129
Submitted batch job 3895130
Submitted batch job 3895131
Submitted batch job 3895132
Submitted batch job 3895133
Submitted batch job 3895134
Submitted batch job 3895135
Submitted batch job 3895136
Submitted batch job 3895137
Submitted batch job 3895138
Submitted batch job 3895139
Submitted batch job 3895140
Submitted batch job 3895141
Submitted batch job 3895142
Submitted batch job 3895143
Submitted batch job 3895144
Submitted batch job 3895145
Submitted batch job 3895146
Submitted batch job 3895147
Submitted batch job 3895148
Submitted batch job 3895149
Submitted batch job 3895150
Submitted batch job 3895151
Submitted batch job 

Submitted batch job 3895412
Submitted batch job 3895413
Submitted batch job 3895414
Submitted batch job 3895415
Submitted batch job 3895416
Submitted batch job 3895417
Submitted batch job 3895418
Submitted batch job 3895419
Submitted batch job 3895420
Submitted batch job 3895421
Submitted batch job 3895422
Submitted batch job 3895423
Submitted batch job 3895424
Submitted batch job 3895425
Submitted batch job 3895426
Submitted batch job 3895427
Submitted batch job 3895428
Submitted batch job 3895429
Submitted batch job 3895430
Submitted batch job 3895431
Submitted batch job 3895432
Submitted batch job 3895433
Submitted batch job 3895434
Submitted batch job 3895435
Submitted batch job 3895436
Submitted batch job 3895437
Submitted batch job 3895438
Submitted batch job 3895439
Submitted batch job 3895440
Submitted batch job 3895441
Submitted batch job 3895442
Submitted batch job 3895443
Submitted batch job 3895444
Submitted batch job 3895445
Submitted batch job 3895446
Submitted batch job 

Submitted batch job 3895705
Submitted batch job 3895706
Submitted batch job 3895707
Submitted batch job 3895708
Submitted batch job 3895709
Submitted batch job 3895710
Submitted batch job 3895711
Submitted batch job 3895712
Submitted batch job 3895713
Submitted batch job 3895714
Submitted batch job 3895715
Submitted batch job 3895716
Submitted batch job 3895717
Submitted batch job 3895718
Submitted batch job 3895719
Submitted batch job 3895720
Submitted batch job 3895721
Submitted batch job 3895722
Submitted batch job 3895723
Submitted batch job 3895724
Submitted batch job 3895725
Submitted batch job 3895726
Submitted batch job 3895727
Submitted batch job 3895728
Submitted batch job 3895729
Submitted batch job 3895730
Submitted batch job 3895731
Submitted batch job 3895732
Submitted batch job 3895733
Submitted batch job 3895734
Submitted batch job 3895735
Submitted batch job 3895736
Submitted batch job 3895737
Submitted batch job 3895738
Submitted batch job 3895739
Submitted batch job 

Submitted batch job 3895999
Submitted batch job 3896000
Submitted batch job 3896001
Submitted batch job 3896002
Submitted batch job 3896003
Submitted batch job 3896004
Submitted batch job 3896005
Submitted batch job 3896006
Submitted batch job 3896007
Submitted batch job 3896008
Submitted batch job 3896009
Submitted batch job 3896010
Submitted batch job 3896011
Submitted batch job 3896012
Submitted batch job 3896013
Submitted batch job 3896014
Submitted batch job 3896015
Submitted batch job 3896016
Submitted batch job 3896017
Submitted batch job 3896018
Submitted batch job 3896019
Submitted batch job 3896020
Submitted batch job 3896021
Submitted batch job 3896022
Submitted batch job 3896023
Submitted batch job 3896024
Submitted batch job 3896025
Submitted batch job 3896026
Submitted batch job 3896027
Submitted batch job 3896028
Submitted batch job 3896029
Submitted batch job 3896030
Submitted batch job 3896031
Submitted batch job 3896032
Submitted batch job 3896033
Submitted batch job 

Submitted batch job 3896294
Submitted batch job 3896295
Submitted batch job 3896296
Submitted batch job 3896297
Submitted batch job 3896298
Submitted batch job 3896299
Submitted batch job 3896300
Submitted batch job 3896301
Submitted batch job 3896302
Submitted batch job 3896303
Submitted batch job 3896304
Submitted batch job 3896305
Submitted batch job 3896306
Submitted batch job 3896307
Submitted batch job 3896308
Submitted batch job 3896309
Submitted batch job 3896310
Submitted batch job 3896311
Submitted batch job 3896312
Submitted batch job 3896313
Submitted batch job 3896314
Submitted batch job 3896315
Submitted batch job 3896316
Submitted batch job 3896317
Submitted batch job 3896318
Submitted batch job 3896319
Submitted batch job 3896320
Submitted batch job 3896321
Submitted batch job 3896322
Submitted batch job 3896323
Submitted batch job 3896324
Submitted batch job 3896325
Submitted batch job 3896326
Submitted batch job 3896327
Submitted batch job 3896328
Submitted batch job 

Submitted batch job 3896587
Submitted batch job 3896588
Submitted batch job 3896589
Submitted batch job 3896590
Submitted batch job 3896591
Submitted batch job 3896592
Submitted batch job 3896593
Submitted batch job 3896594
Submitted batch job 3896595
Submitted batch job 3896596
Submitted batch job 3896597
Submitted batch job 3896598
Submitted batch job 3896599
Submitted batch job 3896600
Submitted batch job 3896601
Submitted batch job 3896602
Submitted batch job 3896603
Submitted batch job 3896604
Submitted batch job 3896605
Submitted batch job 3896606
Submitted batch job 3896607
Submitted batch job 3896608
Submitted batch job 3896609
Submitted batch job 3896610
Submitted batch job 3896611
Submitted batch job 3896612
Submitted batch job 3896613
Submitted batch job 3896614
Submitted batch job 3896615
Submitted batch job 3896616
Submitted batch job 3896617
Submitted batch job 3896618
Submitted batch job 3896619
Submitted batch job 3896620
Submitted batch job 3896621
Submitted batch job 

Submitted batch job 3896880
Submitted batch job 3896881
Submitted batch job 3896882
Submitted batch job 3896883
Submitted batch job 3896884
Submitted batch job 3896885
Submitted batch job 3896886
Submitted batch job 3896887
Submitted batch job 3896888
Submitted batch job 3896889
Submitted batch job 3896890
Submitted batch job 3896891
Submitted batch job 3896892
Submitted batch job 3896893
Submitted batch job 3896894
Submitted batch job 3896895
Submitted batch job 3896896
Submitted batch job 3896897
Submitted batch job 3896898
Submitted batch job 3896899
Submitted batch job 3896900
Submitted batch job 3896901
Submitted batch job 3896902
Submitted batch job 3896903
Submitted batch job 3896904
Submitted batch job 3896905
Submitted batch job 3896906
Submitted batch job 3896907
Submitted batch job 3896908
Submitted batch job 3896909
Submitted batch job 3896910
Submitted batch job 3896911
Submitted batch job 3896912
Submitted batch job 3896913
Submitted batch job 3896914
Submitted batch job 

Submitted batch job 3897173
Submitted batch job 3897174
Submitted batch job 3897175
Submitted batch job 3897176
Submitted batch job 3897177
Submitted batch job 3897178
Submitted batch job 3897179
Submitted batch job 3897180
Submitted batch job 3897181
Submitted batch job 3897182
Submitted batch job 3897183
Submitted batch job 3897184
Submitted batch job 3897185
Submitted batch job 3897186
Submitted batch job 3897187
Submitted batch job 3897188
Submitted batch job 3897189
Submitted batch job 3897190
Submitted batch job 3897191
Submitted batch job 3897192
Submitted batch job 3897193
Submitted batch job 3897194
Submitted batch job 3897195
Submitted batch job 3897196
Submitted batch job 3897197
Submitted batch job 3897198
Submitted batch job 3897199
Submitted batch job 3897200
Submitted batch job 3897201
Submitted batch job 3897202
Submitted batch job 3897203
Submitted batch job 3897204
Submitted batch job 3897205
Submitted batch job 3897206
Submitted batch job 3897207
Submitted batch job 

Submitted batch job 3897466
Submitted batch job 3897467
Submitted batch job 3897468
Submitted batch job 3897469
Submitted batch job 3897470
Submitted batch job 3897471
Submitted batch job 3897472
Submitted batch job 3897473
Submitted batch job 3897474
Submitted batch job 3897475
Submitted batch job 3897476
Submitted batch job 3897477
Submitted batch job 3897478
Submitted batch job 3897479
Submitted batch job 3897480
Submitted batch job 3897481
Submitted batch job 3897482
Submitted batch job 3897483
Submitted batch job 3897484
Submitted batch job 3897485
Submitted batch job 3897486
Submitted batch job 3897487
Submitted batch job 3897488
Submitted batch job 3897489
Submitted batch job 3897490
Submitted batch job 3897491
Submitted batch job 3897492
Submitted batch job 3897493
Submitted batch job 3897494
Submitted batch job 3897495
Submitted batch job 3897496
Submitted batch job 3897497
Submitted batch job 3897498
Submitted batch job 3897499
Submitted batch job 3897500
Submitted batch job 

Submitted batch job 3897759
Submitted batch job 3897760
Submitted batch job 3897761
Submitted batch job 3897762
Submitted batch job 3897763
Submitted batch job 3897764
Submitted batch job 3897765
Submitted batch job 3897766
Submitted batch job 3897767
Submitted batch job 3897768
Submitted batch job 3897769
Submitted batch job 3897770
Submitted batch job 3897771
Submitted batch job 3897772
Submitted batch job 3897773
Submitted batch job 3897774
Submitted batch job 3897775
Submitted batch job 3897776
Submitted batch job 3897777
Submitted batch job 3897778
Submitted batch job 3897779
Submitted batch job 3897780
Submitted batch job 3897781
Submitted batch job 3897782
Submitted batch job 3897783
Submitted batch job 3897784
Submitted batch job 3897785
Submitted batch job 3897786
Submitted batch job 3897787
Submitted batch job 3897788
Submitted batch job 3897789
Submitted batch job 3897790
Submitted batch job 3897791
Submitted batch job 3897792
Submitted batch job 3897793
Submitted batch job 

In [11]:
n_expected_outputs = len(dt_arr_di_case_d.keys()) + len(dt_arr_di_case_c.keys()) + len(dt_arr_di_case_b.keys())
n_output_fps = len([x for x in dst_p.rglob("*.tif")])

assert(n_expected_outputs == n_output_fps)

# Pipeline Complete!