# Prepare ALFRESCO vegetation and flammability datasets for hosting

This notebook captures information from the ALFRESCO vegetation and flammability datasets for generating a standard metadata / GeoNetwork entry, and creates helpful `.zip`s for distribution.

In [5]:
import os
import subprocess
from pathlib import Path
import rasterio as rio
from rasterio.crs import CRS
from rasterio.warp import transform_bounds
import numpy as np

data_dir = Path("/workspace/Shared/Tech_Projects/Alaska_IEM/project_data/NCR_ALFRESCO_datasets/")
out_dir = Path("/workspace/Shared/Tech_Projects/Alaska_IEM/final_products/NCR_ALFRESCO_zips")

### Spatial info: extent in WGS84, resolution, dimensions

Get the spatial extent of this dataset in WGS84 (and use as an opportunity to ensure they are all the same).

Use a [function](https://github.com/ua-snap/snap-geo/blob/e65e2d9aee0a1a0ea8c3432b3d01807476316206/antimeridian_raster_bbox.ipynb) to get WGS84 extent when the west side crosses the dateline. Adapt it to pull the resolution and spatial dimension sizes as well. 

In [6]:
def get_wgs84_extent(gtiff_fp):
    with rio.open(gtiff_fp) as src:
        src_crs = src.crs
        src_bounds = src.bounds
        x_res = src.transform[0]
        y_res = -src.transform[4]
        width, height = src.width, src.height
    dst_crs = CRS.from_wkt(
        CRS.from_epsg(4326).to_wkt().replace('PRIMEM["Greenwich",0', 'PRIMEM["Greenwich",180')
    )
    bounds = transform_bounds(src_crs, dst_crs, *src_bounds)
    new_bounds = np.round((bounds[0] - 180, bounds[1], bounds[2] - 180, bounds[3]), 4)
    
    return new_bounds, x_res, y_res, width, height

Iterate over all files in the dataset and extract the spatial info, appending to lists for verification of uniformity:

In [25]:
%%time
alf_names = ["alfresco_vegetation_type_percentage", "alfresco_vegetation_mode_statistic", "alfresco_relative_flammability_30yr"]
all_bounds = []
x_sizes, y_sizes = [], []
widths, heights = [], []
for alf_name in alf_names:
    fps = list(data_dir.joinpath(alf_name).glob("*.tif"))
    out = [get_wgs84_extent(fp) for fp in fps]
    all_bounds.extend([o[0] for o in out])
    x_sizes.extend([o[1] for o in out])
    y_sizes.extend([o[2] for o in out])
    widths.extend([o[3] for o in out])
    heights.extend([o[4] for o in out])

CPU times: user 5.32 s, sys: 276 ms, total: 5.59 s
Wall time: 6.42 s


If the below cell executes without error, then all files have the same extent:

In [26]:
assert np.all([all_bounds[0] == bnds for bnds in all_bounds])

View those bounds for inclusion in metadata file:

In [27]:
print("WSEN bounds:", all_bounds[0])

WSEN bounds: [-197.3058   50.3484 -107.1439   72.932 ]


Likewise, confirm that all files have the same X and Y sizes, and print those sizes:

In [28]:
assert np.all([x_sizes[0] == res for res in x_sizes])
assert np.all([y_sizes[0] == res for res in y_sizes])
print("X resolution:", x_sizes[0])
print("Y resolution:", y_sizes[0])

X resolution: 1000.0
Y resolution: 1000.0


Confirm that all files have the same shape, and print those:

In [29]:
assert np.all([widths[0] == w for w in widths])
assert np.all([heights[0] == h for h in heights])
print("X dimension size:", widths[0])
print("Y dimension size:", heights[0])

X dimension size: 3650
Y dimension size: 2100


## Zip files for distribution

Here we will zip the files for distribution. We will zip them by degree day variable.

In [31]:
for alf_name in alf_names:
    command = f"bash ./zipit.sh {data_dir} {alf_name} {out_dir}"
    output = subprocess.check_output(command, shell=True)

Did we zip them all?

In [32]:
# did we zip 'em all?
zips = list(out_dir.glob("*.zip"))
zips

[PosixPath('/workspace/Shared/Tech_Projects/Alaska_IEM/final_products/NCR_ALFRESCO_zips/alfresco_vegetation_type_percentage.zip'),
 PosixPath('/workspace/Shared/Tech_Projects/Alaska_IEM/final_products/NCR_ALFRESCO_zips/alfresco_vegetation_mode_statistic.zip'),
 PosixPath('/workspace/Shared/Tech_Projects/Alaska_IEM/final_products/NCR_ALFRESCO_zips/alfresco_relative_flammability_30yr.zip')]

Looks like it. Make a directory in Poseidon and copy these files (/workspace/CKAN not available on compute node):

```
cp /workspace/Shared/Tech_Projects/Alaska_IEM/final_products/NCR_ALFRESCO_zips/alfresco_vegetation*.zip /workspace/CKAN/CKAN_Data/IEM/Outputs/ALF/Gen_1a/alfresco_relative_spatial_outputs/vegetation_type
cp /workspace/Shared/Tech_Projects/Alaska_IEM/final_products/NCR_ALFRESCO_zips/alfresco_relative_flammability_30yr.zip /workspace/CKAN/CKAN_Data/IEM/Outputs/ALF/Gen_1a/alfresco_relative_spatial_outputs/relative_flammability/AR5_CMIP5/
```