# GeoTIFFs to reV HDF5 Files

In the previous tutorial, we demostrated how we can use `reVX`'s Geotiff handler to manage geotiff files

In this tutorial, we will go over getting tiff files into a [reV](https://github.com/NREL/reV)-ready format using the `LayeredH5` handler

Let's start with a few common imports:

In [1]:
import urllib.request
from pathlib import Path
from multiprocessing.pool import ThreadPool

from reVX.handlers.layered_h5 import LayeredH5
from reVX.handlers.geotiff import Geotiff

from sl_utils import download_tiff_file

In [2]:
# uncomment below if you want to see the contents of `download_tiff_file`
# %load sl_utils

## Downloading the data

Before we dive into the code, we first have to download a sample TIFF from 
[Siting Lab](https://data.openei.org/submissions/6119) to use as an example of adding data to a layered HDF5 file. 
In particular, we will be using data from {cite:t}`oedi_6121`.

If you have already downloaded the data, you can skip this step (just make sure path variables below are set correctly).
We'll start by defining the local file path destination:

In [4]:
AIRPORT_HELIPORT_SETBACKS = "airport_heliport_setbacks.tif"
NEXRAD_GREEN_LOS = "NEXRAD_green_los.tif"
SETBACKS_PIPELINE_REFERENCE = "setbacks_pipeline_reference.tif"
SETBACKS_STRUCTURE_115HH_170RD = "setbacks_structure_115hh_170rd.tif"
SETBACKS_STRUCTURE_REFERENCE = "setbacks_structure_reference.tif"

Let's also define the URL for each of these files:

In [5]:
FILE_URLS = {
    AIRPORT_HELIPORT_SETBACKS: "https://data.openei.org/files/6120/airport_heliport_setbacks.tif",
    NEXRAD_GREEN_LOS: "https://data.openei.org/files/6121/nexrad_4km.tif",
    SETBACKS_PIPELINE_REFERENCE: "https://data.openei.org/files/6125/setbacks_pipeline_115hh_170rd_extrapolated.tif",
    SETBACKS_STRUCTURE_115HH_170RD: "https://data.openei.org/files/6132/setbacks_structure_115hh_170rd_extrapolated.tif",
    SETBACKS_STRUCTURE_REFERENCE : "https://data.openei.org/files/6132/setbacks_structure_115hh_170rd.tif"
}

Next, we can use a siting lab utility function to download the data. This function uses `urllib` (which is part of the Python standard library) under the hood.

<div class="alert alert-block alert-warning">
<b>Note:</b> The source TIFF files are large (90m resolution for all of CONUS), so we specified <code class="python">crop=True</code> to crop the data immediately after downloading it to make it easier to work with. If you have a machine with sufficiently large memory (32GB+), or you are downloading the file in order to use it for analysis purposes, you should set <code class="python">crop=False</code>.
</div>

In [6]:
def download(local_filepath):
    url = FILE_URLS[local_filepath]
    download_tiff_file(url, local_filepath, crop=True)


with ThreadPool(len(FILE_URLS)) as p:
    p.map(download, FILE_URLS)

'airport_heliport_setbacks.tif' already exists!
Downloaded 'NEXRAD_green_los.tif'!
Downloaded 'setbacks_structure_reference.tif'!
Downloaded 'setbacks_pipeline_reference.tif'!
Downloaded 'setbacks_structure_115hh_170rd.tif'!


## Creating the layered HDF5 from TIFF

First, we will initialize the `LayeredH5` object.

If creating a new HDF5 file that does not exist, we use the `.create_new()` method. 

When creating a new HDF5, a template filepath must be specified. The template file is used to define the properties of the HDF5 file including:
1. The profile information
2. Coordinate reference system and projection
3. The geographic extent, spatial resolution

All other files that are subsequently added to the HDF5 file will be transformed/adjusted to fit the properties of the template file before being written to the file.

In [7]:
H5_PATH = "example.h5"

# Initialize layered h5 object
h5 = LayeredH5(H5_PATH, template_file=NEXRAD_GREEN_LOS)

# If file doesn't exist, create new h5
h5.create_new()

Inspecting the h5 file (using the `layers` property), we see that the first two layers are longitude and latitude arrays. These are the coordinate locations for each grid cell defined by the template file pixels.

In [8]:
# Use the layer method to see the layers in the H5 file
h5.layers

['latitude', 'longitude']

Meta data information about the h5 file can be retrieved by using `.profile` and `.shape`

In [9]:
print(f"H5 profile: {h5.profile}")
print(f"shape: {h5.shape}")

H5 profile: {'driver': 'GTiff', 'dtype': 'uint8', 'nodata': 255.0, 'width': 2000, 'height': 2000, 'count': 1, 'crs': '+init=epsg:5070', 'transform': (90.0, 0.0, 1829980.2632930684, 0.0, -90.0, 2297068.2309463923), 'blockxsize': 256, 'blockysize': 256, 'tiled': True, 'compress': 'lzma', 'interleave': 'band'}
shape: (2000, 2000)


Once the H5 file is created(or if it exists already), we can write numpy arrays and tiff files into the h5 files using the `.write_layer_to_h5()` and `.write_geotiff_to_h5()` respectively

In [10]:
# adding numpy arrays

# Let's read in a geotiff file into a numpy array using the Geotiff handler
with Geotiff(NEXRAD_GREEN_LOS) as geo:
    h5.write_layer_to_h5(
        values=geo.values,
        layer_name="nexrad_green_los",
        profile=geo.profile,
        description="NEXRAD Line of sight"
    )

In [11]:
# adding a geotiff file directly
h5.write_geotiff_to_h5(
    geotiff=AIRPORT_HELIPORT_SETBACKS,
    layer_name="airport_heliport_setbacks",
    description="Setbacks from airports and heliports",
    replace=False
)

  in_crs_string = _prepare_from_proj_string(in_crs_string)
  proj = self._crs.to_proj4(version=version)


Now we can check to see what layers are currently in the H5 file

In [12]:
# Checking current layers in the
h5.layers

['airport_heliport_setbacks', 'latitude', 'longitude', 'nexrad_green_los']

We can also add multiple geotiffs into the h5 using the `.layers_to_h5()` method. 

This accepts a list or dictionary mapping layer name to geotiff filepaths. You can also pass a dictionary mapping layer name to description for the `description` argument

In [13]:
file_list = [
    SETBACKS_PIPELINE_REFERENCE,
    SETBACKS_STRUCTURE_115HH_170RD,
    SETBACKS_STRUCTURE_REFERENCE,
]

print(f"Adding {len(file_list)} file(s) to the h5...")
for fn in file_list:
    print(fn.split(".")[0])

h5.layers_to_h5(
    layers=file_list,
    replace=False
)

Adding 3 file(s) to the h5...
setbacks_pipeline_reference
setbacks_structure_115hh_170rd
setbacks_structure_reference


In [14]:
# Checking current layers in the h5
h5.layers

['airport_heliport_setbacks',
 'latitude',
 'longitude',
 'nexrad_green_los',
 'setbacks_pipeline_reference',
 'setbacks_structure_115hh_170rd',
 'setbacks_structure_reference']

Layers in the h5 file can also be extracted as geotiffs. `.layer_to_geotiff()` for single layers, `.extract_layers()` for multiple layers.

All the layers in the h5 can be extracted using `.extract_all_layers()` by passing an output directory as argument.

In [15]:
# extracting single layer
layer = "airport_heliport_setbacks"
outpath = "airport_heliport_setbacks_h5_extract.tif"
h5.layer_to_geotiff(layer=layer,
                    geotiff=outpath)


# Extracting multiple layers
layers = {
    "nexrad_green_los": "nexrad_green_los_h5_extract.tif",
    "setbacks_pipeline_reference": "setbacks_pipeline_reference_h5_extract.tif"
}
h5.extract_layers(layers)

### Using the command line to add and extract layers from h5 files

Alternatively, the command line can be used to add and extract layers from the h5 file

**1. Adding tiffs to the h5** 

First, we need to construct a json config file that contains layer name mapping to geotiff filepaths
This json configuration file can optionally contain layer descriptions


`layers.json`
```json
{
    "layers":
        {
            "nexrad_green_los": "../data/nexrad_green_los.tif",
            "ops_water": "../data/ops_water.tif",
            "setbacks_transmission_reference": "../data/setbacks_transmission_reference.tif"
        }
}
```

Then we run `$ reVX exclusions layers-to-h5  -h5 "../data/example.h5" --layers layers.json` on the command line.

**2. Extracting layers from h5**

To extract layers from the h5 file, we pass the list of layers to extract as an argument and an output directory

Example: `$ reVX exclusions layers-from-h5 -h5 "../data/example.h5" -l nexrad_green_los ops_water setbacks_transmission_reference -o "./outputs"`