This notebook includes the code to prepare the partial data example. We run through the hazard files from the full_data_example and clip them to a small area. We write out these files and use them as inputs in the partial_data_example. 

# Configure

In [1]:
%load_ext autoreload
%autoreload 2

In [46]:
import os
from pathlib import Path
import sys
import glob
from os.path import join
os.environ['USE_PYGEOS'] = '0'
import yaml
from yaml.loader import SafeLoader
import geopandas as gpd
import pandas as pd
import numpy as np
import rasterio 
import rasterio.mask
from shapely.geometry import Polygon
import zipfile

from unsafe.files import *


# Get clipped depths and write out

It helps to use the config file from the full_data_example to read the hazard data and output the zipped directory of the clipped hazard data.

In [8]:
ABS_DIR = Path().absolute().parents[0]

CONFIG_FILEP = join(ABS_DIR, 'config', 'config.yaml')
with open(CONFIG_FILEP) as f:
    CONFIG = yaml.load(f, Loader=SafeLoader)

FR = join(ABS_DIR, "data", "raw")
FE = join(FR, "external")
HAZ_DIR_R = join(FE, "haz")
UNZIP_DIR = join(FR, "unzipped")
HAZ_DIR_UZ = join(UNZIP_DIR, "external", "haz")

# Get hazard model variables
# Get Return Period list
RET_PERS = CONFIG['RPs']
HAZ_FILEN = CONFIG['haz_filename']
# Get CRS for depth grids
HAZ_CRS = CONFIG['haz_crs']

We will create a bounding box. This extent was found in the full_data_example, and it captures an area with one of the highest risk properties according to the Philadelphia flood risk database. It also captures several other houses exposed to flooding. 

In [23]:
minx = -75.10
maxx = -75.065
miny = 40
maxy = 40.02

# Polygon representation of the above
clip_geo = gpd.GeoDataFrame(geometry=[Polygon([[minx, miny], [minx, maxy],
                                              [maxx, maxy], [maxx, miny]])],
                            crs="EPSG:4326")
# Reproject to hazard coordinate ref system
clip_geo_r = clip_geo.to_crs(HAZ_CRS)

In [45]:
# We want to loop through the hazard files, clip them to clip_geo_r
# then save the clipped files in raw/external/haz/dg_clipped
haz_clip_dir = join(HAZ_DIR_R, 'dg_clipped')
haz_clip_filen = "Depth_{RP}pct_clip.tif"
# Then we want to zip this directory and delete the uncompressed one
for rp in RET_PERS:
    dg = read_dg(rp, HAZ_DIR_UZ, HAZ_FILEN)
    # Following https://rasterio.readthedocs.io/en/stable/topics/masking-by-shapefile.html
    out_image, out_transform = rasterio.mask.mask(dg, clip_geo_r['geometry'], crop=True)
    out_meta = dg.meta
    out_meta.update({"driver": "GTiff",
                    "height": out_image.shape[1],
                    "width": out_image.shape[2],
                    "transform": out_transform})
    # We want to save in raw/external/haz/dg_clipped
    rp_clip_filep = join(haz_clip_dir, haz_clip_filen.replace('{RP}', rp))

    # We will write out each clipped raster
    prepare_saving(rp_clip_filep)
    with rasterio.open(rp_clip_filep, "w", **out_meta) as dest:
        dest.write(out_image)

To get the example data ready, we used an Ubuntu command line to zip the directory. Then we deleted the uncompressed directory to get rid of redundant files. It is easiest to separate the partial data example in a separate working directory, so even though we write out the data above within the philadelphia_frd/ directory, we do the zipping in the phil_frd_sub/ directory and then delete the uncompressed directory in the former. 