# Prepare HydroFabric Map Layers

**Authors**: 
- Irene Garousi-Nejad <igarousi@cuahsi.org>, 
- Tony Castronova <acastronova@cuahsi.org>

**Last Updated**: 

**Description**:  

**Software Requirements**:

> Conda: 22.9.0  \
> Python: 3.9.16  \
> wget: 3.2  \
> pandas: 2.0.0  \
> geopandas: 0.12.2  \
> fiona:  1.9.1

---

In [5]:
import boto3
import fsspec
import pandas
import geopandas

In [47]:
%%time

geoms = []
s3 = boto3.resource('s3')
s3 = fsspec.filesystem('s3', anon=True)
for vpu in ['01', '02','03N','03S','03W', '04','05','06','07','08','09','10L','10U','11','12','13','14','15','16','17','18']:
    print(f'Processing VPU: {vpu}', end='...', flush=True)
    with s3.open(f's3://nextgen-hydrofabric/v1.2/nextgen_{vpu}.gpkg') as f:    
        # Read the file-like object into a GeoDataFrame
        gdf_divide = geopandas.read_file(f, layer='divides')
        gdf_divide['geometry'] = gdf_divide.buffer(0.01) # this is necessary to ensure that geometries are valid
        geom = gdf_divide.dissolve()
        
        # save the result in a list
        geom['VPU'] = vpu
        geom.drop(columns=['id','areasqkm','type','toid'], inplace=True)
        geoms.append(geom)
    print('done')

Processing VPU: 01...done
Processing VPU: 02...done
Processing VPU: 03N...done
Processing VPU: 03S...done
Processing VPU: 03W...done
Processing VPU: 04...done
Processing VPU: 05...done
Processing VPU: 06...done
Processing VPU: 07...done
Processing VPU: 08...done
Processing VPU: 09...done
Processing VPU: 10L...done
Processing VPU: 10U...done
Processing VPU: 11...done
Processing VPU: 12...done
Processing VPU: 13...done
Processing VPU: 14...done
Processing VPU: 15...done
Processing VPU: 16...done
Processing VPU: 17...done
Processing VPU: 18...done
CPU times: user 13min 54s, sys: 56.1 s, total: 14min 50s
Wall time: 22min 35s


In [55]:
gdf = geopandas.GeoDataFrame(pandas.concat(geoms, ignore_index=True))

In [58]:
gdf.to_file('vpu_boundaries.shp', driver='ESRI Shapefile')