In [1]:
%matplotlib notebook

import os
import pandas as pd
import numpy as np
import json
import shutil

import geopandas as gpd
from ipyleaflet import Map, GeoData, basemaps, WidgetControl
from ipywidgets import HTML, SelectMultiple

import Genmod_Utilities as gmu
import flopy as fp
import RTD_util6 as rtd_ut

The purpose of this notebook is to create a shape file for use in General Model notebooks. The shapefile can be created using any method. This notebook provides one convenient way to do it.

The General Models work at any scale, but HUC8s have been shown to provide an appropriate average of general conditions. In this notebook, HUC8s are selected from a 250K-scale HUC index shape file. If the shape file does not exist in the directory that contains this notebook, it will be downloaded. The user supplies a name or partial name of a HUC8 basin. In case there are multiple occurences of that name or partial name, the user can select which one(s) are processed. If there are multiple segments of the desired HUC8, multiple basins can be selected.  They will be combined into one shapefile. 

This notebook is interactive It can be executed one cell at a time by pressing shift-enter in each cell. It can also be run all at once from the menu bar, but there will be an error after the map cell until a basin or basins are selected. After the selection is made, the rest of the cells can be run. 

Supply the name of a HUC8. Partial names are acceptable, e.g., 'Hous' for the Housatonic River 

In [2]:
HUC8_name = 'Saco'

The user should execute the following cells, up to the dropdown menu, without any interaction. The user can select the chosen HUC8 from the dropdown menu.

In [3]:
file_dict = dict()
file_dict['HUC8_name'] = HUC8_name

down_load_dir = 'downloads'
model_gis = 'gis'
mfpth6 = '../Executables/mf6.1.1/bin/mf6.exe'

file_dict['download_dir'] = down_load_dir
file_dict['gis_dir'] = model_gis
file_dict['modflow_path'] = mfpth6

In [4]:
if not os.path.exists(down_load_dir):
    os.mkdir(down_load_dir)

if not os.path.exists(model_gis):
    os.mkdir(model_gis)

If the 250K HUC index shape file does not exist in the user directory, download it. 

In [5]:
url = 'https://water.usgs.gov/GIS/dsdl/huc250k_shp.zip'
huc_shapefile_name = url.split('/')[-1].split('.')[0]
huc_shapefile_name = os.path.join(down_load_dir, huc_shapefile_name)

if not os.path.exists(huc_shapefile_name):
    gmu.download_and_extract(url, destination=down_load_dir)

huc_map = gpd.read_file(huc_shapefile_name)

Check to see if the requested name exists in the HUC index map. Count how many basins match that name. 

In [6]:
selection = huc_map[huc_map.HUC_NAME.str.contains(HUC8_name)].HUC_NAME

prelim_gdf = huc_map.set_index('HUC_NAME', drop=False)
prelim_gdf = prelim_gdf.loc[selection, :]
prelim_gdf.to_crs(crs='geog', inplace=True)
prelim_gdf['ID'] = (np.arange(prelim_gdf.shape[0]) + 1)
prelim_gdf['menu'] = prelim_gdf.ID.astype(str) +  ' - ' + prelim_gdf.HUC_NAME

number_of_shapes = len(selection)

print('There are {} basin shapes that contain the HUC8_name you provided.'.format(number_of_shapes))

There are 1 basin shapes that contain the HUC8_name you provided.


The following map can be used to identify basins.  Hover the cursor over each outlined basin to see its identity. There is a "sweet spot" in each basin where the cursor has to be to get the identity, and you may have to move the cursor around a bit to find it. The basin will turn red if it has been identified. Note that this step does not select the basin, it only helps identify which one(s) you want. Basins are selected from the drop-down menu below.

In [13]:
a = prelim_gdf.bounds[['miny', 'minx']].min()
b = prelim_gdf.bounds[['maxy', 'maxx']].max()
bounds = [a.to_list(), b.to_list()]

m = Map(basemap=basemaps.OpenTopoMap, dragging=True, scroll_wheel_zoom=True)
m.fit_bounds(bounds)

sel_data = GeoData(geo_dataframe = prelim_gdf,
                   style={'color': 'black', 'fillColor': 'none'},
                   hover_style={'fillColor': 'red' , 'fillOpacity': 0.5})
m.add_layer(sel_data)

html = HTML('''Hover over a watershed''')
html.layout.margin = '0px 20px 20px 20px'
control = WidgetControl(widget=html, position='topright')
m.add_control(control)

def update_html(feature, **kwargs):
    html.value = '''
        <h3>{} {}</h3>
        <h4>HUC8: {} </h4> 
        <h4>Area: {:,.0f} km<sup>2<sup></h4> 
                '''.format(feature['properties']['ID'],
                           feature['id'],
               feature['properties']['HUC_CODE'],
               feature['properties']['AREA'] / 1E+06)

sel_data.on_hover(update_html)

m

Map(center=[0.0, 0.0], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_out_t…

Select basins from the follwoing drop-down menu. Select multiple basins using control + right-click. After making the selection, move the cursor to the next cell manually. Executing this cell after the selection is made will remove the selection. 

In [8]:
widget = SelectMultiple(
    options=prelim_gdf.menu,
    rows=int(number_of_shapes),
    description='Matching HUCs',
    disabled=False,
    style={'description_width': 'initial'}
)

widget

SelectMultiple(description='Matching HUCs', options=('1 - Saco',), rows=1, style=DescriptionStyle(description_…

Execute the next bunch of code cells to download the data and save it.

In [14]:
huc_list = widget.value
final = prelim_gdf.loc[prelim_gdf['menu'].isin(huc_list)]
huc4 = final.HUC_CODE.str[0:4].unique()
if len(huc4) > 1:
    print('The basins you selected are not all in the same HUC4 basin, and they should be')
else:
    try:
        huc4 = huc4[0]
        file_dict['huc4'] = huc4
        print('The selected HUC4 is {}'.format(huc4))
        finish = True
    except IndexError:
        print('Go back to the drop-down menu and select one or more basins')
        finish = False

The selected HUC4 is 0106


The next cell downloads some of the files needed for general models. The files can be quite large and downloading may take several minutes. Please be patient.

In [15]:
if finish:
    nhd_src = 'https://prd-tnm.s3.amazonaws.com/StagedProducts/Hydrography/NHDPlusHR/Beta/GDB'

    vector_url = nhd_src + '/NHDPLUS_H_{}_HU4_GDB.zip'.format(huc4)
    vector_name = vector_url.split('/')[-1].split('.')[0] + '.gdb'
    file_dict['vector_name'] = os.path.join(down_load_dir, vector_name)

    raster_url = nhd_src + '/NHDPLUS_H_{}_HU4_RASTER.7z'.format(huc4)
    raster_name = 'HRNHDPlusRasters{}'.format(huc4)
    file_dict['raster_name'] = os.path.join(down_load_dir, raster_name)

    if not os.path.exists(file_dict['vector_name']):
        print('Downloading NHD high resolution vector data from \n{}'.format(vector_name))
        gmu.download_and_extract(vector_url, destination=down_load_dir)
    else:
        print('NHD high resolution vector data already exists')

    if not os.path.exists(file_dict['raster_name']):
        print('Downloading NHD high resolution raster data from \n{}'.format(raster_name))
        gmu.download_and_extract(raster_url, destination=down_load_dir)
    else:
        print('NHD high resolution raster data already exists')
    
else:
    print('Go back to the drop-down menu and select one or more basins')

NHD high resolution vector data already exists
NHD high resolution raster data already exists


In [16]:
if finish:
# Read the High Resolution NHDPlus file that was just downloaded. 
    WBDHU8 = gpd.read_file(os.path.join(down_load_dir, vector_name), layer='WBDHU8')

# Select the selected basins from the High Resolution NHDPlus files.  These might be the same 
# basin shapes as in the index map, but there is no guarantee they will be the same. The High 
# Resolution maps will be updated more frequently.     
    basins = WBDHU8.loc[WBDHU8['HUC8'].isin(final.HUC_CODE)]

# If more than one basin was selected, dissolve them into one polygon. 
    geom = gpd.GeoSeries(basins.geometry.unary_union)

# Make a data frame from the dissolved polygons and add the ibound code needed to make a general model. 
    domain = gpd.GeoDataFrame(geometry=geom, crs=basins.crs)
    domain['ibound'] = 1

# Write the domain shapefile to disk.
    domain_name = os.path.join(model_gis, 'domain_' + '_'.join(basins.Name.tolist()))
    file_dict['domain_name'] = domain_name
    domain.to_file(domain_name)

else:
    print('Go back to the drop-down menu and select one or more basins')

Download and unpack the National Groundwater Model for this HUC4. The rest of the cells should run without user interaction.

In [17]:
ngwm_url = 'https://water.usgs.gov/GIS/dsdl/gwmodels/zell2020_wrr/models.{}.zip'.format(huc4[:2])
print('Downloading the National Groundwater Model input data from \n{}'.format(url))
file_list = gmu.download_and_extract(ngwm_url, id=huc4, destination=down_load_dir)

fl = [item.split('/')[0] for item in file_list ]
model_ws = max(fl, key=fl.count)
model_ws = os.path.join(down_load_dir, model_ws)

file_dict['ngwm_dir'] = model_ws

Downloading the National Groundwater Model input data from 
https://water.usgs.gov/GIS/dsdl/huc250k_shp.zip
Downloading started


In [18]:
print ('Reading model information')


sim_ws = os.path.join(model_ws, 'mfsim.nam')

print("   working model is {}".format(model_ws))

ml = fp.mf6.MFSimulation.load(sim_name=sim_ws, version='mf6', exe_name=mfpth6, 
                              sim_ws=model_ws, strict=True, verbosity_level=0, 
                              load_only=['ic', 'npf', 'dis', 'rch', 'drn'], 
                              verify_data=False)

model = ml.get_model()
rtd = rtd_ut.RTD_util(ml, 'flow', 'rt')

ic = model.get_package('ic')
npf = model.get_package('npf')
dis = model.get_package('dis')
rch = model.get_package('rch')

try:
    sto = model.get_package('sto')
except:
    print('no storage package')

drn = model.get_package('drn')

nlay, nrow, ncol = dis.nlay.array, dis.nrow.array, dis.ncol.array
nper = ml.get_package('tdis').nper.array

delr = np.unique(dis.delr.array)
assert delr.shape[0] == 1, 'I cannot make a raster from variable grid spacing'
delr = delr[0]

delc = np.unique(dis.delc.array)
assert delc.shape[0] == 1, 'I cannot make a raster from variable grid spacing'
delc = delc[0]

assert delc == delr, 'I cannot make a raster from variable grid spacing'

print ('   ... done') 

Reading model information
   working model is downloads\0104_0106_0107_0109_MF6_SS_Unconfined_250
   ... done


Make a dictionary of model layer properties

In [19]:
prop_dict = dict()

prop_dict['strt'] = ic.strt.array

prop_dict['top'] = dis.top.array
prop_dict['botm'] = dis.botm.array
prop_dict['idomain'] = dis.idomain.array
prop_dict['thick'] = dis.top.array - dis.botm.array

prop_dict['recharge'] = rch.recharge.array

prop_dict['k'] = npf.k.array
prop_dict['k22'] = npf.k22.array
prop_dict['k22overk'] = npf.k22overk.array
prop_dict['k33'] = npf.k33.array
prop_dict['k33overk'] = npf.k33overk.array

try:
    prop_dict['ss'] = sto.ss.array
    prop_dict['sy'] = sto.sy.array
except:
    print('no storage information')

no storage information


Loop through the model properties dictionary and put each layer that has a time associated with it into a separate dictionary entry.

In [20]:
outputs = model.simulation_data.mfdata.output_keys()

for output in outputs:
    sim_name, source, flow_component = output
    output_data = model.simulation_data.mfdata[output]
    if flow_component not in ['FLOW-JA-FACE', 'RCH']:
        if output_data.ndim == 2:
            ntime = output_data.shape[0]
            for per in range(ntime):
                output_df = pd.DataFrame.from_records(output_data[per])
                output_df.set_index('node', inplace=True)
                output_df = output_df.reindex(index=np.arange(nlay * nrow * ncol))
                label = '{}_per_{}'.format(flow_component, per)
                prop_dict[label] = output_df['q'].values.reshape(nrow, ncol)
        elif output_data.ndim == 4:
            ntime = output_data.shape[0]
            nnlay = output_data.shape[1]
            for per in range(ntime):
                for l in range(nnlay):
                    label = '{}_per_{}_layer_{}'.format(flow_component, per, l)
                    prop_dict[label] = output_data[per, l, ...]

Get the model geographic reference file, parse it into a dictionary and create a blank model raster image in geographic coordinates

In [21]:
with open(os.path.join(model_ws, 'usgs.model.reference'), 'r') as f:
    x = f.readlines()
ref_dict = dict([item.strip().split(' ', 1) for item in x])

kwargs = {'theta': np.float32(ref_dict['rotation']),
         'origin': [np.float32(ref_dict['xul']), np.float32(ref_dict['yul'])],
         'LX': delr,
         'LY': delc,
         'nrow': nrow,
         'ncol': ncol, 
         'output_raster_proj': ref_dict['proj4']}

mf_grid = gmu.SourceProcessing()
mf_grid.create_raster(**kwargs)

In [22]:
idomain = prop_dict['idomain']

for key, value in prop_dict.items():
    value = np.float32(value)
    try:
        ndim = value.ndim

        if ndim == 2:
            value[idomain[0, ...] != 1] = np.nan
            mf_grid.old_array = value
            dst = '{}.tif'.format(key)
            dst_pth = os.path.join(model_ws, dst)
            mf_grid.write_raster(dst_pth)

        elif ndim == 3:
            value[idomain != 1] = np.nan
            nlayers = value.shape[-3]
            for l in range(nlayers):
                mf_grid.old_array = value[l, ...]
                dst = '{}_lay_{}.tif'.format(key, l)
                dst_pth = os.path.join(model_ws, dst)
                mf_grid.write_raster(dst_pth)

        elif ndim == 4:
            nlayers = value.shape[-3]
            ntime = value.shape[-4]          
            for sper in range(ntime):
                value[sper, idomain != 1] = np.nan
                for l in range(nlayers):
                    mf_grid.old_array = value[sper, l, ...]
                    dst = '{}_lay_{}_sper_{}.tif'.format(key, l, sper)
                    dst_pth = os.path.join(model_ws, dst)
                    mf_grid.write_raster(dst_pth)
                    
        else:
            print('Unknown number of dimensions for ' + key)

    except AttributeError:
        print('no GeoTiff created for', key, ', probably because there was no model input for it')

Unknown number of dimensions for k22
Unknown number of dimensions for k22overk
Unknown number of dimensions for k33overk


In [23]:
# Write the metadata dictionary to disk.    
with open('GenMod_metadata.txt', 'w') as file:
     file.write(json.dumps(file_dict)) 