# MEP Preprossessing

In this notebook I manipulate watershed boundary layers used for the Massachusetts Estuaries Project, within [MEP study area](https://www.mass.gov/guides/the-massachusetts-estuaries-project-and-reports). 

1. Regroup subwatershed layers that were split by travel time.
2. Calculate the elevation percentile in subs (Lid_Sub_ZS)
3. Classify subwatersheds by elevation percentile (ele5pct_poly)
4. Intersect elevation classes with subwatersheds (sub_le5pct)
5. Intersect elevation classified subwatersheds with tax parcel data (subs_le5_tax)

Tax parcel data can then be used to generate land cover classifications within subwatersheds uplands and terminal zones (seepage faces) of subwatersheds.

NOTE: whenever you set up a new ArcGIS Pro project with python for batch processing make sure to uncheck `options > geoprocessing > 'add output datasets to open map'` this will save RAM and prevent crashes when you are looping through many files. 

## Next Steps

Consider doing more lidar based metrics such as topographix wetness index or topographic openess index

Consider adding slope. 

Consider summarizing landuse for older years. 

## Data 

Publication
Carlson, C.S., Masterson, J.P., Walter, D.A., and Barbaro, J.R., 2017, Development of simulated groundwater-contributing areas to selected streams, ponds, coastal water bodies, and production wells in the Plymouth-Carver region and Cape Cod, Massachusetts: U.S. Geological Survey Data Series 1074, 17 p. https://doi.org/10.3133/ds1074

Dataset: 
Carlson, C.S., Masterson, J.P., Walter, D.A., and Barbaro, J.R., 2017, Simulated groundwater-contributing areas to selected streams, ponds, coastal water bodies, and production wells, Plymouth-Carver region and Cape Cod, Massachusetts: U.S. Geological Survey data release, https://doi.org/10.5066/F7V69H2Z.


In [35]:
# this codeblock sets up the environment from jupyter notebooks
setup_notebook = "C:/Users/Adrian.Wiegman/Documents/GitHub/Wiegman_USDA_ARS/MEP/_Setup.ipynb"
%run $setup_notebook # magic command to run the notebook 

***
loading python modules...

  `module_list` contains names of all loaded modules

...module loading complete

***
loading user defined functions...

type `fn_`+TAB to for autocomplete suggestions

 the object `def_list` contains user defined function names:
   fn_get_info
   fn_arcgis_table_to_df
   fn_arcgis_table_to_np_to_pd_df
   fn_try_mkdir
   fn_hello
   fn_recursive_glob_search
   fn_regex_search_replace
   fn_regex_search_0
   fn_arcpy_table_to_excel
   fn_agg_sum_df_on_group
   fn_add_prefix_suffix_to_selected_cols
   fn_calc_pct_cover_within_groups

 use ??{insert fn name} to inspect
 for example running `??fn_get_info` returns:
[1;31mSignature:[0m [0mfn_get_info[0m[1;33m([0m[0mname[0m[1;33m=[0m[1;34m'fn_get_info'[0m[1;33m)[0m[1;33m[0m[1;33m[0m[0m
[1;31mSource:[0m   
[1;32mdef[0m [0mfn_get_info[0m[1;33m([0m[0mname[0m[1;33m=[0m[1;34m'fn_get_info'[0m[1;33m)[0m[1;33m:[0m[1;33m
[0m    [1;34m'''
    returns the source information about a 

In [3]:
# test function
fn_regex_search_0('Mystic Lake GT10 E','\w+10')
fn_regex_search_replace('MysticLakeGT10E','\wT10','')
#fn_regex_search_replace('Mystic Lake  E','  ',' ')

'MysticLakeE'

In [4]:
# make a working copy 
copyfile = r"C:\Workspace\Geodata\MEP\outputs\MEP_Subwatersheds_All_Copy.shp"
original = r"C:\Workspace\Geodata\MEP\outputs\MEP_Subwatersheds_All.shp"
#original = r"C:\Workspace\Geodata\Massachusetts\MEP\CC_MV_Subwatersheds\Subwatersheds.shp"
arcpy.management.Copy(original, copyfile, "ShapeFile", None)

In [5]:
# dissolve the MEP subwatersheds data
outfile = os.path.join("MEP_Subwatersheds_Dissolve")
arcpy.management.Dissolve(copyfile, outfile, "FID", None, "MULTI_PART", "DISSOLVE_LINES")

make a new feature class for subwatershed travel time. 

In [6]:
# make a new feature class for subwatershed travel time. 
fn_string = """def fn_regex_search_0 (string,pattern,noneVal="NA"):
    '''
    returns the first match of a regular expression pattern search on a string
    '''
    import re
    x = re.search(pattern,string)
    if x is None: 
        x= [noneVal]    
    return(x[0])
    """
arcpy.management.CalculateField(copyfile,
                                "Travel_Tim",
                                "fn_regex_search_0(!SUBWATER_D!,'\wT10','NA')",
                                "PYTHON3",
                                fn_string, "TEXT", "NO_ENFORCE_DOMAINS")

make a new subwatershed name field that excludes travel time

In [7]:
# make a new subwatershed name field that excludes travel time
fn_string = """def fn_regex_search_replace(string,pattern,replacement):
    '''
    returns the a string with a pattern substituted by a replacement
    '''
    import re
    x = re.sub(pattern,replacement,string)
    return(x)"""
newField = "SUBW_NAME"
arcpy.management.CalculateField(copyfile,
                                newField,
                                """fn_regex_search_replace(!SUBWATER_N!,"\wT10.*","")""", 
                                "PYTHON3",
                                fn_string,
                                "TEXT",
                                "NO_ENFORCE_DOMAINS")

In [8]:
# dissolve subwatersheds by subwatershed name.
arcpy.management.Dissolve(copyfile,
                          "MEP_SUBW_NAME", 
                          "SUBW_NAME", None, "MULTI_PART", "DISSOLVE_LINES")

In [None]:
# extract the statewide lidar dataset with mask of subwatersheds
raster = r"C:\Workspace\Geodata\Massachusetts\LiDAR_DEM\LiDAR_DEM.gdb\LiDAR_DEM_INT_16bit\Band_1"
mask = "MEP_Subwatersheds_Dissolve"

lidar_extr = arcpy.sa.ExtractByMask(raster,mask)

In [7]:
# this code worked
out_raster = arcpy.sa.ExtractByMask(
    in_raster=r"C:\Workspace\Geodata\Massachusetts\LiDAR_DEM\LiDAR_DEM.gdb\LiDAR_DEM_INT_16bit\Band_1",
    in_mask_data="MEP_Subwatersheds_Dissolve",
    extraction_area="INSIDE",
    analysis_extent='-7924126.67244911 5048924.87130968 -7783827.32899976 5173789.77853474 PROJCS["WGS_1984_Web_Mercator_Auxiliary_Sphere",GEOGCS["GCS_WGS_1984",DATUM["D_WGS_1984",SPHEROID["WGS_1984",6378137.0,298.257223563]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]],PROJECTION["Mercator_Auxiliary_Sphere"],PARAMETER["False_Easting",0.0],PARAMETER["False_Northing",0.0],PARAMETER["Central_Meridian",0.0],PARAMETER["Standard_Parallel_1",0.0],PARAMETER["Auxiliary_Sphere_Type",0.0],UNIT["Meter",1.0]]'
)
out_raster.save(r"C:\Workspace\Geodata\MEP\Default.gdb\Extract_LiDA1")

ExecuteError: ERROR 160333: The table was not found.
Failed to execute (ExtractByMask).


In [12]:
outname = "Extract_LiDA1"
lidar_extr.save(outname)

zonal stats to calc 5th percentile of elevation in each subcatchment

In [9]:
copyfile = r"C:\Workspace\Geodata\MEP\outputs\MEP_Subwatersheds_All_Copy.shp"
poly = copyfile
zonefield = "SUBW_NAME"
pct = 5  # 5% percentile
outname = "Lid_Sub_ZS"
Lid_Sub_ZS = arcpy.ia.ZonalStatistics(poly, 
                                      zonefield, 
                                      raster, 
                                      "PERCENTILE", 
                                      "DATA", 
                                      "CURRENT_SLICE", 
                                      pct, 
                                      "AUTO_DETECT"); 
Lid_Sub_ZS.save(os.path.join(outname))

In [9]:
a = "Extract_LiDA1"
b = "Lid_Sub_ZS"
print(a,b)
lidar_le5pct = arcpy.ia.LessThanEqual(a,b); 
lidar_le5pct.save("lidar_bog_le5pct")

Extract_LiDA1 Lid_Sub_ZS


In [10]:
# convert raster of lidar_le5pct to polygon
outfile = "lidar_bog_le5pct_poly"
poly = arcpy.conversion.RasterToPolygon("lidar_bog_le5pct", outfile, "SIMPLIFY", "VALUE", "SINGLE_OUTER_PART", None)

In [13]:
myfun = """def fn(x):
    y = "GT5%"
    if x == 1: y = "LE5%"
    return(y)"""
# rename the field gridcode 
arcpy.management.CalculateField(outfile, 
                                "ele5pct", 
                                "fn(!gridcode!)", 
                                "PYTHON3", 
                                myfun, "TEXT", "NO_ENFORCE_DOMAINS")
#arcpy.management.AlterField(outfile, 'gridcode', 'ElevLE5pct', 'Elev <= 5% percentile')

In [14]:
# dissolve new polygon layer by elevation class 
arcpy.management.Dissolve("ele5pct_poly",
                          "ele5pct_poly_diss", 
                          "ele5pct", None, "MULTI_PART", "DISSOLVE_LINES")

In [19]:
# compute the identity (intersection) of elevation poly and watershed poly
infeat = "ele5pct_poly_diss"
identfeat = copyfile
outname = "subs_tt_le5pct"
arcpy.analysis.Identity(infeat, identfeat, 
                        outname, "ALL", None, "NO_RELATIONSHIPS")

In [20]:
# compute the identity (intersection) of elevation poly and watershed poly
# without travel times
infeat = "ele5pct_poly_diss"
identfeat = "MEP_SUBW_NAME"
outname = "subs_le5pct"
arcpy.analysis.Identity(infeat, identfeat, 
                        outname, "ALL", None, "NO_RELATIONSHIPS")

In [21]:
# compute the identity (intersection) of elevation poly and watershed poly
# without travel times
infeat = "subs_le5pct"
identfeat = "MEP_TaxPar"
outname = "subs_le5_tax"
arcpy.analysis.Identity(infeat, identfeat, 
                        outname, "ALL", None, "NO_RELATIONSHIPS")

In [22]:
# compute the identity (intersection) of elevation poly and watershed poly
# with travel times
infeat = "subs_tt_le5pct"
identfeat = "MEP_TaxPar"
outname = "subs_tt_le5_tax"
arcpy.analysis.Identity(infeat, identfeat, 
                        outname, "ALL", None, "NO_RELATIONSHIPS")

In [30]:
# intersect sheds with soil
arcpy.analysis.Identity("subs_le5pct", 
                        r"C:\Workspace\Geodata\Massachusetts\Soils_MassGIS.gdb\SOILS_MUPOLYGON_TOP20", 
                        r"C:\Workspace\Geodata\MEP\Default.gdb\subs_le5pct_soil20", "ALL", None, "NO_RELATIONSHIPS")

In [31]:
# intersect sheds with soil
arcpy.analysis.Identity("subs_tt_le5pct", 
                        r"C:\Workspace\Geodata\Massachusetts\Soils_MassGIS.gdb\SOILS_MUPOLYGON_TOP20", 
                        r"C:\Workspace\Geodata\MEP\Default.gdb\subs_tt_le5_soil20", "ALL", None, "NO_RELATIONSHIPS")

In [24]:
# compute the itentity intersection of the output with the LCLU layer.
infeat = "subs_le5pct"
identfeat = r"C:\Workspace\Geodata\Massachusetts\lclu_gdb\MA_LCLU2016.gdb\LANDCOVER_LANDUSE_POLY"
outname = 'subs_le5_lclu16'
arcpy.analysis.Identity(infeat, identfeat, outname, "ALL", None, "NO_RELATIONSHIPS")

In [32]:
# compute the itentity intersection of the output with the LCLU layer.
infeat = "subs_tt_le5pct"
identfeat = r"C:\Workspace\Geodata\Massachusetts\lclu_gdb\MA_LCLU2016.gdb\LANDCOVER_LANDUSE_POLY"
outname = 'subs_tt_le5_lclu16'
arcpy.analysis.Identity(infeat, identfeat, outname, "ALL", None, "NO_RELATIONSHIPS")

In [25]:
# NEED TO UPDATE CRANBERRY LAYER WITH MORE RECENT LAND USE LAND COVER DATA
# make a working copy of cranberry layer 
original = "C:\Workspace\Geodata\Massachusetts\WMAbogsDRAFT2013\WMAbogsDRAFT2013.shp"

# make a new column identifying all polygons as cranberry
arcpy.management.CalculateField(
    in_table=original,
    field="CRANBERRY",
    expression="1",
    expression_type="PYTHON3",
    code_block="",
    field_type="TEXT",
    enforce_domains="NO_ENFORCE_DOMAINS"
)

# make a new column identifying actively farmed cranberry
arcpy.management.CalculateField(
    in_table=original,
    field="ACTIVE",
    expression="fn(!CropStatus!)",
    expression_type="PYTHON3",
    code_block="""def fn(x):
    if x=='active':
        return '1'
    else:
        return '0'""",
    field_type="TEXT",
    enforce_domains="NO_ENFORCE_DOMAINS"
)

copyfile = "Cranberry_Copy"
arcpy.management.Copy(original, os.path.join(odr,copyfile), "ShapeFile", None)

In [26]:
# intersect cranberry layer with elevation subs. 
infeat = "subs_le5pct"
identfeat = "Cranberry_Copy"
outname = "subs_le5_cran"
arcpy.analysis.Identity(infeat,identfeat,outname,join_attributes="ALL",cluster_tolerance=None,relationship="NO_RELATIONSHIPS")

In [130]:
# intersect cranberry layer with elevation subs. 
# compute the itentity intersection of the output with the LCLU layer.
infeat = "subs_tt_le5pct"
identfeat = "Cranberry_Copy"
outname = "subs_tt_le5_cran"
arcpy.analysis.Identity(infeat,identfeat,outname,join_attributes="ALL",cluster_tolerance=None,relationship="NO_RELATIONSHIPS")

In [None]:
# 2023-03-30 RESUME HERE! 
# export feature table data 
print(def_list)

# the functions below provide options to export feature tables 
??fn_arcpy_table_to_excel
#fn_arcpy_table_to_excel(inFeaturePath,outTablePath=odr,outTableName="SUBS_TaxParAssess.xlsx")

??fn_arcgis_table_to_df #this one works better

??fn_arcgis_table_to_np_to_pd_df # this one doesn't work very well

In [66]:
infeat = 'subs_tt_le5pct'
field_names = [f.name for f in arcpy.ListFields(infeat)]
print(field_names)

# convert feature table to pandas data frame
df = fn_arcgis_table_to_df(in_fc=infeat)

# remove unwanted columns
selected_fields = ['OBJECTID','SUBW_NAME','LOC_ID','ele5pct','Travel_Tim',"EMBAY_NAME","SUBWATER_N",'CropStatus', 'FID_Cranberry_Copy','WMA_NO','BOG_NAME','COMMENT','OWNER_FIRS', 'OWNER_LAST','COMMENT','CRANBERRY', 'ACTIVE', 'Shape_Length', 'Shape_Area']
df_select =df.filter(selected_fields,axis=1) # filter columns on index; use axis = 1 for cols use axis = 0 for rows)

# save pickle
#df.to_pickle(os.path.join(odr,'df_'+infeat+'.pkl'))
df_select.to_pickle(os.path.join(odr,'df_'+infeat+'_select.pkl'))

# save csv
#df.to_csv(os.path.join(odr,'df_'+infeat+'.csv'))
df_select.to_csv(os.path.join(odr,'df_'+infeat+'_select.csv'))

['OBJECTID', 'Shape', 'FID_ele5pct_poly_diss', 'ele5pct', 'FID_MEP_Subwatersheds_All_Copy', 'sf_area', 'SubWat_Num', 'New_ID', 'ft_perim', 'SubW_Names', 'SHED_ID', 'Terminal_N', 'F_', 'Area_Ha', 'MAD_HBR_ID', 'Sub_Name', 'AREA', 'PERIMETER', 'NAME', 'Subwatersh', 'sqft_perim', 'sqft_area', 'WS_Num', 'p', 'a', 'OBJECTID_1', 'ISLAND', 'Shape_Leng', 'Subw_New', 'mt_perim', 'sm_area', 'Id', 'MapID', 'EelRiv_WS', 'PlyHar_WS', 'SUB_WS', 'REV_SUBW', 'SUBW_NAME', 'Subw', 'SubID', 'ACRES', 'FID_nb_wat', 'WATERSHED2', 'WATERSHED3', 'FID_acushn', 'fst_SHED_I', 'OBJECTID_12', 'SUBWATER_I', 'SUBWATER_N', 'SUBWATER_D', 'EMBAY_ID', 'EMBAY_NAME', 'EMBAY_DISP', 'X_Centroid', 'Y_Centroid', 'Acreage', 'GeoString', 'S_N', 'Travel_Tim', 'Shape_Length', 'Shape_Area']


In [131]:
infeat = "subs_tt_le5_cran"
field_names = [f.name for f in arcpy.ListFields(infeat)]
print(field_names)

# convert feature table to pandas data frame
df = fn_arcgis_table_to_df(in_fc=infeat)

# remove unwanted columns
selected_fields = ['OBJECTID','SUBW_NAME','LOC_ID','ele5pct','Travel_Tim',"EMBAY_NAME","SUBWATER_N",'CropStatus','WMA_NO','BOG_NAME','COMMENT','Owner','OWNER_FIRS', 'OWNER_LAST','COMMENT','CRANBERRY', 'ACTIVE', 'Shape_Length', 'Shape_Area']
df_select =df.filter(selected_fields,axis=1) # filter columns on index; use axis = 1 for cols use axis = 0 for rows)

# save pickle
#df.to_pickle(os.path.join(odr,'df_'+infeat+'.pkl'))
df_select.to_pickle(os.path.join(odr,'df_'+infeat+'_select.pkl'))

# save csv
#df.to_csv(os.path.join(odr,'df_'+infeat+'.csv'))
df_select.to_csv(os.path.join(odr,'df_'+infeat+'_select.csv'))

['OBJECTID', 'Shape', 'FID_subs_tt_le5pct', 'FID_ele5pct_poly_diss', 'ele5pct', 'FID_MEP_Subwatersheds_All_Copy', 'sf_area', 'SubWat_Num', 'New_ID', 'ft_perim', 'SubW_Names', 'SHED_ID', 'Terminal_N', 'F_', 'Area_Ha', 'MAD_HBR_ID', 'Sub_Name', 'AREA', 'PERIMETER', 'NAME', 'Subwatersh', 'sqft_perim', 'sqft_area', 'WS_Num', 'p', 'a', 'OBJECTID_1', 'ISLAND', 'Shape_Leng', 'Subw_New', 'mt_perim', 'sm_area', 'Id', 'MapID', 'EelRiv_WS', 'PlyHar_WS', 'SUB_WS', 'REV_SUBW', 'SUBW_NAME', 'Subw', 'SubID', 'ACRES', 'FID_nb_wat', 'WATERSHED2', 'WATERSHED3', 'FID_acushn', 'fst_SHED_I', 'OBJECTID_12', 'SUBWATER_I', 'SUBWATER_N', 'SUBWATER_D', 'EMBAY_ID', 'EMBAY_NAME', 'EMBAY_DISP', 'X_Centroid', 'Y_Centroid', 'Acreage', 'GeoString', 'S_N', 'Travel_Tim', 'FID_Cranberry_Copy', 'ID_1', 'WMA_NO', 'OWNER', 'ADDRESS', 'TOWN', 'REGION', 'BOG_NAME', 'REGAREA', 'CERTAREA', 'CREDITAREA', 'PERMITAREA', 'TOTAREA', 'STAFF', 'PROGRAM', 'DATE_ENTER', 'COMMENT', 'BIRTHREG', 'AREA_1', 'PERIMETER_1', 'PERMIT_NUM', 'OWN

In [60]:
infeat = 'subs_tt_le5_tax'
field_names = [f.name for f in arcpy.ListFields(infeat)]
print(field_names)

# convert feature table to pandas data frame
df = fn_arcgis_table_to_df(in_fc=infeat)

# remove unwanted columns
selected_fields = ['OBJECTID','SUBW_NAME','LOC_ID','ele5pct','Travel_Tim',"EMBAY_NAME","SUBWATER_N","Shape_Length","Shape_Area"]
df_select =df.filter(selected_fields,axis=1) # filter columns on index; use axis = 1 for cols use axis = 0 for rows)

# save pickle
#df.to_pickle(os.path.join(odr,'df_'+infeat+'.pkl'))
df_select.to_pickle(os.path.join(odr,'df_'+infeat+'_select.pkl'))

# save csv
#df.to_csv(os.path.join(odr,'df_'+infeat+'.csv'))
df_select.to_csv(os.path.join(odr,'df_'+infeat+'_select.csv'))

['OBJECTID', 'Shape', 'FID_subs_tt_le5pct', 'FID_ele5pct_poly_diss', 'ele5pct', 'FID_MEP_Subwatersheds_All_Copy', 'sf_area', 'SubWat_Num', 'New_ID', 'ft_perim', 'SubW_Names', 'SHED_ID', 'Terminal_N', 'F_', 'Area_Ha', 'MAD_HBR_ID', 'Sub_Name', 'AREA', 'PERIMETER', 'NAME', 'Subwatersh', 'sqft_perim', 'sqft_area', 'WS_Num', 'p', 'a', 'OBJECTID_1', 'ISLAND', 'Shape_Leng', 'Subw_New', 'mt_perim', 'sm_area', 'Id', 'MapID', 'EelRiv_WS', 'PlyHar_WS', 'SUB_WS', 'REV_SUBW', 'SUBW_NAME', 'Subw', 'SubID', 'ACRES', 'FID_nb_wat', 'WATERSHED2', 'WATERSHED3', 'FID_acushn', 'fst_SHED_I', 'OBJECTID_12', 'SUBWATER_I', 'SUBWATER_N', 'SUBWATER_D', 'EMBAY_ID', 'EMBAY_NAME', 'EMBAY_DISP', 'X_Centroid', 'Y_Centroid', 'Acreage', 'GeoString', 'S_N', 'Travel_Tim', 'FID_MEP_TaxPar', 'MAP_PAR_ID', 'LOC_ID', 'POLY_TYPE', 'MAP_NO', 'SOURCE', 'PLAN_ID', 'LAST_EDIT', 'BND_CHK', 'NO_MATCH', 'TOWN_ID', 'MERGE_SRC', 'Shape_Length', 'Shape_Area']


In [57]:
infeat = 'subs_tt_le5_soil20'
field_names = [f.name for f in arcpy.ListFields(infeat)]
print(field_names)

# convert feature table to pandas data frame
df = fn_arcgis_table_to_df(in_fc=infeat)

# remove unwanted columns
selected_fields = ['OBJECTID','SUBW_NAME','LOC_ID','ele5pct','Travel_Tim',"EMBAY_NAME","SUBWATER_N","Shape_Length","Shape_Area", # watershed attributes
                  "COMPNAME","SLOPE","SLOPE_1","FRMLNDCLS",'HYDROLGRP','HYDRCRATNG','DRAINCLASS','DEP2WATTBL',
                   'ROADS', 'SEPTANKAF', 'SLOPE_1', 'FLOODING', 'PONDING', 'CORCONCRET',
                  'PHWATER', 'CLAY', 'KSAT', 'OM', 'SAND', 'NLEACHING'] #soil attributes
df_select =df.filter(selected_fields,axis=1) # filter columns on index; use axis = 1 for cols use axis = 0 for rows)

# save pickle
#df.to_pickle(os.path.join(odr,'df_'+infeat+'.pkl'))
df_select.to_pickle(os.path.join(odr,'df_'+infeat+'_select.pkl'))

# save csv
#df.to_csv(os.path.join(odr,'df_'+infeat+'.csv'))
df_select.to_csv(os.path.join(odr,'df_'+infeat+'_select.csv'))

['OBJECTID', 'Shape', 'FID_subs_tt_le5pct', 'FID_ele5pct_poly_diss', 'ele5pct', 'FID_MEP_Subwatersheds_All_Copy', 'sf_area', 'SubWat_Num', 'New_ID', 'ft_perim', 'SubW_Names', 'SHED_ID', 'Terminal_N', 'F_', 'Area_Ha', 'MAD_HBR_ID', 'Sub_Name', 'AREA', 'PERIMETER', 'NAME', 'Subwatersh', 'sqft_perim', 'sqft_area', 'WS_Num', 'p', 'a', 'OBJECTID_1', 'ISLAND', 'Shape_Leng', 'Subw_New', 'mt_perim', 'sm_area', 'Id', 'MapID', 'EelRiv_WS', 'PlyHar_WS', 'SUB_WS', 'REV_SUBW', 'SUBW_NAME', 'Subw', 'SubID', 'ACRES', 'FID_nb_wat', 'WATERSHED2', 'WATERSHED3', 'FID_acushn', 'fst_SHED_I', 'OBJECTID_12', 'SUBWATER_I', 'SUBWATER_N', 'SUBWATER_D', 'EMBAY_ID', 'EMBAY_NAME', 'EMBAY_DISP', 'X_Centroid', 'Y_Centroid', 'Acreage', 'GeoString', 'S_N', 'Travel_Tim', 'FID_SOILS_MUPOLYGON_TOP20', 'AREASYMBOL', 'SPATIALVER', 'MUSYM', 'MUKEY', 'SS_AREA', 'MUSYM_AREA', 'SLOPE', 'AREANAME', 'MUNAME', 'COMPNAME', 'MUKIND', 'FRMLNDCLS', 'HYDRCRATNG', 'DRAINCLASS', 'MINSURFTEXT', 'TFACTOR', 'AWS100', 'AWS25', 'DEP2WATTBL',

In [55]:
infeat = 'subs_tt_le5_lclu16'
field_names = [f.name for f in arcpy.ListFields(infeat)]
print(field_names)

# convert feature table to pandas data frame
df = fn_arcgis_table_to_df(in_fc=infeat)

# remove unwanted columns
selected_fields = ['OBJECTID','SUBW_NAME','LOC_ID','ele5pct','Travel_Tim',"EMBAY_NAME","SUBWATER_N","Shape_Length","Shape_Area", # watershed attributes
                  "USE_CODE","USEGENCODE","COVERCODE","COVERNAME",'USEGENNAME'] #land use attributes
df_select =df.filter(selected_fields,axis=1) # filter columns on index; use axis = 1 for cols use axis = 0 for rows)

# save pickle
#df.to_pickle(os.path.join(odr,'df_'+infeat+'.pkl'))
df_select.to_pickle(os.path.join(odr,'df_'+infeat+'_select.pkl'))

# save csv
#df.to_csv(os.path.join(odr,'df_'+infeat+'.csv'))
df_select.to_csv(os.path.join(odr,'df_'+infeat+'_select.csv'))

['OBJECTID', 'Shape', 'FID_subs_tt_le5pct', 'FID_ele5pct_poly_diss', 'ele5pct', 'FID_MEP_Subwatersheds_All_Copy', 'sf_area', 'SubWat_Num', 'New_ID', 'ft_perim', 'SubW_Names', 'SHED_ID', 'Terminal_N', 'F_', 'Area_Ha', 'MAD_HBR_ID', 'Sub_Name', 'AREA', 'PERIMETER', 'NAME', 'Subwatersh', 'sqft_perim', 'sqft_area', 'WS_Num', 'p', 'a', 'OBJECTID_1', 'ISLAND', 'Shape_Leng', 'Subw_New', 'mt_perim', 'sm_area', 'Id', 'MapID', 'EelRiv_WS', 'PlyHar_WS', 'SUB_WS', 'REV_SUBW', 'SUBW_NAME', 'Subw', 'SubID', 'ACRES', 'FID_nb_wat', 'WATERSHED2', 'WATERSHED3', 'FID_acushn', 'fst_SHED_I', 'OBJECTID_12', 'SUBWATER_I', 'SUBWATER_N', 'SUBWATER_D', 'EMBAY_ID', 'EMBAY_NAME', 'EMBAY_DISP', 'X_Centroid', 'Y_Centroid', 'Acreage', 'GeoString', 'S_N', 'Travel_Tim', 'FID_LANDCOVER_LANDUSE_POLY', 'COVERNAME', 'COVERCODE', 'USEGENNAME', 'USEGENCODE', 'USE_CODE', 'POLY_TYPE', 'FY', 'TOWN_ID', 'TILENAME', 'Shape_Length', 'Shape_Area']


In [43]:
infeat = 'subs_le5pct'
field_names = [f.name for f in arcpy.ListFields(infeat)]
print(field_names)

# convert feature table to pandas data frame
df = fn_arcgis_table_to_df(in_fc=infeat)

# remove unwanted columns
selected_fields = ['OBJECTID','SUBW_NAME','ele5pct','Travel_Tim',"EMBAY_NAME","SUBWATER_N","Shape_Length","Shape_Area"]
df_select =df.filter(selected_fields,axis=1) # filter columns on index; use axis = 1 for cols use axis = 0 for rows)

# save pickle
df.to_pickle(os.path.join(odr,'df_'+infeat+'.pkl'))
df_select.to_pickle(os.path.join(odr,'df_'+infeat+'_select.pkl'))

# save csv
df.to_csv(os.path.join(odr,'df_'+infeat+'.csv'))
df_select.to_csv(os.path.join(odr,'df_'+infeat+'_select.csv'))

['OBJECTID', 'Shape', 'FID_ele5pct_poly_diss', 'ele5pct', 'FID_MEP_SUBW_NAME', 'SUBW_NAME', 'Shape_Length', 'Shape_Area']


In [44]:
# Join table with tax parcel assessor data
df_select.head()

Unnamed: 0_level_0,SUBW_NAME,ele5pct,Shape_Length,Shape_Area
OBJECTID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,,GT5%,259724.234855,12666.37
2,,LE5%,49691.267685,3250.521
3,,GT5%,508668.060522,127282400.0
4,4Ponds,GT5%,14768.315875,1404234.0
5,AbnerPond,GT5%,3851.691608,335584.3


make a new feature class subwatershed ids exluding travel time

make new sub watershed layer that combines subwatersheds that were split by travel time

In [None]:
# Appendix

In [None]:
## Unused code snippets