# StreamStats & VA DEQ Data Cleaning
The purpose of this Jupyter notebook is to load data collected from the StreamStats Batch Processor based on the VADEQ dataset. The two datasets will be joined and cleaned.

Emma Reilly Oare

In [181]:
# Standard Libraries
import os  # File handling and directory management
import warnings  # Suppress warnings

# Data Handling
import pandas as pd  # Data manipulation and analysis
import numpy as np  # Numerical operations

# Visualization
import matplotlib.pyplot as plt  # Basic plotting
import seaborn as sns  # Statistical visualization

# Suppress warnings
warnings.filterwarnings("ignore")

In [182]:
# Show all columns
pd.set_option('display.max_columns', None) 

## Step 1. Import dataset
The loaded datasets are based on multiple sources. **First**, the embeddedness measurements are pulled from the Virginia DEQ datasets provided to Dr. Jonathan Czuba. Some of the data was in a June, 2023 dataset (PHAB) and some of the data is from Dec, 2023 (Probablistic Monitoring) dataset. **Second**, we are brining in ecoregion categories for geographical context. The stations from these two datasets were run through the StreamStats batch processor to quickly calculate different characteristic attributes. **The second, third, and fourth datasets** are directly from StreamStats.

A description of the StreamStats dataset can be found here: https://streamstats.usgs.gov/information-portal/

In [183]:
# Define base directory
BASE_DIR = os.path.abspath(os.path.join(os.getcwd(), "../../.."))

In [184]:
# Read in VA DEQ embeddedness Excel File
vadeq_path = os.path.join(BASE_DIR, "data", "processed", "virginia", "uncleaned_vadeq_emd_joined.xlsx")
vadeq_emd = pd.read_excel(vadeq_path).drop(columns = ['Unnamed: 0'])

In [185]:
# Read in ecoregion information
eco_path = os.path.join(BASE_DIR, "data", "raw", "virginia", "vadeq_ecoregion.xlsx")
vadeq_eco = pd.read_excel(eco_path)

In [186]:
# Read in StreamStats excel files with initial trimming
# Initial trimming was completed in Excel (mostly metadata trimming)
wshds_path = os.path.join(BASE_DIR, "data", "processed", "virginia", "StreamStats", "globalwatersheds_TRIMMED1.xlsx")
flw_path = os.path.join(BASE_DIR, "data", "processed", "virginia", "StreamStats", "flowstats_TRIMMED1.xlsx")
chr_path = os.path.join(BASE_DIR, "data", "processed", "virginia", "StreamStats", "characteristics_TRIMMED1.xlsx")
strmsts_wshds = pd.read_excel(wshds_path)
strmsts_flw = pd.read_excel(flw_path)
strmsts_chr = pd.read_excel(chr_path)

#Print largest dataset
strmsts_flw

Unnamed: 0,Name,RegionName,StatName,Value,Units
0,2-BGU005.67,Piedmont_nonMesozoic_2011_5144,50-percent AEP flood,885.00,cubic feet per second
1,2-BGU005.67,Piedmont_nonMesozoic_2011_5144,2-percent AEP flood,4750.00,cubic feet per second
2,2-BGU005.67,Piedmont_nonMesozoic_2011_5143,1 Day 1.25 Year Low Flow,2.46,cubic feet per second
3,2-BGU005.67,Piedmont_nonMesozoic_2011_5143,4 Day 1.25 Year Low Flow,4.18,cubic feet per second
4,2-BGU005.67,Piedmont_nonMesozoic_2011_5143,7 Day 1.11 Year Low Flow,6.79,cubic feet per second
...,...,...,...,...,...
150567,6CSFH084.73,Valley_and_Ridge_2011_5144,4-percent AEP flood,12300.00,cubic feet per second
150568,6CSFH084.73,Valley_and_Ridge_2011_5144,10-percent AEP flood,8990.00,cubic feet per second
150569,6CSFH084.73,Valley_and_Ridge_2011_5144,20-percent AEP flood,6750.00,cubic feet per second
150570,6CSFH084.73,Valley_and_Ridge_2011_5144,42.9-percent AEP flood,4400.00,cubic feet per second


In [187]:
# Read in Reference site information
ref_path = os.path.join(BASE_DIR, "data", "processed", "virginia", "VADEQ_ref_sites_join.xlsx")
vadeq_ref = pd.read_excel(ref_path).drop_duplicates(subset = ['StationID', 'Date']).drop(columns = ['Unnamed: 0'])
vadeq_ref

Unnamed: 0,StationID,Date,ReachLength,Slope_x,RP100,BR_PCT,HP_PCT,RC_PCT,BL_PCT,CB_PCT,GC_PCT,GF_PCT,SA_PCT_x,FN_PCT_x,WD_PCT,OT_PCT,BL_CB_GR_PCT,SA_FN_PCT_x,TotSubstrate_PCT,LSUB_DMM_x,VLW_msq,Xdepth,Xwid,XBKF_W,BKF_depth_in_meters,BKFW_BKFD,incised_depth,Xembed,LRBS2,Year,Slope [m/m],u* [m/s],RefStress
0,1AACO006.10,2006-11-21,440,0.220,34.752500,7.692308,6.730769,0.000000,9.615385,14.423077,35.576923,6.730769,16.346154,2.884615,0.000000,0.0,66.346154,19.230769,100.0,1.582869,0.001897,66.12,14.690476,18.636364,1.123018,16.594890,2.547564,62.545455,0.517308,2006,0.00220,0.155682,
1,1AACO004.84,2008-06-25,320,0.521,25.757012,0.000000,0.952381,0.000000,5.714286,27.619048,26.666667,14.285714,10.476190,7.619048,6.666667,0.0,74.285714,18.095238,100.0,1.165237,0.004859,56.22,9.580952,15.372727,0.989473,15.536282,2.275836,51.636364,-0.245939,2008,0.00521,0.224882,
2,1AACO006.10,2008-06-26,520,0.173,35.429907,2.857143,7.619048,0.952381,4.761905,21.904762,22.857143,12.380952,20.952381,2.857143,2.857143,0.0,61.904762,23.809524,100.0,1.327436,0.000897,63.57,13.819048,17.800000,1.112973,15.993204,2.990245,56.818182,0.374672,2008,0.00173,0.137436,
3,1AACO009.14,2008-06-26,560,0.223,22.451871,2.857143,4.761905,0.000000,9.523810,18.095238,28.571429,10.476190,17.142857,1.904762,6.666667,0.0,66.666667,19.047619,100.0,1.437585,0.005595,54.89,12.790476,15.681818,0.857991,18.277371,2.221627,63.454545,0.459602,2008,0.00223,0.137002,
4,1AAUA017.60,2005-09-22,160,0.400,19.910507,0.000000,0.000000,0.000000,7.619048,34.285714,20.000000,5.714286,28.571429,3.809524,0.000000,0.0,67.619048,32.380952,100.0,1.081350,0.008752,31.73,7.790476,10.759091,0.753664,14.275720,2.717300,65.454545,-0.086419,2005,0.00400,0.171970,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1120,2BXRK001.64,2022-10-26,150,1.287,9.486085,0.952381,5.714286,0.952381,0.000000,13.333333,10.476190,12.380952,53.333333,2.857143,0.000000,0.0,36.190476,56.190476,100.0,0.480012,0.001413,18.97,2.066667,2.736364,0.589700,4.640264,1.562427,67.090909,-1.157233,2022,0.01287,0.272860,Ref
1121,2-JKS070.06,2022-09-12,800,0.450,25.065800,9.523810,0.000000,0.000000,5.714286,30.476190,38.095238,2.857143,3.809524,9.523810,0.000000,0.0,77.142857,13.333333,100.0,1.545757,0.002030,68.57,20.238095,29.218182,1.540245,18.969822,2.949336,27.636364,-0.052227,2022,0.00450,0.260757,
1122,4ASNA007.82,2022-09-20,320,0.430,15.910317,0.000000,0.000000,0.000000,0.000000,0.000000,3.809524,7.619048,80.000000,6.666667,1.904762,0.0,11.428571,86.666667,100.0,-0.402036,0.007645,34.39,7.576190,13.836364,1.057536,13.083582,2.298445,95.909091,-1.817226,2022,0.00430,0.211211,
1123,8-MIC001.47,2022-10-25,150,0.300,10.014865,0.000000,9.708738,0.000000,0.000000,0.000000,0.000000,0.000000,56.310680,32.038835,1.941748,0.0,0.000000,88.349515,100.0,-0.582559,0.005560,26.58,2.919048,4.390909,0.556709,7.887260,1.247618,89.090909,-1.544161,2022,0.00300,0.128000,


## Step 2. Trim Datasets
In this model, we are specifically testing if data only derived from the StreamStats processor can predict embeddedness for a site *and* if they can provide insight into residuals off the baseline curve. Therefore, we are going to only retain data in the VDEQ set that can a) provide the target value, b) identify the specific site and measurement, and c) help us double check the StreamStats output.

In [188]:
# Delete any columns with null values for vadeq
vadeq_trim = vadeq_emd.dropna(axis = "columns")

# Rename columns with suffixes
vadeq_trim = vadeq_trim.rename(columns = {'Slope_x':'Slope',
                                         'SA_PCT_x':'SA_PCT',
                                         'FN_PCT_x':'FN_PCT',
                                         'SA_FN_PCT_x':'SA_FN_PCT',
                                         'LSUB_DMM_x':'LSUB_DMM'})
vadeq_ref = vadeq_ref.rename(columns = {'Slope_x':'Slope',
                                         'SA_PCT_x':'SA_PCT',
                                         'FN_PCT_x':'FN_PCT',
                                         'SA_FN_PCT_x':'SA_FN_PCT',
                                         'LSUB_DMM_x':'LSUB_DMM'})

# Display trimmed dataset
vadeq_trim

Unnamed: 0,StationID,Date,ReachLength,Slope,RP100,BR_PCT,HP_PCT,RC_PCT,BL_PCT,CB_PCT,GC_PCT,GF_PCT,SA_PCT,FN_PCT,WD_PCT,OT_PCT,BL_CB_GR_PCT,SA_FN_PCT,TotSubstrate_PCT,LSUB_DMM,VLW_msq,Xdepth,Xwid,XBKF_W,BKF_depth_in_meters,BKFW_BKFD,incised_depth,Xembed,LRBS2,Year
0,1AACO006.10,2006-11-21,440,0.220,34.752500,7.692308,6.730769,0.000000,9.615385,14.423077,35.576923,6.730769,16.346154,2.884615,0.000000,0.0,66.346154,19.230769,100.0,1.582869,0.001897,66.12,14.690476,18.636364,1.123018,16.594890,2.547564,62.545455,0.517308,2006
1,1AACO004.84,2008-06-25,320,0.521,25.757012,0.000000,0.952381,0.000000,5.714286,27.619048,26.666667,14.285714,10.476190,7.619048,6.666667,0.0,74.285714,18.095238,100.0,1.165237,0.004859,56.22,9.580952,15.372727,0.989473,15.536282,2.275836,51.636364,-0.245939,2008
2,1AACO006.10,2008-06-26,520,0.173,35.429907,2.857143,7.619048,0.952381,4.761905,21.904762,22.857143,12.380952,20.952381,2.857143,2.857143,0.0,61.904762,23.809524,100.0,1.327436,0.000897,63.57,13.819048,17.800000,1.112973,15.993204,2.990245,56.818182,0.374672,2008
3,1AACO009.14,2008-06-26,560,0.223,22.451871,2.857143,4.761905,0.000000,9.523810,18.095238,28.571429,10.476190,17.142857,1.904762,6.666667,0.0,66.666667,19.047619,100.0,1.437585,0.005595,54.89,12.790476,15.681818,0.857991,18.277371,2.221627,63.454545,0.459602,2008
4,1AAUA017.60,2005-09-22,160,0.400,19.910507,0.000000,0.000000,0.000000,7.619048,34.285714,20.000000,5.714286,28.571429,3.809524,0.000000,0.0,67.619048,32.380952,100.0,1.081350,0.008752,31.73,7.790476,10.759091,0.753664,14.275720,2.717300,65.454545,-0.086419,2005
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1120,2BXRK001.64,2022-10-26,150,1.287,9.486085,0.952381,5.714286,0.952381,0.000000,13.333333,10.476190,12.380952,53.333333,2.857143,0.000000,0.0,36.190476,56.190476,100.0,0.480012,0.001413,18.97,2.066667,2.736364,0.589700,4.640264,1.562427,67.090909,-1.157233,2022
1121,2-JKS070.06,2022-09-12,800,0.450,25.065800,9.523810,0.000000,0.000000,5.714286,30.476190,38.095238,2.857143,3.809524,9.523810,0.000000,0.0,77.142857,13.333333,100.0,1.545757,0.002030,68.57,20.238095,29.218182,1.540245,18.969822,2.949336,27.636364,-0.052227,2022
1122,4ASNA007.82,2022-09-20,320,0.430,15.910317,0.000000,0.000000,0.000000,0.000000,0.000000,3.809524,7.619048,80.000000,6.666667,1.904762,0.0,11.428571,86.666667,100.0,-0.402036,0.007645,34.39,7.576190,13.836364,1.057536,13.083582,2.298445,95.909091,-1.817226,2022
1123,8-MIC001.47,2022-10-25,150,0.300,10.014865,0.000000,9.708738,0.000000,0.000000,0.000000,0.000000,0.000000,56.310680,32.038835,1.941748,0.0,0.000000,88.349515,100.0,-0.582559,0.005560,26.58,2.919048,4.390909,0.556709,7.887260,1.247618,89.090909,-1.544161,2022


In [189]:
# Drop any nulls from StreamStats datasets (and rename for joining)
wshds_trim = strmsts_wshds.dropna(axis = "columns").rename(columns = {'Name':'StationID'})

# Display  df
wshds_trim

Unnamed: 0,StationID,RELIEF,STATSCLY30,LC01IMP,STATSOM2_6,PDIGMET,LC06FORSHB,LC11GRASS,PKREGNO,BRMETA,STATSCLY40,PRECIP,STATSPERM,STATSCLY50,STATSCLY60,LC01DEV,LC06GRASS,LC06WATER,LC11IMP,STATSWATCP,I24H2Y,STATOM19_8,LC06CRPHAY,LC11WATER,LC06IMP,LC06DEV,LC11CRPHAY,LFREGNO,LC11DEV,LC01FORSHB,VRPLSLC,VRCARB,LC06WETLND,STATSGODEP,MINBELEV,LC11FORSHB,LC11WETLND,DRNAREA,LC01BARE,ELEV,LC11BARE,CPSED,LC01CRPHAY,STATSCLAY10,STATSCLY20,LC06BARE,LC01WATER,LC01HERB,STATSOM0_5,ELEVMAX,LC01WETLND,Shape_Length,Shape_Area
0,2-BGU005.67,242.0,0.00,0.27,0.00,100.00,67.53,6.44,1551,0.00,100.00,45.370,1.434,0.00,0.00,3.16,5.37,0.15,0.29,0.098,3.307,0,20.12,0.15,0.29,3.19,19.89,1545,3.27,70.51,0.00,0.00,3.45,62.80,280.85,66.54,3.54,18.40,0.10,416.25,0.17,0.00,20.14,0.00,0.00,0.19,0.12,2.46,100.00,522.63,3.51,47460,47622600
1,2-SNN001.19,195.0,0.00,0.58,0.00,100.00,65.18,7.36,1551,0.00,86.99,45.244,1.525,13.01,0.00,4.34,0.58,1.06,0.66,0.000,3.304,0,27.31,1.12,0.55,4.30,26.89,1545,4.17,63.83,0.00,0.00,1.58,65.08,407.11,58.84,1.62,1.96,0.14,520.17,0.00,0.00,27.78,0.00,0.00,0.00,0.32,2.00,100.00,601.83,1.59,12180,5080300
2,2BXBD000.12,195.0,3.09,0.51,2.96,99.45,40.57,1.39,1551,0.00,66.48,43.216,1.341,30.43,0.00,5.03,1.07,0.42,0.96,0.112,3.288,0,52.89,0.46,0.55,5.05,51.81,1545,6.19,40.65,0.00,0.00,0.00,66.44,137.11,39.99,0.17,1.81,0.00,242.66,0.00,0.00,52.97,0.00,0.00,0.00,0.46,0.89,97.04,332.23,0.00,13520,4684500
3,2-PWT001.23,260.0,0.00,13.93,0.00,45.90,25.81,0.19,1551,0.00,97.45,44.378,1.854,0.00,0.00,67.72,0.28,0.49,16.40,0.116,3.344,0,1.47,0.44,14.66,69.37,0.61,1545,73.80,27.16,0.00,0.00,2.57,65.99,113.19,22.60,2.32,10.60,0.00,263.68,0.00,54.10,1.56,0.00,2.55,0.00,0.49,0.45,100.00,373.67,2.63,35340,27437300
4,2-WTK001.50,224.0,0.00,0.26,0.00,80.50,76.66,6.83,1551,0.00,94.10,45.016,1.582,4.67,0.00,3.05,3.18,0.25,0.27,0.109,3.321,0,12.71,0.25,0.26,3.04,12.56,1545,3.06,71.91,0.00,0.00,3.97,64.50,158.64,73.12,3.96,25.60,0.04,265.79,0.23,0.00,14.45,1.23,0.00,0.18,0.17,6.33,100.00,382.37,4.05,53920,66302000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1130,9-RDC013.79,1980.0,0.00,1.57,28.95,0.00,52.13,0.94,1554,0.00,14.65,40.307,2.595,0.00,47.81,7.08,0.32,0.04,1.83,0.114,2.538,0,39.98,0.04,1.71,7.32,39.80,1547,7.53,51.87,56.77,43.23,0.12,52.98,1980.38,51.50,0.12,244.00,0.04,2522.11,0.07,0.00,40.73,0.00,37.54,0.08,0.04,0.19,71.05,3964.73,0.05,186400,631519100
1131,8-NAR006.15,1070.0,45.72,0.61,4.43,81.50,66.37,2.37,1550,4.84,35.66,44.224,1.757,13.48,0.00,4.68,1.81,4.53,0.70,0.114,3.294,0,19.65,4.52,0.62,4.75,19.05,1544,6.39,66.33,0.00,0.00,2.67,58.57,40.52,64.19,3.25,461.00,0.16,341.34,0.24,7.26,20.20,1.68,3.46,0.22,4.53,1.36,95.57,1112.72,2.73,296320,1193433800
1132,8-MTA012.09,219.0,0.07,0.34,0.00,93.57,75.82,2.73,1551,0.00,99.93,43.597,1.534,0.00,0.00,4.14,0.53,0.22,0.45,0.099,3.222,0,13.02,0.27,0.35,4.15,12.64,1545,5.19,73.59,0.00,0.00,6.13,63.57,183.41,70.88,8.02,43.50,0.05,288.55,0.28,6.43,13.57,0.00,0.00,0.13,0.23,2.24,100.00,402.71,6.18,64580,112758900
1133,8-HER012.99,85.0,0.00,0.47,0.00,0.00,67.26,2.72,1550,0.00,100.00,44.194,1.000,0.00,0.00,5.91,1.42,0.33,0.59,0.000,3.258,0,19.55,0.45,0.46,5.88,19.90,1544,6.62,67.76,0.00,0.00,5.56,78.00,126.15,62.77,6.99,5.54,0.00,189.37,0.57,100.00,19.93,0.00,0.00,0.00,0.33,0.38,100.00,211.20,5.68,24660,14355300


In [190]:
# Pivot flow table
flw_pvt = strmsts_flw.pivot_table(index = "Name",
                               columns = "StatName",
                               values = "Value").reset_index()
# Drop any nulls from StreamStats datasets (and rename for joining)
flw_trim = flw_pvt.dropna(axis = "columns").rename(columns = {'Name':'StationID'})

# Print new df to check pivoting visually
flw_trim

StatName,StationID,1 Day 1.11 Year Low Flow,1 Day 1.25 Year Low Flow,1 Day 1.43 Year Low Flow,1 Day 1.67 Year Low Flow,1 Day 2 Year Low Flow,10-percent AEP flood,2-percent AEP flood,20-percent AEP flood,30 Day 1.11 Year Low Flow,30 Day 1.25 Year Low Flow,30 Day 1.43 Year Low Flow,4 Day 1.11 Year Low Flow,4 Day 1.25 Year Low Flow,4-percent AEP flood,7 Day 1.11 Year Low Flow,7 Day 1.25 Year Low Flow,7 Day 1.43 Year Low Flow,7 Day 1.67 Year Low Flow,Bieger_D_channel_cross_sectional_area,Bieger_D_channel_depth,Bieger_D_channel_width,Bieger_USA_channel_cross_sectional_area,Bieger_USA_channel_depth,Bieger_USA_channel_width,Urban 0.2-percent AEP flood,Urban 0.5-percent AEP flood,Urban 1-percent AEP flood,Urban 10-percent AEP flood,Urban 2-percent AEP flood,Urban 20-Percent AEP flood,Urban 4-percent AEP flood,Urban 42.9-percent AEP flood,Urban 50-percent AEP flood,Urban 66.7-percent AEP flood,Urban 80-percent AEP flood,Urban 90-percent AEP flood,Urban 95-percent AEP flood,Urban 99-percent AEP flood,Urban 99.5-percent AEP flood
0,1AACO004.84,6.723333,4.653333,3.446667,2.620000,1.960000,2853.333333,5380.000000,2009.333333,14.666667,10.523333,7.973333,9.220000,6.893333,4193.333333,11.246667,8.433333,6.696667,5.3900,233.00,3.240,70.60,126.00,2.650,45.60,31866.666667,17166.666667,12633.333333,5350.000000,9886.666667,3886.666667,8006.666667,2400.000000,2140.000000,1543.333333,1206.666667,891.000000,782.000000,536.666667,487.666667
1,1AACO006.10,0.012467,0.007940,0.005353,0.003673,0.002477,71.533333,166.366667,45.000000,0.039167,0.027933,0.020600,0.033533,0.024733,118.666667,0.041400,0.029733,0.022600,0.0172,2.66,0.523,5.05,4.07,0.685,4.86,426.333333,351.333333,293.666667,136.333333,219.666667,106.866667,167.333333,84.400000,78.800000,78.666667,65.633333,63.500000,54.800000,64.933333,64.200000
2,1AACO009.14,6.106667,4.216667,3.123333,2.366667,1.766667,2736.666667,5190.000000,1925.666667,13.266667,9.520000,7.210000,8.476667,6.346667,4036.666667,10.343333,7.773333,6.183333,4.9800,216.00,3.140,67.50,119.00,2.590,43.80,29866.666667,16133.333333,11900.000000,5053.333333,9313.333333,3676.666667,7540.000000,2276.666667,2033.333333,1476.666667,1153.333333,855.666667,751.000000,520.000000,474.000000
3,1AAUA017.60,5.760000,4.010000,3.000000,2.300000,1.740000,3200.000000,6040.000000,2250.000000,11.300000,8.210000,6.260000,8.200000,6.330000,4710.000000,10.200000,7.940000,6.460000,5.3100,188.00,2.970,62.10,107.00,2.480,40.90,11500.000000,8680.000000,6746.666667,2803.333333,5493.333333,1996.666667,4196.666667,1273.333333,1133.333333,892.333333,734.333333,561.666667,446.000000,319.666667,284.333333
4,1ABAR038.57,25.000000,20.900000,18.300000,16.100000,14.400000,8460.000000,14200.000000,6340.000000,34.800000,28.000000,24.000000,26.300000,21.900000,11600.000000,26.900000,22.300000,19.400000,17.3000,487.00,4.380,109.00,222.00,3.310,65.80,27633.333333,22600.000000,17466.666667,7520.000000,14100.000000,5410.000000,10900.000000,3433.333333,3050.000000,2390.000000,1920.000000,1460.000000,1150.000000,810.000000,714.000000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1130,9-XES000.94,0.112000,0.085500,0.069900,0.058100,0.048500,287.000000,587.000000,190.000000,0.170000,0.123000,0.098500,0.115000,0.088100,442.000000,0.119000,0.090300,0.073600,0.0614,13.50,1.020,13.20,14.20,1.120,11.00,1336.666667,922.000000,738.666667,304.000000,587.666667,221.333333,442.666667,151.333333,137.333333,117.666667,100.366667,82.800000,68.700000,59.400000,55.400000
1131,9-XFH000.92,0.046300,0.029700,0.020600,0.014500,0.010100,209.000000,478.000000,129.000000,0.077200,0.050000,0.035300,0.047300,0.030800,343.000000,0.050900,0.032800,0.022900,0.0162,6.86,0.770,8.82,8.43,0.912,7.81,380.000000,240.666667,194.666667,69.433333,166.000000,47.200000,118.333333,30.233333,26.900000,22.200000,20.633333,15.766667,12.366667,9.433333,8.563333
1132,9-XFM000.17,0.117000,0.089500,0.073100,0.060900,0.050800,295.000000,603.000000,196.000000,0.178000,0.129000,0.103000,0.120000,0.092100,454.000000,0.124000,0.094500,0.077000,0.0643,13.90,1.030,13.40,14.50,1.130,11.10,3350.000000,2443.333333,1940.000000,946.666667,1413.333333,755.333333,1136.666667,570.333333,531.333333,483.666667,377.666667,348.666667,314.666667,334.000000,326.333333
1133,9-XFZ000.08,0.210000,0.141000,0.101000,0.073800,0.053200,507.000000,1110.000000,321.000000,0.346000,0.233000,0.170000,0.218000,0.148000,811.000000,0.235000,0.158000,0.114000,0.0838,18.00,1.140,15.60,17.60,1.220,12.60,980.333333,661.000000,530.333333,197.000000,446.666667,135.333333,324.000000,86.633333,76.966667,62.900000,56.633333,43.233333,33.866667,25.333333,22.833333


Now we will pivot and trim the characteristics table. This table has watershed areas, which we will be able to compare to some of the drainage areas that are available in the ProbMon dataset. **We will not use this dataset for model building directly.**

In [191]:
#Pivot characteristics table based on Region and Statistic
chr_pvt = strmsts_chr.pivot_table(index='Name', 
                                  columns = ['RegionName',
                                             'StatName'], 
                                  values = 'Value').reset_index()

# Drop any nulls from StreamStats datasets (and rename for joining)
chr_trim = chr_pvt.dropna(axis = "columns").rename(columns = {'Name':'StationID'})

# Display df
chr_trim

RegionName,StationID,Peak_Urban01_2014_5090,Peak_Urban01_2014_5090,Peak_Urban06_2014_5090,Peak_Urban06_2014_5090,Peak_Urban11_2014_5090,Peak_Urban11_2014_5090,USA_Bieger_2015
StatName,Unnamed: 1_level_1,Area that drains to a point on a stream,Percentage of land-use from NLCD 2001 classes 21-24,Area that drains to a point on a stream,Percentage of land-use from NLCD 2006 classes 21-24,Area that drains to a point on a stream,Percentage of developed (urban) land from NLCD 2011 classes 21-24,Area that drains to a point on a stream
0,1AACO004.84,40.5000,70.30,40.5000,71.00,40.5000,73.50,40.5000
1,1AACO006.10,0.0702,57.00,0.0702,56.90,0.0702,83.80,0.0702
2,1AACO009.14,36.3000,71.40,36.3000,71.80,36.3000,73.50,36.3000
3,1AAUA017.60,29.7000,12.30,29.7000,13.20,29.7000,15.70,29.7000
4,1ABAR038.57,115.0000,8.71,115.0000,8.70,115.0000,8.80,115.0000
...,...,...,...,...,...,...,...,...
1130,9-XES000.94,0.7100,36.20,0.7100,42.60,0.7100,46.10,0.7100
1131,9-XFH000.92,0.2700,5.66,0.2700,5.12,0.2700,4.24,0.2700
1132,9-XFM000.17,0.7400,97.10,0.7400,98.50,0.7400,98.70,0.7400
1133,9-XFZ000.08,1.0600,3.31,1.0600,3.58,1.0600,6.11,1.0600


## Step 3. Join Tables

We will join the five key tables--VADEQ embeddedness, global watershed, reference sites, ecoregion info, and flow--into one dataset for initial EDA.

In [192]:
common_columns

['StationID']

In [193]:
# Use a left join for the VADEQ set and watershed characteristics
common_columns = list(set(vadeq_trim.columns) & set(wshds_trim.columns))
join1 = vadeq_trim.merge(wshds_trim, 
                        on = common_columns, 
                        how = "left")

# Print first join
join1

Unnamed: 0,StationID,Date,ReachLength,Slope,RP100,BR_PCT,HP_PCT,RC_PCT,BL_PCT,CB_PCT,GC_PCT,GF_PCT,SA_PCT,FN_PCT,WD_PCT,OT_PCT,BL_CB_GR_PCT,SA_FN_PCT,TotSubstrate_PCT,LSUB_DMM,VLW_msq,Xdepth,Xwid,XBKF_W,BKF_depth_in_meters,BKFW_BKFD,incised_depth,Xembed,LRBS2,Year,RELIEF,STATSCLY30,LC01IMP,STATSOM2_6,PDIGMET,LC06FORSHB,LC11GRASS,PKREGNO,BRMETA,STATSCLY40,PRECIP,STATSPERM,STATSCLY50,STATSCLY60,LC01DEV,LC06GRASS,LC06WATER,LC11IMP,STATSWATCP,I24H2Y,STATOM19_8,LC06CRPHAY,LC11WATER,LC06IMP,LC06DEV,LC11CRPHAY,LFREGNO,LC11DEV,LC01FORSHB,VRPLSLC,VRCARB,LC06WETLND,STATSGODEP,MINBELEV,LC11FORSHB,LC11WETLND,DRNAREA,LC01BARE,ELEV,LC11BARE,CPSED,LC01CRPHAY,STATSCLAY10,STATSCLY20,LC06BARE,LC01WATER,LC01HERB,STATSOM0_5,ELEVMAX,LC01WETLND,Shape_Length,Shape_Area
0,1AACO006.10,2006-11-21,440,0.220,34.752500,7.692308,6.730769,0.000000,9.615385,14.423077,35.576923,6.730769,16.346154,2.884615,0.000000,0.0,66.346154,19.230769,100.0,1.582869,0.001897,66.12,14.690476,18.636364,1.123018,16.594890,2.547564,62.545455,0.517308,2006,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0
1,1AACO004.84,2008-06-25,320,0.521,25.757012,0.000000,0.952381,0.000000,5.714286,27.619048,26.666667,14.285714,10.476190,7.619048,6.666667,0.0,74.285714,18.095238,100.0,1.165237,0.004859,56.22,9.580952,15.372727,0.989473,15.536282,2.275836,51.636364,-0.245939,2008,451.0,48.13,21.97,0.00,87.78,25.10,0.01,1550.0,0.0,0.02,43.888,2.395,0.00,0.00,70.34,0.01,0.25,23.10,0.133,3.167,0.0,0.28,0.26,22.19,71.05,0.11,1544.0,73.50,25.82,0.00,0.00,3.27,66.65,39.50,22.49,3.50,40.5000,0.03,308.44,0.17,11.98,0.28,0.0,51.85,0.05,0.25,0.01,100.00,490.18,3.28,85900.0,105013700.0
2,1AACO006.10,2008-06-26,520,0.173,35.429907,2.857143,7.619048,0.952381,4.761905,21.904762,22.857143,12.380952,20.952381,2.857143,2.857143,0.0,61.904762,23.809524,100.0,1.327436,0.000897,63.57,13.819048,17.800000,1.112973,15.993204,2.990245,56.818182,0.374672,2008,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0
3,1AACO009.14,2008-06-26,560,0.223,22.451871,2.857143,4.761905,0.000000,9.523810,18.095238,28.571429,10.476190,17.142857,1.904762,6.666667,0.0,66.666667,19.047619,100.0,1.437585,0.005595,54.89,12.790476,15.681818,0.857991,18.277371,2.221627,63.454545,0.459602,2008,357.0,45.34,21.16,0.00,95.20,24.49,0.00,1551.0,0.0,0.02,43.981,2.271,0.00,0.00,71.37,0.00,0.28,22.00,0.135,3.169,0.0,0.26,0.28,21.37,71.85,0.09,1545.0,73.50,24.98,0.00,0.00,3.12,66.39,132.81,22.65,3.52,36.3000,0.01,324.56,0.00,4.53,0.25,0.0,54.64,0.01,0.28,0.00,100.00,490.18,3.11,73540.0,93979700.0
4,1AAUA017.60,2005-09-22,160,0.400,19.910507,0.000000,0.000000,0.000000,7.619048,34.285714,20.000000,5.714286,28.571429,3.809524,0.000000,0.0,67.619048,32.380952,100.0,1.081350,0.008752,31.73,7.790476,10.759091,0.753664,14.275720,2.717300,65.454545,-0.086419,2005,278.0,40.11,1.53,0.11,99.38,71.52,2.12,1551.0,0.0,59.60,43.317,1.506,0.00,0.00,12.29,1.89,0.25,1.91,0.112,3.055,0.0,8.59,0.25,1.75,13.23,7.74,1545.0,15.70,72.78,0.00,0.00,4.47,56.77,165.53,67.67,6.40,29.7000,0.05,317.16,0.10,0.38,8.68,0.0,0.30,0.04,0.25,1.42,99.89,443.72,4.53,57900.0,76884600.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1120,2BXRK001.64,2022-10-26,150,1.287,9.486085,0.952381,5.714286,0.952381,0.000000,13.333333,10.476190,12.380952,53.333333,2.857143,0.000000,0.0,36.190476,56.190476,100.0,0.480012,0.001413,18.97,2.066667,2.736364,0.589700,4.640264,1.562427,67.090909,-1.157233,2022,575.0,100.00,0.17,0.00,0.00,95.66,0.00,1553.0,100.0,0.00,49.307,2.000,0.00,0.00,4.15,0.00,0.00,0.19,0.000,3.723,0.0,0.00,0.00,0.20,4.34,0.00,1546.0,4.66,95.85,0.00,0.00,0.00,68.00,574.73,95.34,0.00,0.7600,0.00,796.84,0.00,0.00,0.00,0.0,0.00,0.00,0.00,0.00,100.00,1149.30,0.00,8240.0,1967300.0
1121,2-JKS070.06,2022-09-12,800,0.450,25.065800,9.523810,0.000000,0.000000,5.714286,30.476190,38.095238,2.857143,3.809524,9.523810,0.000000,0.0,77.142857,13.333333,100.0,1.545757,0.002030,68.57,20.238095,29.218182,1.540245,18.969822,2.949336,27.636364,-0.052227,2022,2550.0,0.00,0.09,8.97,0.00,77.61,0.22,1554.0,0.0,23.43,43.163,5.089,0.00,7.39,4.29,0.04,0.02,0.11,0.097,2.959,0.0,17.99,0.02,0.09,4.28,17.98,1547.0,4.29,77.42,88.28,11.72,0.02,44.64,1825.15,77.43,0.02,104.0000,0.02,2821.90,0.04,0.00,18.21,0.0,69.18,0.03,0.02,0.00,91.03,4377.44,0.02,125400.0,270367300.0
1122,4ASNA007.82,2022-09-20,320,0.430,15.910317,0.000000,0.000000,0.000000,0.000000,0.000000,3.809524,7.619048,80.000000,6.666667,1.904762,0.0,11.428571,86.666667,100.0,-0.402036,0.007645,34.39,7.576190,13.836364,1.057536,13.083582,2.298445,95.909091,-1.817226,2022,691.0,0.00,0.64,0.00,94.61,62.42,8.79,1551.0,0.0,9.51,45.609,1.114,90.49,0.00,5.25,7.14,0.31,0.67,0.101,3.319,0.0,23.28,0.32,0.65,5.27,23.17,1545.0,5.26,62.84,0.00,0.00,1.42,71.04,393.33,60.89,1.49,79.9000,0.19,604.47,0.09,0.00,23.51,0.0,0.00,0.18,0.32,6.46,100.00,1084.73,1.43,98300.0,206883000.0
1123,8-MIC001.47,2022-10-25,150,0.300,10.014865,0.000000,9.708738,0.000000,0.000000,0.000000,0.000000,0.000000,56.310680,32.038835,1.941748,0.0,0.000000,88.349515,100.0,-0.582559,0.005560,26.58,2.919048,4.390909,0.556709,7.887260,1.247618,89.090909,-1.544161,2022,197.0,0.00,0.22,0.00,54.61,42.67,0.34,1551.0,0.0,0.00,43.671,1.000,100.00,0.00,2.44,0.60,1.94,0.26,0.000,3.290,0.0,48.80,2.38,0.24,2.47,49.07,1545.0,2.68,42.49,0.00,0.00,3.53,72.00,204.11,41.15,4.19,5.6500,0.00,298.12,0.18,45.39,49.36,0.0,0.00,0.00,1.94,0.25,100.00,401.50,3.51,22580.0,14620900.0


In [194]:
# Join join1 to ecoregion information
common_columns = list(set(join1.columns) & set(vadeq_eco.columns))
join2 = join1.merge(vadeq_eco,
                    on = common_columns,
                    how = "left")
# Display join2
join2

Unnamed: 0,StationID,Date,ReachLength,Slope,RP100,BR_PCT,HP_PCT,RC_PCT,BL_PCT,CB_PCT,GC_PCT,GF_PCT,SA_PCT,FN_PCT,WD_PCT,OT_PCT,BL_CB_GR_PCT,SA_FN_PCT,TotSubstrate_PCT,LSUB_DMM,VLW_msq,Xdepth,Xwid,XBKF_W,BKF_depth_in_meters,BKFW_BKFD,incised_depth,Xembed,LRBS2,Year,RELIEF,STATSCLY30,LC01IMP,STATSOM2_6,PDIGMET,LC06FORSHB,LC11GRASS,PKREGNO,BRMETA,STATSCLY40,PRECIP,STATSPERM,STATSCLY50,STATSCLY60,LC01DEV,LC06GRASS,LC06WATER,LC11IMP,STATSWATCP,I24H2Y,STATOM19_8,LC06CRPHAY,LC11WATER,LC06IMP,LC06DEV,LC11CRPHAY,LFREGNO,LC11DEV,LC01FORSHB,VRPLSLC,VRCARB,LC06WETLND,STATSGODEP,MINBELEV,LC11FORSHB,LC11WETLND,DRNAREA,LC01BARE,ELEV,LC11BARE,CPSED,LC01CRPHAY,STATSCLAY10,STATSCLY20,LC06BARE,LC01WATER,LC01HERB,STATSOM0_5,ELEVMAX,LC01WETLND,Shape_Length,Shape_Area,FID,LATITUDE,LONGITUDE,OBJECTID,Join_Count,TARGET_FID,US_L3CODE,US_L3NAME,NA_L3CODE,NA_L3NAME,NA_L2CODE,NA_L2NAME,NA_L1CODE,NA_L1NAME,STATE_NAME,EPA_REGION,L3_KEY,L2_KEY,L1_KEY,Shape_Leng
0,1AACO006.10,2006-11-21,440,0.220,34.752500,7.692308,6.730769,0.000000,9.615385,14.423077,35.576923,6.730769,16.346154,2.884615,0.000000,0.0,66.346154,19.230769,100.0,1.582869,0.001897,66.12,14.690476,18.636364,1.123018,16.594890,2.547564,62.545455,0.517308,2006,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0,0.0,38.728611,-77.203333,1.0,1.0,0.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06
1,1AACO004.84,2008-06-25,320,0.521,25.757012,0.000000,0.952381,0.000000,5.714286,27.619048,26.666667,14.285714,10.476190,7.619048,6.666667,0.0,74.285714,18.095238,100.0,1.165237,0.004859,56.22,9.580952,15.372727,0.989473,15.536282,2.275836,51.636364,-0.245939,2008,451.0,48.13,21.97,0.00,87.78,25.10,0.01,1550.0,0.0,0.02,43.888,2.395,0.00,0.00,70.34,0.01,0.25,23.10,0.133,3.167,0.0,0.28,0.26,22.19,71.05,0.11,1544.0,73.50,25.82,0.00,0.00,3.27,66.65,39.50,22.49,3.50,40.5000,0.03,308.44,0.17,11.98,0.28,0.0,51.85,0.05,0.25,0.01,100.00,490.18,3.28,85900.0,105013700.0,1.0,38.720500,-77.190722,2.0,1.0,1.0,65.0,Southeastern Plains,8.3.5,Southeastern Plains,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,65 Southeastern Plains,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.388810e+06
2,1AACO006.10,2008-06-26,520,0.173,35.429907,2.857143,7.619048,0.952381,4.761905,21.904762,22.857143,12.380952,20.952381,2.857143,2.857143,0.0,61.904762,23.809524,100.0,1.327436,0.000897,63.57,13.819048,17.800000,1.112973,15.993204,2.990245,56.818182,0.374672,2008,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0,0.0,38.728611,-77.203333,1.0,1.0,0.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06
3,1AACO009.14,2008-06-26,560,0.223,22.451871,2.857143,4.761905,0.000000,9.523810,18.095238,28.571429,10.476190,17.142857,1.904762,6.666667,0.0,66.666667,19.047619,100.0,1.437585,0.005595,54.89,12.790476,15.681818,0.857991,18.277371,2.221627,63.454545,0.459602,2008,357.0,45.34,21.16,0.00,95.20,24.49,0.00,1551.0,0.0,0.02,43.981,2.271,0.00,0.00,71.37,0.00,0.28,22.00,0.135,3.169,0.0,0.26,0.28,21.37,71.85,0.09,1545.0,73.50,24.98,0.00,0.00,3.12,66.39,132.81,22.65,3.52,36.3000,0.01,324.56,0.00,4.53,0.25,0.0,54.64,0.01,0.28,0.00,100.00,490.18,3.11,73540.0,93979700.0,2.0,38.761944,-77.207222,3.0,1.0,2.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06
4,1AAUA017.60,2005-09-22,160,0.400,19.910507,0.000000,0.000000,0.000000,7.619048,34.285714,20.000000,5.714286,28.571429,3.809524,0.000000,0.0,67.619048,32.380952,100.0,1.081350,0.008752,31.73,7.790476,10.759091,0.753664,14.275720,2.717300,65.454545,-0.086419,2005,278.0,40.11,1.53,0.11,99.38,71.52,2.12,1551.0,0.0,59.60,43.317,1.506,0.00,0.00,12.29,1.89,0.25,1.91,0.112,3.055,0.0,8.59,0.25,1.75,13.23,7.74,1545.0,15.70,72.78,0.00,0.00,4.47,56.77,165.53,67.67,6.40,29.7000,0.05,317.16,0.10,0.38,8.68,0.0,0.30,0.04,0.25,1.42,99.89,443.72,4.53,57900.0,76884600.0,259.0,38.490361,-77.466389,260.0,1.0,259.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1120,2BXRK001.64,2022-10-26,150,1.287,9.486085,0.952381,5.714286,0.952381,0.000000,13.333333,10.476190,12.380952,53.333333,2.857143,0.000000,0.0,36.190476,56.190476,100.0,0.480012,0.001413,18.97,2.066667,2.736364,0.589700,4.640264,1.562427,67.090909,-1.157233,2022,575.0,100.00,0.17,0.00,0.00,95.66,0.00,1553.0,100.0,0.00,49.307,2.000,0.00,0.00,4.15,0.00,0.00,0.19,0.000,3.723,0.0,0.00,0.00,0.20,4.34,0.00,1546.0,4.66,95.85,0.00,0.00,0.00,68.00,574.73,95.34,0.00,0.7600,0.00,796.84,0.00,0.00,0.00,0.0,0.00,0.00,0.00,0.00,100.00,1149.30,0.00,8240.0,1967300.0,308.0,37.811383,-78.729900,309.0,1.0,308.0,64.0,Northern Piedmont,8.3.1,Northern Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,64 Northern Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,8.206757e+05
1121,2-JKS070.06,2022-09-12,800,0.450,25.065800,9.523810,0.000000,0.000000,5.714286,30.476190,38.095238,2.857143,3.809524,9.523810,0.000000,0.0,77.142857,13.333333,100.0,1.545757,0.002030,68.57,20.238095,29.218182,1.540245,18.969822,2.949336,27.636364,-0.052227,2022,2550.0,0.00,0.09,8.97,0.00,77.61,0.22,1554.0,0.0,23.43,43.163,5.089,0.00,7.39,4.29,0.04,0.02,0.11,0.097,2.959,0.0,17.99,0.02,0.09,4.28,17.98,1547.0,4.29,77.42,88.28,11.72,0.02,44.64,1825.15,77.43,0.02,104.0000,0.02,2821.90,0.04,0.00,18.21,0.0,69.18,0.03,0.02,0.00,91.03,4377.44,0.02,125400.0,270367300.0,903.0,38.123444,-79.779725,904.0,1.0,903.0,67.0,Ridge and Valley,8.4.1,Ridge and Valley,8.4,OZARK/OUACHITA-APPALACHIAN FORESTS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,67 Ridge and Valley,8.4 OZARK/OUACHITA-APPALACHIAN FORESTS,8 EASTERN TEMPERATE FORESTS,1.764867e+06
1122,4ASNA007.82,2022-09-20,320,0.430,15.910317,0.000000,0.000000,0.000000,0.000000,0.000000,3.809524,7.619048,80.000000,6.666667,1.904762,0.0,11.428571,86.666667,100.0,-0.402036,0.007645,34.39,7.576190,13.836364,1.057536,13.083582,2.298445,95.909091,-1.817226,2022,691.0,0.00,0.64,0.00,94.61,62.42,8.79,1551.0,0.0,9.51,45.609,1.114,90.49,0.00,5.25,7.14,0.31,0.67,0.101,3.319,0.0,23.28,0.32,0.65,5.27,23.17,1545.0,5.26,62.84,0.00,0.00,1.42,71.04,393.33,60.89,1.49,79.9000,0.19,604.47,0.09,0.00,23.51,0.0,0.00,0.18,0.32,6.46,100.00,1084.73,1.43,98300.0,206883000.0,904.0,36.792139,-79.112444,905.0,1.0,904.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06
1123,8-MIC001.47,2022-10-25,150,0.300,10.014865,0.000000,9.708738,0.000000,0.000000,0.000000,0.000000,0.000000,56.310680,32.038835,1.941748,0.0,0.000000,88.349515,100.0,-0.582559,0.005560,26.58,2.919048,4.390909,0.556709,7.887260,1.247618,89.090909,-1.544161,2022,197.0,0.00,0.22,0.00,54.61,42.67,0.34,1551.0,0.0,0.00,43.671,1.000,100.00,0.00,2.44,0.60,1.94,0.26,0.000,3.290,0.0,48.80,2.38,0.24,2.47,49.07,1545.0,2.68,42.49,0.00,0.00,3.53,72.00,204.11,41.15,4.19,5.6500,0.00,298.12,0.18,45.39,49.36,0.0,0.00,0.00,1.94,0.25,100.00,401.50,3.51,22580.0,14620900.0,905.0,37.753450,-77.720931,906.0,1.0,905.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06


In [195]:
# Join join2 to reference information
common_columns = list(set(join2.columns) & set(vadeq_ref.columns))
join3 = join2.merge(vadeq_ref,
                    on = common_columns,
                    how = "left")
# Display join3
join3

Unnamed: 0,StationID,Date,ReachLength,Slope,RP100,BR_PCT,HP_PCT,RC_PCT,BL_PCT,CB_PCT,GC_PCT,GF_PCT,SA_PCT,FN_PCT,WD_PCT,OT_PCT,BL_CB_GR_PCT,SA_FN_PCT,TotSubstrate_PCT,LSUB_DMM,VLW_msq,Xdepth,Xwid,XBKF_W,BKF_depth_in_meters,BKFW_BKFD,incised_depth,Xembed,LRBS2,Year,RELIEF,STATSCLY30,LC01IMP,STATSOM2_6,PDIGMET,LC06FORSHB,LC11GRASS,PKREGNO,BRMETA,STATSCLY40,PRECIP,STATSPERM,STATSCLY50,STATSCLY60,LC01DEV,LC06GRASS,LC06WATER,LC11IMP,STATSWATCP,I24H2Y,STATOM19_8,LC06CRPHAY,LC11WATER,LC06IMP,LC06DEV,LC11CRPHAY,LFREGNO,LC11DEV,LC01FORSHB,VRPLSLC,VRCARB,LC06WETLND,STATSGODEP,MINBELEV,LC11FORSHB,LC11WETLND,DRNAREA,LC01BARE,ELEV,LC11BARE,CPSED,LC01CRPHAY,STATSCLAY10,STATSCLY20,LC06BARE,LC01WATER,LC01HERB,STATSOM0_5,ELEVMAX,LC01WETLND,Shape_Length,Shape_Area,FID,LATITUDE,LONGITUDE,OBJECTID,Join_Count,TARGET_FID,US_L3CODE,US_L3NAME,NA_L3CODE,NA_L3NAME,NA_L2CODE,NA_L2NAME,NA_L1CODE,NA_L1NAME,STATE_NAME,EPA_REGION,L3_KEY,L2_KEY,L1_KEY,Shape_Leng,Slope [m/m],u* [m/s],RefStress
0,1AACO006.10,2006-11-21,440,0.220,34.752500,7.692308,6.730769,0.000000,9.615385,14.423077,35.576923,6.730769,16.346154,2.884615,0.000000,0.0,66.346154,19.230769,100.0,1.582869,0.001897,66.12,14.690476,18.636364,1.123018,16.594890,2.547564,62.545455,0.517308,2006,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0,0.0,38.728611,-77.203333,1.0,1.0,0.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00220,0.155682,
1,1AACO004.84,2008-06-25,320,0.521,25.757012,0.000000,0.952381,0.000000,5.714286,27.619048,26.666667,14.285714,10.476190,7.619048,6.666667,0.0,74.285714,18.095238,100.0,1.165237,0.004859,56.22,9.580952,15.372727,0.989473,15.536282,2.275836,51.636364,-0.245939,2008,451.0,48.13,21.97,0.00,87.78,25.10,0.01,1550.0,0.0,0.02,43.888,2.395,0.00,0.00,70.34,0.01,0.25,23.10,0.133,3.167,0.0,0.28,0.26,22.19,71.05,0.11,1544.0,73.50,25.82,0.00,0.00,3.27,66.65,39.50,22.49,3.50,40.5000,0.03,308.44,0.17,11.98,0.28,0.0,51.85,0.05,0.25,0.01,100.00,490.18,3.28,85900.0,105013700.0,1.0,38.720500,-77.190722,2.0,1.0,1.0,65.0,Southeastern Plains,8.3.5,Southeastern Plains,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,65 Southeastern Plains,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.388810e+06,0.00521,0.224882,
2,1AACO006.10,2008-06-26,520,0.173,35.429907,2.857143,7.619048,0.952381,4.761905,21.904762,22.857143,12.380952,20.952381,2.857143,2.857143,0.0,61.904762,23.809524,100.0,1.327436,0.000897,63.57,13.819048,17.800000,1.112973,15.993204,2.990245,56.818182,0.374672,2008,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0,0.0,38.728611,-77.203333,1.0,1.0,0.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00173,0.137436,
3,1AACO009.14,2008-06-26,560,0.223,22.451871,2.857143,4.761905,0.000000,9.523810,18.095238,28.571429,10.476190,17.142857,1.904762,6.666667,0.0,66.666667,19.047619,100.0,1.437585,0.005595,54.89,12.790476,15.681818,0.857991,18.277371,2.221627,63.454545,0.459602,2008,357.0,45.34,21.16,0.00,95.20,24.49,0.00,1551.0,0.0,0.02,43.981,2.271,0.00,0.00,71.37,0.00,0.28,22.00,0.135,3.169,0.0,0.26,0.28,21.37,71.85,0.09,1545.0,73.50,24.98,0.00,0.00,3.12,66.39,132.81,22.65,3.52,36.3000,0.01,324.56,0.00,4.53,0.25,0.0,54.64,0.01,0.28,0.00,100.00,490.18,3.11,73540.0,93979700.0,2.0,38.761944,-77.207222,3.0,1.0,2.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00223,0.137002,
4,1AAUA017.60,2005-09-22,160,0.400,19.910507,0.000000,0.000000,0.000000,7.619048,34.285714,20.000000,5.714286,28.571429,3.809524,0.000000,0.0,67.619048,32.380952,100.0,1.081350,0.008752,31.73,7.790476,10.759091,0.753664,14.275720,2.717300,65.454545,-0.086419,2005,278.0,40.11,1.53,0.11,99.38,71.52,2.12,1551.0,0.0,59.60,43.317,1.506,0.00,0.00,12.29,1.89,0.25,1.91,0.112,3.055,0.0,8.59,0.25,1.75,13.23,7.74,1545.0,15.70,72.78,0.00,0.00,4.47,56.77,165.53,67.67,6.40,29.7000,0.05,317.16,0.10,0.38,8.68,0.0,0.30,0.04,0.25,1.42,99.89,443.72,4.53,57900.0,76884600.0,259.0,38.490361,-77.466389,260.0,1.0,259.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00400,0.171970,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1120,2BXRK001.64,2022-10-26,150,1.287,9.486085,0.952381,5.714286,0.952381,0.000000,13.333333,10.476190,12.380952,53.333333,2.857143,0.000000,0.0,36.190476,56.190476,100.0,0.480012,0.001413,18.97,2.066667,2.736364,0.589700,4.640264,1.562427,67.090909,-1.157233,2022,575.0,100.00,0.17,0.00,0.00,95.66,0.00,1553.0,100.0,0.00,49.307,2.000,0.00,0.00,4.15,0.00,0.00,0.19,0.000,3.723,0.0,0.00,0.00,0.20,4.34,0.00,1546.0,4.66,95.85,0.00,0.00,0.00,68.00,574.73,95.34,0.00,0.7600,0.00,796.84,0.00,0.00,0.00,0.0,0.00,0.00,0.00,0.00,100.00,1149.30,0.00,8240.0,1967300.0,308.0,37.811383,-78.729900,309.0,1.0,308.0,64.0,Northern Piedmont,8.3.1,Northern Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,64 Northern Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,8.206757e+05,0.01287,0.272860,Ref
1121,2-JKS070.06,2022-09-12,800,0.450,25.065800,9.523810,0.000000,0.000000,5.714286,30.476190,38.095238,2.857143,3.809524,9.523810,0.000000,0.0,77.142857,13.333333,100.0,1.545757,0.002030,68.57,20.238095,29.218182,1.540245,18.969822,2.949336,27.636364,-0.052227,2022,2550.0,0.00,0.09,8.97,0.00,77.61,0.22,1554.0,0.0,23.43,43.163,5.089,0.00,7.39,4.29,0.04,0.02,0.11,0.097,2.959,0.0,17.99,0.02,0.09,4.28,17.98,1547.0,4.29,77.42,88.28,11.72,0.02,44.64,1825.15,77.43,0.02,104.0000,0.02,2821.90,0.04,0.00,18.21,0.0,69.18,0.03,0.02,0.00,91.03,4377.44,0.02,125400.0,270367300.0,903.0,38.123444,-79.779725,904.0,1.0,903.0,67.0,Ridge and Valley,8.4.1,Ridge and Valley,8.4,OZARK/OUACHITA-APPALACHIAN FORESTS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,67 Ridge and Valley,8.4 OZARK/OUACHITA-APPALACHIAN FORESTS,8 EASTERN TEMPERATE FORESTS,1.764867e+06,0.00450,0.260757,
1122,4ASNA007.82,2022-09-20,320,0.430,15.910317,0.000000,0.000000,0.000000,0.000000,0.000000,3.809524,7.619048,80.000000,6.666667,1.904762,0.0,11.428571,86.666667,100.0,-0.402036,0.007645,34.39,7.576190,13.836364,1.057536,13.083582,2.298445,95.909091,-1.817226,2022,691.0,0.00,0.64,0.00,94.61,62.42,8.79,1551.0,0.0,9.51,45.609,1.114,90.49,0.00,5.25,7.14,0.31,0.67,0.101,3.319,0.0,23.28,0.32,0.65,5.27,23.17,1545.0,5.26,62.84,0.00,0.00,1.42,71.04,393.33,60.89,1.49,79.9000,0.19,604.47,0.09,0.00,23.51,0.0,0.00,0.18,0.32,6.46,100.00,1084.73,1.43,98300.0,206883000.0,904.0,36.792139,-79.112444,905.0,1.0,904.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00430,0.211211,
1123,8-MIC001.47,2022-10-25,150,0.300,10.014865,0.000000,9.708738,0.000000,0.000000,0.000000,0.000000,0.000000,56.310680,32.038835,1.941748,0.0,0.000000,88.349515,100.0,-0.582559,0.005560,26.58,2.919048,4.390909,0.556709,7.887260,1.247618,89.090909,-1.544161,2022,197.0,0.00,0.22,0.00,54.61,42.67,0.34,1551.0,0.0,0.00,43.671,1.000,100.00,0.00,2.44,0.60,1.94,0.26,0.000,3.290,0.0,48.80,2.38,0.24,2.47,49.07,1545.0,2.68,42.49,0.00,0.00,3.53,72.00,204.11,41.15,4.19,5.6500,0.00,298.12,0.18,45.39,49.36,0.0,0.00,0.00,1.94,0.25,100.00,401.50,3.51,22580.0,14620900.0,905.0,37.753450,-77.720931,906.0,1.0,905.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00300,0.128000,


In [196]:
# Join join3 to reference information
common_columns = list(set(join3.columns) & set(flw_trim.columns))
vadeq_strmsts = join3.merge(flw_trim,
                    on = common_columns,
                    how = "left")
# Display join4
vadeq_strmsts

Unnamed: 0,StationID,Date,ReachLength,Slope,RP100,BR_PCT,HP_PCT,RC_PCT,BL_PCT,CB_PCT,GC_PCT,GF_PCT,SA_PCT,FN_PCT,WD_PCT,OT_PCT,BL_CB_GR_PCT,SA_FN_PCT,TotSubstrate_PCT,LSUB_DMM,VLW_msq,Xdepth,Xwid,XBKF_W,BKF_depth_in_meters,BKFW_BKFD,incised_depth,Xembed,LRBS2,Year,RELIEF,STATSCLY30,LC01IMP,STATSOM2_6,PDIGMET,LC06FORSHB,LC11GRASS,PKREGNO,BRMETA,STATSCLY40,PRECIP,STATSPERM,STATSCLY50,STATSCLY60,LC01DEV,LC06GRASS,LC06WATER,LC11IMP,STATSWATCP,I24H2Y,STATOM19_8,LC06CRPHAY,LC11WATER,LC06IMP,LC06DEV,LC11CRPHAY,LFREGNO,LC11DEV,LC01FORSHB,VRPLSLC,VRCARB,LC06WETLND,STATSGODEP,MINBELEV,LC11FORSHB,LC11WETLND,DRNAREA,LC01BARE,ELEV,LC11BARE,CPSED,LC01CRPHAY,STATSCLAY10,STATSCLY20,LC06BARE,LC01WATER,LC01HERB,STATSOM0_5,ELEVMAX,LC01WETLND,Shape_Length,Shape_Area,FID,LATITUDE,LONGITUDE,OBJECTID,Join_Count,TARGET_FID,US_L3CODE,US_L3NAME,NA_L3CODE,NA_L3NAME,NA_L2CODE,NA_L2NAME,NA_L1CODE,NA_L1NAME,STATE_NAME,EPA_REGION,L3_KEY,L2_KEY,L1_KEY,Shape_Leng,Slope [m/m],u* [m/s],RefStress,1 Day 1.11 Year Low Flow,1 Day 1.25 Year Low Flow,1 Day 1.43 Year Low Flow,1 Day 1.67 Year Low Flow,1 Day 2 Year Low Flow,10-percent AEP flood,2-percent AEP flood,20-percent AEP flood,30 Day 1.11 Year Low Flow,30 Day 1.25 Year Low Flow,30 Day 1.43 Year Low Flow,4 Day 1.11 Year Low Flow,4 Day 1.25 Year Low Flow,4-percent AEP flood,7 Day 1.11 Year Low Flow,7 Day 1.25 Year Low Flow,7 Day 1.43 Year Low Flow,7 Day 1.67 Year Low Flow,Bieger_D_channel_cross_sectional_area,Bieger_D_channel_depth,Bieger_D_channel_width,Bieger_USA_channel_cross_sectional_area,Bieger_USA_channel_depth,Bieger_USA_channel_width,Urban 0.2-percent AEP flood,Urban 0.5-percent AEP flood,Urban 1-percent AEP flood,Urban 10-percent AEP flood,Urban 2-percent AEP flood,Urban 20-Percent AEP flood,Urban 4-percent AEP flood,Urban 42.9-percent AEP flood,Urban 50-percent AEP flood,Urban 66.7-percent AEP flood,Urban 80-percent AEP flood,Urban 90-percent AEP flood,Urban 95-percent AEP flood,Urban 99-percent AEP flood,Urban 99.5-percent AEP flood
0,1AACO006.10,2006-11-21,440,0.220,34.752500,7.692308,6.730769,0.000000,9.615385,14.423077,35.576923,6.730769,16.346154,2.884615,0.000000,0.0,66.346154,19.230769,100.0,1.582869,0.001897,66.12,14.690476,18.636364,1.123018,16.594890,2.547564,62.545455,0.517308,2006,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0,0.0,38.728611,-77.203333,1.0,1.0,0.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00220,0.155682,,0.012467,0.007940,0.005353,0.003673,0.002477,71.533333,166.366667,45.000000,0.039167,0.027933,0.020600,0.033533,0.024733,118.666667,0.041400,0.029733,0.022600,0.017200,2.66,0.523,5.05,4.07,0.685,4.86,426.333333,351.333333,293.666667,136.333333,219.666667,106.866667,167.333333,84.400000,78.800000,78.666667,65.633333,63.500000,54.8,64.933333,64.200000
1,1AACO004.84,2008-06-25,320,0.521,25.757012,0.000000,0.952381,0.000000,5.714286,27.619048,26.666667,14.285714,10.476190,7.619048,6.666667,0.0,74.285714,18.095238,100.0,1.165237,0.004859,56.22,9.580952,15.372727,0.989473,15.536282,2.275836,51.636364,-0.245939,2008,451.0,48.13,21.97,0.00,87.78,25.10,0.01,1550.0,0.0,0.02,43.888,2.395,0.00,0.00,70.34,0.01,0.25,23.10,0.133,3.167,0.0,0.28,0.26,22.19,71.05,0.11,1544.0,73.50,25.82,0.00,0.00,3.27,66.65,39.50,22.49,3.50,40.5000,0.03,308.44,0.17,11.98,0.28,0.0,51.85,0.05,0.25,0.01,100.00,490.18,3.28,85900.0,105013700.0,1.0,38.720500,-77.190722,2.0,1.0,1.0,65.0,Southeastern Plains,8.3.5,Southeastern Plains,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,65 Southeastern Plains,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.388810e+06,0.00521,0.224882,,6.723333,4.653333,3.446667,2.620000,1.960000,2853.333333,5380.000000,2009.333333,14.666667,10.523333,7.973333,9.220000,6.893333,4193.333333,11.246667,8.433333,6.696667,5.390000,233.00,3.240,70.60,126.00,2.650,45.60,31866.666667,17166.666667,12633.333333,5350.000000,9886.666667,3886.666667,8006.666667,2400.000000,2140.000000,1543.333333,1206.666667,891.000000,782.0,536.666667,487.666667
2,1AACO006.10,2008-06-26,520,0.173,35.429907,2.857143,7.619048,0.952381,4.761905,21.904762,22.857143,12.380952,20.952381,2.857143,2.857143,0.0,61.904762,23.809524,100.0,1.327436,0.000897,63.57,13.819048,17.800000,1.112973,15.993204,2.990245,56.818182,0.374672,2008,171.0,100.00,14.94,0.00,33.39,37.62,0.00,1551.0,0.0,0.00,43.078,2.895,0.00,0.00,57.04,0.00,0.00,19.30,0.000,3.140,0.0,0.00,0.00,14.20,56.93,0.00,1545.0,83.80,42.96,0.00,0.00,0.00,70.48,54.96,16.23,0.00,0.0702,0.00,148.98,0.00,66.61,0.00,0.0,0.00,5.45,0.00,0.00,100.00,225.47,0.00,2380.0,181800.0,0.0,38.728611,-77.203333,1.0,1.0,0.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00173,0.137436,,0.012467,0.007940,0.005353,0.003673,0.002477,71.533333,166.366667,45.000000,0.039167,0.027933,0.020600,0.033533,0.024733,118.666667,0.041400,0.029733,0.022600,0.017200,2.66,0.523,5.05,4.07,0.685,4.86,426.333333,351.333333,293.666667,136.333333,219.666667,106.866667,167.333333,84.400000,78.800000,78.666667,65.633333,63.500000,54.8,64.933333,64.200000
3,1AACO009.14,2008-06-26,560,0.223,22.451871,2.857143,4.761905,0.000000,9.523810,18.095238,28.571429,10.476190,17.142857,1.904762,6.666667,0.0,66.666667,19.047619,100.0,1.437585,0.005595,54.89,12.790476,15.681818,0.857991,18.277371,2.221627,63.454545,0.459602,2008,357.0,45.34,21.16,0.00,95.20,24.49,0.00,1551.0,0.0,0.02,43.981,2.271,0.00,0.00,71.37,0.00,0.28,22.00,0.135,3.169,0.0,0.26,0.28,21.37,71.85,0.09,1545.0,73.50,24.98,0.00,0.00,3.12,66.39,132.81,22.65,3.52,36.3000,0.01,324.56,0.00,4.53,0.25,0.0,54.64,0.01,0.28,0.00,100.00,490.18,3.11,73540.0,93979700.0,2.0,38.761944,-77.207222,3.0,1.0,2.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00223,0.137002,,6.106667,4.216667,3.123333,2.366667,1.766667,2736.666667,5190.000000,1925.666667,13.266667,9.520000,7.210000,8.476667,6.346667,4036.666667,10.343333,7.773333,6.183333,4.980000,216.00,3.140,67.50,119.00,2.590,43.80,29866.666667,16133.333333,11900.000000,5053.333333,9313.333333,3676.666667,7540.000000,2276.666667,2033.333333,1476.666667,1153.333333,855.666667,751.0,520.000000,474.000000
4,1AAUA017.60,2005-09-22,160,0.400,19.910507,0.000000,0.000000,0.000000,7.619048,34.285714,20.000000,5.714286,28.571429,3.809524,0.000000,0.0,67.619048,32.380952,100.0,1.081350,0.008752,31.73,7.790476,10.759091,0.753664,14.275720,2.717300,65.454545,-0.086419,2005,278.0,40.11,1.53,0.11,99.38,71.52,2.12,1551.0,0.0,59.60,43.317,1.506,0.00,0.00,12.29,1.89,0.25,1.91,0.112,3.055,0.0,8.59,0.25,1.75,13.23,7.74,1545.0,15.70,72.78,0.00,0.00,4.47,56.77,165.53,67.67,6.40,29.7000,0.05,317.16,0.10,0.38,8.68,0.0,0.30,0.04,0.25,1.42,99.89,443.72,4.53,57900.0,76884600.0,259.0,38.490361,-77.466389,260.0,1.0,259.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00400,0.171970,,5.760000,4.010000,3.000000,2.300000,1.740000,3200.000000,6040.000000,2250.000000,11.300000,8.210000,6.260000,8.200000,6.330000,4710.000000,10.200000,7.940000,6.460000,5.310000,188.00,2.970,62.10,107.00,2.480,40.90,11500.000000,8680.000000,6746.666667,2803.333333,5493.333333,1996.666667,4196.666667,1273.333333,1133.333333,892.333333,734.333333,561.666667,446.0,319.666667,284.333333
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1120,2BXRK001.64,2022-10-26,150,1.287,9.486085,0.952381,5.714286,0.952381,0.000000,13.333333,10.476190,12.380952,53.333333,2.857143,0.000000,0.0,36.190476,56.190476,100.0,0.480012,0.001413,18.97,2.066667,2.736364,0.589700,4.640264,1.562427,67.090909,-1.157233,2022,575.0,100.00,0.17,0.00,0.00,95.66,0.00,1553.0,100.0,0.00,49.307,2.000,0.00,0.00,4.15,0.00,0.00,0.19,0.000,3.723,0.0,0.00,0.00,0.20,4.34,0.00,1546.0,4.66,95.85,0.00,0.00,0.00,68.00,574.73,95.34,0.00,0.7600,0.00,796.84,0.00,0.00,0.00,0.0,0.00,0.00,0.00,0.00,100.00,1149.30,0.00,8240.0,1967300.0,308.0,37.811383,-78.729900,309.0,1.0,308.0,64.0,Northern Piedmont,8.3.1,Northern Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,64 Northern Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,8.206757e+05,0.01287,0.272860,Ref,0.146000,0.096300,0.068600,0.049700,0.035500,409.000000,905.000000,257.000000,0.240000,0.160000,0.116000,0.151000,0.101000,658.000000,0.162000,0.108000,0.077400,0.056200,14.20,1.040,13.60,14.70,1.140,11.20,776.333333,515.666667,414.666667,152.333333,350.333333,104.333333,253.000000,66.866667,59.333333,48.666667,44.133333,33.700000,26.4,19.833333,17.900000
1121,2-JKS070.06,2022-09-12,800,0.450,25.065800,9.523810,0.000000,0.000000,5.714286,30.476190,38.095238,2.857143,3.809524,9.523810,0.000000,0.0,77.142857,13.333333,100.0,1.545757,0.002030,68.57,20.238095,29.218182,1.540245,18.969822,2.949336,27.636364,-0.052227,2022,2550.0,0.00,0.09,8.97,0.00,77.61,0.22,1554.0,0.0,23.43,43.163,5.089,0.00,7.39,4.29,0.04,0.02,0.11,0.097,2.959,0.0,17.99,0.02,0.09,4.28,17.98,1547.0,4.29,77.42,88.28,11.72,0.02,44.64,1825.15,77.43,0.02,104.0000,0.02,2821.90,0.04,0.00,18.21,0.0,69.18,0.03,0.02,0.00,91.03,4377.44,0.02,125400.0,270367300.0,903.0,38.123444,-79.779725,904.0,1.0,903.0,67.0,Ridge and Valley,8.4.1,Ridge and Valley,8.4,OZARK/OUACHITA-APPALACHIAN FORESTS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,67 Ridge and Valley,8.4 OZARK/OUACHITA-APPALACHIAN FORESTS,8 EASTERN TEMPERATE FORESTS,1.764867e+06,0.00450,0.260757,,22.500000,18.800000,16.400000,14.400000,12.900000,7910.000000,13300.000000,5910.000000,31.300000,25.100000,21.500000,23.600000,19.600000,10900.000000,24.200000,20.000000,17.400000,15.500000,454.00,4.250,104.00,210.00,3.240,63.50,24300.000000,20500.000000,15900.000000,6840.000000,12900.000000,4920.000000,9910.000000,3140.000000,2790.000000,2210.000000,1780.000000,1360.000000,1070.0,756.000000,666.000000
1122,4ASNA007.82,2022-09-20,320,0.430,15.910317,0.000000,0.000000,0.000000,0.000000,0.000000,3.809524,7.619048,80.000000,6.666667,1.904762,0.0,11.428571,86.666667,100.0,-0.402036,0.007645,34.39,7.576190,13.836364,1.057536,13.083582,2.298445,95.909091,-1.817226,2022,691.0,0.00,0.64,0.00,94.61,62.42,8.79,1551.0,0.0,9.51,45.609,1.114,90.49,0.00,5.25,7.14,0.31,0.67,0.101,3.319,0.0,23.28,0.32,0.65,5.27,23.17,1545.0,5.26,62.84,0.00,0.00,1.42,71.04,393.33,60.89,1.49,79.9000,0.19,604.47,0.09,0.00,23.51,0.0,0.00,0.18,0.32,6.46,100.00,1084.73,1.43,98300.0,206883000.0,904.0,36.792139,-79.112444,905.0,1.0,904.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00430,0.211211,,13.800000,9.770000,7.470000,5.840000,4.500000,5970.000000,11536.666667,4166.666667,26.466667,19.600000,15.266667,16.633333,12.636667,8903.333333,20.433333,15.606667,12.786667,10.633333,377.00,3.940,93.60,182.00,3.060,57.90,20400.000000,16900.000000,13100.000000,5600.000000,10700.000000,4020.000000,8180.000000,2560.000000,2280.000000,1800.000000,1460.000000,1120.000000,877.0,622.000000,549.000000
1123,8-MIC001.47,2022-10-25,150,0.300,10.014865,0.000000,9.708738,0.000000,0.000000,0.000000,0.000000,0.000000,56.310680,32.038835,1.941748,0.0,0.000000,88.349515,100.0,-0.582559,0.005560,26.58,2.919048,4.390909,0.556709,7.887260,1.247618,89.090909,-1.544161,2022,197.0,0.00,0.22,0.00,54.61,42.67,0.34,1551.0,0.0,0.00,43.671,1.000,100.00,0.00,2.44,0.60,1.94,0.26,0.000,3.290,0.0,48.80,2.38,0.24,2.47,49.07,1545.0,2.68,42.49,0.00,0.00,3.53,72.00,204.11,41.15,4.19,5.6500,0.00,298.12,0.18,45.39,49.36,0.0,0.00,0.00,1.94,0.25,100.00,401.50,3.51,22580.0,14620900.0,905.0,37.753450,-77.720931,906.0,1.0,905.0,45.0,Piedmont,8.3.4,Piedmont,8.3,SOUTHEASTERN USA PLAINS,8.0,EASTERN TEMPERATE FORESTS,Virginia,3.0,45 Piedmont,8.3 SOUTHEASTERN USA PLAINS,8 EASTERN TEMPERATE FORESTS,1.347537e+06,0.00300,0.128000,,1.110000,0.736000,0.531000,0.392000,0.284000,1320.000000,2630.000000,900.000000,2.370000,1.640000,1.200000,1.970000,1.500000,2010.000000,2.470000,1.880000,1.500000,1.220000,58.30,1.840,31.20,43.50,1.740,22.80,3073.333333,2263.333333,1793.333333,701.666667,1493.333333,489.666667,1103.333333,312.333333,277.333333,224.333333,194.333333,148.333333,116.0,84.700000,75.600000


## Step 4. Cleaning Joined Table

### Extract Month from Date Column

In [197]:
# Convert the 'date' column to datetime objects
vadeq_strmsts['Date'] = pd.to_datetime(vadeq_strmsts['Date'])

# Extract the month from the 'date' column
vadeq_strmsts['Month']= vadeq_strmsts['Date'].dt.month

### Check for nulls

In [198]:
#Provide sum of null values in each row, organized by largest sum first
vadeq_strmsts.isna().sum().sort_values(ascending = False)

RefStress        523
STATSCLAY10        8
CPSED              8
Shape_Area         8
Shape_Length       8
                ... 
BKFW_BKFD          0
incised_depth      0
Xembed             0
LRBS2              0
Month              0
Length: 145, dtype: int64

Since there are maximum 8 null values (ignoring RefStress), we are assuming that a few stations did not join correctly. For now, we will cut these rows.

In [199]:
# Drop null rows, excluding RefStress
vadeq_strmsts = vadeq_strmsts.dropna(axis = 'rows',
                                     subset = ['STATSCLAY10']) # Only identfying nulls for StreamStats calculations

In [204]:
# Display nulls again
vadeq_strmsts.isna().sum().sort_values(ascending = False)

RefStress       516
u* [m/s]          2
Slope [m/m]       2
StationID         0
L3_KEY            0
               ... 
I24H2Y            0
STATOM19_8        0
LC06CRPHAY        0
LC11WATER         0
DRNAREA_SQKM      0
Length: 146, dtype: int64

Here we see that we're missing two slopes. That's okay for this purpose--our current model focuses on using remotely-sensed data, and u* can be much less accurate when using remotely-sensed channel width. Therefore, we can ignore this for now.

### Unit Conversion

In [201]:
#Unit Conversion for drainage area that was provided in square miles
#Create sqkm column
vadeq_strmsts['DRNAREA_SQKM'] = vadeq_strmsts['DRNAREA']*2.58999

#Print two columns to check if conversion worked
vadeq_strmsts[['DRNAREA',
               'DRNAREA_SQKM']]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  vadeq_strmsts['DRNAREA_SQKM'] = vadeq_strmsts['DRNAREA']*2.58999


Unnamed: 0,DRNAREA,DRNAREA_SQKM
0,0.0702,0.181817
1,40.5000,104.894595
2,0.0702,0.181817
3,36.3000,94.016637
4,29.7000,76.922703
...,...,...
1120,0.7600,1.968392
1121,104.0000,269.358960
1122,79.9000,206.940201
1123,5.6500,14.633444


## Step 5. Export dataset

In [202]:
# Export to Excel
final_path = os.path.join(BASE_DIR, "data", "processed", "virginia", "vadeq_strmstats_mrgd.xlsx")
vadeq_strmsts.to_excel(final_path)
final_path

'C:\\Users\\reill\\OneDrive - Virginia Tech\\Documents\\Virginia Tech\\Czuba Research\\Embeddedness\\GitHub\\Embeddedness_ML_XAI_Models\\data\\processed\\virginia\\vadeq_strmstats_mrgd.xlsx'