# National Water Information collection for Cimarron watershed 

The porpose of this example is tring to download the water information of Cimarron watershed from NWIS website and display the detail information of a selected site (The site of Cimarron River near Ripley, OK).Below figure shows the map location of this site. 
<img src='site.png' width = '600'>  
PS:  
  To run this code in your computer, You should download the 'PyHSPF' package from Dr,Lampert's published repositories on Gighub web.

Below code defines the key Python libraries 

In [35]:
import os,datetime,pickle

The following code import the tool of NWIS extractor that existing under the pyhspf package/preprocessing folder.

In [36]:
from pyhspf.preprocessing import NWISExtractor

The 'NWIS' and 'directory' code indicated the location folders for download zip files and input/output working directory. 'HUC8' stands for the 8-digit Hydrological Unit Code(HUC) for Cimarron Watershed (11050003). 

In [37]:
NWIS = 'NWIS-download'
directory = 'data-curves'       
HUC8      = '11050003'  

The following code indicated all the data path for Cimarron Watershed water information.

In [38]:
gagepath  = '{}/gagedata'.format(directory)

The 'start' 'end' code function set the time period of downloaded data from NWIS website. 

In [39]:
start = datetime.datetime(1980, 1, 1)      # start date for timeseries
end   = datetime.datetime(2020, 1, 1)      # end date for timeseries

the following code give an instance of location of the metadata file to the NWIS extractor process. If the indicated 'NWIS' folder was not existed, the process will create a new 'NWIS' folder. 

In [40]:
nwisextractor = NWISExtractor(NWIS)

The following code call the 'extract_HUC8' process under the 'NWISExtractor' class and give it the indicated key values. The process will check and extract gage metadata from donloaded 'NWIS' zip file by using indicated'HUC8' number and save the data into 'data-curves'(directory) folder(if it didn't exist, the process will create a new folder called 'data-curves'(directory).

In [41]:
nwisextractor.extract_HUC8(HUC8, directory)

NWIS directory NWIS-download exists

NWIS source metadata file NWIS-download/USGS_Streamgages-NHD_Locations_Shape.zip is present

gage metadata NWIS-download/USGS_Streamgages-NHD_Locations is present

gage station file data-curves/gagestations exists



Below code is a if loop that used for check the data files under the created 'data-curves' folder. If the condition didn't conform, the loop will call the 'dwonload_all' process under the 'NWISExtractor' class and plot all the downloaded metadata. 

In [42]:
if not os.path.isdir(gagepath):
    nwisextractor.download_all(start, end, output = gagepath)

The following code is trying to download the daily flow and water quality data for one gage by the given USGS NWIS Site ID number. The specific site number could be found on the following URL.
URL:https://waterdata.usgs.gov/nwis/inventory?search_station_nm=cimarron&search_station_nm_match_type=beginning&state_cd=ok&format=station_list&group_key=NONE&list_of_search_criteria=state_cd%2Csearch_station_nm  
The code will call the 'download_gagedata' process under the 'NWISExtractor' class and check the existing of sepcified metadata.

In [43]:
picked_gageid    = '07161450'
picked_gagedata  = 'Cimarron River near Ripley'
nwisextractor.download_gagedata(picked_gageid, start, end, output = picked_gagedata)

gage data for 07161450 exist



Below code is trying to print the fundamental labels and the data values of selected site. 

In [44]:
print('The water flow information for the Cimarron River near Ripley, OK')
print('')
p = '{}/{}'.format(gagepath,picked_gageid) #identify p as the picked_gageid file 
                                           #under 'data-curves' folder.
with open(p, 'rb') as f:      #open p file and ensure the file is 'clean up' 
                              #after finish the process
        
    station = pickle.load(f)      #to load pickled data from p file

# the following are attributes of the station directly from the database

print('Gage ID:                     ', station.gageid)
print('Name:                        ', station.name)
print('State:                       ', station.state)
print('First day of measurement:    ', station.day1)
print('Last day of measurement:     ', station.dayn)
print('Drainage area (square miles):', station.drain)
print('Average flow (cfs):          ', station.ave)
print('NWIS url:                    ', station.web)

The water flow information for the Cimarron River near Ripley, OK

Gage ID:                      07161450
Name:                         Cimarron River near Ripley, OK
State:                        OK
First day of measurement:     19871001
Last day of measurement:      20040930
Drainage area (square miles): 17979.0
Average flow (cfs):           2220.978
NWIS url:                     http://waterdata.usgs.gov/nwis/nwisman/?site_no=07161450


The following code is trying to extract the water flow information of the sepcific setting time period on the selected site. The flowrate of start date and end date are displayed. A mean flowerate of this picked period is also showing. 

In [45]:
s = datetime.datetime(1993, 1, 1)  # Start date 
e = datetime.datetime(1994, 1, 1)   # End date

# get the time series of daily flow data from the start to end date
# if it's available:

try: 
    #call the 'make_timeseries'process under the 'gagestation' class 
    ts = station.make_timeseries(start = s, end = e)

# if the start and end dates are not supplied, the function return the 
# data for the whole period

    startt = s.year, s.month, s.day, ts[0]
    print('Flow on {:04d}-{:02d}-{:02d} (cfs):     {}'.format(*startt))
    endt = e.year, e.month, e.day, ts[0]
    print('Flow on {:04d}-{:02d}-{:02d} (cfs):     {}'.format(*endt))
    
    # calculate the average flow across the dates specified
    ave = sum(ts) / (e - s).days
    print('Mean flow across dates (cfs): {:.1f}'.format(ave))

except: pass
print('')
# if the flow values are missing, the function will fill the value for 
# the data with a "None"

Flow on 1993-01-01 (cfs):     1740.0
Flow on 1994-01-01 (cfs):     1740.0
Mean flow across dates (cfs): 4406.4



The following code is trying to measure the total suspended soild for the selected site.

In [46]:
# the code for total suspended solids is 00530; the following shows how
# to get the TSS data for a gage station
try:
    TSS = station.waterquality['00530']
    print('Number of suspended solids measurements:', len(TSS))
    print('TSS concentration on {}: {} mg/L'.format(*TSS[0]))
except: 
    print('no TSS data available for this station')    
print('')

no TSS data available for this station



This is the result graphs for the selected site:  
<img src='Cimarron River near Ripley.png' width = '800'>