<a href="https://colab.research.google.com/github/FleaBusyBeeBergs/dtsa5741/blob/main/Assignment%203%20_%20Stearmflow%20Data%20via%20NWIS.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# NOAA API
https://www.ncdc.noaa.gov/cdo-web/webservices/v2#gettingStarted

## NOAA datasets
https://www.ncdc.noaa.gov/cdo-web/webservices/v2#datasets

## NOAA data
https://www.ncdc.noaa.gov/cdo-web/webservices/v2#data

# USGS real-time data
https://www.usgs.gov/products/data/real-time-data

# Data retrieval with python repo
https://github.com/DOI-USGS/dataretrieval-python

# USGS parameter code definition
https://help.waterdata.usgs.gov/codes-and-parameters/parameters

In [None]:
!pip install dataretrieval -q

In [None]:
import dataretrieval.nwis as nwis
import pandas as pd

# nwis services avail:
* site info = 'site'
* instantaneous values = 'iv'
* daily values = 'dv'
* statistics = 'stat'
* discharge peaks = 'peaks'
* discharge measurements = 'measurements'
* water quality samples = 'qwdata'

example query:
df = nwis.get_record(site, service, start = 'yyyy-mm-dd', end = 'yyyy-mm-dd' )

In [None]:
# site info, drought well near pueblo, se colorado
site_a = '382323104200701'
pueblo = nwis.get_record(sites = site_a, service = 'site')

pueblo

Unnamed: 0,agency_cd,site_no,station_nm,site_tp_cd,lat_va,long_va,dec_lat_va,dec_long_va,coord_meth_cd,coord_acy_cd,...,reliability_cd,gw_file_cd,nat_aqfr_cd,aqfr_cd,aqfr_type_cd,well_depth_va,hole_depth_va,depth_src_cd,project_no,geometry
0,USGS,382323104200701,"SC01906221AAA DROUGHT WELL NEAR PUEBLO, CO",GW,382322.82,1042008.94,38.389672,-104.335817,M,S,...,C,YYNYNYNN,N100ALLUVL,112TERC,U,90.0,90,D,858200240,POINT (-104.33582 38.38967)


In [None]:
pueblo_dv = nwis.get_record(sites = site_a,
                            service = 'dv',
                            start = '2024-03-01',
                            end = '2024-03-31')

Unnamed: 0_level_0,site_no,72019_Mean,72019_Mean_cd
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2024-03-01 00:00:00+00:00,382323104200701,21.86,A
2024-03-02 00:00:00+00:00,382323104200701,21.86,A
2024-03-03 00:00:00+00:00,382323104200701,21.86,A


In [None]:
pueblo_dv.reset_index(inplace = True)
pueblo_dv.head()

Unnamed: 0,datetime,site_no,72019_Mean,72019_Mean_cd
0,2024-03-01 00:00:00+00:00,382323104200701,21.86,A
1,2024-03-02 00:00:00+00:00,382323104200701,21.86,A
2,2024-03-03 00:00:00+00:00,382323104200701,21.86,A
3,2024-03-04 00:00:00+00:00,382323104200701,21.86,A
4,2024-03-05 00:00:00+00:00,382323104200701,21.86,A


In [None]:
# slice, rename cols
pueblo_dv = pueblo_dv[['datetime', '72019_Mean']]
pueblo_dv.columns = ['datetime', 'gwl']
pueblo_dv.head()

Unnamed: 0,datetime,gwl
0,2024-03-01 00:00:00+00:00,21.86
1,2024-03-02 00:00:00+00:00,21.86
2,2024-03-03 00:00:00+00:00,21.86
3,2024-03-04 00:00:00+00:00,21.86
4,2024-03-05 00:00:00+00:00,21.86


### Assignment 3: Access Streamflow Data via NWIS

In this lab, you will use the library `dataretriveal.nwis` to access streamflow data. From the github repository, `dataretrieval` was created to simplify the process of loading hydrologic data. It is designed to retrieve the major data types of U.S. Geological Survey (USGS) hydrology data that are available on the Web, as well as data from the Water Quality Portal (WQP), which currently houses water quality data from the Environmental Protection Agency (EPA), U.S. Department of Agriculture (USDA), and USGS. Direct USGS data is obtained from a service called the National Water Information System (NWIS).

For this lab, you will need to install `dataretriveal`. For more information, review the [github repository](https://github.com/DOI-USGS/dataretrieval-python).

For this lab, you need to retrive streamflow data from March 1 2024 to March 31 2024 and calculate the mean of the streamflow data. Use the starter code to calculate your mean value.



In [34]:
# Define site number and date range
site_no = '07106500'
start_date = '2024-03-01'
end_date = '2024-03-31'
serv_type = 'dv'

# Retrieve the data. Note you will need to add the get_record() parameters.
data = nwis.get_record(site=  site_no,
                       service = serv_type,
                       start = start_date,
                       end = end_date)

print(data.describe())
# The data is already in a DataFrame format
# The parameter code for streamflow is '00060' and for the daily value we will be using '00060_Mean' column.
# Find the mean of the streamflow data for March 2024 i.e. find the mean of the '00060_Mean' column

mean_streamflow = data['00060_Mean'].mean()

print(mean_streamflow)

       00010_Maximum  00010_Minimum  00010_Mean  00060_Mean  00095_Maximum  \
count      31.000000      31.000000   31.000000   31.000000      31.000000   
mean       12.867742       4.429032    8.345161  212.903226    1152.096774   
std         2.908652       1.852241    2.071206   90.091566      64.855920   
min         6.700000       1.000000    4.300000  116.000000     985.000000   
25%        11.250000       2.800000    6.950000  136.000000    1120.000000   
50%        13.400000       4.200000    8.200000  200.000000    1150.000000   
75%        14.900000       6.000000    9.800000  243.500000    1175.000000   
max        17.100000       7.500000   11.700000  453.000000    1280.000000   

       00095_Minimum   00095_Mean   80154_Mean   80155_Mean  
count      31.000000    31.000000    31.000000    31.000000  
mean     1090.129032  1119.290323   638.709677   583.451613  
std        79.821360    65.402953   771.871068  1069.765701  
min       851.000000   969.000000   100.000000   

Now that you have calculated the mean, use the following quiz to check your answer.