## <span style="color:green">Introduction</span>
This program downloads peak flow data from USGS Surface Data Portal for a USER_INPUT station and calculates the flow 
corresponding to (different) return period

This code is written in Python 3 format

Revision No: 05

Last Revised : 2020-06-05

## <span style="color:green">Import the packages/modules required for this exercise</span>

<p> We need the following packages: urllib.parse, urllib.request, math, scipy.stats, numpy (np), gamma from scipy.stats, and invgamma from scipy.stats. The paranthesis contains the commonly used short forms for these libraries.</p>

In [None]:
## Import the required Modules/Packages for obtaining the data from portal
import urllib.parse
import urllib.request
import os

## Import the required Modules/Packages for calculating return period flow using Gamma Inverse Function
import math
import scipy.stats
import numpy as np
#import csv
from scipy.stats import gamma
from scipy.stats import invgamma

## <span style="color:green">Definition of Function for retrieval of Peak Flow Data</span> 

<p style='text-align: justify;'>Let us define the first definition block in Python to collect the data from the USGS web link using urllib package. This definition block will be later invoked in the code.</p>
<ul>
<li>Step 1: <span style="color:red">Build the url using the station code.</span></li>
<li>Step 2: <span style="color:red">Access the data using the url and gather the data (date, flow data, station name</span></li>
<li>Step 3: <span style="color:red">Decode the data and extract only the required data</span></li>
<li>Step 4: <span style="color:red">Return the flow data and station name</span></li>

In [None]:
## Define a function for obtaining the peak flow data from USGS Surface Data Portal
def GetAnnualPeakFlowData_f(station_number,FolderName):
    """
    Input: Station Number, Folder Name
    Output: Peak Flow Values, Station Name
    """
    ## Building URLs
    var1 = {'site_no': station_number}
    part1 = 'https://nwis.waterdata.usgs.gov/nwis/peak?'
    part2 = '&agency_cd=USGS&format=rdb'
    link = (part1 + urllib.parse.urlencode(var1) + part2)
    print("The USGS Link is: \t")
    print (link)
    
    ## Opening the link & retrieving data
    response = urllib.request.urlopen(link)
    html = response.read()
    
    ## Assigning the location & Storing the Original Data
    
    #DataStore=FolderName + station_number + ".txt"
    with open(FolderName+'Data_' + station_number + '_raw'  + '.txt', 'wb') as f1:
        f1.write(html)
    f1.close
    
    ## Converts html from bytes class to str class
    html = html.decode()
    ## Splits the string by \n and converts list
    html2 = html.split('\r\n')
    
    ## To get the station name 
    line_no=0
    for line_no in range(len(html2)):
        ## Check if first six (use 0:7) characters is "#  USGS",
        if html2[line_no][0:7]=="#  USGS":
            station_name=html2[line_no][3:]
            break
        line_no+=1
    
    ## Define an empty string
    reqd_data = 'Year,Discharge'+'\n'
    #print(type(reqd_data))
    reqd_flow_list=[]
    reqd_flow_list.append(["Year","Discharge"])
    
    for line in html2[74:]:
        ## Splits each line to col by tab separator
        cols = line.split('\t')
        if len(cols) == 1:
            continue
        ## Joins only date and peakflow
        ## cols[2] corresponds to Date of peak streamflow (format YYYY-MM-DD)
        ## cols[4] corresponds to Annual peak streamflow value in cfs
        newline = ','.join([cols[2],cols[4]])
        reqd_data += newline + '\n'
        reqd_flow_list.append((cols[4]))

    
    ## Converts reqd_data from str class to bytes class
    reqd_data = reqd_data.encode() 
    ## Saves the date and peakflow into a new file
    with open(FolderName+'Data_' + station_number + '_reqd'  + '.txt', 'wb') as f2:
        f2.write(reqd_data)
    f2.close
    print ('\n')
    print("Raw Data and Processed Data is stored in Results Folder.")
    #print(reqd_data)
    
    ## Returns the peak flow data as list for calculation of return period
    return (reqd_flow_list,station_name)

## <span style="color:green">MAIN CODE</span> 
Now, the user has to input the station number of the desired USGS Station. It executes the definition block and stores the data in the folder.

In [None]:
## Main Code

station_number=input("Enter USGS Station Number of the Required Station (USGS Station Number/site_no) \t")
print('\t')
FolderName="./Results/"

## Make folder to save the results
if os.path.exists(FolderName) == False:
    os.mkdir(FolderName)

peakflow_list_wb,station_name=GetAnnualPeakFlowData_f(station_number,FolderName)
print("\nThe station name is:", station_name,"\n")

## <span style="color:green">Years for Analysis</span> 

Now, the user has to input enter the four values for dates. This should be properly entered otherwise you will get an error message "Error in length of data and check whether it is continuous".

In [None]:
## Enter the four years for carrying out the analysis
## Input data & analysis years

data_start_year=int(input("Enter the starting year of DATA PERIOD (excluding initial break period):"))
print('\t')
data_end_year=int(input("Enter the ending year of DATA PERIOD:"))
print('\t')
analysis_start_year=int(input("Enter the starting year of ANALYSIS PERIOD:"))
print('\t')
analysis_end_year=int(input("Enter the ending year of ANALYSIS PERIOD:"))
print('\t')

## <span style="color:green">Calculation of Return Period</span> 
Next, we have to write the code for performing the calculations of return period flow using moving average method.

In [None]:
<WRITE YOUR CODE HERE>
