# Quick Start

## Pre-requisites
Installation of Delta2K and supporting software (see [here](Installation.pdf) for installation instructions).

## Introduction

This notebook provides a simplified Python interface for application of the Delta Method to a user-specified data set of dissolved oxygen and temperature time-series. The notebook is composed of code cells and markdown (text) cells. Each code cell has a [ ]: symbol next to it, which will contain a number indicating the execution order of the cell once the code is executed. To execute each cell, click into or next to the cell and press 'Shift + Enter'.

## Step 1. Load the software into the current computing environment

First, execute the `import` command below to load the Delta2K software into the notebook environment. To execute, click into or next to the code cell and press 'shift' and 'enter' at the same time.

In [None]:
import delta2k as d2k

*Note that an asterisk appears to the left of the cell while it is executing. Once the code has executed, a number will appear indicating the order of execution in the current state of the notebook.*

## Step 2. Connect to or request data from a valid source

Next, the data source is specified. Currently, there are two types of data supported:
-  NWIS data from the USGS National Water Information System
-  Local data from a user supplied file

### Step 2A. Request and analyze NWIS data

The cell below provides an example of analyzing data from the NWIS site at the Cuyahoga River in Independence, Ohio. Click beside the cell and press 'shift+enter' to run. Alternatively, you can simply edit the variable name 'cuyahoga' and the inputs to retrieve and analyze data from a different site.

In [None]:
cuyahoga = d2k.source('NWIS', # source type, options 'NWIS' or 'local' (for user-generated data files, see 2B)
                       site='04208000', # USGS site id
                       period=None, # Look back a given number of days: e.g., period = 'P10D' = past 10 days
                       startDate='2019-09-18',
                       endDate='2019-09-20',
                       elevation=582.66) # Elevation of site in NAVD88 feet

The warnings generated by the creation of a solar geometry point can be ignored in the present build of the software.

#### Summarize the results

The result from the Approximate Delta Method (ADM) analysis is stored in an attribute of the new data object called `admResult`. This attribute is a dataframe (or time-indexed table) of the intermediate parameter calculations leading to the estimated parameters from stream metabolism. A dataframe can be summarized by using the `.describe()` method, as shown below:

In [None]:
cuyahoga.admResult.describe()

#### Generate shareable interactive plots of the data and the results

The data and analysis can be reviewed in an interactive graphics browser by executing the command below. A multi-tab panel is generated in a new tab on your web browser. Return to this JupyterLab tab when you are done reviewing the graphics.

In [None]:
cuyahoga.plotADM()

#### Perform the non-linear regression optimization.

The data and all associated site information are now available in an object with the name provided in the code cell above. The object method `fit2DO` can be called to perform a three-parameter nonlinear regression to fit the parameters to the observed oxygen data. The following options can be passed:
-  user_est = None (or omit entirely) will use the ADM result estimates as initial guesses for the optimizer
-  user_est = [ka, Pmax, R] (entered as numbers in brackets separated by commas) can be used with a single-day to overwrite the ADM initial guess
-  print_results = True (default is False) will print the optimization results to the screen for each day. If omitted, the results will not be printed. 
-  method: the optimization algorithm to use. Default is a bounds-constrained Powell method. All options are set to a x-tolerance of 1e-6 for acceptable convergence. Bounds are fixed, and all parameters must be >= 0, while only reaeration is constrained from the right such that (ka <= 100). Other enabled options include:'Nelder-Mead', 'SLSQP', 'L-BFGS-B' (See Scipy.optimize.minize docs for more detail).

Note - expect the code execution below to take about 2 seconds per day of data provided. If `print_results=True` is passed, progress will be evident but may add to the length of the analysis notebook generated.

In [None]:
cuyahoga.fit2DO(user_est=None,
                print_results=True,
                method='Powell')

#### Generate shareable interactive plots of data and fitted results

A new interactive and shareable file can be generated, now with an extra tab showing the fitted model performance. Again, return to the JupyterLab tab in your browser when finished reviewing the plots to continue with the tutorial.

In [None]:
cuyahoga.plotFitted()

#### Convenience attributes

The data associated with your oxygen analysis is now accessible through the object name provided at the start of this example. You can access important metadata as follows: 

Access source metadata:

In [None]:
cuyahoga.siteinfo

In [None]:
cuyahoga.geography

In [None]:
cuyahoga.timezone

Access source data:

In [None]:
cuyahoga.data

Access ADM result:

In [None]:
cuyahoga.admResult

Create a CSV of the data or results:

You can also generate shareable CSV files of your raw data and the subsequent analyses:

In [None]:
# data
datafile = 'cuyahogaData.csv'
cuyahoga.data.to_csv(datafile)

# results
resultfile = 'cuyahogaResults.csv'
cuyahoga.admResult.to_csv(resultfile)

Access detailed dictionary for `fitResult` for given day:

In [None]:
cuyahoga.admResult.fitResult

In [None]:
cuyahoga.admResult.fitResult[0]

In [None]:
cuyahoga.admResult.fitResult[0]['fit_ka']

### Step 2B: Load user-specified file

A GUI is in development, but for now the user can enter the data in the cell below and execute the subsequent cells to create a Delta2K model from any file.

In [None]:
# Enter filename
fileName = 'tests/YRCuster2012.txt'

# Enter separator/delimiter between observation (column) features
sep = '\t' # acceptable options = ',' for csv or '\t' for tab delimited 

# Enter column names in left-to-right order of occurrence
columns = ['dateTime', 'wtemp_c', 'do_mass']

# Enter data units
water_temp_units = 'C' # or 'F'
diss_ox_units = 'ppm' # or 'pct'
ph_units = None # or 'SU' if pH present

# Enter descriptive site information
siteID = None # a numerical identifier, or None
siteName = 'Yellowstone at Custer'

# Enter geographic site information
datum = 'NAD83(2011)'
latitude = 46.143
longitude = -107.551
navd88_ft = None
navd88_m = 829.519

# Enter timezone site information
## See wikipedia for names @ ...
timeZone = "US/Mountain"
timeFormat = '%m/%d/%y %H:%M' # see [this page on C standard time formats] for additional info...

Run the cell below without editing.

In [None]:
units = {
    'wtemp_unit': water_temp_units,
    'do_unit': diss_ox_units,
    'ph_unit': ph_units,
}

siteInfo = {
    'siteID': siteID,
    'siteVariables': columns[1:],
    'siteName': siteName
}

geog = {
    'datum': datum,
    'latitude': latitude,
    'longitude': longitude,
    'navd88_ft': navd88_ft,
    'navd88_m': navd88_m,
    'geoidHt_m': None,
    'grs80_m': None
}

time = {
    'tz': timeZone,
    'format': timeFormat
}

Run the cell below without editing:

In [None]:
# create local source instance
custer = d2k.source('local',
                   path = fileName,
                   sep = sep,
                   columns = columns,
                   units = units,
                   site = siteInfo,
                   geography = geog,
                   timezone = time)

Run the cell below - note that none of the options are altered in the fitting method below. To see which options you can change, click into the code cell, put your cursor inside the parentheses, and then press 'shift + tab' to see the reference documentation on what parameters you may change.

In [None]:
custer.fit2DO()

In [None]:
custer.plotFitted()

Please do not hesitate to email or call with questions:
<br>
Greg Coyle<br>
gregory.coyle@tufts.edu<br>
706.621.9526