# Import statements

In [None]:
import plotly as py
import numpy as np
import pandas as pd
from datetime import datetime
from scipy import special

from rodtox import *

#set up offline mode for plotly
py.offline.init_notebook_mode(connected=False)

# Data import
## Define the paths to the files we need. Those files are:
* csvDO: .csv file containing Rodtox DO data.
* csvTemp: .csv file containing Rodtox temperature data.
* sourcepath: The path to the directory containing the original .csv files (relative to the notebook's directory)
* destinationpath: The path to the directory where the final data may be written.
* existing: Name of an already existing file that you want to analyze alongside the new data. existing = None by default, so you don't have to add the keyword if you don't need it).

In [None]:
csvDO = 'DO_log22AoutB.csv'
csvTemp = 'Temp_log22AoutB.csv'
sourcepath='To_import'
destinationpath='Imported'

In [None]:
Data, Start, End = ImportRdtxCsv(csvDO,csvTemp, sourcepath, destinationpath)  # ex: existing='DO_log22AoutA.csv'

## Plot the imported data
Just to make sure it's what we wanted. 

In [None]:
y1_series = ['DO']
y2_series = ['Temp']

y1_labels = ['Raw DO']
y2_labels = ['Temperature']

y1_units = ['mg/l']
y2_units = ['°C']

marks = ['lines','lines']

figure = Plotit(Data, Start, End, y1_series, y2_series, y1_labels, y2_labels, y1_units, y2_units, marks)
py.offline.iplot(figure)

The plot is interactive, so you can zoom in the the section you want and choose the Start time and End time of your analysis.

In [None]:
Starttime = '10 August 2017 00:00:00'
Endtime = '10 August 2017 12:00:00'
analysis_data = Data[Starttime:Endtime]

## Smoothen the data using the Deriv function.
Inputs:

* The data frame to analyze
* Tau: The delayt constant of the DO probe (for the RODTOX's probe (at least in 2017) it is (was) 32 seconds.
* Window size: The size of the rolling window for the rolling average.

In [None]:
analysis_data= Deriv(analysis_data, 32, 21)

Again, let's plot the data to make sure everything's going according to plan.

In [None]:
y1_series = ['DO','DO_A']
y2_series = ['Temp']

y1_labels = ['Raw DO','DO smooth']
y2_labels = ['Temperature']

y1_units = ['mg/l','mg/l']
y2_units = ['°C']

marks = ['lines','lines']

figure = Plotit(analysis_data, Starttime, Endtime, y1_series, y2_series, y1_labels, y2_labels, y1_units, y2_units, marks)
py.offline.iplot(figure)

Now we're ready for some analysis!

## calc_kla
calc_kla locates the important points of each peaks in the time series and calculates the KLa of each peak.

Inputs: 
* df: DataFrame containing the data to analyze (must have gone through the Analyze_Respirograms1 function beforehand)
* Start: A string defining the start of the data series to analyze
* End: A string defining the start of the data series to analyze
* Sample_Volume: The volume (in liters) of each wastewater sample added to the RODTOX by the measurement pump during the investigated time series.
* Cal_Volume: The volume (in liters) of each calibration solution sample added to the RODTOX by the calibration pump during the investigated time series.

Outputs:
* A figure showing the importants points of each peak.
* DF: The orginal DataFrame with added rows containing the function's results.
* stBODresults: A DataFrame containing the stBOD of each wastewater sample peak in the data series.
* A figure showing the value of Kla found for each peak, as well as the fitted rearation curve of each peak (dotted lines)

There might be a big scary red block saying "A value is trying to be set on a copy of a slice from a DataFrame". Don't worry about it, it's going to work anyway ;)

In [None]:
analysis_data, Kla = calc_kla(analysis_data, Starttime, Endtime, 0.250, 0.012)

## calc_stbod
calc_stbod calculates the short-term BOD of each respirogram.

Inputs: 
* df: DataFrame containing the data to analyze (must have gone through the Analyze_Respirograms1 function beforehand)
* Filtered: DataFrame containing rows that each describe a separate respirogram.
* Start: A string defining the start of the data series to analyze
* End: A string defining the start of the data series to analyze
* Sample_Volume: The volume (in liters) of each wastewater sample added to the RODTOX by the measurement pump during the investigated time series.
* Cal_Volume: The volume (in liters) of each calibration solution sample added to the RODTOX by the calibration pump during the investigated time series.
* dec_time: This parameters helps to locate decantation peaks within the time series. It defines time-delta (in seconds) between a respirogram's 'Start' and its 'Bottom' DO concentration. For a given respirogram, when the time between those two events is larger than dec_time, then the respirogram is tagged as a decantation peak.

Outputs:
* DF: The orginal DataFrame with added rows containing the function's results.
* Filtered: A DataFrame containing rows describing each peak in the data series.

In [None]:
analysis_data, stbod = calc_stbod(analysis_data, Kla, Starttime, Endtime, 0.250, 0.012, 800)

Let's look at the obtained stbod data:

In [None]:
stbod

Let's save the final stbod data to a .csv file
So it's easy to retrieve, let's add the beginning and end dates of the analysis to the filename and store it a folder names "stBOD results"

In [None]:
results_dir = 'stBOD results/'

beginning = datetime.strptime(Starttime, '%d %B %Y %H:%M:%S').strftime('%Y%m%d')
end = datetime.strptime(Endtime, '%d %B %Y %H:%M:%S').strftime('%Y%m%d')

filename = 'stbod_{}_{}'.format(beginning, end)
stbod.to_csv(results_dir + filename + '.csv', sep=';')