# SAID 
### Back-end demo

## Load the constituent data
In the developed model, the constituent data is suspended-sediment concentration (SSC).
This cell loads SSC observations from a tab-delimited text file, drops the sample method code, and displays the Pandas DataFrame containing the SSC observations.

In [None]:
%matplotlib inline
import os

from surrogatemodel import ConstituentData

data_directory = r'.\SpoonRiverAcoustics'

# load constituent data (SSC)
ssc_filename = r'spoonSSC.txt'
scc_file_path = os.path.join(data_directory, ssc_filename)
ssc_data = ConstituentData.read_tab_delimited_data(scc_file_path)

# drop the sample method code
ssc_data = ssc_data.drop_variables(['SampleMethod'])

# show constituent dataset
ssc_data.get_data()

## Load the surrogate data
The surrogate variable will ultimately be mean sediment corrected backscatter (MeanSCB). MeanSCB is calculated from raw acoustic backscatter data. The following cell loads the raw backscatter from two sets of Argonaut files and displays the raw data.

In [None]:
from acoustic import RawBackscatterData

# load Argonaut data
acoustic_filenames = ['SPOON001', 'SPOON002']
advm_data = RawBackscatterData.read_argonaut_data(data_directory, acoustic_filenames[0])
for index in range(1, len(acoustic_filenames)):
    tmp_advm_data = RawBackscatterData.read_argonaut_data(data_directory, acoustic_filenames[index])
    advm_data = advm_data.add_data(tmp_advm_data, keep_curr_obs=True)

# show the contents of the raw data
advm_data.get_data()

## View the ADVM configuration parameters
In addition to backscatter data, the RawBackscatterData.read_argonaut_data() method also loads configuration parameters from the Argonaut data set. The next cell shows the configuration parameters that have been loaded from the data set.

In [None]:
# show configuration parameters
configuration_parameters = advm_data.get_configuration_parameters()
configuration_parameters

## Initialize ADVM backscatter data processing parameters
In order to process the backscatter data, the processing class needs user defined processing parameters. The next cell creates and displays default processing parameters.

In [None]:
# create processing parameters and show default parameter values
from acoustic import ADVMProcParam
processing_parameters = ADVMProcParam(configuration_parameters['Number of Cells'])
processing_parameters

## Create an instance of the backscatter processing class
The ADVMBackscatterDataProcessor class handles the processing of backscatter data in order to calculate the acoustic parameters MeanSCB and sediment attenuation coefficient (SAC). The following cell initializes a processor instance using the default processing parameters and displays the results.

In [None]:
# create processor and show results using default processing parameters
from acoustic import ADVMBackscatterDataProcessor
abs_processor = ADVMBackscatterDataProcessor(advm_data, processing_parameters)
abs_processor.get_acoustic_parameters().get_data()

## Create a rating model
In the next cell, a rating model is initialized with the following information.

* Constituent variable: SSC
* Surrogate varibale: MeanSCB
* Match method: Mean, centered around constituent observation time
* Mean time window width: 30 minutes

A scatter plot, a model fit line and confidence intervals, is also shown.

In [None]:
# create a surrogate model using SSC as the constituent and MeanSCB as the surrogate
from acoustic import BackscatterRatingModel
rating_model = BackscatterRatingModel(ssc_data, abs_processor, 
                                      constituent_variable='SSC', 
                                      surrogate_variables=['MeanSCB'], 
                                      match_method='mean', 
                                      match_time=30)
rating_model.plot()

## View backscatter profile plots

In [None]:
fig = rating_model.plot_backscatter_profiles()
fig.set_size_inches(15, 10)

## Change processing parameters
As shown, the default processing parameters do not produce satisfactory results. The created linear model is unsatisfactory, and the backscatter profiles indicate a problem with the data. 

The next cell changes the parameters, recalculates the acoustic parameters, and shows the recalculated values.

In [None]:
# adjust processing parameters and recalculate acoustic parameters
processing_parameters.update({'Backscatter Values': 'Amp',
                              'Beam': 2,
                              'WCB Profile Adjustment': True,
                              'Near Field Correction': True})
abs_processor.calculate_acoustic_parameters(processing_parameters)
abs_processor.get_acoustic_parameters().get_data()

## Recreate a rating model

In [None]:
rating_model = BackscatterRatingModel(ssc_data, abs_processor, 
                                      constituent_variable='SSC', 
                                      surrogate_variables=['MeanSCB'], 
                                      match_method='mean', 
                                      match_time=30)
rating_model.plot()

## Transform the constituent variable
The plot above shows a non-linear trend. This can be corrected by log-transforming the constituent variable and bringing it inline with the physically-based and suggested form of the single-frequency sediment acoustic linear model, shown below.

$$\log_{10}{SSC}=\beta_{0}+\beta_{1}\overline{SCB}$$

The following cell transforms the constituent variable and shows the plot of the linear model.

In [None]:
# log10 transform SSC
rating_model.set_constituent_transform('log10')
rating_model.plot()

## Remove outliers
The next cell removes observations that have been determined to be outliers and shows a scatter plot of the linear regression model.

In [None]:
# remove outliers
model_index = rating_model.get_model_dataset().index
rating_model.exclude_observations(model_index[[6, 19, 37, 52]])
rating_model.plot()

## Plot backscatter profiles

In [None]:
fig = rating_model.plot_backscatter_profiles()
fig.set_size_inches(15, 10)

## Show diagnostic plots
So far, the model is looking pretty good. The next cell creates a single figure with multiple sets of axes and plots some standard diagnostic plots to assess the quality of the regression. The Plots, from left to right and top to bottom are

* Model scatter plot (transformed constituent variable)
* Variable scatter plot (non-transformed constituent variable)
* Model predicted versus observed plot (transformed constituent variable)
* Variable predicted versus observed plot (non-transformed constituent variable)
* Residual versus fitted plot
* Residual probability plot
* Standardized serial correlation coefficient plot
* Residuals plotted against time

In [None]:
import matplotlib.pyplot as plt
fig, ((ax1, ax2), (ax3, ax4), (ax5, ax6), (ax7, ax8)) = plt.subplots(nrows=4, ncols=2)
rating_model.plot(ax=ax1)
rating_model.plot('variable_scatter', ax=ax2)
rating_model.plot('model_pred_vs_obs', ax=ax3)
rating_model.plot('pred_vs_obs', ax=ax4)
rating_model.plot('resid_vs_fitted', ax=ax5)
rating_model.plot('resid_probability', ax=ax6)
rating_model.plot('serial_correlation', ax=ax7)
rating_model.plot('resid_vs_time', ax=ax8)
fig.set_size_inches(15, 20)

## Show a quantile plot
The next cell shows a quantile plot of the surrogate observations from the entire time series that was loaded with model observation quantiles indicated.

In [None]:
rating_model.plot('quantile')

## Generate model archive report
In the next cell, a report containing standard ordinary least square regression statistics is generated and displayed. The report itself can be saved to a CSV file.

In [None]:
rating_model.get_model_report()

## Show a predicted time series
The cell below shows a time series of the predicted constituent using the loaded surrogate time series. The location of included model observations and excluded model observations is indicated. Missing observations would also be indicated if the model data set contained them. Also plotted is the 90% prediction interval.

In [None]:
fig = plt.figure()
fig.set_size_inches(20, 10)
ax = fig.add_subplot(111)
rating_model.plot('time series', ax)