## Import necessary libraries and configure Jupyter notebook

Note: Depending on your computer OS, you may obtain the warning `OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.`. This warning will **not** affect the output of this notebook (it can be ignored).

In [None]:
from EchoPro import EchoPro
import numpy as np
import matplotlib.pyplot as plt
# grab the SemiVariogram class so we can use its models
from EchoPro.semivariogram import SemiVariogram as SV
%matplotlib widget

# Load and Process Data 

In this section we use the configuration files and initialization parameters to load all files that are necessary for the biomass density calculation. Additionally, using the prepared files we compute the normalized biomass density of the raw data.

The following variables representing the data are constructed:
* `params` -- a dictionary of all parameters from the configuration files
* `strata_df` -- a minimal Dataframe of the data contained in `filename_strata`
* `strata_ds` -- an Xarray Dataset containing strata_df data and computed quantities
* `geo_strata_df` -- a minimal Dataframe of the data contained in `stratification_filename`
* `length_df` -- a minimal Dataframe of the data contained in `filename_length_US/CAN` and computed quantities for the provided `species_code_ID`
* `specimen_df` -- a minimal Dataframe of the data contained in `filename_specimen_US/CAN` for the provided `species_code_ID`
* `nasc_df` -- a minimal Dataframe of the data contained in the appropriate NASC file e.g. `filename_processed_data_no_age1` or `filename_processed_data_all_ages` 
* `final_biomass_table` -- a Dataframe containing a subset of data from `nasc_df` and the calculated normalized biomass density

All of these variables can be accessed through epro_2019 e.g. `epro_2019.strata_df`.

Note: Once `epro_2019` has been created, all computational routines can be accessed using this object.

Note: The run below will print the statements: `A check of the initialization file needs to be done!, A check of the survey year file needs to be done!, We are using our own biomass density calculation!` these can be ignored as they are reminders. 

In [None]:
%%time
epro_2019 = EchoPro(init_file_path='./config_files/initialization_config.yml',
                    survey_year_file_path='./config_files/survey_year_2019_config.yml',
                    source=3, 
                    exclude_age1=True)

### Display the final biomass table

In [None]:
epro_2019.final_biomass_table.head()

# Jolly-Hampton CV Analysis

Here we compute the mean of the Jolly-Hampton CV value for data that has not been Kriged.

Note: the algorithm used to compute this value is random in nature. Thus, different runs can produce slightly different values.

In [None]:
%%time
lat_INPFC = [np.NINF, 36, 40.5, 43.000, 45.7667, 48.5, 55.0000]  # INPFC
CV_JH_mean = epro_2019.run_cv_analysis(lat_INPFC, kriged_data=False)
print(f"CV_JH_mean = {CV_JH_mean}")
# The output should be approximately CV_JH_mean = 0.1337

# Obtain Kriging Mesh Data

Here we obtain the mesh and data, which will be necessary to compute the semi-variogram calculation and actually perform the Kriging. 

This line run produces the following variables: 
* `mesh_gdf` -- A GeoPandas Dataframe obtained from data in `filename_grid_cell`
* `smoothed_contour_gdf` -- A GeoPandas Dataframe obtained from data in `filename_smoothed_contour`

Additionally, this initalization creates routines that can plot and transform the mesh data. 

In [None]:
# obtain kriging mesh class
krig_mesh = epro_2019.get_kriging_mesh()

## Plot the mesh, transects, and smoothed contour

* Transect points are represented by a changing color gradient (these can be seen by zooming in)
* The full mesh points are red 
* The smoothed countour points are blue 

In [None]:
# Plots the transect points on the folium map
fmap = krig_mesh.plot_points(epro_2019.final_biomass_table.reset_index(), 
                             lon_name="Longitude", lat_name="Latitude", 
                             cmap_column='Transect', color='hex')

# Plot full mesh points 
fmap = krig_mesh.plot_points(krig_mesh.mesh_gdf, 
                             lon_name='Longitude of centroid', lat_name='Latitude of centroid', 
                             fmap=fmap, color='red')

# Plot smoothed contour points 
fmap = krig_mesh.plot_points(krig_mesh.smoothed_contour_gdf, 
                             lon_name="Longitude", lat_name="Latitude", fmap=fmap, color='blue')

# display the folium map
fmap

## Transforming a set of points

To run the semi-variogram and Kriging calculations, it is required that one transforms the longitude/latitude points. Below we demonstrate a convenience routine accessible via `krig_mesh` that performs this transformation on the transect points. 

In [None]:
# apply transformations to transect points 
trans_df = krig_mesh.apply_longitude_transformation(epro_2019.final_biomass_table)
D_x = trans_df['Longitude'].max() - trans_df['Longitude'].min()
D_y = trans_df['Latitude'].max() - trans_df['Latitude'].min()
x_transect, y_transect = krig_mesh.apply_distance_transformation(trans_df, D_x, D_y)

In [None]:
# plot the transformed points 
plt.plot(x_transect, y_transect, 'r*', markersize=1.25)
plt.show()

# Compute Semi-Variogram and fit a model

Below we demonstrate how to compute the normalized semi-variogram for the transect points using the normalized biomass density. We then show how to fit a model to the normalized semi-variogram data. 

In [None]:
# setup bins for semi-variogram calculation
nlag = 30 
lag_res = 0.002
center_bins = lag_res*np.arange(nlag)

In [None]:
# initialize semi-variogram class using the transect points
semi_vario = epro_2019.get_semi_variogram(x_transect, y_transect, 
                                          epro_2019.final_biomass_table['normalized_biomass_density'].values.flatten())

## Compute the semi-variogram

In [None]:
%%time
# run the semi-variogram calculation 
semi_vario.calculate_semi_variogram(center_bins)

# display the semi-variogram values
semi_vario.gamma_standardized

## Fit a model to the semi-variogram

To run Kriging, we need to fit a model to the normalized semi-variogram values. We provide a widget to display this model and allow one to actively change parameters within the model. 

Note: When you run the least-squares fit of the model, all model parameters will be updated and the model will be plotted in red. The apply model button will plot the model for the values provided in the box. Each time you change the values in the box, you need to unselect and select the apply model button to display the updated model.  

In [None]:
semi_vario.view_semi_variogram()

# Perform Kriging

Below we perform Ordinary Kriging using the constructed transformed mesh points, the semi-variogram model, and the normalized biomass density.   

## Setup preliminary variables necessary for Kriging

In [None]:
# apply transformation to the mesh points
trans_dfm = krig_mesh.apply_longitude_transformation(krig_mesh.mesh_gdf, 
                                                     gdf_lon_name='Longitude of centroid', 
                                                     gdf_lat_name='Latitude of centroid')

# Note we are using D_x and D_y computed in a previous cell
x_mesh, y_mesh = krig_mesh.apply_distance_transformation(trans_dfm, D_x, D_y,
                                                         gdf_lon_name='Longitude of centroid',
                                                         gdf_lat_name='Latitude of centroid')

In [None]:
# initalize kriging routine
krig = epro_2019.get_kriging()

In [None]:
# Initalize Kriging parameters
k_max = 10
k_min = 3
R = 0.0226287
ratio = 0.001

# parameters for semi-variogram model
s_v_params = {'nugget': 0.0, 'sill': 0.95279, 'ls': 0.0075429,
              'exp_pow': 1.5, 'ls_hole_eff': 0.0}

# grab appropriate semi-variogram model
s_v_model = SV.generalized_exp_bessel

## Perform Ordinary Kriging

Below we perform Ordinary Kriging on the normalized biomass density using the established paramters. This routine returns:

* `ep_arr` -- Kriging variance for each mesh coordinate
* `eps_arr` -- Kriging sample variance for each mesh coordinate
* `vp_arr` -- Kriged value for each mesh coordinate

In [None]:
%%time
ep_arr, eps_arr, vp_arr = krig.run_kriging(x_mesh, x_transect, 
                                           y_mesh, y_transect, 
                                           epro_2019.final_biomass_table['normalized_biomass_density'].values.flatten(), 
                                           k_max, k_min, R, ratio, 
                                           s_v_params, s_v_model)

## Compute the total Kriged biomass estimate 

This should produce a total Kriged biomass estimate of 1725.0331199094 (kmt)

In [None]:
Area = krig_mesh.mesh_gdf['Cell portion'].values*epro_2019.params['kriging_A0']

krig_biomass_vals = (vp_arr*Area)*1e-6 # in kmt
tot_krig_biomass = np.nansum(krig_biomass_vals)

print(f"Total Kriged Biomass Estimate {tot_krig_biomass} (kmt) \n")

print(np.isclose(tot_krig_biomass, 1725.0331199094))

## Plot Kriged Biomass estimate in kmt

Red points represent a higher biomass estimate

In [None]:
k_mesh_x = krig_mesh.mesh_gdf['Latitude of centroid'].values.flatten()
k_mesh_y = krig_mesh.mesh_gdf['Longitude of centroid'].values.flatten()

krig.plot_kriging_results(k_mesh_x, k_mesh_y, krig_biomass_vals)