# EchoPro Semi-variogram Workflow

## Import libraries and configure the Jupyter notebook

In [1]:
# libraries used in the Notebook
import matplotlib.pyplot as plt
import numpy as np 

# Python version of EchoPro
import EchoPro

# Allows us to grab the SemiVariogram class so we can use its models
from EchoPro.computation import SemiVariogram as SV

# Allows us to easily use matplotlib widgets in our Notebook
%matplotlib widget

## Set up  EchoPro for a specific survey year

### Initialize EchoPro object  using configuration files

* `initialization_config.yml` -- parameters independent of survey year
* `survey_year_2019_config.yml` -- parameters specific to survey year
* `source` -- Define the region of data to use e.g. US, CAN, US & CAN
* `exclude_age1` -- States whether age 1 hake should be included in analysis.

In [2]:
%%time
survey_2019 = EchoPro.Survey(init_file_path='../config_files/initialization_config.yml',
                             survey_year_file_path='../config_files/survey_year_2019_config.yml',
                             source=3, 
                             exclude_age1=True)

A full check of the initialization file contents needs to be done!
A check of the survey year file contents needs to be done!
CPU times: user 7.64 ms, sys: 0 ns, total: 7.64 ms
Wall time: 7.58 ms


### Load and process input data 
* This data is stored in `survey_2019`

In [3]:
%%time 
survey_2019.load_survey_data() 

CPU times: user 1.37 s, sys: 0 ns, total: 1.37 s
Wall time: 1.37 s


### Compute the areal biomass density
* The areal biomass density is stored in `survey_2019.bio_calc.transect_results_gdf` as `biomass_density_adult`

In [4]:
%%time
survey_2019.compute_transect_results()

CPU times: user 1.72 s, sys: 5.65 ms, total: 1.73 s
Wall time: 1.73 s


## Obtain Kriging Mesh Data

### Access Kriging mesh object
* Reads mesh data files specified by `survey_2019` 

In [5]:
krig_mesh = survey_2019.get_kriging_mesh()

### Apply coordinate transformations to transect data 
* Longitude transformation
* Lat/Lon to distance

#### Transect points

In [6]:
krig_mesh.apply_coordinate_transformation(coord_type='transect')

## Compute biomass density Semi-Variogram and fit a model

* Compute the normalized semi-variogram using the areal biomass density
* Fit a model to the semi-variogram values

### Compute the semi-variogram

#### Initialize semi-variogram calculation
* Transformed transect points
* Parameters specific to semi-variogram algorithm

In [7]:
semi_vario = survey_2019.get_semi_variogram(
    krig_mesh,
    params=dict(nlag=30, lag_res=0.002)
)

#### Compute the normalized semi-variogram

In [8]:
%%time
semi_vario.calculate_semi_variogram()

CPU times: user 3.7 s, sys: 1.95 s, total: 5.66 s
Wall time: 3.34 s


In [9]:
semi_vario.gamma_normalized

array([0.55956903, 0.2787168 , 0.50171389, 0.62853021, 0.81700132,
       0.84808277, 0.85626558, 0.88805863, 0.87678714, 0.89533099,
       0.92552751, 0.92969092, 0.92622266, 0.93359387, 0.94453168,
       0.94983155, 0.92341107, 0.91565978, 0.93721164, 0.96212835,
       0.94361429, 0.93675359, 0.96621853, 0.97465786, 0.977012  ,
       0.96146394, 0.98101596, 0.98020563, 0.98974628, 0.95737046])

### Fit a model to the semi-variogram

* A widget to easily fit the model

**Note: The below Least Squares fit has a default bound on all float parameters. All parameters except `Length scale hole effect` and `Nugget` are in the bounds `(0, infinity)`. The parameters `Length scale hole effect` and `Nugget` have the bounds (0, 1e-13) so that they produce near zero values.**

In [10]:
semi_vario.get_widget()

GridspecLayout(children=(Dropdown(description='Semi-variogram model', index=1, layout=Layout(grid_area='widget…

#### Get semi-variogram model Parameters 
- Obtain semi-variogram model parameters 

**Note: If one fits the model using the above widget and then reruns the below line, the updated model parameters can be obtained.**

In [11]:
semi_vario.get_params_for_kriging()

{'s_v_model': <function EchoPro.computation.semivariogram.SemiVariogram.generalized_exp_bessel(lag_vec: numpy.ndarray, sill: float, ls: float, exp_pow: float, ls_hole_eff: float, nugget: float) -> numpy.ndarray>,
 's_v_params': {'sill': 1.0,
  'ls': 1.0,
  'exp_pow': 1.0,
  'ls_hole_eff': 1.0,
  'nugget': 0.0}}

### Compare Python normalized gamma against Matlab output
- We see that the results are matching to about 4 decimal places

In [12]:
# output produced by Matlab EchoPro
matlab_gamma = np.array([0.55957167, 0.27871057, 0.5017043 , 0.6285182 , 0.81699518,
       0.84807799, 0.85626063, 0.88805541, 0.87678417, 0.89532938,
       0.925526  , 0.92969017, 0.92621991, 0.93359212, 0.94452939,
       0.94982919, 0.92340754, 0.91565693, 0.93720898, 0.96212605,
       0.94361212, 0.93675056, 0.96621628, 0.97465567, 0.97700998,
       0.96146162, 0.98101481, 0.98020469, 0.98974528, 0.95736852])

np.allclose(semi_vario.gamma_normalized, matlab_gamma, rtol=1e-4)

True