### Time consuming of semblance

The main problem of semblance calculation is thar it very time consuming opeartion. This notebook is aimed to show few approaches to semblance calculation and redusing time.

In [2]:
import sys

import numpy as np
import matplotlib.pyplot as plt

sys.path.append('..')

from seismicpro.batchflow import ImagesBatch, Dataset, V, B, Pipeline, D
from seismicpro import FieldIndex, CustomIndex, SeismicDataset, seismic_plot, SeismicBatch

### Semblance calculation

For semblance calculation we use the following formula:
$$S = \frac{\sum^{k+N/2}_{k-N/2}(\sum^M_1 f_{ij})^2}{M \sum^{k+N/2}_{k-N/2}\sum^M_1 (f_{ij})^2} \text{ ,where }$$

* k - time sample
* N - window size
* M - number of traces
* f - value of amplitude

This secotion will contains few approaches for semblance calculation in order to reduce the time.

Frist approach is numba with 3 nested loops. Function ```_calc_semb_hard``` in utils that one can find [here](../seismicpro/src/semblance_utils.py).

In [13]:
%%time
batch = (SeismicDataset(ix_raw, batch_class=SeismicBatch).next_batch(1)
         .load(fmt='segy', components='raw', tslice=slice(1500))
         .sort_traces(src='raw', dst='raw', sort_by='offset')
         .calculate_semblance('raw', 'semblance_hard', [1200, 6000], 30, window=51, method='hard')
)

CPU times: user 3.36 s, sys: 35.1 ms, total: 3.39 s
Wall time: 3.38 s


Funciton with one numba loop and matrix operations. Function ```_calc_semb_hard_numba_mx```.

In [12]:
%%time
batch = (SeismicDataset(ix_raw, batch_class=SeismicBatch).next_batch(1)
         .load(fmt='segy', components='raw', tslice=slice(1500))
         .sort_traces(src='raw', dst='raw', sort_by='offset')
         .calculate_semblance('raw', 'semblance_numba_matrix', [1200, 6000], 30, window=51, method='numba_matrix')
)

CPU times: user 6.78 s, sys: 129 ms, total: 6.91 s
Wall time: 6.89 s


One loop with pure numpy matrix operations. Function ```_calc_semb_hard_matrix```.

In [11]:
%%time
batch = (SeismicDataset(ix_raw, batch_class=SeismicBatch).next_batch(1)
         .load(fmt='segy', components='raw', tslice=slice(1500))
         .sort_traces(src='raw', dst='raw', sort_by='offset')
         .calculate_semblance('raw', 'semblance_matrix', [1200, 6000], 30, window=51, method='matrix')
)

CPU times: user 11.4 s, sys: 845 ms, total: 12.2 s
Wall time: 12.2 s


To conclude we can say that the fastest method is to use numba with 3 nested loops. 