# State of the seismic interpretation

This notebooks sums up our current progress in multiple seismic-related tasks:

* [Horizon detection](detection)
* [Horizon extension](extension)
* [Interlayers segmentation](segmentation)

<a id='detection'></a>

# Horizon detection

This task is closely connected with day-to-day work of seismic experts: the very first thing they need to work with the cube is to pick seismic horizons. Seismic horizon is a change in rock properties across a boundary between two layers of rock, particularly seismic velocity and density. Such changes are visible in seismic images (even for an untrained eye), and could be automatically detected. We have a [notebook](./Horizons_detection.ipynb) that demonstrates entire process of solving the task: from indexing dataset and data-feeding procedure to training neural network and computing the metrics. Some of the most important parts:

* **Dataset:** we have 4 seismic cubes, each with 4-10 hand-labeled horizons. The smallest one is held out for testing purposes

* **Data-feeding procedure:** due to the fact that we usually can't feed entire cube into GPU (only the smallest ones), we cut them into crops and use it as training data. We also use some augmentations like scaling, cutout, additive noise, elastic transform etc

* **Neural network architecture:** `EncoderDecoder` is used; we use sophisticated building blocks (like `Inception` ones) while reducing spatial resolution in order to obtain information about inner structure of the crop, and then restore initial resolution to produce fine-grained labels

* **Training procedure:** we use `Adam` optimizer for 1000 epochs to minimize `Dice` coefficient; 80% of ilines in each of the three cubes are used at train time

* **Validation:** we first validate our model on the remaining 20% of the ilines in the three cubes; then we check model performance on the held-out cube in order to fairly evaluate model quality


### Evaluation

We use multiple metrics: area of detected horizon (compared to the hand-labeled one), mean difference between horizons, area of detected horizon that is closer than 5ms to the ground truth. In order to provide exhaustive research, we also trained multiple models on different datasets.

First of all, we created individual models for each of the three cubes. Moreover, we used only each 200-th iline, totalling in no more than 15 slides. Due to that fact that seismic cubes change slowly along its spatial dimensions, we expect our model to easily recover the rest of horizons. For conveniency, we created `Python` [scripts](./../scripts), that essentially repeat described pipeline but easier to mass-usage. Predictions from each model, tested on the same cube, scored following results:

| Train/test cube |   Area, % | Mean error, ms | Area in 5ms window, % |
| :------ | ----: | ----: | ----: |
| CUBE_1 | 74, 74, 71  | 1.5, 2.1, 4.4 | 96, 94, 85 |
| CUBE_3 | 92, 92, 92  | 1.8, 1.9, 2.5 | 99.8, 99.5, 91.6 | 
| CUBE_VU_ONGMK |  73, 83, 84 | 3, 3.6, 4.3 | 89, 86, 84 |

Results in the table are consistent with our observations: detected horizonts are detected well, and the main point of improvement lies in enlarging the covered area. The only thing that catches the eye is suspiciously low area of detected horizons on `Cube_1`. Returning to the dataset [description](./../datasets/Horizonts_modelling.ipynb), we can easily identify the roots of the problem: hand-labeled horizons are for some reasons present in the zero-traces, and that skews our results.

Models, trained only on one cube, hardly works on the others: it just can'd adjust to the altering of values in the traces. Thus, we need to train model on at least multiple cubes in order to generalize on the others. 

Having 3 cubes in total, we trained models for each pair of them and used it as predictor on the remaining one: results (with the same metrics as before) are shown in the table:

| NOT in the training |   Area, % | Mean error, ms | Area in 5ms window, % |
| :------ | ----: | ----: | ----: |
| CUBE_1 | 5  | 0.85 | 94 |
| CUBE_3 | 45  | 1 | 97 | 
| CUBE_VU_ONGMK |  25 | 50 | 0 |

Again, results are not surprising: where test-cube structure resembles the train ones, model can follow the horizon quite well, with the easiest cube being labeled the best. It is the covered area that is the problem: different types of inner noises does not allow model to generalize well enough on big distances. The hardest cube (`Cube_VU_ONGMK`) stays pretty much unlabeled due to its unique hardness.

### Criticism

The task in hand is ill-defined: it is very unclear, which horizons on the slide we want to get, how many of them, with which rules of picking. Current labels are quite inconsistent: they have different phase (some of them are on maximum values of amplitude, some of them at minimums); they separate different objects: most of them are following the brightest line on the slide, some of them are between crucial seismic facies, few of them track fissures. Most of them are made by automatic autocorellation and not really interesting (nor hard).

Despite that, this model serves as a great trampoline for the others: [horizont extension](./Horizons_extension.ipynb) and [facies segmentation](./Segmenting_interlayers.ipynb). Clear visual interpretation allows to compare different neural network architectures and test all of the methods of our library.

### Suggestions for improvements

It is easily seen that in this task the bottleneck is data: more diverse seismic cubes with more horizons would greatly improve both quality of the detected ones and allow to better generalize on completely unseen data. Nevertheless, we can use this type of models even in its current state to generate more data for other tasks: namely, [horizont extension](./Horizons_extension.ipynb)

<a id='extension'></a>

# Horizon extension

<a id='segmentation'></a>

# Interlayers segmentation