# Faults annotation

Here will show annotation format and structure of faults.

In [None]:
# some imports
import sys
import warnings
warnings.filterwarnings("ignore")

from copy import copy
import glob

import numpy as np
import torch.nn as nn
from tqdm.notebook import tqdm_notebook

sys.path.append('../..')

from seismiqb import *
from seismiqb.src.controllers.torch_models import ExtensionModel

from seismiqb.batchflow import FilesIndex, Pipeline
from seismiqb.batchflow import D, B, V, P, R, L

## Initial annotation

Faults can be sored in different formats (see `Fault` class documentation).
In our case each csv-like file corresponds to one fault.

In [None]:
CUBE_FOLDER = '/data/seismic_data/seismic_interpretation/CUBE_16_PSDM'

In [None]:
fault = glob.glob(CUBE_FOLDER + '/INPUTS/FAULTS/RAW/*')[0]

Columns are `['INLINE', 'iline', 'xline', 'cdp_x', 'cdp_y', 'height', 'name', 'number']`

In [None]:
! head "{fault}"

Firstly, we check that all files have known structure. Otherwise, we have to fix some files.

In [None]:
Fault.check_format(CUBE_FOLDER + '/INPUTS/FAULTS/RAW/*', verbose=True)

At the loading stage, we interpolate each fault as a surface. 

In [None]:
%%time

cube_path = glob.glob(CUBE_FOLDER + '/amp*.hdf5')[0]

dataset = SeismicCubeset(FilesIndex(path=cube_path, no_ext=True))

dataset.load(label_dir='/INPUTS/FAULTS/RAW/*', labels_class=Fault, width=3)
dataset.modify_sampler(dst='train_sampler', finish=True)

Sticks interpolation is time consuming procedure, therefore we dump resulting points as a `.npy` files

In [None]:
dataset.dump_labels('/INPUTS/FAULTS/NPY')

... and make loading faster!

In [None]:
%%time

dataset = SeismicCubeset(FilesIndex(path=cube_path, no_ext=True))

dataset.load(label_dir='/INPUTS/FAULTS/NPY/*', labels_class=Fault)
dataset.modify_sampler(dst='train_sampler', finish=True)

# Map of faults

Now let's see the map of faults.

In [None]:
dataset.show_points()

And slice from the cube

In [None]:
i = dataset.labels[0][0].points[0, 0]
zoom_slice = (slice(None), slice(900, 1500))
dataset.show_slide(i, zoom_slice=zoom_slice, figsize=(20, 10), mode='separate')

In [None]:
dataset.show_slide(i, zoom_slice=zoom_slice, figsize=(20, 10))