# **Tutorial 5**

### **Debuging Dask Data**

This tutorial, we will present how you can capture data from pipeline and debug it. In special, if you need to handle how dask is generating data graphs or the flow of the dask chunks.

Following the same behaviour we are doing, we will use the same example used previously, but instead of plot the image, we will debug to see the data. Let's split up the *F3 Block* in small parts now to avoid big figures of the data graph.

In [None]:
from dasf_seismic.datasets import F3
from dasf_seismic.transforms import SEGYToArray
from dasf_seismic.attributes.complex_trace import InstantaneousBandwidth
from dasf.transforms import ExtractData

dataset = F3(chunks={"iline": 200})

extracted_data = ExtractData()

ib = InstantaneousBandwidth()

We have the option to create a pipeline without attaching an executor, but as we want to debug Dask, we are forced to create one instance.

In [None]:
from dasf.pipeline.executors import DaskPipelineExecutor

dask = DaskPipelineExecutor(use_gpu=True)

To debug data, we have two data structures `Debug` and `VisualizaDaskData`. Let's create both to see what they display.

In [None]:
from dasf.debug import Debug, VisualizeDaskData

debug = Debug()
visualize_dask = VisualizeDaskData("InstantaneousBandwidth.png")

We can create our pipeline now appending our two new debug operators at the end.

In [None]:
from dasf.pipeline import Pipeline

pipeline = Pipeline("F3 Block plot dynamically", executor=dask)

pipeline.add(extracted_data, X=dataset) \
        .add(ib, X=extracted_data) \
        .add(debug.display, X=ib) \
        .add(visualize_dask.display, X=debug.display) \
        .visualize()

Time to run it and see what happens.

In [None]:
%time pipeline.run()

As we can see, it is possible to have a big picture of how the F3 block is divided and the size of each chunk.

The picture is not being rendered because it is too big to fit into the notebook output, but you can open it and see the dask data graph.