# Processing EK60 Data to Extract Target Strength


## Step 1: Fetch Configuration Files

We begin by importing the required libraries and specifying the paths for the dataset and pipeline configuration files. These files contain the necessary information for data processing.

In [1]:
from pathlib import Path
from echoflow import echoflow_start, glob_url

dataset_config = Path("./datastore.yaml").resolve()
pipeline_config = Path("./pipeline.yaml").resolve()

## Step 1.1: Configuration Files

Familiarize yourself with the configuration options by exploring the documentation for:

- Pipeline Configuration: Learn about configuration settings for the pipeline by referring to [Pipeline Configuration](./pipelineconfiguration.md).

- Datastore Configuration: Understand the various configuration options related to data storage by reading [Datastore Configuration](./datastoreconfiguration.md).

These documents provide detailed information on the configurations used during the setup process.

## Step 2: Getting Data
Next, we'll use the glob_url function to retrieve a list of URLs matching a specific pattern. In this case, we're targeting raw EK60 data files from the SH1707 survey.

In [2]:
all_files = glob_url("s3://ncei-wcsd-archive/data/raw/Bell_M._Shimada/SH1707/EK60/*.raw", {'anon':True})

severe performance issues, see also https://github.com/dask/dask/issues/10276

To fix, you should specify a lower version bound on s3fs, or
update the current installation.



## Step 3: Preparing Files
We'll now extract the file names from the URLs and create a file listing for the transect. This will help us organize and work with the data effectively.

In [3]:
files = []
for file in all_files:
    f = file.split(".r")[0]
    files.append(f.split("/")[-1])

transect = open('EK60_SH1707_Shimada.txt','w')
i = 0
for f in files:
    if i == 20:
        break
    transect.write(f+".raw\n")
    i = i + 1
transect.close()

## Step 4: Processing with echoflow
Now, we're ready to kick off the data processing using echoflow. We'll provide the dataset and pipeline configurations, along with additional options.

In [4]:
options = {"storage_options_override": False}
data  = echoflow_start(dataset_config=dataset_config, pipeline_config=pipeline_config, options=options)

{'name': 'Bell_M._Shimada-SH1707-EK60', 'sonar_model': 'EK60', 'raw_regex': '(.*)-?D(?P<date>\\w{1,8})-T(?P<time>\\w{1,6})', 'args': {'urlpath': 's3://ncei-wcsd-archive/data/raw/{{ ship_name }}/{{ survey_name }}/{{ sonar_model }}/*.raw', 'parameters': {'ship_name': 'Bell_M._Shimada', 'survey_name': 'SH1707', 'sonar_model': 'EK60'}, 'storage_options': {'anon': True}, 'transect': {'file': './EK60_SH1707_Shimada.txt'}, 'default_transect_num': 2017, 'json_export': True}, 'output': {'urlpath': './echoflow-output', 'retention': True, 'overwrite': True}}
{'active_recipe': 'target_strength', 'use_local_dask': True, 'n_workers': 5, 'pipeline': [{'recipe_name': 'target_strength', 'stages': [{'name': 'echoflow_open_raw', 'module': 'echoflow.stages.subflows.open_raw', 'options': {'save_raw_file': True, 'use_raw_offline': True, 'use_offline': True}}, {'name': 'echoflow_compute_TS', 'module': 'echoflow.stages.subflows.compute_TS', 'options': {'use_offline': True}}]}]}


<Client: 'tcp://127.0.0.1:50418' processes=5 threads=20, memory=15.63 GiB>
--------------------------------------------------

Executing stage :  name='echoflow_open_raw' module='echoflow.stages.subflows.open_raw' external_params=None options={'save_raw_file': True, 'use_raw_offline': True, 'use_offline': True} prefect_config=None


{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_open_raw\\2017\\Summer2017-D20170615-T190214.zarr', 'transect': 2017, 'file_name': 'Summer2017-D20170615-T190214.raw', 'error': False}
{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_open_raw\\2017\\Summer2017-D20170615-T190843.zarr', 'transect': 2017, 'file_name': 'Summer2017-D20170615-T190843.raw', 'error': False}
{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_open_raw\\2017\\Summer2017-D20170615-T212409.zarr', 'transect': 2017, 'file_name': 'Summer2017-D20170615-T212409.raw', 'error': False}
{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_open_raw\\2017\\Summer2017-D20170615-T212933.zarr', 'transect': 2017, 'file_name': 'Summer2017-D20170615-T212933.raw', 'error': False}
{'ou

<Client: 'tcp://127.0.0.1:50418' processes=5 threads=20, memory=15.63 GiB>
--------------------------------------------------

Executing stage :  name='echoflow_compute_TS' module='echoflow.stages.subflows.compute_TS' external_params=None options={'use_offline': True} prefect_config=None


{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_compute_TS\\2017\\Summer2017-D20170615-T190214_TS.zarr', 'transect': '2017', 'file_name': 'Summer2017-D20170615-T190214_TS.zarr', 'error': False}
{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_compute_TS\\2017\\Summer2017-D20170615-T190843_TS.zarr', 'transect': '2017', 'file_name': 'Summer2017-D20170615-T190843_TS.zarr', 'error': False}
{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_compute_TS\\2017\\Summer2017-D20170615-T212409_TS.zarr', 'transect': '2017', 'file_name': 'Summer2017-D20170615-T212409_TS.zarr', 'error': False}
{'out_path': 'C:\\Users\\soham\\Desktop\\Soham\\Projects\\echoflow\\jupyterbook\\local\\echoflow-output\\echoflow_compute_TS\\2017\\Summer2017-D20170615-T212933_TS.zarr', 'transect': '2017', 'file_name': 'Summer2017-D201

## Step 5: Results
Finally, let's take a look at the first entry from the processed data.

In [5]:
data[0][0]

Unnamed: 0,Array,Chunk
Bytes,1.69 MiB,303.75 kiB
Shape,"(3, 19, 3888)","(2, 10, 1944)"
Dask graph,8 chunks in 2 graph layers,8 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 1.69 MiB 303.75 kiB Shape (3, 19, 3888) (2, 10, 1944) Dask graph 8 chunks in 2 graph layers Data type float64 numpy.ndarray",3888  19  3,

Unnamed: 0,Array,Chunk
Bytes,1.69 MiB,303.75 kiB
Shape,"(3, 19, 3888)","(2, 10, 1944)"
Dask graph,8 chunks in 2 graph layers,8 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.69 MiB,303.75 kiB
Shape,"(3, 19, 3888)","(2, 10, 1944)"
Dask graph,8 chunks in 2 graph layers,8 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 1.69 MiB 303.75 kiB Shape (3, 19, 3888) (2, 10, 1944) Dask graph 8 chunks in 2 graph layers Data type float64 numpy.ndarray",3888  19  3,

Unnamed: 0,Array,Chunk
Bytes,1.69 MiB,303.75 kiB
Shape,"(3, 19, 3888)","(2, 10, 1944)"
Dask graph,8 chunks in 2 graph layers,8 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 24 B 24 B Shape (3,) (3,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  1,

Unnamed: 0,Array,Chunk
Bytes,24 B,24 B
Shape,"(3,)","(3,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(19, 3)","(19, 3)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 456 B 456 B Shape (19, 3) (19, 3) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  19,

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(19, 3)","(19, 3)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(19, 3)","(19, 3)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 456 B 456 B Shape (19, 3) (19, 3) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",3  19,

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(19, 3)","(19, 3)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(3, 19)","(3, 19)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 456 B 456 B Shape (3, 19) (3, 19) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",19  3,

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(3, 19)","(3, 19)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(3, 19)","(3, 19)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 456 B 456 B Shape (3, 19) (3, 19) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",19  3,

Unnamed: 0,Array,Chunk
Bytes,456 B,456 B
Shape,"(3, 19)","(3, 19)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,608 B,608 B
Shape,"(1,)","(1,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,,
"Array Chunk Bytes 608 B 608 B Shape (1,) (1,) Dask graph 1 chunks in 2 graph layers Data type",1  1,

Unnamed: 0,Array,Chunk
Bytes,608 B,608 B
Shape,"(1,)","(1,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,,


**Congratulations!** You've successfully processed EK60 data using echoflow. This notebook provides a simplified overview, and you can explore the capabilities of echoflow for more advanced processing tasks.

Feel free to modify the parameters, paths, and configurations as needed to adapt to your data and requirements.