(function:open-converted)=
# Working with converted files

## Open a converted netCDF or Zarr dataset

Converted netCDF files can be opened with the [`open_converted`](echopype.open_converted) function that returns a lazy-loaded [`EchoData` object](data-format:echodata-object) (only metadata are read during opening):

```python
import echopype as ep
file_path = "./converted_files/file.nc"      # path to a converted nc file
ed = ep.open_converted(file_path)            # create an EchoData object
```

Likewise, specify the path to open a Zarr dataset. To open such a dataset from cloud storage, use the same `storage_options` parameter as with [open_raw](convert.html#aws-s3-access). For example:

```python
s3_path = "s3://s3bucketname/directory_path/dataset.zarr"     # S3 dataset path
ed = ep.open_converted(s3_path, storage_options={"anon": True})
```

## Combine EchoData objects

Converted data found in multiple files corresponding to the same instrument deployment can be combined into a single [`EchoData` object](data-format:echodata-object) using [`combine_echodata`](echopype.combine_echodata). With the release of `echopype` version `0.6.3`, one can now combine a large number of files in parallel (using [Dask](https://www.dask.org/)) while maintaining a stable memory usage. To accomplish this, each `EchoData` object is directly appended to a Zarr store, which corresponds to the final combined `EchoData` object. The path of this Zarr store is determined by the keyword argument `zarr_path`. In addition to this argument, `combine_echodata` also accepts the keyword argument `client`, which represents an initialized Dask distributed client that parallel tasks will be submitted to. If `zarr_path` or `client` are not provided, then default values will be used (see the `Notes` section in [`combine_echodata`](echopype.combine_echodata)).

To use `combine_echodata`, the following criteria must be met: 
- Each `EchoData` object must have the same `sonar_model`
- Each `EchoData` object must be produced by distinct file paths
- The first time value of each `EchoData` group must be less than the first time value of the subsequent corresponding `EchoData` group, with respect to the order in the list of `EchoData` objects being combined
- The same `EchoData` groups in the list of `EchoData` objects must have the same number of channels and the same name for each of these channels
- The following attribute criteria must be satisfied amongst the combined `EchoData` groups:
  - the names of each attribute must be the same
  - the values of each attribute must be identical (other than the attributes `date_created` or `conversion_time` which must have the same types)

## Combining converted files

The first step in combining converted files is to establish a Dask client with a scheduler. If one is working on a local machine, this can be done as follows:
```python
client = Client()  # create client with local scheduler
```
If one is interested in running on distributed hardware, we highly suggest reviewing the Dask documentation for [deploying Dask clusters](https://docs.dask.org/en/latest/deploying.html). 

Next, we assemble a list of `EchoData` objects from converted files (netCDF or Zarr):
```python
ed_list = []
for converted_file in ["convertedfile1.zarr", "convertedfile2.zarr"]:
    ed_list.append(ep.open_converted(converted_file))
```

Finally, we apply `combine_echodata` on this list to combine all the data into a single `EchoData` object. Here, we will store the final combined form in the Zarr path `path_to/combined_echodatas.zarr` and use the client we established above: 
```python
combined_ed = ep.combine_echodata(ed_list, 
                                  zarr_path='path_to/combined_echodatas.zarr', 
                                  client=client)
```
Once executed, `combine_echodata` returns a lazy loaded `EchoData` object (obtained from `zarr_path`) with all data from the input `EchoData` objects combined.

:::{Note}
In previous versions, `combine_echodata` corrected reversed timestamps and stored the uncorrected timestamps in the `Provenance` group. The current implementation of `combine_echodata` now allows us to preserve time coordinates that have reversed timestamps and we no longer perform this correction. 
:::