# Pelagic Dataset

In [36]:
%load_ext autoreload
%autoreload 2
from paidiverpy.pipeline import Pipeline

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [37]:
pipeline = Pipeline(config_file_path="config/config_pelagic.yml")



## Show the pipeline

To check the pipeline steps, you just need to run `pipeline`. You can also click in a step to see more information about it

In [38]:
pipeline

## Metadata

The metadata is a csv file

In [39]:
pipeline.get_metadata()

Unnamed: 0,id,image-filename,image-altitude-meters,flag
0,426.0,20230310041826_045_0001.bmp,10.0,0
1,449.0,20230310041826_594_0002.bmp,20.0,0
2,576.0,20230310041851_893_0001.bmp,10.0,0
3,414.0,20230310041825_794_0000.bmp,100.0,0
4,66.0,20230310041755_849_0000.bmp,1.0,0
5,293.0,20230310041822_845_0000.bmp,3.0,0
6,1038.0,20230310043208_264_0000.bmp,5.0,0
7,980.0,20230310042929_619_0000.bmp,6.0,0
8,1063.0,20230310043332_861_0002.bmp,7.0,0
9,428.0,20230310041826_094_0001.bmp,8.0,0


## Configuration file

In [40]:
pipeline.config

{
    "general": {
        "name": "raw",
        "step_name": "open",
        "sample_data": null,
        "is_remote": false,
        "input_path": "images",
        "metadata_path": "metadata/metadata_pelagic.csv",
        "metadata_type": "CSV_FILE",
        "image_type": "BMP",
        "append_data_to_metadata": false,
        "output_is_remote": false,
        "output_path": "output",
        "n_jobs": 1,
        "client": null,
        "track_changes": true,
        "rename": null,
        "sampling": null,
        "convert": null
    },
    "steps": [
        {
            "name": "contrast",
            "step_name": "colour",
            "mode": "contrast",
            "test": false,
            "params": {
                "method": "clahe",
                "kernel_size": null,
                "clip_limit": 0.01,
                "gamma_value": 2,
                "raise_error": false
            }
        },
        {
            "name": "illumination_correction",
            "

## Run the pipeline

To run the pipeline, use the command `pipeline.run()`

In [41]:
# Run the pipeline
pipeline = Pipeline(config_file_path="config/config_pelagic.yml")
pipeline.run()
pipeline.save_images()

[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:03 | Processing images using 1 cores[0m
[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:03 | Running step 0: raw - OpenLayer[0m


Open Images: 100%|███████████████████████████████████████████████████████████████████| 58/58 [00:00<00:00, 1781.93it/s]


[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:03 | Step 0 completed[0m
[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:03 | Running step 1: contrast - ColourLayer[0m


Processing images: 100%|██████████████████████████████████████████████████████████████| 58/58 [00:00<00:00, 132.61it/s]

[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:04 | Step 1 completed[0m





[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:04 | Running step 2: illumination_correction - ColourLayer[0m


Processing images: 100%|███████████████████████████████████████████████████████████████| 58/58 [00:03<00:00, 19.26it/s]


[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:07 | Step 2 completed[0m
[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:07 | Running step 3: gaussian_blur - ColourLayer[0m


Processing images: 100%|█████████████████████████████████████████████████████████████| 58/58 [00:00<00:00, 1803.48it/s]

[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:08 | Step 3 completed[0m





[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:08 | Running step 4: sharpen - ColourLayer[0m


Processing images: 100%|██████████████████████████████████████████████████████████████| 58/58 [00:00<00:00, 537.84it/s]

[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:08 | Step 4 completed[0m





[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:08 | Running step 5: deblur - ColourLayer[0m


Processing images: 100%|███████████████████████████████████████████████████████████████| 58/58 [00:01<00:00, 46.64it/s]

[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:09 | Step 5 completed[0m





[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:10 | Running step 6: invert - CustomLayer[0m


Processing images: 100%|████████████████████████████████████████████████████████████| 58/58 [00:00<00:00, 35503.45it/s]

[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:10 | Step 6 completed[0m





[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:10 | Saving images from step: last[0m
[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:10 | Saving images[0m
[92m☁ paidiverpy ☁  |       INFO | 2025-02-27 19:12:10 | Images are saved to: output[0m


In [42]:
# See the images output
pipeline.images

# Now it's your turn!

Using the same dataset, try out these tasks:  

1. **Adjust hyperparameters** – Modify the hyperparameters of each step and observe their impact.  

2. **Reorder the steps** – Change the sequence of processing steps and see how it affects the results.  

3. **🔹 Advanced: Create a custom algorithm**  
   Develop your own processing step to extend the pipeline’s functionality. Learn how to do this in the [custom algorithms guide](https://paidiverpy.readthedocs.io/en/latest/guide/custom_algorithms/index.html).  

4. **🔹 Advanced: Run the pipeline from the command line**  
   Once you're satisfied with your results, execute the pipeline directly from the terminal:  

   ```bash
   cd ~/paidiver/pelagic  
   paidiverpy -c "config/config_pelagic.yml"
   ```  

In [43]:
#import the package
from paidiverpy.pipeline import Pipeline

In [20]:
# inspect the pipeline

In [21]:
# inspect the metadata

In [22]:
# run the pipeline

In [24]:
# see the output images

In [None]:
# export the images