# PREAMBULE

#### This notebook follows the sames steps than `data_uploader`, however the present one can treat several dataset at a time

Before you can upload your dataset:
- adapt `path_osmose_home` which points to OSmOSE working directory ;
- choose a dataset name (should not contain any special character, including `-`) ;
- create the folder `/home/datawork-osmose/dataset/{dataset_name}` (or `home/datawork-osmose/dataset/{campaign_name}/{dataset_name}` in case of a recording campaign) ;
- place in this folder your audio data, it can be individual files or contained within multiple sub-folders ;
- if you have any csv files (either a `timestamp.csv` or `*gps*.csv` file) should also be placed in this folder ;

**Important notes:**
- about timestamps : all timestamps from your original data (from your audio filenames or from your csv files) MUST follow the same timestamp template which should be given in `date_template` ;
- about `*gps*.csv` file : this file provides the GPS track (ie latitude and longitude coordinates) of a moving hydrophone. This file must contain the term _gps_ in its filename ;
- about auxiliary csv files : they must contain headers with the following standardized names : timestamp , depth , lat , lon

In [None]:
# FILL IN RED PARTS !
from pathlib import Path
from OSmOSE import Dataset
from OSmOSE.utils.core_utils import display_folder_storage_info, list_not_built_dataset
from os import umask

umask(0o002)

path_osmose_home = "/home/datawork-osmose/"
path_osmose_dataset = Path(path_osmose_home / "dataset")

display_folder_storage_info(path_osmose_home)

In [None]:
# FILL IN RED PART !
list_not_built_dataset(path_osmose_dataset / "dataset/name")

If your dataset is part of a recording campaign, please provide its name with `campaign_name` ; in that case your dataset should be present in `home/datawork-osmose/dataset/{campaign_name}/{dataset_name}`. Otherwise let the default value `campaign_name = ""`.

In [None]:
# FILL IN RED PARTS !
list_dataset_name = [
    "C5D1_ST7181",
    "C5D1_ST7194",
    "C5D2_ST7189",
    "C5D2_ST7190",
    "C5D3_ST7189",
    "C5D3_ST7190",
    "C5D4_ST7181",
    "C5D4_ST7194",
    "C5D5_ST7181",
    "C5D5_ST7194",
    "C5D6_ST7189",
    "C5D6_ST7190",
    "C5D7_ST7181",
    "C5D7_ST7194",
    "C5D8_ST7189",
    "C5D8_ST7190",
    "C5D9_ST7181",
    "C5D9_ST7194",
]

list_campaign_name = ["APOCADO3"] * len(list_dataset_name)

In case of fixed GPS coordinates, write in the variable `gps` below in decimal degree (latitude , longitude) of your dataset (eg, `gps= (49 , -2)` ). If you have a mobile hydrophone, enter the name of the csv file containing the GPS coordinates, this filename should contain the term _gps_.

In [None]:
# FILL IN GREEN PARTS !
list_gps = [
    (47.89755, -4.69856666666667),
    (47.89755, -4.69856666666667),
    (47.8917666666667, -4.72161666666667),
    (47.8917666666667, -4.72161666666667),
    (48.0853666666667, -4.83871666666667),
    (48.0853666666667, -4.83871666666667),
    (48.0900833333333, -4.82485),
    (48.0900833333333, -4.82485),
    (47.9945333333333, -4.82413333333333),
    (47.9945333333333, -4.82413333333333),
    (48.0863333333333, -4.8401),
    (48.0863333333333, -4.8401),
    (48.021, -4.93675),
    (48.021, -4.93675),
    (47.9811, -4.84206666666667),
    (47.9811, -4.84206666666667),
    (48.0211, -4.919),
    (48.0211, -4.919),
]

In [None]:
# FILL IN GREEN PARTS !
list_depth = [
    0,
    0,
    49,
    49,
    57,
    57,
    0,
    0,
    56,
    56,
    57,
    57,
    0,
    0,
    25,
    25,
    0,
    0,
]

Concerning the `timezone` of your data, by default we will consider they are from UTC+00:00. If that is not the case, please use this parameter to inform it, its format MUST follow `"+02:00"` for UTC+02:00 for example.

In [None]:
# FILL IN RED PARTS !
list_timezone = [
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
    "+01:00",
]

The variable `force_upload` allows you to upload your dataset on the platform despite detected anomalies.

In [None]:
# FILL IN RED and GREEN PARTS !
force_upload = True
date_template = "%Y%m%d_%H%M%S"

In [None]:
# FILL IN RED and GREEN PARTS !
for dataset_name, campaign_name, gps, depth, timezone in zip(
    list_dataset_name, list_campaign_name, list_gps, list_depth, list_timezone
):
    print(dataset_name)

    dataset = Dataset(
        Path(path_osmose_dataset, campaign_name, dataset_name),
        gps_coordinates=gps,
        depth=depth,
        owner_group="gosmose",
        local=False,
        timezone=timezone,
    )

    dataset.build(force_upload=force_upload, date_template=date_template)