# Tutorial 1: Database structure

This tutorial explains how data is structured for a succesful data processing pipeline with our tools: auv_nav, camera_calibration and correct_images. At the end of this tutorial, you will:
 1. Recognise the structure of a database
 2. Find any dive or mission, as well as their raw or processed data
 3. Be able to maintain the structure for future development
 
## Structure of the database

A mission database is formed by a root folder (_data_ in this example) and three main subfolders: raw, configuration and processed.
In each of these three subfolders, the same subfolder structure is replicated to isolate raw data from the generated (processed) data, and the configuration that led to it. 

    data/
    ├── raw/
    ├── configuration/
    └── processed/

Then, the missions or dives are organised in years, cruises, platforms and missions with the following rules:
 1. Always use lowercase names
 2. Use cruise labels (e.g. use "dy109" instead of "Expedition to Darwin Mounds with RSS Discovery"
 3. Mission names always contain the date and hour of deployment (idealy, in water) followed by a short version of the platform name (e.g. autosub6000 shortened to as6) a short version of its payload (if any) and optionally the purpose of the mission. Try to keep names short and concise.
 
In short, we follow this structure: 

    data/{raw,configuration,processed}/year/campaing/platform/YYYYMMDD_hhmmss_deployment

**Year** should be the year in 4 digits (e.g. 2018).

**Campaing** should be the full name of the campaign as on official documentation (e.g. ssk17-01, fk180731)

**Platform** should be the full name of the platform and imaging system used with small letter and underscores instead of spaces (e.g. tuna_sand, tuna_sand2, ae2000f)

**Deployment** should be the date and time (usually the creation of the relevant image folder on the imaging unit) yyyymmdd_hhmmss followed by an platform identifier and imaging system identifier separated by underscores. So:
 - tuna sand - ts
 - tuna sand 2 - ts2
 - ae2000f - ae2000f
 - autosub6000 - as6
 - hybis - hybis
 
and the imaging systems or payloads:
 - unagi - un,
 - unagi_6k - un6k,
 - seaxerocks3 - sx3,
 - seaxerocks3 LED - sx3led,
 - seaxerocks4 - sx4

See the example below for an entire cruise:

    raw/
    └── 2019/
        └── dy109/
            ├── hybis/
            │   ├── 20190911_191936_hybis_d51/
            │   ├── 20190916_164920_hybis_d54/
            │   ├── 20190917_162914_hybis_d55/
            │   ├── 20190918_161853_hybis_d56/
            │   ├── 20190919_162523_hybis_d57/
            │   ├── 20190920_180800_hybis_d58/
            │   ├── 20190922_165906_hybis_d59/
            │   ├── 20190923_171728_hybis_d60/
            │   └── 20190924_193240_hybis_d61/
            └── autosub/
                ├── 20190910_071106_as6_sx4_laser_calibration/
                ├── 20190910_080940_as6_sx4_mapping/
                ├── 20190910_092354_as6_sx4_mapping/
                ├── 20190913_090214_as6_sx4_laser_calibration/
                ├── 20190913_101337_as6_sx4_mapping/
                ├── 20190916_090456_as6_sx4_mapping/
                ├── 20190918_105330_as6_sx4_mapping/
                └── 20190922_140621_as6_sx4_mapping/
                



This stucture will be therefore duplicated by the software in configuration and processed database folders. 

### Example 1: Let's explore an actual database!

In https://console.cloud.google.com/storage/browser/soi-uos-data/ you can find an actual dataset. 
Download an example dataset from https://console.cloud.google.com/storage/browser/soi-uos-data/raw/year/cruise/platform/ (61.5 GiB) to test our tools.

#### Download an example dataset

Using
 - gsutil (install it from here: https://cloud.google.com/storage/docs/gsutil_install) 
 - and rsync (commonly available in linux distros)

you will download an example dataset to process it with further tools. 

<div class="alert alert-warning">
Please make sure to change the <font color="blue">database_path</font> to a sensible path in your computer. The dataset is <font color="red">61.5 GiB</font>. If you leave it at /tmp, it will get deleted when you reboot the computer.
</div>

you do not have to worry if you run multiple times this cell, rsync will take care of doing nothing if you've alredy got the data.

In [3]:
database_path = '/data/dives'

The next command will take time (~45 minutes). 
When finished, the \[*\] indicator on the left with become a number.

In [None]:
%%bash -s "$database_path" 
echo "Downloading data to " $1
mkdir -p $1/raw/year/cruise/platform
gsutil -m rsync -r gs://soi-uos-data/raw/year/cruise/platform $1/raw/year/cruise/platform

### Let's explore the dataset

The following commands will explore your filesystem and show a folder tree

In [4]:
from auv_nav.tools.displayable_path import DisplayablePath
DisplayablePath.show_tree(database_path, max_depth=2)

dives/
├── calibration/
├── configuration/
├── processed/
├── raw/
├── raw_test/
├── Test_ae2000/
└── Test_tunasand2/


The previous command should have just shown your database root folder and raw, as that is what we just downloaded. After processing the datasets with auv_nav, the folders configuration and processed will be created.

Let's inspect raw a bit deeper:

In [5]:
DisplayablePath.show_tree(database_path + '/raw', max_depth=5)

raw/
├── 2017/
│   └── SSK17-01/
│       └── ts_un_006/
│           ├── image/
│           ├── mission.yaml
│           ├── nav/
│           └── vehicle.yaml
├── 2018/
│   ├── fk180731/
│   │   ├── ae2000f/
│   │   │   ├── 20180802_172527_ae2000f_sx3/
│   │   │   ├── 20180803_065749_ae2000f_sx3
│   │   │   └── 20180809_083837_ae2000f_sx3
│   │   └── tuna_sand/
│   │       ├── 20180811_153727_ts_un6k/
│   │       └── calibration_images/
│   ├── fk180731_supplementary/
│   │   ├── ae2000f/
│   │   │   └── SeaXerocksData20180802_174821_laserCal/
│   │   └── cameraCal/
│   └── koyo18-01/
│       └── ae2000f/
│           └── 20181121_061956_ae2000f_sx3/
├── 2019/
│   ├── autosub_test/
│   │   └── ALR6000/
│   │       └── 180629_alr6000_test/
│   └── dy109/
│       └── autosub6000/
│           └── 20190916_090456_as6_sx4_mapping/
└── year/
    └── cruise/
        ├── .DS_Store
        ├── autosub6000/
        │   ├── .DS_Store
        │   └── 180629_autosub6000_test/
        └── platform/
  

If everything is fine, you should see two missions at the end of the tree:
    
    ├── YYYYMMDD_hhmmss_platform_sensor_calib/
    └── YYYYMMDD_hhmmss_platform_sensor_data/
 
These are the two mission folders we will be working on in the following tutorials.

## Structure of a deployment mission

Now you are familiar with the structure of the database. Let's take a closer look to mission folder `YYYYMMDD_hhmmss_platform_sensor_data`.

In [6]:
DisplayablePath.show_tree(database_path + '/raw/year/cruise/platform/YYYYMMDD_hhmmss_platform_sensor_data', max_depth=2)

YYYYMMDD_hhmmss_platform_sensor_data/
├── .DS_Store
├── image/
├── mission.yaml
├── nav/
├── payload/
└── vehicle.yaml


The folder contents are:
 - Image folder: holds the raw images of the dive
 - Nav folder: holds the logfiles of the navigation sensors
 - Payload folder: holds any other recorder information that is not images nor navigation. For instance, sidescansonar, multibeam, magnetometer...
 - `mission.yaml`: stores the relative path of the navigation logfiles and their relationship with what measurement they performed.
 - `vehicle.yaml`: stores the geometric transformations of all sensors and payloads relative to the platform or vehicle used.
 
### mission.yaml 

The file `mission.yaml` is structured in eight sections: two for initial conditions and six for each measurement:
 1. Version: defaults to version 1
 2. Origin: initial latitude and longitude of the mission, date and CRS.
 3. Velocity: path and format of the velocity measurements. Time offset and uncertainty model.
 4. Orientation: path and format of the orientation measurements. Time offset and uncertainty model.
 5. Depth: path and format of the depth measurements. Time offset and uncertainty model.
 6. Pressure: path and format of the pressure measurements. Time offset and uncertainty model.
 7. Altitude: path and format of the altitude measurements. Time offset and uncertainty model.
 8. USBL: path and format of the USBL measurements. Time offset and uncertainty model.
 9. Image: path and format of the images captured.
 
Optionally, there can be payload sections, but for the moment that is not implemented.

Run the cell bellow to see the contents of an actual file.

In [19]:
mission_yaml = database_path + '/raw/year/cruise/platform/YYYYMMDD_hhmmss_platform_sensor_data/mission.yaml'
!cat $mission_yaml

#YAML 1.0
version: 1

origin:
  latitude: 44.571 #44.673
  longitude: -125.149 #-125.120
  coordinate_reference_system: wgs84
  date: 2018/08/05

velocity:
  format: ae2000
  filepath: nav/ae_log/
  filename: pos180805123456.csv
  timezone: -7
  timeoffset: 0.0
  std_factor: 0.001
  std_offset: 0.2


orientation:
  format: ae2000
  filepath: nav/ae_log/
  filename: pos180805123456.csv
  timezone: -7
  timeoffset: 0.0
  std_factor: 0.0
  std_offset: 0.003


depth:
  format: ae2000
  filepath: nav/ae_log/
  filename: pos180805123456.csv
  timezone: -7
  timeoffset: 0.0
  std_factor: 0.0001
  std_offset: 0.0  

altitude:
  format: ae2000
  filepath: nav/ae_log/
  filename: pos180805123456.csv
  timezone: -7
  timeoffset: 0.0
  std_factor: 0.01
  std_offset: 0.0


usbl:
  format: gaps
  filepath: nav/gaps/
  timezone: utc
  timeoffset: 0.0
  id: 3 # Tried: 1, 2, 3, 4
  std_factor: 0.

### vehicle.yaml 

In [20]:
vehicle_yaml = database_path + '/raw/year/cruise/platform/YYYYMMDD_hhmmss_platform_sensor_data/vehicle.yaml'
!cat $vehicle_yaml

#YAML 1.0
origin: #centre of robot
  surge_m: 0
  sway_m: 0
  heave_m: 0
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0

# distance with reference to origin/centre of robot
usbl:
  surge_m: 0.
  sway_m: 0
  heave_m: -0.289 # uncertain
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0

ins:
  surge_m: -0.09 # uncertain
  sway_m: 0
  heave_m: 0
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0.0

dvl:
  surge_m: -0.780625
  sway_m: 0
  heave_m: 0.204
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0.0

depth:
  surge_m: 0.
  sway_m: 0
  heave_m: 0
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0

camera1: #Left Camera
  surge_m: 0.262875
  sway_m: 0.
  heave_m: 0.5
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0

camera2: #Right Camera
  surge_m: 0.012875
  sway_m: 0. # uncertain
  heave_m: 0.5 # uncertain
  roll_deg: 0
  pitch_deg: 0
  yaw_deg: 0


camera3: # Laser Camera (LM165)
  surge_m: 0.150375 # 0.147875

In [None]:
%%bash --no-raise-error
auv_nav