![header](https://i.imgur.com/I4ake6d.jpg)

# IN SITU GLOBAL SEAS TRAINING

<div style="text-align: right"><i> 13-04-Part-four-out-of-five </i></div>

***
# GLO `NRT` product/dataset: managing files (saildrones)

***
**General Note 1**: Execute each cell through the <button class="btn btn-default btn-xs"><i class="icon-play fa fa-play"></i></button> button from the top MENU (or keyboard shortcut `Shift` + `Enter`).<br>
<br>
**General Note 2**: If, for any reason, the kernel is not working anymore, in the top MENU, click on the <button class="btn btn-default btn-xs"><i class="fa fa-repeat icon-repeat"></i></button> button. Then, in the top MENU, click on "Cell" and select "Run All Above Selected Cell".<br>
***

<h1>Table of Contents<span class="tocSkip"></h1>
<div class="toc">
    <ul class="toc-item">
        <li><span><a href="#1.-Introduction" data-toc-modified-id="1.-Introduction">1. Introduction</a></span></li>
        <li>
            <span><a href="#2.-Setup" data-toc-modified-id="2.-Setup">2. Setup</a></span>
            <ul>
                <li><span><a href="#2.1.-Python-packages" data-toc-modified-id="2.1.-Python-packages">2.1. Python packages</a></span></li>
                <li><span><a href="#2.2.-Auxiliary-functions" data-toc-modified-id="2.2.-Auxiliary-functions">2.2. Auxiliary functions</a></span></li>
            </ul>
        </li>
        <li><span><a href="#3.-Saildrones-(SD)-data" data-toc-modified-id="3.-Saildrones-(SD)-data">3. Saildrones (SD) data</a></span>
            <ul>
                <li><span><a href="#3.1.-Reading-the-file" data-toc-modified-id="3.1.-Reading-the-file">3.1. Reading the file</a></span></li>
                <li><span><a href="#3.2.-Data-visualization" data-toc-modified-id="3.2.-Data-visualization">3.2. Data visualization</a></span>
                    <ul>
                        <li><span><a href="#3.2.1.-Trajectory-animation" data-toc-modified-id="3.2.1.-Trajectory-animation">3.2.1. Trajectory animation</a></span></li>
                <li><span><a href="#3.2.2.-Along-track-variable-evolution" data-toc-modified-id="3.2.2.-Along-track-variable-evolution">3.2.2. Along track variable evolution</a></span></li>
        <li><span><a href="#3.2.3.-Overall-variable-evolution" data-toc-modified-id="3.2.3.-Overall-variable-evolution">3.2.3. Overall variable evolution</a></span></li>
                    </ul>
                </li>     
            </ul>
        </li>
        <li><span><a href="#4.-Wrap-up" data-toc-modified-id="4.-Wrap-up">4. Wrap-up</a></span></li>
    </ul>
</div>

## 1. Introduction
[Go back to the "Table of Contents"](#Table-of-Contents)

According to the [13-01-NearRealTtime-product-collections-overview.ipynb](13-01-NearRealTtime-product-collections-overview.ipynb) one of the available data source types are the Saildrones. Please use the notebook [13-02-NearRealTtime-product-subsetting-download](13-02-NearRealTtime-product-subsetting-download.ipynb) to download some files from Saildrones ('SD' data type) and let's check its data. <br> If you wanna skip the downloading part you can use the netCDF files available `/data/files/SD/` instead.    

## 2. Setup
[Go back to the "Table of Contents"](#Table-of-Contents)

### 2.1. Python packages

For the notebook to properly run we need to first load the next packages available from the Jupyter Notebook Ecosystem. Please run the `next cell`:

In [None]:
import warnings
warnings.filterwarnings("ignore")

import os
import pandas as pd
import datetime
import numpy as np
import xarray
import folium
from folium import plugins
from IPython.display import YouTubeVideo
import branca
%matplotlib inline

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If any of them raises any error it means you need to install the module first. For doing so please:
1. Open a new cell int he notebook
2. Run <i>`!conda install packageName --yes`</i> or <i>`!conda install -c conda-forge packageName --yes`</i> or <i>`!pip install packageName`</i>
3. Import again!
<br><br>
Example: <i>how-to-solve import error for json2html module </i>

![region.png](img/errorImporting.gif)

### 2.1. Auxiliary functions

Please `run the next cell` to load into memory a functions we will use later on for subsetting the original file if it results too large to fit int memmory:

In [None]:
def get_subset(start,end,ds):
    #Subsets a dataset (ds) from start to end dates
    i_start = ds['TIME'].astype(str).values.tolist().index(ds['TIME'][ds['TIME'].astype(str).str.contains(start) == True].astype(str).values[0])
    i_end = ds['TIME'].astype(str).values.tolist().index(ds['TIME'][ds['TIME'].astype(str).str.contains(end) == True].astype(str).values[-1])
    return ds.isel(TIME=slice(i_start, i_end),LATITUDE=slice(i_start, i_end),LONGITUDE=slice(i_start, i_end),POSITION=slice(i_start, i_end))

## 3. Saildrones (SD) data
[Go back to the "Table of Contents"](#Table-of-Contents)

Saildrones sailing devices that measure a number of metocean variables (i.e salinity,temperature, wind..) as they move. <br>These are brand new platforms and you can get to know them better by watching the following Hello-World video: `run the next cell`

In [None]:
YouTubeVideo('iFTToTsJuY0', width="100%", height=500)

Let's see the data of one of the available Saildrones in the GLO Seas.<br>`Run the next cell` to see the saildrones files already available in the /data folder:

In [None]:
dir_SD = os.path.join(os.getcwd(),'data','nc_files','SD') 
os.listdir(dir_SD)

### 3.1. Reading the file

`Set one` of the above available `file name` and `run the next cells`:

In [None]:
file = 'GL_TS_SD_1801573.nc'
path = os.path.join(dir_SD, file)

In [None]:
ds = xarray.open_dataset(path)
ds.close()
ds

The above one is an overview of the content of the file: variables, dimensions, coordinates, global attributes...
<br>Let's list now the available variables: `run the next cell`

In [None]:
for var in ds.variables:
    print(var + ':' + ds[var].attrs['long_name'])

Let's see the average sampling rate, as it is to bear in mind when plotting to avoid memory chrashes:

In [None]:
start = datetime.datetime.strptime(ds.attrs['time_coverage_start'], '%Y-%m-%dT%H:%M:%SZ')
end = datetime.datetime.strptime(ds.attrs['time_coverage_end'], '%Y-%m-%dT%H:%M:%SZ')

In [None]:
aprox_sampling_rate_in_minutes = ((end-start).total_seconds()/60)/len(ds['TIME'])
'one measure every '+str(aprox_sampling_rate_in_minutes)+' minutes from/to %s/%s'%(start,end)

Regarding the above information, it is better to keep going with just a subset of the file: `run the next cell`

In [None]:
start = '2020-01-30'
end = '2020-02-04'

In [None]:
subset = get_subset(start, end, ds)

### 3.2. Data visualization

#### 3.2.1. Trajectory animation

As stated before, the saildrone a mobile platform.
<br>Let's check the overall trajectory by joining the sampling points.

In In Situ TAC netCDFs all variables are linked to another called the same plus '_QC'. This 'twin' variable contains a quality flag for each value in the paired variable.`run the next cell` to check the flag values convention:

In [None]:
pd.DataFrame(data=subset['TEMP_QC'].attrs['flag_values'],
             index=subset['TEMP_QC'].attrs['flag_meanings'].split(' '), columns=['quality flag'])

Users are recommended to use only the data flagged as 1, they so called 'good data'. Let's then check the available flags for the coordinates (time and position) to see if we need to get rid of not-good values: `run the next cells`

In [None]:
subset['POSITION_QC'].plot(aspect=2, size=5)

From above, we see no flags values different from 1, so we are ready to go!. Be aware nevertheless, that in the event of other flags values, a selection of the data must be done. See next how:`run the next cell`

In [None]:
lats = subset['LATITUDE'].where(subset['POSITION_QC'] == 1).values.tolist()
lats = [i[0] for i in lats]
lons = subset['LONGITUDE'].where(subset['POSITION_QC'] == 1).values.tolist()
lons = [i[1] for i in lons]
times = subset['TIME'].where(subset['TIME_QC'] == 1).values.tolist()
strtimes = subset['TIME'].where(subset['TIME_QC'] == 1).values[:]

Let's create now a geojson feature representing the vessel:

In [None]:
saildrone = {
    'type': 'Feature',
    'geometry': {
        'type': 'LineString',
        'coordinates': []
    },
    'properties': {
        'times': [],
    }
}

Let's populate it:

In [None]:
for time, strtime, lat, lon in zip(times, strtimes, lats, lons):
    base = [time,lat,lon]
    if(any(x is None for x in base)):
        continue
    if(any(np.isnan(x) for x in base)):
        continue
    saildrone['properties']['times'].append(str(strtime)[:22])
    saildrone['geometry']['coordinates'].append([lon, lat])

In [None]:
mean_lat, mean_lon = np.nanmean(lats), np.nanmean(lons)
m = folium.Map(location=[mean_lat, mean_lon], zoom_start=6)
marker = plugins.TimestampedGeoJson({
    'type': 'FeatureCollection',
    'features': [saildrone],
}, add_last_point=True, loop=False).add_to(m)
m

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If you do not see any map when running the next cell please change your navigator (try chrome!).

### 3.2.2. Along track variable evolution

Let's focus on one of the variables to visualize its data!: `set one and run the next cell`

In [None]:
param = 'TEMP'
subset[param][:,1]

In [None]:
subset[param+'_QC'][:,1].plot()

Let's get only the good data (1) or probably good data (2):

In [None]:
var = subset[param][:,1].where(subset[param+'_QC'][:,1] == 1).values.tolist()

Let's set a colormap:

In [None]:
linear_cmap = branca.colormap.LinearColormap(['green', 'yellow', 'red'],vmin=np.nanmin(var), vmax=np.nanmax(var))
linear_cmap

Let's plot the parameter values along the trajectory:

In [None]:
m = folium.Map(location=[mean_lat, mean_lon], zoom_start=8)
for k in range(0,len(times)-1,3):
    try:
        color = linear_cmap(var[k])
        folium.CircleMarker([lats[k],lons[k]], radius=1,color=color).add_to(m)
    except Exception as e:
        pass
m.fit_bounds(m.get_bounds())
colormap = branca.colormap.LinearColormap(['green', 'yellow', 'red']).scale(int(np.nanmin(var)), int(np.nanmax(var))).to_step(6)
colormap.caption = param+' variation along the saildrone track'
m.add_child(colormap)
m

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If you do not see any map when running the next cell please change your navigator (try chrome!).

Do you wanna use a bathy basemap instead? `run the next cell`

In [None]:
m = folium.Map(
    location=[mean_lat, mean_lon], 
    zoom_start=8,
    tiles='https://tiles.emodnet-bathymetry.eu/2020/baselayer/web_mercator/{z}/{x}/{y}.png',
    attr="EMODnet bathy"
)
for k in range(0,len(times)-1,3):
    try:
        color = linear_cmap(var[k])
        folium.CircleMarker([lats[k],lons[k]], radius=1,color=color).add_to(m)
    except Exception as e:
        pass
m.fit_bounds(m.get_bounds())
colormap = branca.colormap.LinearColormap(['green', 'yellow', 'red']).scale(int(np.nanmin(var)), int(np.nanmax(var))).to_step(6)
colormap.caption = param+' variation along the saildrone track'
m.add_child(colormap)
m

### 3.2.4. Overall variable evolution

Let's plot the overall variability of the parameter over time (as vessel moves)

In [None]:
subset[param][:,1].plot(aspect=3, size=5, marker='o', color='k')

***

## 4. Wrap-up
[Go back to the "Table of Contents"](#Table-of-Contents)

So far you should already know how to deal with trajectory-like Time Serie data from drifting buoys. <br> `If you don't please ask us! it is the moment!`