![header](https://i.imgur.com/I4ake6d.jpg)

# IN SITU BLACK SEA TRAINING

<div style="text-align: right"><i> 13-05-Part-three-out-of-five </i></div>

# BS `NRT` product/dataset: managing files (Tide Gauges)

***

<h1>Table of Contents<span class="tocSkip"></h1>
<div class="toc">
    <ul class="toc-item">
        <li><span><a href="#Introduction" data-toc-modified-id="Introduction">Introduction</a></span></li>
        <li>
            <span><a href="#Setup" data-toc-modified-id="Setup">Setup</a></span>
            <ul>
                <li><span><a href="#Python-packages" data-toc-modified-id="Python-packages">Python packages</a></span></li>
            </ul>
        </li>
        <li><span><a href="#Tide-Gauges-(TG)-data" data-toc-modified-id="Tide-Gauges-(TG)-data">Tide Gauges (TG) data</a></span>
            <ul>
                <li><span><a href="#Reading-file" data-toc-modified-id="Reading-file">Reading file</a></span></li>
                <li><span><a href="#Subsetting-Operations" data-toc-modified-id="Subsetting-Operations">Subsetting Operations</a></span>
                <li><span><a href="#Sampling-Operations" data-toc-modified-id="Sampling-Operations">Sampling Operations</a></span>
                    <ul>
                        <li><span><a href="#Selecting-Good-data-(QC-flags)" data-toc-modified-id="Selecting-Good-data-(QC-flags)">Selecting Good data (QC flags)</a></span></li>
                        <li><span><a href="#Upsampling/Downsampling" data-toc-modified-id="Upsampling/Downsampling">Upsampling/Downsampling</a></span></li>
                    </ul>
                </li>
        <li><span><a href="#Exporting-data-to-excel" data-toc-modified-id="Exporting-data-to-excel">Exporting data to excel</a></span></li>
            </ul>
        </li>
        <li><span><a href="#Wrap-up" data-toc-modified-id="Wrap-up">Wrap-up</a></span></li>
        <li><span><a href="#Feedback-survey" data-toc-modified-id="Feedback-survey">Feedback survey</a></span></li>
    </ul>
</div>

***

## Introduction

According to the [13-01-NearRealTtime-product-collections-overview.ipynb](13-01-NearRealTtime-product-collections-overview.ipynb) one of the data source types available are Tide Gauges. Please use the notebook [13-02-NearRealTtime-product-subsetting-download](13-02-NearRealTtime-product-subsetting-download.ipynb) to download some files from Tide Gauge ('TG' data type) and let's check its data. If you wanna skip the downloading part you can use the netCDF files available <a href="data" target="_blank">here</a> instead.  

## Setup

### Python packages

For the notebook to properly run we need to first load the next packages available from the Jupyter Notebook Ecosystem. Please run the `next cell`:

In [None]:
import warnings
warnings.filterwarnings("ignore")

import IPython
import datetime
import pandas as pd
import os
import xarray
import matplotlib.pyplot as plt
import folium
%matplotlib inline

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If any of them raises any error it means you need to install the module first. For doing so please:
1. Open a new cell int he notebook
2. Run <i>`!conda install packageName --yes`</i> or <i>`!conda install -c conda-forge packageName --yes`</i> or <i>`!pip install packageName`</i>
3. Import again!
<br><br>
Example: <i>how-to-solve import error for json2html module </i>

![region.png](img/errorImporting.gif)

## Tide Gauges (TG) data

Tide Gauges are fixed platforms measuring Sea Level over time plus, potentially, some other oceanographic variables.<br>
Let's see the data of one of the available Tide Gauges in the BS. 

### Reading file

`Run the next cell` to see the tide gauges files already available in the /data folder:

In [None]:
IPython.display.IFrame('data/files/TG', width='100%', height=350)

`Set one` of the above available `file name` and `run the next cells`:

In [None]:
file = 'BS_TS_TG_Varna.nc'
path = os.path.join(os.getcwd(), 'data','files','TG', file)

In [None]:
ds = xarray.open_dataset(path)
ds.close()
ds

The above one is an overview of the content of the file: variables, dimensions, coordinates, global attributes...
<br>i.e We are able to know already the platform last position. Let's draw it on a map: `run the next cell`

In [None]:
m = folium.Map(
    location=[ds.attrs['last_latitude_observation'],ds.attrs['last_longitude_observation']],
    zoom_start=6
)
tooltip = ds.platform_code
folium.Marker([float(ds.last_latitude_observation), float(ds.last_longitude_observation)], tooltip=tooltip).add_to(m)
m

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If you do not see any map when running the next cell please change your navigator (try chrome!).

<br>Let's list now the available variables: `run the next cell`

In [None]:
for var in ds.variables:
    print(var + ':' + ds[var].attrs['long_name'])

Let's focus on one of the parameters `run he next cell` to see its attributes:

In [None]:
param = 'SLEV'
ds[param]

Let's have  a look to the whole parameter time serie: `run the next cell`

In [None]:
ds[param].plot(aspect=2, size=10, color='k', marker='o')

### Subsetting Operations

Let's select an specific time range:

In [None]:
start = '2019-01-01'
end = '2019-12-30'

In [None]:
subset = ds[param].sel(TIME=slice(start, end))
subset.plot(aspect=2, size=10, color='k', marker='o')

### Sampling Operations

We will upsample and downsample the above serie subset, not the original one, to better see the diferences.

#### Selecting Good data (QC flags)

Is it there any bad data in the above time serie? `Run the next cells`to check the quality flags assigned to the parameter:

In [None]:
subset_QC = ds[param+'_QC'].sel(TIME=slice(start, end))
subset_QC.plot(aspect=2, size=5)

In [None]:
set(subset_QC[:,0].values.tolist())

All In Situ TAC variables are linked to another called the same plus '_QC'. This 'twin' variable contains a quality flag for each value in the paired variable. <br>
Let's check all posible 'QC' values: `run the next cell`

In [None]:
pd.DataFrame(data=ds[param+'_QC'].attrs['flag_values'],
             index=ds[param+'_QC'].attrs['flag_meanings'].split(' '), 
             columns=['quality flag'])

From the above list, users are recommended to use only the data flagged as 1; so that, when working with any of the variables, we will perfom first a data cleaning to use only the 'good data'.

If we wanted to do so (only if in the plot above we saw values different from 1 the operation would be:

In [None]:
subset_good = subset.where(subset_QC == 1)

In [None]:
subset.plot(aspect=2, size=10, color='r', marker='o') #original serie
subset_good.plot(color='k', marker='o') #clean serie in black

#### Upsampling/Downsampling

<ul> <li>Downsamplig </li></ul>
Let's see next a downsampling example; this is, retrieveing less observations in a given period by agregating some-how <i>i.e mean of the original number of observations</i>. Let's get, by averaging, a mothly resolution sampling: `run the next cell`

In [None]:
(subset_good.resample(TIME='1M').mean()).plot(aspect=2, size=10, color='k', marker='o')

<ul><li>Upsampling</ul></li>

Let's see next a upsampling example; this is, retrieveing more observations in a given period by inferring new ones some-how i.e <i>interpolating the original number of observations</i>. Let's get, by interpolating, a 0.5 minute resolution instead:

In [None]:
(subset_good.resample(TIME='0.5Min').interpolate('linear')).plot(aspect=2, size=10, color='k', marker='o')

### Exporting data to excel

Let's export the full serie to excel.

1) `run the next cell` to create a dataframe:

In [None]:
dataframe = subset_good[:,0].to_dataframe()
dataframe

2) `run the next cell` to reset the time axis to readable dates

In [None]:
df_with_readable_time = dataframe.set_index(dataframe.index.astype(str).str[:19])
df_with_readable_time

3) Exporting dataframe to excel: `run the next cells`!

In [None]:
out_put_dir = os.getcwd() #by default: current working directory. Set a different path if you want

In [None]:
file_name = ds.attrs['platform_code']+'_time_serie.xlsx'
file_name

In [None]:
dataframe.to_excel(os.path.join(out_put_dir, file_name))

<div class="alert alert-block alert-info" style="margin-left: 2em">
<b>TIP</b>
    
***  
Check your output directory for the file exported and inspect the content!.

---



## Wrap-up

So far you should already know how to deal with Time Serie data from fixed platforms (tide gauges, moorings, river flows...).<br> `If you don't please ask us! it is the moment!`

***

## Feedback survey

<div class="alert alert-block alert-success">
    <b>CONGRATULATIONS</b><br>

***
**IF IT'S 202025, PLEASE READ CAREFULLY BELOW LINES (ACTION FROM YOUR SIDE)**
***    
This training course is over but we'd love to hear from you about how we could improve it (topics, tools, storytelling, format, speed etc). 

We have prepared a little questionnaire to gather all your inputs, available here (just click on the hyperlink or execute the very last cell and click on `Answer`):
- https://tiny.cc/training-blk-insitu

We do thank you in advance for your kind collaboration :)

Greetings <3

In [None]:
from IPython.display import IFrame
IFrame('https://tiny.cc/training-blk-insitu', width=900, height=500)