![header](https://i.imgur.com/I4ake6d.jpg)

# IN SITU GLOBAL SEAS TRAINING

<div style="text-align: right"><i> 13-03-Part-three-out-of-five </i></div>

***
# GLO `NRT` product/dataset: managing files (tideGauges)

***
**General Note 1**: Execute each cell through the <button class="btn btn-default btn-xs"><i class="icon-play fa fa-play"></i></button> button from the top MENU (or keyboard shortcut `Shift` + `Enter`).<br>
<br>
**General Note 2**: If, for any reason, the kernel is not working anymore, in the top MENU, click on the <button class="btn btn-default btn-xs"><i class="fa fa-repeat icon-repeat"></i></button> button. Then, in the top MENU, click on "Cell" and select "Run All Above Selected Cell".<br>
***

<h1>Table of Contents<span class="tocSkip"></h1>
<div class="toc">
    <ul class="toc-item">
        <li><span><a href="#1.-Introduction" data-toc-modified-id="1.-Introduction">1. Introduction</a></span></li>
        <li>
            <span><a href="#2.-Setup" data-toc-modified-id="2.-Setup">2. Setup</a></span>
            <ul>
                <li><span><a href="#2.1.-Python-packages" data-toc-modified-id="2.1.-Python-packages">2.1. Python packages</a></span></li>
            </ul>
        </li>
        <li><span><a href="#3.-Tide-Gauges-(TG)-data" data-toc-modified-id="3.-Tide-Gauges-(TG)-data">3. Tide Gauges (TG) data</a></span>
            <ul>
                <li><span><a href="#3.1.-Reading-file" data-toc-modified-id="3.1.-Reading-file">3.1. Reading file</a></span></li>
                <li><span><a href="#3.2.-Subsetting-Operations" data-toc-modified-id="3.2.-Subsetting-Operations">3.2. Subsetting Operations</a></span>
                <li><span><a href="#3.3.-Sampling-Operations" data-toc-modified-id="3.3.-Sampling-Operations">3.3. Sampling Operations</a></span>
                    <ul>
                        <li><span><a href="#3.3.1.-Selecting-Good-data-(QC-flags)" data-toc-modified-id="3.3.1.-Selecting-Good-data-(QC-flags)">3.3.1. Selecting Good data (QC flags)</a></span></li>
                        <li><span><a href="#3.3.2.-Upsampling/Downsampling" data-toc-modified-id="3.3.2.-Upsampling/Downsampling">3.3.2. Upsampling/Downsampling</a></span></li>
                    </ul>
                </li>
        <li><span><a href="#3.4.-Exporting-data-to-csv" data-toc-modified-id="3.4.-Exporting-data-to-csv">3.4. Exporting data to csv</a></span></li>
            </ul>
        </li>
        <li><span><a href="#4.-Wrap-up" data-toc-modified-id="4.-Wrap-up">4. Wrap-up</a></span></li>
    </ul>
</div>

***

## 1. Introduction
[Go back to the "Table of Contents"](#Table-of-Contents)

According to the [13-01-NearRealTtime-product-collections-overview.ipynb](13-01-NearRealTtime-product-collections-overview.ipynb) one of the data source types available are the Tide Gauges. Please use the notebook [13-02-NearRealTtime-product-subsetting-download](13-02-NearRealTtime-product-subsetting-download.ipynb) to download some files from Tide Gauges ('TG' data type) and let's check its data. If you wanna skip the downloading part you can use the netCDF files available at the `/data/nc_files/TG` folder instead.  

## 2. Setup
[Go back to the "Table of Contents"](#Table-of-Contents)

### 2.1. Python packages

For the notebook to properly run we need to first load the next packages available from the Jupyter Notebook Ecosystem. Please run the `next cell`:

In [None]:
import warnings
warnings.filterwarnings("ignore")

import IPython
import datetime
import pandas as pd
import os
import xarray
import matplotlib.pyplot as plt
import folium
from IPython.display import YouTubeVideo
%matplotlib inline

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If any of them raises any error it means you need to install the module first. For doing so please:
1. Open a new cell in the notebook
2. Run <i>`!conda install packageName --yes`</i> or <i>`!conda install -c conda-forge packageName --yes`</i> or <i>`!pip install packageName`</i>
3. Import again!
<br><br>
Example: <i>how-to-solve import error for json2html module </i>

![region.png](img/errorImporting.gif)

## 3. Tide Gauges (TG) data
[Go back to the "Table of Contents"](#Table-of-Contents)

Tide Gauges are fixed platforms measuring Sea Level over time.<br>
See in the next video from MLA College how these platforms looks, works and are deployed: `run the next cell`

In [None]:
YouTubeVideo('IUhrY1NfFxA', width="100%", height=500)

Let's see the data reported by one of the available Tide Gauges in the GLO region. 

### 3.1. Reading file

`Run the next cell` to see the tide gauges files already available in the `/data/nc_files/TG` folder:

In [None]:
dir_TG = os.path.join(os.getcwd(),'data','nc_files','TG') 
os.listdir(dir_TG)

`Set one` of the above available `file name` and `run the next cells`:

In [None]:
file = 'IR_TS_TG_ElHierroTG.nc'
path = os.path.join(dir_TG, file)

In [None]:
ds = xarray.open_dataset(path)
ds.close()
ds

The above one is an overview of the content of the file: variables, dimensions, coordinates, global attributes...
<br>i.e we are able to know already the platform last position. Let's draw it on a map: `run the next cell`

In [None]:
m = folium.Map(
    location=[float(ds.attrs['last_latitude_observation']),float(ds.attrs['last_longitude_observation'])],
    zoom_start=6
)
tooltip = ds.platform_code
folium.Marker([float(ds.last_latitude_observation), float(ds.last_longitude_observation)], tooltip=tooltip).add_to(m)
m

<div class="alert alert-block alert-warning">
<b>WARNING</b>
    
***  
If you do not see any map when running the next cell please change your navigator (try chrome!).

<br>Let's list now the available variables: `run the next cell`

In [None]:
for var in ds.variables:
    print(var + ':' + ds[var].attrs['long_name'])

Let's focus on one of the parameters `run he next cell` to see its attributes:

In [None]:
param = 'SLEV'
ds[param][:,0]

Let's have  a look to the whole parameter time serie: `run the next cell`

In [None]:
ds[param][:,0].plot(aspect=2, size=10, color='k', marker='o')

### 3.2. Subsetting Operations

Let's select an specific time range:

In [None]:
start = '2018-01-01'
end = '2018-03-30'

In [None]:
subset = ds[param][:,0].sel(TIME=slice(start, end))
subset.plot(aspect=2, size=10, color='k', marker='o')

<div class="alert alert-block alert-success">
<b>EMMA STORM</b>
    
***  
Do you see anything peculiar in the overall Sea Level time serie in between 2018/02/26 and 1018/03/05? 
<br>Yes! There is a period of 'high seas' or 'high tides' cause by Emma Storm, a low pressure event that caused many harm in the coast of Spain, Portugal and UK back in 2018. Look more about such episode!

### 3.3. Sampling Operations

We will upsample and downsample the above serie subset, not the original one, to better see the diferences.

#### 3.3.1. Selecting Good data (QC flags)

Is it there any bad data in the above time serie? `Run the next cell`to check the quality flags assigned to the parameter:

In [None]:
subset_QC = ds[param+'_QC'][:,0]#.sel(TIME=slice(start, end))
ds[param+'_QC'][:,0].plot(aspect=2, size=5)

All In Situ TAC variables are linked to another called the same plus '_QC'. This 'twin' variable contains a quality flag for each value in the paired variable. <br>
Let's check all posible 'QC' values: `run the next cell`

In [None]:
pd.DataFrame(data=ds[param+'_QC'][:,0].attrs['flag_values'],
             index=ds[param+'_QC'][:,0].attrs['flag_meanings'].split(' '), 
             columns=['quality flag'])

From the above list, users are recommended to use only the data flagged as 1; so that, when working with any of the variables, we will perfom first a data cleaning to use only the 'good data' to be safe. See how to do it by `running the next cells`:

In [None]:
subset_good = subset.where(subset_QC == 1)

In [None]:
subset.plot(aspect=2, size=10, color='r', marker='o')#in red the bad data
subset_good.plot(color='k', marker='o')#in black the good data!!! :D

#### 3.3.2. Upsampling/Downsampling

<ul> <li>Downsamplig </li></ul>
Let's see next a downsampling example; this is, retrieveing less observations in a given period by agregating some-how <i>i.e mean of the original number of observations</i>. Let's get, by averaging, a weekly resolution sampling: `run the next cell`

In [None]:
(subset_good.resample(TIME='1w').mean()).plot(aspect=2, size=10, color='k', marker='o')

<ul><li>Upsampling</ul></li>

Let's see next a upsampling example; this is, retrieveing more observations in a given period by inferring new ones some-how i.e <i>interpolating the original number of observations</i>. Let's get, by interpolating, a 0.5 minute resolution instead:

In [None]:
(subset_good.resample(TIME='0.5Min').interpolate('linear')).plot(aspect=2, size=10, color='k', marker='o')

### 3.4. Exporting data to csv

Let's export the full serie to csv.

1) `run the next cell` to create a dataframe:

In [None]:
dataframe = subset_good.to_dataframe()
dataframe.transpose()

2) `run the next cell` to reset the time axis to readable dates

In [None]:
df_with_readable_time = dataframe.set_index(dataframe.index.astype(str).str[:19])
df_with_readable_time.transpose()

3) Exporting dataframe to csv: `run the next cells`!

In [None]:
out_put_dir = os.getcwd() #by default: current working directory. Set a different path if you want

In [None]:
file_name = ds.attrs['platform_code']+'_time_serie.csv'
file_name

In [None]:
df_with_readable_time.to_csv(os.path.join(out_put_dir, file_name))

<div class="alert alert-block alert-info" style="margin-left: 2em">
<b>TIP</b>
    
***  
Check your output directory for the file exported and inspect the content!.

---



## 4. Wrap-up
[Go back to the "Table of Contents"](#Table-of-Contents)

So far you should already know how to deal with Time Serie data from fixed platforms (tide gauges, moorings, river flows...).<br> `If you don't please ask us! it is the moment!`