<img src='./img/LogoWekeo_Copernicus_RGB_0.png' alt='' align='centre' width='30%'></img>

***

# COPERNICUS MARINE BIO BLACK-SEA TRAINING

<div style="text-align: right"><i> 07-03-BIO </i></div>


    License: This code is offered as open source and free-to-use in the public domain, 
             with no warranty, under the MIT license associated with this code repository.


***

<center><h1> How to visualize maps, sections of nutrients, chlorophyll, oxygen and CO2 in the Black-Sea </h1></center>

***
**General Note 1**: Execute each cell through the <button class="btn btn-default btn-xs"><i class="icon-play fa fa-play"></i></button> button from the top MENU (or keyboard shortcut `Shift` + `Enter`).<br>
<br>
**General Note 2**: If, for any reason, the kernel is not working anymore, in the top MENU, click on the <button class="btn btn-default btn-xs"><i class="fa fa-repeat icon-repeat"></i></button> button. Then, in the top MENU, click on "Cell" and select "Run All Above Selected Cell".<br>

***
# Table of contents
- [1. Introduction](#1.-Introduction)
- [2. About the data](#2.-About-the-data)
- [3. Required Python modules](#3.-Required-Python-modules)
- [4. Download data with HDA](#4.-Download-data-with-HDA)
- [5. Exercise n.1: Vertical profiles](#5.-Exercise-n.1:-Vertical-profiles)
- [6. Exercise n.2: Plot of transects](#6.-Exercise-n.2:-Plot-of-transects)
- [7. Conclusion](#7.-Conclusion)
***

# 1. Introduction

[Go back to the "Table of contents"](#Table-of-contents)

The objective of this exercise is to use the Copernicus Marine (CMEMS) BIOgeochemical products to visualize some typical coastal biogeochemical features in the Black Sea.

In particular, we will display:
 
- exercice 1: typical vertical profiles of oxygen and of pH
- exercice 2: plot of transects

We will use the near-real time (NRT) products as they already use the latest CMEMS name conventions. After July 2020, the multi-year (MY) product will also use the same conventions.
***

# 2. About the data

[Go back to the "Table of contents"](#Table-of-contents)

## Model description

### This example is based on the product: [EO:MO:DAT:BLKSEA_ANALYSIS_FORECAST_BIO_007_010](https://moi.wekeo.eu/data?view=dataset&dataset=EO%3AMO%3ADAT%3ABLKSEA_ANALYSIS_FORECAST_BIO_007_010)

**BLKSEA_ANALYSIS_FORECAST_BIO_007_010** is the nominal product of the Black Sea Biogeochemistry NRT system and is generated by the NEMO-BAMHBI modelling system. Biogeochemical Model for Hypoxic and Benthic Influenced areas (BAMHBI) is an innovative biogeochemical model with a 28-variable pelagic component (including the carbonate system) and a 6-variable benthic component ; it explicitely represents processes in the anoxic layer.The product provides analysis and forecast for 3D concentration of chlorophyll, nutrients (nitrate and phosphate), dissolved oxygen, phytoplankton carbon biomass, net primary production, pH, dissolved inorganic carbon, total alkalinity, and for 2D fields of bottom oxygen concentration (for the North-Western shelf), surface partial pressure of CO2 and surface flux of CO2. These variables are computed on the same grid as the PHY product, at ~3km x 31-levels resolution, and are provided as daily and monthly means.

<img src="https://wekeo-broker.apps.mercator.dpi.wekeo.eu/previews/EO_MO_DAT_BLKSEA_ANALYSIS_FORECAST_BIO_007_010_bs-ulg-pft-an-fc-m.png">

## Get more info on the product
You can find more info on this product and access to the download services in the [products viewer on Wekeo](https://moi.wekeo.eu/data?view=dataset&dataset=EO%3AMO%3ADAT%3ABLKSEA_ANALYSIS_FORECAST_BIO_007_010).
<br><br>

## Parameters used for downloading the data
| Parameter | Value |
| :---: | :---|
| **Product** | BLKSEA_ANALYSIS_FORECAST_BIO_007_010 |
| **Datasets** | <ul><li>bs-ulg-bio-an-fc-m</li><li>bs-ulg-ptf-an-fc-m</li><li>bs-ulg-nut-an-fc-m</li>
| **Frequency** | monthly |
| **Lat min** | 27.37 |
| **Lat max** | 41.96 |
| **Lon min** | 40.86 |
| **Lon max** | 46.80 |
| **Timesteps** | 2019-01-01, 2019-09-01 |
| **Service for downloading** | HDA (FTP) |
| **Files total dimension** | ~40 MB |

<div class="alert alert-block alert-warning">
    <b>Get the WEkEO User credentials</b>
<hr>
If you want to download the data to use this notebook, you will need WEkEO User credentials. If you do not have these, you can register <a href="https://www.wekeo.eu/web/guest/user-registration" target="_blank">here</a>.

***

# 3. Required Python modules

[Go back to the "Table of contents"](#Table-of-contents)

Here you can find the Python modules imported for running the notebook's code. They are quite common modules adopted for handling the scientific data.

| Module name | Description |
| :---: | :---|
| **os** | [ Miscellaneous operating system interfaces](https://docs.python.org/3.7/library/os.html) for managing paths, creating directories,... |
| **sys** | [sys](https://docs.python.org/3/library/sys.html) for accessing variables used by the interpreter |
| **json** | [json](https://docs.python.org/3/library/json.html) is for manipulating JSON files |
| **requests** | [requests](https://requests.readthedocs.io/en/master/) is for sending HTTP requests |
| **numpy** | [NumPy](https://numpy.org/) is the fundamental package for scientific computing with Python and for managing ND-arrays |
| **xarray** | [Xarray](http://xarray.pydata.org/en/stable/) introduces labels in the form of dimensions, coordinates and attributes on top of raw NumPy-like arrays, which allows for a more intuitive, more concise, and less error-prone developer experience. |
| **matplotlib** |[Matplotlib](https://matplotlib.org/) is a Python 2D plotting library which produces publication quality figures |

### Code cells allow you to enter and run Python code 
Run a code cell using <code>Shift-Enter</code> or pressing the <button class="btn btn-default btn-xs"><i class="icon-play fa fa-play"></i></button> button in the toolbar above:

## Import the modules

For avoiding the warning messages during the execution and installation process, at first remove them:

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
import os
import sys
import json
import requests
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt

If you don't have the right module, please install it with the command:
```
conda install module_name
```
and then re-try to execute the cell for importing it. **Please install the modules one by one**.

***

# 4. Download data with HDA

### Install the WEkEO HDA client

The WEkEO HDA client is a python based library. It provides support for both Python 2.7.x and Python 3.

In order to install the WEkEO HDA client via the package management system pip, you have to running on Unix/Linux the command shown below.

In [None]:
pip install -U hda

Please verify the following requirements are installed before skipping to the next step:
   - Python 3
   - requests
   - tqdm

#### Load WEkEO HDA client

The hda client provides a fully compliant Python 3 client that can be used to search and download products using the Harmonized Data Access WEkEO API.
HDA is RESTful interface allowing users to search and download WEkEO datasets.
Documentation about its usage can be found at https://www.wekeo.eu/.

In [None]:
from hda import Client

<hr>

### Configure the WEkEO API Authentication

In order to interact with WEkEO's Harmonised Data Access API, each user first makes sure the file "$HOME/.hdarc" exists with the URL to the API end point and your user and password.

For example, to search for the file .hdarc in the $HOME diretory, the user would open a terminale and run the following command:

Then he could copy the code below in the file "$HOME/.hdarc" (in your Unix/Linux environment) and adapt the following template with the credentials of your WEkEO account:

If he doesn't have a WEkEO account, please self register at the WEkEO registration page https://my.wekeo.eu/web/guest/user-registration.

<hr>

In [None]:
download_dir_path = os.path.join(os.getcwd(),'products', "07-03")

# make the output directory if required
if not os.path.exists(download_dir_path):
    os.makedirs(download_dir_path)

This tutorial covers three different data sets. If you want to look at all of them, you need to uncomment (remove the # and replace on the others) the line corresponding to the dataset you want to download and run the next cells in this section, for each data set ID.

In [None]:
dataset_id = "EO:MO:DAT:BLKSEA_ANALYSIS_FORECAST_BIO_007_010"

## Uncooment the product with the variable (together)

#product = "bs-ulg-pft-an-fc-m";variable="chl"
#product = "bs-ulg-bio-an-fc-m";variable="o2"
product = "bs-ulg-nut-an-fc-m";variable="no3"

We provide here the parameters of the requests as described in previous section. You can prepare this request thanks to the data access feature of WEkEO data viewer.

In [None]:
query = {
  "datasetId": dataset_id+":"+product,
  "boundingBoxValues": [
    {
      "name": "bbox",
      "bbox": [
        36.29404162525542,
        43.07629007394045,
        38.025762720953,
        44.03246025765181
      ]
    }
  ],
  "dateRangeSelectValues": [
    {
      "name": "position",
      "start": "2020-09-02T00:00:00.000Z",
      "end": "2020-11-02T00:00:00.000Z"
    }
  ],
  "multiStringSelectValues": [
    {
      "name": "variable",
      "value": [
        variable
      ]
    }
  ],
  "stringChoiceValues": [
    {
      "name": "service",
      "value": "BLKSEA_ANALYSIS_FORECAST_BIO_007_010-TDS"
    },
    {
      "name": "product",
      "value": product
    },
    {
      "name": "startDepth",
      "value": "22.66373634338379"
    },
    {
      "name": "endDepth",
      "value": "78.25402069091797"
    }
  ]
}

In [None]:
print('Downloading data...')
c = Client(debug=True)

matches = c.search(query)
print(matches)
matches.download()

***

# 5. Exercise n.1: Vertical profiles

[Go back to the "Table of contents"](#Table-of-contents)

##   Scientific question ?

What does typical vertical profiles of oxygen and pH look like in the Black Sea ?

*Don't change the following constants, which define the training and the notebook codes*:

In [None]:
REGION_CODE = "07"
NB_CODE = "03"

**checkDir**: function for creating a path, if needed

In [None]:
def checkDir(outPath):
    if not os.path.exists(outPath):
        os.makedirs(outPath)

**getRangeIndexes**: function for getting the indexes of the array *arr* between the *var_min* and *var_max* values:

In [None]:
def getRangeIndexes(arr, var_min, var_max):
    return np.where((arr >= var_min) & (arr <= var_max))[0]

Set the paths:

In [None]:
ex_code = REGION_CODE+'-'+NB_CODE

In [None]:
# Path for netcdf files
data_path = './products/'+ex_code
# Path for the output files (images, etc)
out_path = './out/'+ex_code

Check the current directories:

In [None]:
# Create directories
checkDir(data_path)
checkDir(out_path)

Check if the new directories have been created... 

In [None]:
for filename in os.listdir('.'):
    print(filename)

... and if the data files are available:

In [None]:
for filename in os.listdir(data_path):
    print(filename)

## 5.1. Access the data

In [None]:
# Input netcdf file (please note - if you haven't downloaded all three different data sets, you'll need to comment the relevant lines below 
#  or go back to the section above to download the data.)

# Monthly means of January and September 2019
nut_f = ["20190115_m-ULg--NUTR-nemo_bamhbi-BS-b20200603_an-sv09.00.nc",
         "20190915_m-ULg--NUTR-nemo_bamhbi-BS-b20200603_an-sv09.00.nc"]
pft_f = ["20190115_m-ULg--PFTC-nemo_bamhbi-BS-b20200603_an-sv09.00.nc",
         "20190915_m-ULg--PFTC-nemo_bamhbi-BS-b20200603_an-sv09.00.nc"]
bio_f = ["20190115_m-ULg--BIOL-nemo_bamhbi-BS-b20200603_an-sv09.00.nc",
         "20190915_m-ULg--BIOL-nemo_bamhbi-BS-b20200603_an-sv09.00.nc"]

nut_nc = [os.path.join(data_path, f) for f in nut_f]
pft_nc = [os.path.join(data_path, f) for f in pft_f]
bio_nc = [os.path.join(data_path, f) for f in bio_f]

In [None]:
# Open the nc datasets
nut_ds = [ xr.open_dataset(nc) for nc in nut_nc]
pft_ds = [ xr.open_dataset(nc) for nc in pft_nc]
bio_ds = [ xr.open_dataset(nc) for nc in bio_nc]

Set the array index for accessing the desired dataset in the datasets arrays:

In [None]:
# index for the ds arrays
# 0 = winter, 1 = summer
j = 0

### 5.1.1. Get info about the dataset of **"Nitrate and Phosphate 3D"**:

In [None]:
nut_ds[j].info

Products used in this training sessions are NetCDF files (.nc files). A NetCDF file is a common way of storing scientific data. It contains:
- The **dimensions** of the data (here depth, latitutde, longitude and time)
- Several **variables** depending on one or more of these dimensions.
    - Each variable comes with its own attributes such as units, long_name, FillValue...
- General information about the product (**Attributes**)

And about the variables:

In [None]:
nut_ds[j].data_vars

... and the coordinates:

In [None]:
nut_ds[j].coords

<div class="alert alert-block alert-warning">
    
**The dataset is 3D! You have depth levels!**

The xarray dataset ***nut_ds*** is extracted from a **3D dataset**        

Try to check the depth levels. Type ```nut_ds[j].depth``` in the cell below and execute it:

### 5.1.2. Get info about the dataset for **"Phytoplankton Carbon Biomass and Chlorophyll"**:

In [None]:
# please note - if you haven't downloaded all three data sets, you may not be able to run this section. 
# If you get errors saying data isn't found, go back to section 4 to download the relevant data.

pft_ds[j].info

And about the variables:

In [None]:
pft_ds[j].data_vars

... and the coordinates:

In [None]:
pft_ds[j].coords

### 5.1.3. Get info about the dataset for **"Primary Production and Oxygen"**:

In [None]:
bio_ds[j].info

And about the variables:

In [None]:
bio_ds[j].data_vars

... and the coordinates:

In [None]:
bio_ds[j].coords

Press the ***TAB*** key (on your keyboard) for obtaining the other methods and properties of the dataset: 

Select the cell below, press enter and then type:
```
bio_ds[j]. (and press TAB)
```

## 5.2. Set the configuration

Let's check the coordinates names in the cells above (check the ***ds.coords*** outputs) and set the correct variables below:

In [None]:
# Set the coordinate names (used later for accessing the data)
lon_name = "longitude"
lat_name = "latitude"
time_name = "time"
depth_name = "depth"

Do the same for the variables names (check the ***ds.data_vars*** outputs): 

In [None]:
# Set the variable names

# mass_concentration_of_chlorophyll_a_in_sea_water (CHL)
chl_name = "chl"

# mole_concentration_of_dissolved_molecular_oxygen_in_sea_water (O2)
oxy_name = "o2"

# mole_concentration_of_nitrate_in_sea_water (NO3)
no3_name = "no3"

# mole_concentration_of_phosphate_in_sea_water (PO4)
po4_name = "po4"

# net_primary_production_of_biomass_expressed_as_carbon_per_unit_volume_in_sea_water (PP)
npp_name = "nppv"

## 5.3. Plot the variables

### 5.3.1. Configure the variables for the plots

In [None]:
# selected latitude and longitude point 
lat_sel, lon_sel = 43, 32

In [None]:
# set the season: 0 = winter, 1 = summer
season_index = 0

# extract the columns for phyc and o2
phyc_column = np.squeeze(pft_ds[season_index].phyc.sel(latitude=lat_sel, longitude=lon_sel, method="nearest"))
o2_column = np.squeeze(bio_ds[season_index].o2.sel(latitude=lat_sel, longitude=lon_sel, method="nearest"))

# Plot configuration
width_inch = 16
height_inch = 8

# Axes labels
fontsize = 14
xlabel = 'longitude [deg]'
ylabel = 'latitude [deg]'
xlabelpad = 30
ylabelpad = 60

# Colorbar configuration
cmap = "jet"
cbar_position = "right"

title_fontstyle = {
    "fontsize": "14",
    "fontstyle": "italic",
    "fontweight": "bold",
    "pad": 30
}

label_fontstyle = {
    "fontsize": "12",
    "labelpad": 30
}

### 5.3.2 Plot the variables

In [None]:
fig, ax1 = plt.subplots(figsize=(width_inch, height_inch))

# set depths as negative
depths_column = -o2_column.depth

# define the signals
sig1 = o2_column
sig2 = phyc_column

# set the colors
color1 = 'r'
color2 = 'b'

# plot sig1
ax1.plot(sig1,depths_column,color1)
ax1.set_xlabel('o2', fontsize=14, color=color1)
ax1.set_ylabel("Depth [m]", fontsize=14)
ax1.tick_params(axis='y', labelcolor=color1)
ax1.set_ylim([-300, 0])
ax1.grid()

# plot sig2
ax2 = ax1.twiny()
ax2.plot(sig2, depths_column, color2)
ax2.set_xlabel("phyc", fontsize=14, color=color2)
ax2.tick_params(axis='y', labelcolor=color2)

# set title
title = "phyc and o2 vertical profiles"
plt.title(title, **title_fontstyle)

# output file
output_file = os.path.join(out_path,title.replace(' ','_')) + ".png"

# save the output file
plt.savefig(output_file)

plt.show()

plt.close()

***

# 6. Exercise n.2: Plot of transects

[Go back to the "Table of contents"](#Table-of-contents)

## 6.1. Define the area of interest

You should select a fixed latitude and a range of longitude. You can choose also the range of depths.

In [None]:
depth_min = 0
depth_max = 200

lat_point = 43

lon_min = 27.5
lon_max = 42

## 6.2. Access the data and set the configuration

In [None]:
# index for the ds arrays
j = 0

# --- Choose the variable to plot (but comment the others with the symbol #): ---
var_sel = nut_ds[j][no3_name]
#var_sel = pft_ds[j][chl_name]
#var_sel = bio_ds[j][oxy_name]

# Set the variable min and max values for the plot and the colorbar (otherwise assign None):
min_value, max_value = 0, 30
#min_value, max_value = None, None
#min_value, max_value = 260, 330


dataset_3D = False
if depth_name in var_sel.coords:
    dataset_3D = True

# --- Set up the arrays of coordinates for the selected dataset ---
# 
lats = var_sel[lat_name]
lons = var_sel[lon_name]
times = var_sel[time_name]
depths = var_sel[depth_name]

# Extract the coordinates subsets
ds = pft_ds[j] # the choosen dataset is not important now: we are extrating the coordinates
lats_ds, lons_ds, depths_ds = ds[lat_name], ds[lon_name], ds[depth_name]

# Set the indexes of coordinates
depth_indexes = getRangeIndexes(depths_ds, depth_min, depth_max) if dataset_3D else [0]
lat_indexes = np.abs(lats_ds-lat_point).argmin()
lon_indexes = getRangeIndexes(lons_ds, lon_min, lon_max)
time_indexes = 0

lons_sel = ds[lon_name][lon_indexes]
lats_sel = ds[lat_name][lat_indexes]
depths_sel = -ds[depth_name][depth_indexes] # note the minus '-ds'. Why?

# Set the variables to plot
CHL = [ds[chl_name][time_indexes, depth_indexes, lat_indexes, lon_indexes] for ds in pft_ds]
OXY = [ds[oxy_name][time_indexes, depth_indexes, lat_indexes, lon_indexes] for ds in bio_ds]
NPP = [ds[npp_name][time_indexes, depth_indexes, lat_indexes, lon_indexes] for ds in bio_ds]
PHO = [ds[po4_name][time_indexes, depth_indexes, lat_indexes, lon_indexes] for ds in nut_ds]
NO3 = [ds[no3_name][time_indexes, depth_indexes, lat_indexes, lon_indexes] for ds in nut_ds]

# Set the variable min and max values for the plot and the colorbar (otherwise assign None):
min_value, max_value = None, None
# min_value, max_value = 0, 250

If you want to check a variable content, write its name in the cell below and press RUN (or shift-enter):

In [None]:
depths_sel

In [None]:
# Create the meshgrid for the plot 
X, Y = np.meshgrid(lons_sel, depths_sel)

# Plot configuration
width_inch = 14
height_inch = 8

# Axes labels
fontsize = 14
xlabel = "longitude [degE]"
ylabel = "depth [m]"

# Colorbar configuration
cmap = "jet"
cbar_position = "right"

contour_levels = 100

title_fontstyle = {
    "fontsize": "14",
    "fontstyle": "italic",
    "fontweight": "bold",
    "pad": 30
}
label_fontstyle = {
    "fontsize": "12",
#     "labelpad": 10
}

## 6.3. Define the plot function

In [None]:
# function for plotting the transect (it uses many "hidden" variables...)
def plot_transect(data, min_value, max_value,step_value):
    fig = plt.figure(figsize=(width_inch, height_inch))

    # Get the timestep
    timestep = np.datetime_as_string(data.time,'h')

    # Create the meshgrid for the plot 
    xx, yy = np.meshgrid(lons_sel, depths_sel)

    # set variable limits
    min_value = data.min() if min_value is None else min_value
    max_value = data.max() if max_value is None else max_value
    contour_levels = np.arange(min_value, max_value, step_value)
    
    ## contour fill
    plt.contourf(xx, yy, data, contour_levels, cmap=cmap, vmin=min_value, vmax=max_value)
    
    plt.grid()
    plt.colorbar(extend='both')

    title_sel = data.long_name
    var_str = "{} [{}]".format(title_sel, data.units)
    title = ' - '.join((var_str, timestep))

    plt.title(title, **title_fontstyle)
    plt.xlabel(xlabel, **label_fontstyle)
    plt.ylabel(ylabel, **label_fontstyle)

    # output file
    output_file = os.path.join(out_path,title.replace(' ','_')) + ".png"

    # save the output file
    plt.savefig(output_file)

    plt.show()

    plt.close()

## 6.4. Plot the variables along a transect

### We're now ready to plot the section along the chosen transect

### we plot a section of phytoplankton 

Phytoplankton, the autotrophic component of the marine ecosystem, needs light and nutrients to growth.

These drivers greatly vary throughout the year, so that the distribution of the phytoplankton in the water column has a seasonal cycle.

### a winter section of chlorophyll

The chlorophyll is often used to describe phytoplankton biomass, however it is worth to remind that the CMEMS model provides also the carbon biomass which is more indicated to investigate carbon cycle, carbon production and trophic transfer efficiency through the food web.

The winter section of chlorophyll shows 
 - the effect of the Danube input on the western-most part of the transect.
 - no chlorophyll below ~100 m depth
 - the surface layers are well mixed (vertically), and we observe a Deep Chlorophyll Maximum around 40-50m depth


In [None]:
data = CHL[0]

plot_transect(data, 0, 3, 0.01)
#plot_transect(data, None, None, 0.01)

### and in summer ?
We select the second time frame of the chlorophyll dataset.

In [None]:
data = CHL[1]

plot_transect(data, 0, 3, 0.01)
#plot_transect(data, None, None, 0.01)

At surface in summer, chlorophyll content is much lower than in winter because the strong stratification prevents vertical movements of nutrients. The Deep Chlorophyll Maximum is still present, and occurs where there's still sufficient light, and also nutrients.

***

# 7. Conclusion

[Go back to the "Table of contents"](#Table-of-contents)

<div class="alert alert-block alert-success">
    <b>CONGRATULATIONS</b><br>
  
--- 

#### And thank you for your attention! :) We hope you enjoyed this training on the Mediterranean Biogeochemical model data provided through WEkEO by Copernicus Marine Service, for free, thanks to the European Commission.

#### Now let's try to download new data and variables and to access and visualize them... you can try to make new maps and plots... and don't forget to try to the others

We'd love to hear from you about how we could improve it (topics, tools, storytelling, format, speed etc). 

We do thank you in advance for your kind collaboration :)

<img src='./img/all_partners_wekeo.png' alt='' align='center' width='75%'></img>

<p style="text-align:left;">This project is licensed under the <a href="./LICENSE">MIT License</a> <span style="float:right;"><a href="https://github.com/wekeo/wekeo-jupyter-lab">View on GitHub</a> | <a href="https://www.wekeo.eu/">WEkEO Website</a> | <a href=mailto:support@wekeo.eu>Contact</a></span></p>