# EUMETSAT Sentinel-3 OLCI over Coastal Waters - Neural Net ML algorithm

This code will introduce you to Python. You will lear how to import a netCDF file in to your workspace, conduct some simple operations, and plot (map) an image. In this case, we will be using a level-2 OLCI image, but the script can be easily adapted to plot any netCDF variable.

Sentinel-3's Ocean and Land Colour Instrument (OLCI) acquires spectral information on the colour of the oceans. These data can be used to monitor phytoplankton, the foundation of nearly all life in our seas. Ocean colour data are also useful for helping us to measure and track sediment and coloured dissolved organic matter (CDOM). 

Using 2 different algorithms to derive abundance of phytoplankton (through Chl-a concentrations), we will be assessing OLCI Level-2 Water Full Resolution data (300 m) collected over (1) open oceans and (2) complex coastal waters.

### 1(a) Setting up your Code: Import the Libraries you will need. 

Libraries are usually code modules that perform specific tasks or provide specific capability (e.g. statistical analysis or plotting routines). In this case we will import the xarray library for handling netCDF files, the numpy library which will help to conduct various operations on the data, and the matplotlib plotting library to generate some images.

In [None]:
%matplotlib inline

# Import the libraries you need with its alias (short 'nickname')
# Data processing
import xarray as xr
import netCDF4 as nc
import numpy as np
import os

# Data vis/ mapping
import matplotlib.pyplot as plt
from matplotlib import gridspec
import matplotlib.colors as mcolors
import cartopy.crs as ccrs
import cartopy.feature as cfeature

import warnings
warnings.filterwarnings('ignore')

### 1(b) Setting up your Code: Define any Functions you may call. 

Usually we also define functions at the top of a Python script. Functions are routines that can be called elsewhere in our script and perform a specific task. Typically we would use a function to take care of any process that we are going to perform more than once. The box below defines a function that will mask our data according to quality flags. We will call this function later on.

In [None]:
def flag_data_fast(flags_we_want, flag_names, flag_values, flag_data, flag_type = 'WQSF'):
    flag_bits = np.uint64()
    if flag_type == 'SST':
        flag_bits = np.uint8()
    elif flag_type == 'WQSF_lsb':
        flag_bits = np.uint32()
    
    for flag in flags_we_want:
        try:
            flag_bits = flag_bits | flag_values[flag_names.index(flag)]
        except:
            print(flag + " not present")
    
    return (flag_data & flag_bits) > 0

### 2. Data Processing

We only really start to run the script from this point. A key step is to correctly point Python to where you have saved your data.

In [None]:
pwd # Confirming your current path

#### Replace my path with your path (from pwd command above) with '/data' tagged at the end

In [None]:
# Path to your Sentinel-3 OLCI data
inpath = os.path.join('/Users/lbiermann1/Dropbox/2425_ipython_ynb/S3_OLCI_L2/data')
infile = 'S3B_OL_2_WFR____20241014T160151_20241014T160451_20241015T234749_0179_098_325_2520_MAR_O_NT_003.SEN3' 

In [None]:
filename_oc = 'chl_oc4me.nc'
filename_nn = 'chl_nn.nc'
file_latlon = 'geo_coordinates.nc'

Let's quickly make sure your path is fine, and the data are where you pointed the script to! 

In [None]:
# check the length of your path name - some windows os' break at long paths
if len(os.path.join(inpath, infile, "chl_nn.nc")) > 259 \
  or len(os.path.join(inpath,infile,"chl_nn.nc")) > 248:
    print('Beware, your path name is quite long. Consider moving your data to a new directory')
else:
    print('Your path length name is fine')
# check that you have pointed python to the right places
if os.path.exists(os.path.join(inpath,infile,"chl_nn.nc")):
    print('Found your required data file')
else:
    print('Data file missing. Please check your path and file name')

#### Import your variables - CHL from OC4Me and NN algorithms, Latitude (lat), Longitude (lon)

In [None]:
# Latitude and Longitude from geo_coordinates
lat_lon = xr.open_dataset(os.path.join(inpath, infile, file_latlon))
lat = lat_lon.latitude
lon = lat_lon.longitude

# CHL concentrations calculated using the tradtional OC4Me Algorithm
OLCI_oc = xr.open_dataset(os.path.join(inpath, infile, filename_oc))
OC4me = OLCI_oc.CHL_OC4ME.data

# CHL concentrations calculated using the Neural Net (ML) Algorithm
OLCI_nn = xr.open_dataset(os.path.join(inpath, infile, filename_nn))
CHLnn = OLCI_nn.CHL_NN.data

# Close holders
lat_lon.close()
OLCI_oc.close()
OLCI_nn.close()

#### We can plot our imported CHL data to check it looks right, and also to subset.

In [None]:
# Quick Check of your [CHL] Data
f, axarr = plt.subplots(1,2) #subplot(r,c) provide the no. of rows and columns
axarr[0].imshow(OC4me)
axarr[1].imshow(CHLnn)

In [None]:
# Subset using Rows and Column Values from above Plots
row1 = 0
row2 = 2000
col1 = 1600
col2 = 4800

In [None]:
# Now Subset your Data to speed up Processing and Plotting
LAT_ss = lat[row1:row2, col1:col2 ]
LON_ss = lon[row1:row2, col1:col2 ]
OC4_ss = OC4me[row1:row2,col1:col2]
CHL_ss = CHLnn[row1:row2,col1:col2]

In [None]:
# Quick Check of your subset [CHL] Data
f, axarr = plt.subplots(1,2) #subplot(r,c) provide the no. of rows and columns
axarr[0].imshow(OC4_ss)
axarr[1].imshow(CHL_ss)

### 3. Data Visualisation

#### There are several different ways to plot these kinds of data - this is one way :)

In [None]:
# Create a new figure for the plot, with a size of 16x16 inches and a high resolution (300 dpi)
fig1 = plt.figure(figsize=(16, 16), dpi=300)

# Use a grid layout to place two maps side by side (1 row, 2 columns)
gs = gridspec.GridSpec(1, 2)

# Define the projection for your data (PlateCarree is often used for geographic data)
data_projection = ccrs.PlateCarree()

# Set up a higher-resolution land feature to make the land look more detailed on the map
land_poly = cfeature.NaturalEarthFeature('physical', 'land', '10m',
                                         edgecolor='k', facecolor='black')

# ----- Plot 1: CHL OC4ME Algorithm -----
#########################################

# Create the first subplot in the left panel (gs[0,0]) with a PlateCarree projection
ax = plt.subplot(gs[0, 0], projection=ccrs.PlateCarree(central_longitude=0.0))

# Plot your OC4ME data (LON_ss, LAT_ss, and OC4_ss are the longitude, latitude, and data arrays)
# We use vmin and vmax to set the minimum and maximum values for the color scale (using viridis colormap)
im = plt.pcolormesh(LON_ss, LAT_ss, OC4_ss, vmin=np.nanmin(OC4_ss), vmax=np.nanmax(OC4_ss),
                    cmap=plt.cm.viridis, transform=data_projection)

# Add coastlines and land features to make the map clearer
ax.coastlines(resolution='10m', color='black', linewidth=1)
ax.add_feature(land_poly)

# Add gridlines to the map (with labels for latitude and longitude)
g1 = ax.gridlines(draw_labels=True, linewidth=1, color='lightgray', alpha=0.5, linestyle='--')
g1.top_labels = False  # Don't show labels on the top of the plot
g1.right_labels = False  # Don't show labels on the right of the plot
g1.xlabel_style = {'size': 11, 'color': 'gray'}
g1.ylabel_style = {'size': 11, 'color': 'gray'}

# Add a color bar at the bottom of the map to show what the colors mean (representing chlorophyll levels)
cbar = plt.colorbar(im, orientation="horizontal", fraction=0.1, pad=0.03)
cbar.set_label('CHL_OC4ME mg.m$^{-3}$', fontsize=11)

# ----- Plot 2: CHL Neural Network Algorithm -----
##################################################

# Create the second subplot in the right panel (gs[0,1]) with a PlateCarree projection
ax = plt.subplot(gs[0, 1], projection=ccrs.PlateCarree(central_longitude=0.0))

# Plot your CHL Neural Network data (LON_ss, LAT_ss, and CHL_ss are the longitude, latitude, and data arrays)
# Again, we use vmin and vmax to define the color scale with the viridis colormap
im = plt.pcolormesh(LON_ss, LAT_ss, CHL_ss, vmin=np.nanmin(CHL_ss), vmax=np.nanmax(CHL_ss),
                    cmap=plt.cm.viridis, transform=data_projection)

# Add coastlines and land features to the map
ax.coastlines(resolution='10m', color='black', linewidth=1)
ax.add_feature(land_poly)

# Add gridlines to the map with labels
g1 = ax.gridlines(draw_labels=True, linewidth=1, color='lightgray', alpha=0.5, linestyle='--')
g1.top_labels = False  # Don't show labels on the top of the plot
g1.right_labels = False  # Don't show labels on the right of the plot
g1.xlabel_style = {'size': 11, 'color': 'gray'}
g1.ylabel_style = {'size': 11, 'color': 'gray'}

# Add a color bar at the bottom of the second map to represent chlorophyll levels from the Neural Network model
cbar = plt.colorbar(im, orientation="horizontal", fraction=0.1, pad=0.03)
cbar.set_label('CHL_NN mg.m$^{-3}$', fontsize=11)

# Display the plot
plt.show()


In [None]:
# Uncomment the comment below to save your figure (1)
# fig1.savefig('OLCI__all_flags_off.png', bbox_inches='tight')

### 4. Call Function to Mask Flagged Data

In [None]:
# FLAG and MASK clouds
######################
inflag = 'wqsf.nc'
# Choose and Add Flags to Mask Cloud
flag_vars = ['CLOUD', 'CLOUD_AMBIGUOUS', 'CLOUD_MARGIN', 'HIGHGLINT']
FLAG_file = xr.open_dataset(os.path.join(inpath, infile, inflag))

# Flag names
flag_name = FLAG_file['WQSF'].flag_meanings.split(' ')
# flag bit values
flag_vals = FLAG_file['WQSF'].flag_masks
# flag field itself
FLAGS = FLAG_file.variables['WQSF'].data
FLAG_file.close()

# Make flag mask using the function we defined at the start: "flag_data_fast" (cell2)
flag_mask = flag_data_fast(flag_vars, flag_name, flag_vals, FLAGS, flag_type='WQSF')
flag_mask = flag_mask.astype(float)
flag_mask[flag_mask == 0.0] = np.nan

# subset flag mask
FLAG_subset1 = flag_mask[row1:row2, col1:col2]
print(flag_name)

In [None]:
CHL_ss[np.isfinite(FLAG_subset1)] = np.nan
OC4_ss[np.isfinite(FLAG_subset1)] = np.nan

In [None]:
fig2 = plt.figure(figsize=(18, 18), dpi=300)

# Set data & output map projections:
data_projection = ccrs.PlateCarree()
output_projection=ccrs.PlateCarree()

# Land resolution and polygon (you can use 50m for faster rendering)
land_poly=cfeature.NaturalEarthFeature('physical', 'land', '10m', 
                                       edgecolor='k', facecolor='k')

# Create axis with a specific projection to best match where you are in the world:
ax = fig2.add_subplot(1, 1, 1, projection=ccrs.PlateCarree(central_longitude=0.0))

# Plot data using pcolormesh
im = ax.pcolormesh(LON_ss, LAT_ss, CHL_ss, cmap = plt.cm.viridis, transform = data_projection)

# Set color limits for better visualization
im.set_clim(-1.5, 1.5)   # these are adjustable

# Add coastlines and land features
ax.coastlines(resolution='50m', color='black', linewidth=1)
ax.add_feature(land_poly)

# Add gridlines and labels with more control over intervals
g1 = ax.gridlines(draw_labels=True, linewidth=1, color='silver', alpha=0.9, linestyle='--')
g1.top_labels = False
g1.right_labels = False
g1.xlocator = plt.MaxNLocator(6)  # Control the number of x ticks
g1.ylocator = plt.MaxNLocator(6)  # Control the number of y ticks
g1.xlabel_style = {'size': 11, 'color': 'gray'}
g1.ylabel_style = {'size': 11, 'color': 'gray'}

# Adding a colorbar, ensuring correct formatting and size
cbar = plt.colorbar(im, ax=ax, orientation="horizontal", fraction=0.1, pad=0.04)
cbar.set_label('CHL_NN CHL mg.m$^{-3}$', fontsize=12)

# Show the plot
plt.show()

In [None]:
# Uncomment the comment below to save your figure (2)
# fig2.savefig('OC_nn_flagged.png', bbox_inches='tight')

In [None]:
fig3 = plt.figure(figsize=(18, 18), dpi=300)

# Set data & output map projections:
data_projection = ccrs.PlateCarree()
output_projection=ccrs.PlateCarree()

# Land resolution and polygon (you can use 50m for faster rendering)
land_poly=cfeature.NaturalEarthFeature('physical', 'land', '10m', 
                                       edgecolor='k', facecolor='k')

# Create axis with a specific projection to best match where you are in the world:
ax = fig3.add_subplot(1, 1, 1, projection=ccrs.PlateCarree(central_longitude=0.0))

# Plot data using pcolormesh
im = ax.pcolormesh(LON_ss, LAT_ss, OC4_ss, cmap = plt.cm.viridis, transform = data_projection)

# Set color limits for better visualization
im.set_clim(-1.5, 1.5)   # these are adjustable

# Add coastlines and land features
ax.coastlines(resolution='50m', color='black', linewidth=1)
ax.add_feature(land_poly)

# Add gridlines and labels with more control over intervals
g1 = ax.gridlines(draw_labels=True, linewidth=1, color='silver', alpha=0.9, linestyle='--')
g1.top_labels = False
g1.right_labels = False
g1.xlocator = plt.MaxNLocator(6)  # Control the number of x ticks
g1.ylocator = plt.MaxNLocator(6)  # Control the number of y ticks
g1.xlabel_style = {'size': 11, 'color': 'gray'}
g1.ylabel_style = {'size': 11, 'color': 'gray'}

# Adding a colorbar, ensuring correct formatting and size
cbar = plt.colorbar(im, ax=ax, orientation="horizontal", fraction=0.1, pad=0.04)
cbar.set_label('OC4Me CHL mg.m$^{-3}$', fontsize=12)

# Show the plot
plt.show()

In [None]:
# Uncomment the comment below to save your figure (3)
# fig3.savefig('OC4me_flagged.png', bbox_inches='tight')

### BONUS STEP: widgets!

#### Well done for making it this far. As a reward, here is some quick and simple code that will let you play with widgets :)

In [None]:
import ipywidgets as widgets
from ipywidgets import interact

# Plotting function
def plot_chl(vmin, vmax):
    plt.figure(figsize=(8,6))
    plt.pcolormesh(LON_ss, LAT_ss, CHL_ss, cmap='viridis', vmin=vmin, vmax=vmax)
    plt.colorbar(label='CHL_NN mg.m$^{-3}$')
    plt.title('Flagged Chlorophyll - Neural Net Algorithm')
    plt.show()

# Create interactive plot with sliders for vmin and vmax
interact(plot_chl, vmin=(-2, 0, 0.5), vmax=(0.5, 3, 0.5))  # (min, max, steps)

#### Quick lesson on why you need to be careful of applying ALL the flags without consideration...

In [None]:
# FLAG and MASK everything else
###############################
flag_var2 = ['INVALID', 'LAND', 'CLOUD', 'TURBID_ATM', 'CLOUD_AMBIGUOUS', 'CLOUD_MARGIN', 'SNOW_ICE', 
             'INLAND_WATER', 'COASTLINE', 'TIDAL', 'COSMETIC', 'SUSPECT', 'HISOLZEN', 'SATURATED', 
             'MEGLINT', 'HIGHGLINT', 'WHITECAPS', 'ADJAC', 'AC_FAIL', 'OC4ME_FAIL', 'OCNN_FAIL', 'KDM_FAIL']
# Flag names
flag_name = FLAG_file['WQSF'].flag_meanings.split(' ')
# flag bit values
flag_vals = FLAG_file['WQSF'].flag_masks
# flag field itself
FLAGS = FLAG_file.variables['WQSF'].data
FLAG_file.close()

# Make flag mask using the function we defined at the start: "flag_data_fast" (cell2)
flag_mask = flag_data_fast(flag_var2, flag_name, flag_vals, FLAGS, flag_type= 'WQSF')
flag_mask = flag_mask.astype(float)
flag_mask[flag_mask == 0.0] = np.nan

# subset flag mask
FLAG_subset2 = flag_mask[row1:row2, col1:col2]

In [None]:
CHL_ss[np.isfinite(FLAG_subset2)] = np.nan
OC4_ss[np.isfinite(FLAG_subset2)] = np.nan

In [None]:
fig4= plt.figure(figsize=(16, 16), dpi=300)

gs  = gridspec.GridSpec(1, 2)
# set data projection and request output projection
data_projection = ccrs.PlateCarree()
output_projection=ccrs.PlateCarree()
land_resolution = '10m'
land_poly = cfeature.NaturalEarthFeature('physical', 'land', "10m",
                                        edgecolor='k',
                                        facecolor='black')
# PLOT CHL OC4ME Algorithm
##########################
ax = plt.subplot(gs[0,0], projection = ccrs.PlateCarree(central_longitude= 0.0))
# Plotting your variable with defined min - max values with viridis:
im = plt.pcolormesh(LON_ss, LAT_ss, OC4_ss, vmin = -1.5, vmax = 1.5,
                    cmap = plt.cm.viridis)

ax.coastlines(resolution = land_resolution, color = 'black', linewidth = 1)
ax.add_feature(land_poly)
g1 = ax.gridlines(draw_labels = True, linewidth = 1, color = 'lightgray', alpha = 0.5, linestyle = '--')
g1.xlabels_top  = False
g1.ylabels_right= False
g1.xlabel_style = {'size': 11, 'color': 'gray'}
g1.ylabel_style = {'size': 11, 'color': 'gray'}
# Adding a Colourbar
cbar = plt.colorbar(orientation = "horizontal", fraction = 0.1, pad = 0.03) 
cbar.set_label('CHL_OC4ME mg.m$^{-3}$', fontsize = 11)

# PLOT CHL Neural Net Algorithm 
################################
ax = plt.subplot(gs[0,1], projection = ccrs.PlateCarree(central_longitude= 0.0))
# Plotting your variable with defined min - max values with viridis:
im = plt.pcolormesh(LON_ss, LAT_ss, CHL_ss, vmin = -1.5, vmax = 1.5,
                    cmap = plt.cm.viridis)
ax.coastlines(resolution = land_resolution, color = 'black', linewidth = 1)
ax.add_feature(land_poly)
g1 = ax.gridlines(draw_labels = True, linewidth = 1, color = 'lightgray', alpha = 0.5, linestyle = '--')
g1.xlabels_top  = False
g1.ylabels_right= False
g1.xlabel_style = {'size': 11, 'color': 'gray'}
g1.ylabel_style = {'size': 11, 'color': 'gray'}
# Adding a Colourbar
cbar = plt.colorbar(orientation = "horizontal", fraction = 0.1, pad = 0.03) 
cbar.set_label('CHL_NN mg.m$^{-3}$', fontsize = 11)

plt.show()

In [None]:
# Uncomment the comment below to save your figure (4)
# fig4.savefig('OC4me_bloom__extra_flags_on.png', bbox_inches='tight')