Author: Maxime Marin  
@: mff.marin@gmail.com

# Accessing IMOS data case studies: Walk-through and interactive session

The next few notebooks aim to provide case studies using the cloud cluster availability demonstrated earlier, in order to investigate, load, visualise analyse and extract data stored on external servers.  
Main advantages include speed, ease of use and repeatability. Although we are using a python enviornment, novice python users can easily follow the receipes and adapt them to their needs.

This session will be divided into different notebooks focusing on different tasks including:

- <span style="color:green">**Notebook 1 - Start**:</span> Imports libraries, fires up the cluster, provides quick description of datasets available and loads dataset of interest.
- <span style="color:green">**Notebook 2 - Interactive**:</span> Provides interactive tools to investigate a the chosen dataset and make some quick plots.
- <span style="color:green">**Notebook 3 - Analysis**:</span> Performs further analysis on the chosen dataset including climatology, linear trends and anomalies.
- <span style="color:green">**Notebook 4 - Data Extraction**:</span> Extracts data into a format of choice and saves it locally for the user to perform more in-depth analysis.



***
## 1) LIBRARIES

In each notebooks, it is necessary to load the packages that are used within it to run the code. Note that some libraries might be called within other "bits" of code hidden in other files.  

"Importing" a library or package supposes that it was loaded in the python environment prior to running the notebooks... Thankfully we have taken care of that.  

It is customary to import and call libraries in the cell that they are used, or in a cell just before that. If a package is used throughout, we can call it in the first cell:


In [1]:
import sys
import os
sys.path.append('/home/jovyan/intake-aodn')

import intake_aodn #this library is part of the intake-aodn folder that we cloned and was created by us, containing functions we have created for the purpose of this project
import intake

***
## 2) CLUSTER FIRING 

Let's now create a cluster to allow for a significant increase in speed.  
Of course, we first call the necessary libraries.

In [2]:
from intake_aodn.utils import get_distributed_cluster
client = get_distributed_cluster()
client

Creating new cluster. Please wait for this to finish.


VBox(children=(HTML(value='<h2>GatewayCluster</h2>'), HBox(children=(HTML(value='\n<div>\n<style scoped>\n    …

0,1
Connection method: Cluster object,Cluster type: dask_gateway.GatewayCluster
Dashboard: https://hub.csiro.easi-eo.solutions/services/dask-gateway/clusters/easihub.6b9302fad8e94288bfc5796ebff089e8/status,


Notice how we imported `get_distributed_cluster` from `intake_aodn.utlis`. That is because `import` only maps the path, so we needed to tell jupyter where to find `get_distributed_cluster` first.  
utils however is the name of a file containing some functions (including `get_distributed_cluster`), so we can call it using a `.`.

Our cluster is stored under the variable `client`. Click on the dashboard link to see details about the cluster we have created.  

***
## 3) DATA DESCRIPTION 

Finally, let's have a look at what the available datasets are, along with some information about their metadata.

For this workshop, we have provided users with a choice between:
  
- IMOS SST: Satellite SST product created by IMOS hosted on the AODN servers.
- BRAN 2020: The CSIRO BlueLink Reanalysis product hosted on NCI. Only a few variables have been chosen for the sake of the workshop
- IMOS Chlorophyll: Satellite Chlorophyll product created by IMOS hosted on the AODN servers..
- IMOS SLA: Sea Level Anomaly derived from satellite, created by IMOS and hosted on the AODN servers.

In [3]:
import ipywidgets as widgets
from ipywidgets import interact, interact_manual
from intake_aodn.utils import display_entry 
from intake_aodn.utils import get_list_datasets

global catal # creates a varibale that is visible by all functions
catal = intake_aodn.cat

da_list,ser_list = get_list_datasets(catal) # gets list of severs and datasets

def display_detail(Dataset):
    display_entry(catal[ser_list[da_list.index(Dataset)]][Dataset])

interact(display_detail,Dataset = da_list);

interactive(children=(Dropdown(description='Dataset', options=('SST_L3S_1d_ngt', 'SSTAARS_Daily_Climatology', …

The dropdown list includes all the datasets that are currently available (mapped) for the user.  
Outputs also show some basic information on each dataset, including a link to the catalogue of the hosting server. Those links contain a more complete list of paramters such as spatial coverage and lists of variables included in the files.

***
## 4) Region Selection & Loading

Once we have chosen the dataset we want to investigate, let's go ahead and select our region/location of interest.

To help the user visualise where the region/location is located, the map below automatically updates as the user changes the text boxes defining min and max latitude and longitude

In [4]:
import cartopy
import cartopy.crs as ccrs
from ipywidgets import interactive
import matplotlib.pyplot as plt


def map_WA(lon_min,lon_max,lat_min,lat_max):
    lon = lon_min if lon_min == lon_max else [lon_min,lon_max]
    lat = lat_min if lat_min == lat_max else [lat_min,lat_max]

    fig = plt.figure(figsize=(30,8))
    ax = plt.axes(projection = ccrs.PlateCarree());
    ax.set_extent([90,140,-45,-5],crs=ccrs.PlateCarree())
    ax.coastlines()
    ax.gridlines(draw_labels=True,linestyle = '--')
    if isinstance(lon,list)  and isinstance(lat,list):
        ax.plot([lon[0],lon[0],lon[1],lon[1],lon[0]],[lat[0],lat[1],lat[1],lat[0],lat[0]],transform = ccrs.PlateCarree(),color='red')
    elif isinstance(lon,float)  and isinstance(lat,float):
        ax.scatter(lon,lat,s=55,marker="o",edgecolor = 'black',color = 'red',zorder=3)
    else:
        print('Selection cannot be a line')
    return lon, lat

lon_min=112.5
lon_max=114.5
lat_min=-25
lat_max=-27

w = interactive(map_WA,lon_min=widgets.FloatText(lon_min),lon_max=widgets.FloatText(lon_max),lat_min=widgets.FloatText(lat_min),lat_max=widgets.FloatText(lat_max));
display(w)

interactive(children=(FloatText(value=112.5, description='lon_min'), FloatText(value=114.5, description='lon_m…

Note that if the user indicates the same min-max latitude and longitude, it defines a point and places it on the map.  
However, selecting points along a line is not possible, so the map will not show anything in that case.  

To continue, let's select a region of your choice. Remember to not make it too big, we will see later how to "scale it up" (No more than 2x2 degrees!)

For the purpose of this workshop, I won't give you the choice: we will all download some satellite SST data provided by IMOS. Hey, you get to choose your region at least!  
NB: In future versions, users will be allowed to select any available product

Once we are happy with our choice of coordinates, let's load our data!

In [10]:
%%time 
coord = w.result
print('Selected coordinates:' + str(coord))
lat = coord[1]
lon = coord[0]

if 'ds' in locals():
    del ds
    
if isinstance(lat,float) and isinstance(lon,float): #point
    ds=intake_aodn.cat.aodn_s3.SST_L3S_1d_ngt(startdt='1992-07-01',
                                          enddt='2021-06-30',
                                          cropto=dict(latitude=lat,longitude=lon,method = 'nearest')).read()
    
elif isinstance(lat,list) and isinstance(lon,list): #box
    lat.sort()# note that for slice, index order is necessary rather than coordinate order. In this product, latitude is indexed from high to low values
    lon.sort()
    ds=intake_aodn.cat.aodn_s3.SST_L3S_1d_ngt(startdt='1992-07-01',
                                          enddt='2021-06-30',
                                          cropto=dict(latitude=slice(lat[1],lat[0]),longitude=slice(lon[0],lon[1]))).read()    
else:
    print('Coordinates not a point or orthogonal box')

Selected coordinates:([112.5, 114.5], [-27.0, -25.0])
CPU times: user 6.06 s, sys: 21.6 s, total: 27.7 s
Wall time: 1min 54s


In [11]:
ds

### Save the data to netcdf

In [None]:
from intake_aodn.utils import save_netcdf          
save_netcdf(ds,'Example_Data.nc')