# MARS3888 - Finding and Using Online Datasets 

## Notebook 4 - Example Datasets


+ Use this notebook and Notebooks 1 to 3 as a guide to access these example datasets

#### Libraries

First of all, load the necessary libraries:

+ xarray

If you wish plot or analysis the datasets listed in this notebook you will need to add these libraries to your first section of code:

+ numpy
+ matplotlib
+ xarray
+ NetCFD4
+ datetime
+ cartopy
+ cmocean
+ seaborn
+ scipy
+ pymannkendall

In [10]:
import os
import xarray as xr

import warnings
warnings.filterwarnings("ignore", category=RuntimeWarning) 

# Dataset: NOAA Coral Reef Watch via OPeNDAP (Thredds) server

## 1985 - 2021

Info: https://coralreefwatch.noaa.gov/

NOAA OPeNDAP [NOAA Coral Reef Watch](https://www.ncei.noaa.gov/thredds-ocean/catalog/crw/5km/catalog.html) 

Example data:

+ Monthly max DHW data: [ct5km_dhw-max_v3.1_201701.nc](https://www.ncei.noaa.gov/thredds-ocean/dodsC/crw/5km/v3.1/nc/v1.0/monthly/2017/ct5km_dhw-max_v3.1_201701.nc.html)

Variables of interest:

+ DHW - Degree heating weeks
+ SST - Sea surface temperature

NB: this server for this dataset is separated by years 

![image.png](attachment:image.png)

Load sinlge netCDF file to understand the structure

In [11]:
url="https://www.ncei.noaa.gov/thredds-ocean/dodsC/crw/5km/v3.1/nc/v1.0/monthly/2017/ct5km_dhw-max_v3.1_201703.nc"
ds = xr.open_dataset(url)
ds

The data is stored on the server in year folders. The following code makes a list of data URLs with the subfolder structure. Edit the start and end dates in the below code to change the datarange.

In [12]:
base_url = "https://www.ncei.noaa.gov/thredds-ocean/dodsC/crw/5km/v3.1/nc/v1.0/monthly/"

start_date = "2017-01"  # Start date in yyyy-mm format (first file = "1985-04")
stop_date = "2017-01"   # Stop date in yyyy-mm format (last file = "2021-11")

start_year, start_month = start_date.split("-")
stop_year, stop_month = stop_date.split("-")

start_year = int(start_year)
start_month = int(start_month)
stop_year = int(stop_year)
stop_month = int(stop_month)

noaafiles = []

# Based on the server the file naming convention 
for year in range(start_year, stop_year+1):
    if year == start_year:
        year_month_st = start_month
    else:
        year_month_st = 1
    if year == stop_year:
        year_month_ed = stop_month
    else:
        year_month_ed = 12
    yearfiles = [f"{base_url}{year}/ct5km_dhw-max_v3.1_{year}{month:02}.nc" for month in range(year_month_st, year_month_ed+1)]
    noaafiles.extend(yearfiles)

noaafiles

['https://www.ncei.noaa.gov/thredds-ocean/dodsC/crw/5km/v3.1/nc/v1.0/monthly/2017/ct5km_dhw-max_v3.1_201701.nc']

In [13]:
ds_dhw = xr.open_mfdataset(noaafiles)
ds_dhw

Unnamed: 0,Array,Chunk
Bytes,24.72 MiB,24.72 MiB
Shape,"(3600, 7200)","(3600, 7200)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,int8 numpy.ndarray,int8 numpy.ndarray
"Array Chunk Bytes 24.72 MiB 24.72 MiB Shape (3600, 7200) (3600, 7200) Dask graph 1 chunks in 2 graph layers Data type int8 numpy.ndarray",7200  3600,

Unnamed: 0,Array,Chunk
Bytes,24.72 MiB,24.72 MiB
Shape,"(3600, 7200)","(3600, 7200)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,int8 numpy.ndarray,int8 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,98.88 MiB,98.88 MiB
Shape,"(1, 3600, 7200)","(1, 3600, 7200)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 98.88 MiB 98.88 MiB Shape (1, 3600, 7200) (1, 3600, 7200) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",7200  3600  1,

Unnamed: 0,Array,Chunk
Bytes,98.88 MiB,98.88 MiB
Shape,"(1, 3600, 7200)","(1, 3600, 7200)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,98.88 MiB,98.88 MiB
Shape,"(1, 3600, 7200)","(1, 3600, 7200)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 98.88 MiB 98.88 MiB Shape (1, 3600, 7200) (1, 3600, 7200) Dask graph 1 chunks in 2 graph layers Data type float32 numpy.ndarray",7200  3600  1,

Unnamed: 0,Array,Chunk
Bytes,98.88 MiB,98.88 MiB
Shape,"(1, 3600, 7200)","(1, 3600, 7200)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray


# Dataset: eReefs via NCI (Thredds) server

## 2010 - Now

Info: https://ereefs.org.au/ereefs

NCI [AIMS server](http://thredds.ereefs.aims.gov.au/thredds/catalog.html)

Example data: 

+ Monthly data: https://thredds.ereefs.aims.gov.au/thredds/dodsC/s3://aims-ereefs-public-prod/derived/ncaggregate/ereefs/gbr4_v2/monthly-monthly/EREEFS_AIMS-CSIRO_gbr4_v2_hydro_monthly-monthly-2010-09.nc

Variables of interest:

+ ETA = Sea surface elevation
+ u - eastward current
+ v northward cureent
+ salt - salinity
+ temp - temperature
+ wspeed_u - eastward wind
+ wspeed_v - northward wind
+ temp_expose - DHW temp exposure

## See Notebook 1 for more...

In [14]:
url="https://thredds.ereefs.aims.gov.au/thredds/dodsC/s3://aims-ereefs-public-prod/derived/ncaggregate/ereefs/gbr4_v2/monthly-monthly/EREEFS_AIMS-CSIRO_gbr4_v2_hydro_monthly-monthly-2010-09.nc"
ds = xr.open_dataset(url)
ds

# Dataset: Waves (WWIII / CMAR_CAWCR) via CSIRO (Thredds) server 

## 1979 - 2023

Info: https://data.csiro.au/collection/csiro:39819

NCI [Thredds Server](https://data-cbr.csiro.au/thredds/catalog/catch_all/CMAR_CAWCR-Wave_archive/CAWCR_Wave_Hindcast_aggregate/gridded/catalog.html)

Example data: 

+ Daily data: https://data-cbr.csiro.au/thredds/dodsC/catch_all/CMAR_CAWCR-Wave_archive/CAWCR_Wave_Hindcast_aggregate/gridded/ww3.pac_4m.202301.nc.html

Variables of interest:

+ uwnd - eastward current
+ vwnd - northward cureent
+ hs - significant wave height
+ t01m - sea surface wind wave mean period
+ dir - wave direction
+ fp - sea_surface_wave_peak_frequency
+ ...


In [15]:
url="https://data-cbr.csiro.au/thredds/dodsC/catch_all/CMAR_CAWCR-Wave_archive/CAWCR_Wave_Hindcast_aggregate/gridded/ww3.pac_4m.202301.nc"
ds = xr.open_dataset(url)
ds

# Dataset: BRAN2020 Ocean Currents via NCI/CSIRO (Thredds) server

## 1993 - 2022

Info: https://research.csiro.au/bluelink/bran2020-data-released/

NCI [Thredds Server](https://dapds00.nci.org.au/thredds/catalog/gb6/BRAN/BRAN2020/catalog.html)

Example data: 

+ Monthly data: https://dapds00.nci.org.au/thredds/dodsC/gb6/BRAN/BRAN2020/month/ocean_force_mth_2004_04.nc.html

Variables of interest:

+ atm_flux - total alkalinity
+ bmf_u - Bottom u-stress via bottom drag
+ bmf_v - Bottom v-stress via bottom drag
+ ...

In [16]:
url="https://dapds00.nci.org.au/thredds/dodsC/gb6/BRAN/BRAN2020/month/ocean_force_mth_2004_04.nc"
ds = xr.open_dataset(url)
ds

# Now search for datasets that you think might be useful for your projects