# Cross-Calibrated Multi-Platform (CCMP) ocean surface wind 6-hourly, Version 2 (RSS) <img align="right" src="../../resources/easi_logo.jpg">

_Work in progress_

#### Index
- [Overview](#Overview)
- [Setup (dask, imports, query)](#Setup)
- [Product definition (measurements, flags)](#Product-definition)
- [Quality layer (mask)](#Quality-layer)
- [Scaling and nodata](#Scaling-and-nodata)
- [Visualisation](#Visualisation)
- [Appendix](#Appendix)

## Overview

The Cross-Calibrated Multi-Platform (CCMP) gridded surface vector winds are produced using satellite, moored buoy, and model wind data, and as such, are considered to be a Level-3 ocean vector wind analysis product. CCMP V2.0 is created by Remote Sensing Systems (RSS), as an update to the version 1 CCMP product, using improved and additional input data.

Please include the following statement in the acknowledgement section of your paper:
"CCMP Version-2.0 analyses are produced by Remote Sensing Systems and sponsored by NASA Earth Science funding. Data are available at www.remss.com ." (http://data.remss.com/ccmp/readme_ccmp.pdf)

#### Data source and documentation

Version 2 documentation: http://www.remss.com/measurements/ccmp
- Current (June 2021) version 2 data extend from 1987 to 2019/04. We have processed and indexed 2000-2019 data; for brevity while the data are assessed and for reasonable comparison with other satellite data series.
- RSS provide a "v2.0 + NRT" version that extends the series from 2015 to present; see the documentation link on the RSS website. Please let us know if the "v2.0 + NRT" dataset is required for your project, and we will look to add it as a separate product.

Version 1 documentation: http://podaac.jpl.nasa.gov/dataset/CCMP_MEASURES_ATLAS_L4_OW_L3_0_WIND_VECTORS_FLK

#### EASI pipeline

Data are downloaded from https://podaac-tools.jpl.nasa.gov/drive/files/allData/ccmp/rss/v2.0/historical. These data are equivalent to those available from RSS.

| Task | Summary |
|------|---------|
| Source | https://podaac-tools.jpl.nasa.gov/drive/files/allData/ccmp/rss/v2.0/historical |
| Download | 2000-2019 |
| Preprocess | Separate time layers into Tifs|
| Format | COG |
| Prepare | Y |
| TODO | Update to 6-hourly indexing |

__Note__: Data are currently (first cut) organised and indexed on a per day basis, with 6-hourly fields available as [variable]_[hour] (e.g., `uwind_00`). It would be better if the 6-hourly fields were indexed with their `day + 6-hourly` timestamp and variables were then available as `uwind`. We have a plan to do this fairly easily; just need to do it...

## Setup

#### Dask

In [None]:
from dask.distributed import Client

client = Client("tcp://10.0.57.215:34417")
client

#### Imports

In [None]:
# Data tools
import numpy as np
import xarray as xr
import pandas as pd
from datetime import datetime

# Datacube
import datacube
from datacube.utils import masking  # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/masking.py
from odc.algo import enum_to_bool   # https://github.com/opendatacube/odc-tools/blob/develop/libs/algo/odc/algo/_masking.py
from odc.algo import xr_reproject   # https://github.com/opendatacube/odc-tools/blob/develop/libs/algo/odc/algo/_warp.py
from datacube.utils.geometry import GeoBox, box  # https://github.com/opendatacube/datacube-core/blob/develop/datacube/utils/geometry/_base.py
from datacube.utils.rio import configure_s3_access

# Holoviews, Datashader and Bokeh
import hvplot.pandas
import hvplot.xarray
import holoviews as hv
import panel as pn
import colorcet as cc
import cartopy.crs as ccrs
from datashader import reductions
from holoviews import opts
# import geoviews as gv
# from holoviews.operation.datashader import rasterize
hv.extension('bokeh', logo=False)

# Python
import sys, os, re

# Optional EASI tools
sys.path.append(os.path.expanduser('~/hub-notebooks/scripts'))
import notebook_utils

#### ODC database

In [None]:
# For development products:
#  - This is a development ODC database while we test and demo this product.
CONF = """
[datacube]
db_hostname: v2-db-easihub-csiro-eks.cluster-ro-cvaedcg0qvwd.ap-southeast-2.rds.amazonaws.com
db_database: user_dev_odc
db_username: user
db_password: secretpassword
"""
from datacube.config import read_config, LocalConfig
dc = datacube.Datacube(config=LocalConfig(read_config(CONF)), env='datacube')

# dc = datacube.Datacube()

#### Example query

Change any of the parameters in the query object below to adjust the location, time, projection, or spatial resolution of the returned datasets.

Use the Explorer interface to check the temporal and spatial coverage for each product:
- https://explorer.csiro.easi-eo.solutions  + /product (when available)

In [None]:
# Area name
min_longitude, max_longitude = (0,360)
min_latitude, max_latitude = (-80,80)
min_date = '2000-01-01'
max_date = '2019-05-01'
product = 'ccmp_surfacewind_l3v2'

native_crs = 'epsg:4326'

query = {
    'product': product,                     # Product name
    'x': (min_longitude, max_longitude),    # "x" axis bounds
    'y': (min_latitude, max_latitude),      # "y" axis bounds
    'time': (min_date, max_date),           # Any parsable date strings
    'output_crs': native_crs,               # EPSG code
    'resolution': (0.25, 0.25),             # Target resolution
    'group_by': 'solar_day',                # Scene ordering
    'dask_chunks': {'latitude': 2048, 'longitude': 2048},  # Dask chunks
}

In [None]:
# Optional. Some products require AWS S3 credentials to supplied

# S3 credentials - required for s2_l2a
# configure_s3_access(aws_unsigned=True,requester_pays=False,client=client)
# print("Configured s3 requester pays data access")

In [None]:
# Load data
data = dc.load(**query)

notebook_utils.heading(notebook_utils.xarray_object_size(data))
display(data)

# Calculate valid (not nodata) masks for each layer
valid_mask = masking.valid_data_mask(data)
notebook_utils.heading('Valid data masks for each variable')
display(valid_mask)

## Product definition

Display the measurement definitions for the selected product.

Use `list_measurements` to show the details for a product, and `masking.describe_variable_flags` to show the flag definitions.

In [None]:
# Measurement definitions for the selected product
measurement_info = dc.list_measurements().loc[query['product']]
notebook_utils.heading(f'Measurement table for product: {query["product"]}')
notebook_utils.display_table(measurement_info)

# Separate lists of measurement names and flag names
measurement_names = measurement_info[ pd.isnull(measurement_info.flags_definition)].index
flag_names        = measurement_info[pd.notnull(measurement_info.flags_definition)].index

notebook_utils.heading('Selected Measurement and Flag names')
notebook_utils.display_table(pd.DataFrame({
    'group': ['Measurement names', 'Flag names'],
    'names': [', '.join(measurement_names), ', '.join(flag_names)]
}))

# Flag definitions
for flag in flag_names:
    notebook_utils.heading(f'Flag definition table for flag name: {flag}')
    notebook_utils.display_table(masking.describe_variable_flags(data[flag]))

## Quality layer

In [None]:
# No quality layer

## Roll longitudes to -180,180

In [None]:
# Adjust longitudes to -180,180 for plotting with geoviews/datashader
# https://discourse.holoviz.org/t/with-longitude-0-360-data-briefly-appears-on-map-then-disappears-also-hover-broken/1213/2

def roll_longitude(data: xr.Dataset, name: str):
    if max(data['longitude']) > 180:
        print(f'Rolling: {name}')
        data = data.roll(longitude=int(data.sizes['longitude']/2), roll_coords=False)
        data['longitude'] = data['longitude']-180
        display(min(data['longitude']).values, max(data['longitude']).values)
    return data
    
data_roll = roll_longitude(data, 'data')
valid_mask_roll = roll_longitude(valid_mask, 'valid_mask')

In [None]:
# To test that the roll is working correctly
#
# with np.printoptions(threshold=np.inf, precision=3, linewidth=160, formatter={'float':lambda x:f'{x:0.3f}'}):
#     for x in range(int(1440/20)):
#         print(f'{x:4d}: {data["uwind_00"][0,10,x*20:(x+1)*20].values}')

## Masking and Scaling

In [None]:
# Select a layer and apply masking and scaling, then persist in dask

layer_name = 'uwind_00'

# Apply valid mask and good pixel mask
layer = data_roll[[layer_name]].where(valid_mask_roll[layer_name])
layer = layer.persist()

## Visualisation

In [None]:
# Generate a plot

options = {
    'title': f'{query["product"]}: {layer_name}',
    'width': 800,
    'height': 400,
    'aspect': 'equal',
    'cmap': cc.rainbow,
    'clim': (-25, 25),                          # Limit the color range depending on the layer_name
    'colorbar': True,
    'tools': ['hover'],
}

# Set the Dataset CRS
plot_crs = native_crs
if plot_crs == 'epsg:4326':
    plot_crs = ccrs.PlateCarree()


# Native data and coastline overlay:
# - Comment `crs`, `projection`, `coastline` to plot in native_crs coords
# TODO: Update the axis labels to 'longitude', 'latitude' if `coastline` is used

layer_plot = layer.hvplot.image(
    x = 'longitude', y = 'latitude',                        # Dataset x,y dimension names
    rasterize = True,                        # Use Datashader
    aggregator = reductions.mean(),          # Datashader selects mean value
    precompute = True,                       # Datashader precomputes what it can
    crs = plot_crs,                        # Dataset crs
    projection = ccrs.PlateCarree(),         # Output projection (use ccrs.PlateCarree() when coastline=True)
    coastline='10m',                         # Coastline = '10m'/'50m'/'110m'
).options(opts.Image(**options)).hist(bin_range = options['clim'])

# display(layer_plot)
# Optional: Change the default time slider to a dropdown list, https://stackoverflow.com/a/54912917
fig = pn.panel(layer_plot, widgets={'time': pn.widgets.Select})  # widget_location='top_left'
display(fig)

## Appendix