# Wave Analysis using Altimeter data


Here we will illustrate how to extract wave conditions based on altimeter data for a specific geographical region. 

We will be querying data already downloaded from the obtained from Australian Ocean Data Network portal [AODN](https://portal.aodn.org.au/). 

> You should look at **RADWave** [documentation](https://radwave.readthedocs.io/en/latest/usage.html#getting-altimeter-values-from-data-providers) and the embeded video that explain how to select both a spatial bounding box and a temporal extent from the portal and how to export the file containing the `List of URLs`. This `TXT` file contains a list of `NETCDF` files for each available satellites. 

In [None]:
from IPython.display import IFrame
IFrame(src='https://bit.ly/2ROFoLY', width=900, height=600)

## Loading RADWave library and initialisation

We first start by importing **RADwave** library into our working space.

In [None]:
import RADWave as rwave

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

Once the list of `NETCDF` data file has been saved on disk, you will be able to load it by initialising **RADWave** main Python class called `waveAnalysis`.

For a detail overview of the options available in this class, you can have a look at the [waveAnalysis API](https://radwave.readthedocs.io/en/latest/RADWave.html#RADWave.altiwave.waveAnalysis).

Here, we will use the following parameters:

+ `altimeterURL` (str): list of NetCDF URLs downloaded from the wave data portal containing the radar altimeter data ['../dataset/IMOSURLs.txt']
+ `bbox` (list): bounding box specifying the geographical extent of the uploaded dataset following the convention [lon min,lon max,lat min,lat max]  [here we use a region located offshore sydney]
+ `stime` (list):  starting time of wave climate analysis following the convention [year, month, day] [we chose the 1st of January 1985]
+ `etime` (list): ending time of wave climate analysis following the convention [year, month, day] [we chose the 31st of December 2018]

For this example, we don't specify a list of particular satellites to use (`satNames` keyword) so all of them will be queried. In other words we will look at all the records from the 10 altimeters: 

JASON-2 - JASON-3 - SARAL - SENTINEL-3A - CRYOSAT-2 - ENVISAT - GEOSAT - ERS-2 - GFO - TOPEX.

In [None]:
wa = rwave.waveAnalysis(altimeterURL='../../dataset/IMOSURLs.txt', bbox=[152.0,155.0,-36.0,-34.0], 
                  stime=[1985,1,1], etime=[2018,12,31])

# Processing altimeters data

After class initialisation querying the actual dataset is realised by calling the `processingAltimeterData` function. The description of this function is available from the [API](https://radwave.readthedocs.io/en/latest/RADWave.html#RADWave.altiwave.waveAnalysis.processingAltimeterData).

The function can take some times to execute depending on the number of NETCDF files to load and the size of the dataset to query (here it should not take more than **30 s**).

> **RADWave** uses the uploaded file containing the list of URLs to query via `THREDDS` the remote data. This operation can take *several minutes* and when looking at a large region it is recommended to divide the analyse in smaller regions and download a series of URLs text file instead of the entire domain directly.

In [None]:
wa.processingAltimeterData(altimeter_pick='all', saveCSV = 'altimeterData.csv')

In case where the `processingAltimeterData` function has already been executed, one can load directly the processed data from the created `CSV` file in a more efficient way by running the `readingAltimeterData` function as follow:

In [None]:
wa.readingAltimeterData(saveCSV = 'altimeterData.csv')

Once the dataset has been queried, we can plot the altimeter data points on a map using the `visualiseData` function.

This function **plots** and **saves** in a figure the geographical coordinates of processed altimeter data.

In [None]:
wa.visualiseData(title="Altimeter data tracks", extent=[149.,158.,-38.,-32.], 
                 addcity=['Sydney', 151.2093, -33.8688], markersize=40, zoom=8,
                 fsize=(8, 7), fsave='altimeterdata')

# Computing wave regime for specified location


To perform wave analysis and compute the wave parameters discussed in the [documentation](https://radwave.readthedocs.io/en/latest/method.html#), we run the `timeSeries` function.

This function computes time series of wave characteristics from available altimeter data namely the significant wave height and the wind speed.

It computes both **instantaneous** and **monthly** wave variables:

+ significant wave height (m) - wh & wh_rolling
+ wave period (s)  - period & period_rolling
+ wave energy flux (kW/m)  - power & power_rolling
+ wave average energy density (J/m2)  - energy & energy_rolling
+ wave group velocity (m/s)  - speed & speed_rolling

In [None]:
wa.timeSeries()

The class `waveAnalysis` stores a Pandas dataframe (called `timeseries`) of computed wave parameters that can be subsequently used to perform additional analysis.

To visualise this dataframe, one can do:

In [None]:
display(wa.timeseries)

and to list the header names:

In [None]:
list(wa.timeseries)

## Plotting time series

We can now plot time series of **RADWave** calculated wave parameters. This is done by calling the `plotTimeSeries` function. 

Amongst the available option one can choose to (complete list of options is available in the [API](https://radwave.readthedocs.io/en/latest/RADWave.html#RADWave.altiwave.waveAnalysis.plotTimeSeries) :
+ specify a specific temporal extent with the keyword `time` that provide the extent of years for the time series.
+ define the wave parameter to visualise using the keyword `series` that takes the following choices: 'H', 'T', 'P', 'E' and 'Cg'. 
            
In addition to the time series, the function provides additional information:  
 
+ Maximum parameter value
+ Mean parameter value
+ Median parameter value
+ 95th percentile parameter value

In [None]:
# Significant wave height
wa.plotTimeSeries(series='H', fsize=(12, 5), fsave='seriesH')

# Wave period
wa.plotTimeSeries(time=[1995,2016], series='T', fsize=(10, 5), fsave=None)

# Wave power
wa.plotTimeSeries(time=[1995,2016], series='P', fsize=(10, 5), fsave=None)

# Wave energy
wa.plotTimeSeries(time=[1995,2016], series='E', fsize=(10, 5), fsave=None)

# Wave group velocity
wa.plotTimeSeries(time=[1995,2016], series='Cg', fsize=(10, 5), fsave=None)

# Processing wave seasonability trends 

In addition to time series, one can analyse the seasonal characteristics of each parameter computed from the altimeter dataset. 

For a specified time interval and geographical extent, it computes the monthly seasonality of specific wave variables (the option in the `series` keyword are: wh, period, power, energy and speed). 

Obtained monthly averaged values are stored and returned with a `Pandas` dataframe. 

> User has the option to plot the computed wave paraneter characteristics as a heatmap, a box plot and a standard deviation graph.

For the wave height series, a **Seasonal Mann-Kendall** test is also performed to determine monotonic trends in computed dataset using the package from Hussain & Mahmud (2019).

Hussain & Mahmud, 2019: pyMannKendall: a python package for non parametric Mann Kendall family of trend tests - JOSS, 4(39), 1556.

A full explanation on the available options for the `seriesSeasonMonth` function is provided in the [API](https://radwave.readthedocs.io/en/latest/RADWave.html#RADWave.altiwave.waveAnalysis.seriesSeasonMonth).

In [None]:
wh_all = wa.seriesSeasonMonth(series='wh', time=[1998,2018], lonlat=None, fsave='whall', plot=True)

As mentionned above, the function `seriesSeasonMonth` returns a **Pandas dataframe** containing the mean monthly values of specified wave series for the considered time interval.

These information can be displayed with: 

In [None]:
display(wh_all)

Below we provide an example of how this function can be used to process seasonability for different geographical extents.

In [None]:
# First we create a dictionary of 1 by 1 degree tiles within our regional area of interest
tiles = []
tiles.append([152.0,153.0,-36.0,-35.0])
tiles.append([153.0,154.0,-36.0,-35.0])
tiles.append([152.0,153.0,-35.0,-34.0])
tiles.append([153.0,154.0,-35.0,-34.0])

# We also store the geographical locations of the center of each tile 
lonlat = []
lonlat.append([152.5,-35.5])
lonlat.append([153.5,-35.5])
lonlat.append([152.5,-34.5])
lonlat.append([153.5,-34.5])


# And we define a new dictionary that will be filled with regional wave seasonability
seasons = []

# Finally we loop over the defined tiles and perform seasonability analysis for significant wave height
for k in range(4):
    seasons.append(wa.seriesSeasonMonth(series='wh', time=[1998,2018], 
                                        lonlat=tiles[k], plot=False))

This can then be used to plot the annual mean values of significant wave height for each tile over the temporal range of interest... 

In [None]:
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
from matplotlib.transforms import offset_copy
from mpl_toolkits.axes_grid1 import make_axes_locatable
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()

plt.rcParams['mathtext.fontset'] = 'cm'

fig, ax = plt.subplots(figsize = (8,4))

for k in range(4):
    yearwh = seasons[k]['mean']
    yearwh.plot(marker='o', linestyle='dashed', linewidth=2, markersize=8)
    
ax.set_title('Annual value for each tile',fontsize = 12)
ax.set_ylabel("Hs (m)",fontsize = 12)
ax.set_xlabel('Years',fontsize = 11)
ax.legend([lonlat[0],lonlat[1],lonlat[2],lonlat[3]])
ax.yaxis.set_tick_params(labelsize=10)
ax.xaxis.set_tick_params(labelsize=10, rotation=45)
plt.tight_layout()
plt.show()

# 20-years analysis of the impact of climate trend 

Oscillation in atmospheric patterns is known to alter regional weather conditions and associated trends in wave climate [Godoi et al., 2016].

Here we illustrate how the results obtained with **RADWave** can be used to investigate how climate patterns may affect  wave parameters.

For the sake of the demonstration, we will focus our analysis on the following indices:

+ SOI - Southern Oscillation Index / [information](http://www.bom.gov.au/climate/enso/history/ln-2010-12/SOI-what.shtml)
+ AOI - Antarctic Oscillation Index / [information](https://www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/aao/aao_index.html)
+ SAMI - Southern Annular Mode Index / [information](http://www.bom.gov.au/climate/sam/)

We first load the data associated to each index using `Pandas` functionalities.

+ Godoi, V.A., Bryan, K.R. and Gorman, R.M., 2016. [Regional influence of climate patterns on the wave climate of the southwestern Pacific: The New Zealand region](https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/2015JC011572). Journal of Geophysical Research: Oceans, 121(6), pp.4056-4076.

In [None]:
# Defining the 20-years timeframe
time = [1998,2018] 

Monthly means of the SOI, AOI & SAMI index are sourced from the National Oceanic and Atmospheric Administration (**NOAA**) and the Natural Environment Research Council from the British Antarctic Survey (**NERC**). 

For each climate index the anomalies are computed by subtracting overall mean from the monthly means. 

Then, the same is done for the wave parameters in order to investigate how they are modulated by the climate modes.


## SOI - Southern Oscillation Index

In [None]:
import io
import scipy
import requests
import numpy as np 
import pandas as pd

# Dataset URL
url = "http://www.bom.gov.au/climate/enso/soi_monthly.txt"

names = [
    "Date",
    "Value"
]


# Using Pandas to load the file content
soi = requests.get(url).content
soi_data = pd.read_csv(io.StringIO(soi.decode('utf-8')), names=names,skiprows=1)

# Define year and month of each record
soi_data['year'] = soi_data['Date'] // 100
soi_data['month'] = soi_data['Date'] % 100 

# Extract the information for the specified time interval 
soi_df = soi_data.drop(soi_data[soi_data.year < time[0]].index)
soi_df = soi_df.drop(soi_df[soi_df.year > time[1]].index)

# Calculate the 20-years mean 
soi_mean = soi_df['Value'].mean()

# Compute and store the anomalies in the dataframe
soi_df['anomaly'] = soi_df['Value']-soi_mean

soi_df["day"] = np.ones(len(soi_df["Value"]),dtype=int)
soi_df['time'] = pd.to_datetime(soi_df[['year','month','day']])

## AOI - Antarctic Oscillation Index

In [None]:
# Dataset URL
url = "https://www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/aao/monthly.aao.index.b79.current.ascii"

# Using Pandas to load the file content
aoi = requests.get(url).content
aoi_data = pd.read_csv(io.StringIO(aoi.decode('utf-8')),delimiter=r"\s+",header=None)

# Rename columns to fit with RADWave dataframe
aoi_data = aoi_data.rename(columns={0:"year", 1:"month", 2:"Value"})

# Extract the information for the specified time interval 
aoi_df = aoi_data.drop(aoi_data[aoi_data.year < time[0]].index)
aoi_df = aoi_df.drop(aoi_df[aoi_df.year > time[1]].index)

# Calculate the 20-years mean 
aoi_mean = aoi_df['Value'].mean()

# Compute and store the anomalies in the dataframe
aoi_df['anomaly'] = aoi_df['Value']-aoi_mean

## SAMI - Southern Annular Mode Index

In [None]:
# Dataset URL
url = "http://www.nerc-bas.ac.uk/public/icd/gjma/newsam.1957.2007.txt"

# Using Pandas to load the file content
sam = requests.get(url).content
sam_data = pd.read_csv(io.StringIO(sam.decode('utf-8')),delimiter=r"\s+")

# Rename month values to fit with RADWave dataframe
sam_data = sam_data.rename(columns={"JAN":1, "FEB":2,
                      "MAR":3, "APR":4,
                      "MAY":5, "JUN":6,
                      "JUL":7, "AUG":8,
                      "SEP": 9, "OCT":10,
                      "NOV":11, "DEC":12})

# Rename columns to fit with RADWave dataframe
sam_data = sam_data.stack().reset_index()
sam_data = sam_data.rename(columns={"level_0":"year", "level_1":"month", 0:"Value"})


# Extract the information for the specified time interval 
sam_df = sam_data.drop(sam_data[sam_data.year < time[0]].index)
sam_df = sam_df.drop(sam_df[sam_df.year > time[1]].index)

# Calculate the 20-years mean 
sam_mean = sam_df['Value'].mean()

# Compute and store the anomalies in the dataframe
sam_df['anomaly'] = sam_df['Value']-sam_mean

## RADWave significant wave height and wave period anomalies

### Significant wave height

In [None]:
# Get monthly significant wave height 
wh_data = wa.timeseries.groupby(['year', 'month'])[['wh']].apply(np.mean).reset_index()

# Extract the information for the specified time interval 
wh_df = wh_data.drop(wh_data[wh_data.year < time[0]].index)
wh_df = wh_df.drop(wh_df[wh_df.year > time[1]].index)

# Calculate the 20-years mean 
wh_mean = wh_df['wh'].mean()

# Compute and store the anomalies in the dataframe
wh_df['anomaly'] = wh_df['wh']-wh_mean

### Wave period

In [None]:
# Get monthly mean wave period 
T_data = wa.timeseries.groupby(['year', 'month'])[['period']].apply(np.mean).reset_index()

# Extract the information for the specified time interval 
T_df = T_data.drop(T_data[T_data.year < time[0]].index)
T_df = T_df.drop(T_df[T_df.year > time[1]].index)

# Calculate the 20-years mean 
T_mean = T_df['period'].mean()

# Compute and store the anomalies in the dataframe
T_df['anomaly'] = T_df['period']-T_mean

## Correlations

Monthly mean anomalies of significant wave height and wave period can be correlated with monthly mean anomaly time series of the SOI, AOI & SAMI index by computing the **Pearson’s correlation coefficient** (R) for the region of interest. 

We use [scipy.stats.pearsonr](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html) function to make this calculation. This function returns 2 values:

+ r: Pearson’s correlation coefficient.
+ p: Two-tailed p-value.

> Examples of **Pearson’s correlation coefficient** calculation between climate indices and the significant wave height are provided below:

In [None]:
# Pearson correlation between significant wave height and SOI
monthly_wh_soi = scipy.stats.pearsonr(soi_df['anomaly'],wh_df['anomaly']) 
print('+ Pearson correlation between significant wave height and SOI:',monthly_wh_soi[0],'\n')

# Pearson correlation between significant wave height and AOI
monthly_wh_aoi = scipy.stats.pearsonr(aoi_df['anomaly'],wh_df['anomaly']) 
print('+ Pearson correlation between significant wave height and AOI:',monthly_wh_aoi[0],'\n')

# Pearson correlation between significant wave height and SAMI
monthly_wh_sam = scipy.stats.pearsonr(sam_df['anomaly'],wh_df['anomaly']) 
print('+ Pearson correlation between significant wave height and SAMI:',monthly_wh_sam[0],'\n')

In [None]:
# Pearson correlation between significant wave period and SOI
monthly_tp_soi = scipy.stats.pearsonr(soi_df['anomaly'],T_df['anomaly']) 
print('+ Pearson correlation between significant wave period and SOI:',monthly_tp_soi[0],'\n')

# Pearson correlation between significant wave period and AOI
monthly_tp_aoi = scipy.stats.pearsonr(aoi_df['anomaly'],T_df['anomaly']) 
print('+ Pearson correlation between significant wave period and AOI:',monthly_tp_aoi[0],'\n')

# Pearson correlation between significant wave period and SAMI
monthly_tp_sam = scipy.stats.pearsonr(sam_df['anomaly'],T_df['anomaly']) 
print('+ Pearson correlation between significant wave period and SAMI:',monthly_tp_sam[0],'\n')