# Tutorial: Climate Networks

The objective of this tutorial is to introduce climate networks and explain and illustrate their application with the __pyunicorn__ package. First some theoretical background for understanding general climate networks will be given and then some methods provided by `pyunicorn.climate.ClimateNetwork` will be illustrated. An introduction and application of coupled climate networks will follow. For a detailed discussion and further references, please consult __[Donges et al., 2015](https://aip.scitation.org/doi/10.1063/1.4934554)__, on which this tutorial is based. 

## Introduction

_Climate networks (CN)_ are a way to apply complex network theory to the climate system, by assuming that each node represents a varying dynamical system. Of interest is then the collective behaviour of these interacting dynamical system and the structure of the resulting network. This approach was first introduced by __[Tsonis and Roebber, 2004](https://www.sciencedirect.com/science/article/abs/pii/S0378437103009646)__.

Climate network analysis is a versatile approach for investigating climatological data and can be used as a complementary method to classical techniques from multivariate statistics. The approach allows for the analysis of single fields of climatological time series, e.g. surface air temperature observed on a grid, or even two or more fields. It has been succesfully applied in many cases, for example to dynamics and predictability of the El Niño Phenomenon \[__[Radebach et al., 2013](https://arxiv.org/abs/1310.5494)__\].

## Theory of Climate Networks (CN)

Climate networks (class `climate.ClimateNetwork`) are a typical application of _functional networks_, which allow to study the dynamical relationships between subsystems of a high-dimensional complex system by constructing networks from it. The package provides classes for the construction and analysis of such networks, representing the statistical interdependency structure within and between fields of time series using various similarity measures.

### Coupling Analysis

Climate Networks represent strong statistical interrelationships between time series of climatological fields. These statistical interrelationships can be estimated with methods from the `timeseries.CouplingAnalysis` class in terms of matrices of _statistical similarities_ $\textbf{S}$, such as the _(lagged) classical linear Pearson product-moment correlation coefficient_ (CC). 

The CC of two zero-mean time series Variable $X$,$Y$, implemented in `CouplingAnalysis.cross_correlation`, is given by 

$$\rho_{XY}(\tau)=\frac{\langle X_{t-\tau}, Y_t \rangle}{\sigma_X \sigma_Y}$$

which depents on the covariance $\langle X_{t-\tau}, Y_t \rangle$ and standard deviations $\sigma_X, \sigma_Y$. Lags $\tau > 0$ correspond to the linear association of past values of $X$ with $Y$, and vice versa for $\tau < 0$. 

#### Similarity Measures for Climate Networks

By thresholding the matrix of a statistical similarity measure $\textbf{S}$, e.g. based on the CC from above, the interellationships between time series of climate networks can be reconstructed:

$$A_{pq} = \Theta(S_{pq}-\beta), \text{ if } p \neq q$$

and 0 otherwise. $\Theta$ is the Heaviside function, $\beta$ denotes a threshold parameter and $A_{pp} = 0$ is set for all nodes $p$ to exclude self-loops. 

A climate network that is reconstructed using the pearson correlation  from above is call _pearson correlation climate network_.

## Constructing CN with pyunicorn

After establishing some basic theoretic background, we can use pyunicorn to try out some tools for climate networks. First, download the data set following this __[link](https://psl.noaa.gov/repository/entry/show?entryid=synth%3Ae570c8f9-ec09-4e89-93b4-babd5651e7a9%3AL25jZXAucmVhbmFseXNpcy5kZXJpdmVkL3N1cmZhY2UvYWlyLm1vbi5tZWFuLm5j)__ and copy it to the directory "notebooks" of this script ot change the path below.

In [1]:
DATA_FILENAME = "./air.mon.mean.nc"

Now we will start with some imports and some specifications regarding the data set.

In [18]:
import numpy as np
from pyunicorn import climate
from matplotlib import pyplot as plt

In [3]:
FILE_TYPE = "NetCDF"
#  Type of data file ("NetCDF" indicates a NetCDF file with data on a regular
#  lat-lon grid, "iNetCDF" allows for arbitrary grids - > see documentation).
#  For example, the "NetCDF" FILE_TYPE is compatible with data from the IPCC
#  AR4 model ensemble or the reanalysis data provided by NCEP/NCAR.

In [4]:
#  Indicate data source (optional)
DATA_SOURCE = "ncep_ncar_reanalysis"

In [5]:
#  Name of observable in NetCDF file ("air" indicates surface air temperature
#  in NCEP/NCAR reanalysis data)
OBSERVABLE_NAME = "air"

In [6]:
#  Select a subset in time and space from the data (e.g., a particular region
#  or a particular time window, or both)
WINDOW = {"time_min": 0., "time_max": 0., "lat_min": 0, "lon_min": 0,
          "lat_max": 30, "lon_max": 0}  # selects the whole data set

In [7]:
#  Indicate the length of the annual cycle in the data (e.g., 12 for monthly
#  data). This is used for calculating climatological anomaly values
#  correctly.
TIME_CYCLE = 12

Now we set some values related to the climate network construction, the first being the threshold $\beta$ from above.

In [8]:
#  For setting fixed threshold
THRESHOLD = 0.5

#  For setting fixed link density
LINK_DENSITY = 0.005

#  Indicates whether to use only data from winter months (DJF) for calculating
#  correlations
WINTER_ONLY = False

Now we create a ClimateData object containing our data and then print the information.

In [9]:
data = climate.ClimateData.Load(
    file_name=DATA_FILENAME, observable_name=OBSERVABLE_NAME,
    data_source=DATA_SOURCE, file_type=FILE_TYPE,
    window=WINDOW, time_cycle=TIME_CYCLE)

#  Print some information on the data set
print(data)

Reading NetCDF File and converting data to NumPy array...
File format: NETCDF4_CLASSIC
Global attributes:
description: Data from NCEP initialized reanalysis (4x/day).  These are the 0.9950 sigma level values
platform: Model
Conventions: COARDS
NCO: 20121012
history: Thu May  4 20:11:16 2000: ncrcat -d time,0,623 /Datasets/ncep.reanalysis.derived/surface/air.mon.mean.nc air.mon.mean.nc
Thu May  4 18:11:50 2000: ncrcat -d time,0,622 /Datasets/ncep.reanalysis.derived/surface/air.mon.mean.nc ./surface/air.mon.mean.nc
Mon Jul  5 23:47:18 1999: ncrcat ./air.mon.mean.nc /Datasets/ncep.reanalysis.derived/surface/air.mon.mean.nc /dm/dmwork/nmc.rean.ingest/combinedMMs/surface/air.mon.mean.nc
/home/hoop/crdc/cpreanjuke2farm/cpreanjuke2farm Mon Oct 23 21:04:20 1995 from air.sfc.gauss.85.nc
created 95/03/13 by Hoop (netCDF2.3)
Converted to chunked, deflated non-packed NetCDF4 2014/09
title: monthly mean air.sig995 from the NCEP Reanalysis
dataset_title: NCEP-NCAR Reanalysis 1
References: http://www

Now we create a climate network based on Pearson correlation without lag and with fixed threshold.

In [10]:
net = climate.TsonisClimateNetwork(
    data, threshold=THRESHOLD, winter_only=WINTER_ONLY)

Generating a Tsonis climate network...
Calculating daily (monthly) anomaly values...
Calculating correlation matrix at zero lag from anomaly values...
Extracting network adjacency matrix by thresholding...
Setting area weights according to type surface ...
Setting area weights according to type surface ...


Alternatively, several similarity measures and construction mechanisms may be chosen here.

In [12]:
#  Create a climate network based on Pearson correlation without lag and with
#  fixed link density
# net = climate.TsonisClimateNetwork(
#     data, link_density=LINK_DENSITY, winter_only=WINTER_ONLY)

#  Create a climate network based on Spearman's rank order correlation without
#  lag and with fixed threshold
# net = climate.SpearmanClimateNetwork(
#     data, threshold=THRESHOLD, winter_only=WINTER_ONLY)

#  Create a climate network based on mutual information without lag and with
#  fixed threshold
# net = climate.MutualInfoClimateNetwork(
#     data, threshold=THRESHOLD, winter_only=WINTER_ONLY)

We finish by doing some calculations and saving them to text files.

In [11]:
print("Link density:", net.link_density)

#  Get degree
degree = net.degree()
#  Get closeness
closeness = net.closeness()
#  Get betweenness
betweenness = net.betweenness()
#  Get local clustering coefficient
clustering = net.local_clustering()
#  Get average link distance
ald = net.average_link_distance()
#  Get maximum link distance
mld = net.max_link_distance()

#
#  Save results to text file
#

#  Save the grid (mainly vertex coordinates) to text files
data.grid.save_txt(filename="grid.txt")

#  Save the degree sequence. Other measures may be saved similarly.
np.savetxt("degree.txt", degree)

Link density: 0.025814135861437906
Calculating closeness...
Calculating node betweenness...
Calculating local clustering coefficients...
Calculating average link distance...
Calculating angular great circle distance using Cython...
Calculating maximum link distance...


### Plotting

`pyunicorn` provides a basic plotting feature based on the cartopy package and matplotlib that can be used to have a first look at the generated data. Also the plotting with the `pyNGL` package is still supported but not recommended, as it is deprecated and its development currently at halt in favor for the cartopy project. For plotting in pyunicorn with `pyNGL` an old tutorial can be found in `examples\tutorials\climate_networks.py`.

#### Cartopy

For more info on and how to install cartopy please check out their webpage: https://scitools.org.uk/cartopy/docs/latest/ !

*Copyright: Cartopy. Met Office. git@github.com:SciTools/cartopy.git.* 

We start by creating a plot class, which later on we can modify by acessing its axes. 

In [12]:
# create a Cartopy plot instance called cn_plot (cn for climate network)
# from the data with title DATA_SOURCE
cn_plot = climate.CartopyPlots(data.grid, DATA_SOURCE)

Created plot class.


Now we add the network measures that we want to plot out via the `.add_dataset()` method, which takes a title and a network measure. The title will also be the name of the plot that will be saved.

In [13]:
# Add network measures to the plotting queue
cn_plot.add_dataset("Degree", degree)
cn_plot.add_dataset("Closeness", closeness)
cn_plot.add_dataset("Betweenness (log10)", np.log10(betweenness + 1))
cn_plot.add_dataset("Clustering", clustering)
cn_plot.add_dataset("Average link distance", ald)
cn_plot.add_dataset("Maximum link distance", mld)

Before plotting, we can change the plots by accessing `ax`, since cartopy is based on `maplotlib`.

In [19]:
ax = plt.set_cmap('plasma') 

<Figure size 640x480 with 0 Axes>

Now we can generate the plots in the current directory.

In [24]:
# Plot with cartopy and matplotlib
cn_plot.generate_plots(file_name="climate_network_measures",
                                 title_on=False, labels_on=True)

Created and saved plots @ current directory.
