# Practical for 'introduction to the NCAS CF Data Tools, cf-python and cf-plot'

<div class="alert alert-block alert-success">
<i>Practical instructions:</i> these green boxes provide instructions and tips about doing this practical (blue boxes are the same as in the teaching notebook and provide useful information). First of all, there is a copy of the context and learning objectives from the main/presented Notebook below - you are advised to re-read this as a reminder. There is also a copy of the final section from the main Notebook which provides links to further information - you might find the documentation links especially useful here, so they are highlighted in bold.
</div>

## A reminder: context, learning objectives and guidance links

### What are the NCAS CF Data Tools and why do they all have 'cf' in the name?

The _NCAS CF Data Tools_ are a suite of complementary Python libraries which are designed to facilitate working with data for research in the earth sciences and aligned domains. The two that are of most relevance to the average user, and those wanting to process, analyse and visualise atmospheric data, are *cf-python* (https://ncas-cms.github.io/cf-python/) and *cf-plot* (https://ncas-cms.github.io/cf-plot/build/). We will be focusing on use of cf-python and cf-plot today.

The 'cf' in the names of the NCAS CF Data Tools corresponds to the _CF Conventions_, a metadata standard, because they are built around this standard in the form of using the CF Data Model, which as well as performance is considered a 'unique selling point' of the tools.


### What are the CF Conventions?

The _CF Conventions_, usually referred to in this way but also know by the full name of the **C**limate and **F**orecast (CF) metadata conventions, are a metadata standard which is becoming the de-facto convention to cover the description of geoscientific data so that sharing and intercomparison is simpler. See https://cfconventions.org/ for more information.


### What are we going to learn in this session?

Our **learning aim** is to be able to use the NCAS CF Data Tools Python libraries, namely cf-python and cf-plot to process, analyse and visualise netCDF and PP datasets, whilst appreciating the context and 'unique selling point' of the libraries as being built to use the CF Conventions, a metadata standard for earth science data, to make it simpler to do what you want to do with the datasets, by working on top of a Data Model for CF.

We have **six distinct objectives**, matching the sections in this notebook and in the practical notebook you will work through. By the end of this lesson you should be familiar and have practiced using cf-python and cf-plot to:

1. read dataset(s) and view the (meta)data at different detail levels;
2. edit the (meta)data and write out the edited version to file;
3. reduce datasets by subspacing and collapsing;
4. visualise datasets as contour and vector plots;
5. analyse data: applying mathematical and statistical operations and plotting trends;
6. change the underlying grid of data through regridding.

### Guidance: where to find more information and resources on the NCAS CF Data Tools

Here are some links relating to the NCAS CF Data Tools and this training.

* This training, with further material, is hosted online and there are instructions for setting up the environment so you can work through it in your own time: https://github.com/NCAS-CMS/cf-tools-training.
* **The cf-python documentation lives at https://ncas-cms.github.io/cf-python/.**
* The cf-python code lives on GitHub at https://github.com/NCAS-CMS/cf-python. There is an Issue Tracker to report queries or questions at https://github.com/NCAS-CMS/cf-python/issues.
* **The cf-plot documentation lives at https://ncas-cms.github.io/cf-plot/build/.**
* The cf-plot code lives on GitHub at https://github.com/NCAS-CMS/cf-plot. There is an Issue Tracker to report queries or questions at https://github.com/NCAS-CMS/cf-plot/issues.
* There is a technical presentation about the NCAS CF Data Tools avaialble from https://hps.vi4io.org/_media/events/2020/summer-school-cfnetcdf.pdf.
* The website of the CF Conventions can be found at https://cfconventions.org/.
* The landing page for training into the CF Conventions is found here within the website above: https://cfconventions.org/Training/.

If you have any queries after this course, please either use the Issue Trackers linked above or you can email me at: sadie.bartholomew@ncas.ac.uk.

***

<div class="alert alert-block alert-success">
<i>Practical instructions:</i> run all of the cells in this section to do the set up.
</div>

## Setting up

**In this section we set up this Notebook, import the libraries and check the data we will work with, ready to use the libraries within this notebook.**

Run some set up for nice outputs in this Jupyter Notebook (not required in interactive Python or a script):

In [1]:
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

Import cf-python and cf-plot:

In [2]:
import cfplot as cfp
import cf

Inspect the versions of cf-python and cf-plot and the version of the CF Conventions those are matched to:

In [3]:
print("cf-python version is:", cf.__version__)
print("cf-plot version is:", cfp.__version__)
print("CF Conventions version is:", cf.CF())

cf-python version is: 3.17.0
cf-plot version is: 3.3.0
CF Conventions version is: 1.11


<div class="alert alert-block alert-info">
<i>Note:</i> you can work with data compliant by any other version of the CF Conventions, or without (much) compliance, but the CF Conventions version gives the maximum version that these versions of the tools understand the features of.
</div>

Finally, see what datasets we have to explore:

<div class="alert alert-block alert-info">
<i>Note:</i> in a Jupyter Notebook, '!' preceeeds a shell command, so this is a terminal command and not Python
</div>

In [4]:
!ls ../ncas_data

aaaaoa.pmh8dec.pp			   precip_2010.nc
alpine_precip_DJF_means.nc		   precip_DJF_means.nc
data1.nc				   qbo.nc
data1-updated.nc			   regions.nc
data2.nc				   ta.nc
data3.nc				   tripolar.nc
data5.nc				   ua.nc
IPSL-CM5A-LR_r1i1p1_tas_n96_rcp45_mnth.nc  u_n216.nc
land.nc					   u_n96.nc
model_precip_DJF_means_low_res.nc	   vaAMIPlcd_DJF.nc
model_precip_DJF_means.nc		   va.nc
precip_1D_monthly.nc			   wapAMIPlcd_DJF.nc
precip_1D_yearly.nc


***

<div class="alert alert-block alert-success">
<i>Practical instructions:</i> now we can start the practical. We will follow the same sectioning as in the teaching notebook, so please consult the notes there in the matching section for guidance and you can also consult the cf-python and cf-plot documentation linked above.
</div>

## 1. Reading dataset(s) and viewing the (meta)data at different detail levels

**In this section we look at the basic use of cf-python, reading in one or more datasets from file and inspecting the data and the metadata at different levels of detail to suit the amount of information you want to see.**

### a) Reading in data and extracting the _field_ of interest

**1.a.1)** Use `cf` to read in the netCDF dataset `qbo.nc` which is found (as shown at the end of the section above) under the directory `../ncas_data`, assigning it to a variable called `fieldlist`.


In [5]:
fieldlist = cf.read("../ncas_data/qbo.nc")

**1.a.2)** Use the standard Python function `len` to see how long the read-in fieldlist is.

In [6]:
len(fieldlist)

1

**1.a.3)** Access the first field in the fieldlist and assign it to the variable name `field`.

In [7]:
field = fieldlist[0]

### b) Inspecting the _field_ of interest with different amounts of detail

**1.b.1)** View the field from (1.a.3) above in minimal detail.

In [8]:
field

<CF Field: eastward_wind(time(398), pressure(37), latitude(2), longitude(48)) m s**-1>

**1.b.2)** Now try viewing the field from (1.a.3) above at a medium detail level.

In [9]:
print(field)

Field: eastward_wind (ncvar%U)
------------------------------
Data            : eastward_wind(time(398), pressure(37), latitude(2), longitude(48)) m s**-1
Dimension coords: time(398) = [1979-01-16 09:00:00, ..., 2012-02-15 09:00:00] gregorian
                : pressure(37) = [1000.0, ..., 1.0] mbar
                : latitude(2) = [30.0, 0.0] degrees_north
                : longitude(48) = [0.0, ..., 352.5] degrees_east


**1.b.3)** OK, finally let's see it in its full glory - with maximal detail. Take a minute or two to compare these outputs and familiarise yourself with the formats of the different views and how they present the metadata (and preview of the data) of a field.

In [10]:
field.dump()

------------------------------
Field: eastward_wind (ncvar%U)
------------------------------
Conventions = 'CF-1.7'
_FillValue = 2.0000000400817547e+20
long_name = 'U velocity'
missing_value = 2.0000000400817547e+20
name = 'U'
standard_name = 'eastward_wind'
time = '00:00'
title = 'U velocity'
units = 'm s**-1'

Data(time(398), pressure(37), latitude(2), longitude(48)) = [[[[0.48084383185078244, ..., 8.244485042555938]]]] m s**-1

Domain Axis: latitude(2)
Domain Axis: longitude(48)
Domain Axis: pressure(37)
Domain Axis: time(398)

Dimension coordinate: time
    calendar = 'gregorian'
    long_name = 't'
    standard_name = 'time'
    time_origin = '01-JAN-1979:00:00:00'
    units = 'days since 1979-01-01 00:00:00'
    Data(time(398)) = [1979-01-16 09:00:00, ..., 2012-02-15 09:00:00] gregorian

Dimension coordinate: pressure
    long_name = 'p'
    positive = 'down'
    standard_name = 'pressure'
    units = 'mbar'
    Data(pressure(37)) = [1000.0, ..., 1.0] mbar

Dimension coordinate: 

### c) Inspecting a metadata _construct_ e.g. _coordinate_ from the _field_ of interest

**1.c.1)** Let's assume we want to know about a specific metadata construct, in this case we are intereted in the pressure. Assign to a new variable called 'pressure' the pressure coordinate of the field stored in the variable 'field' from section (1a) as just inspected in section (1b).

In [11]:
pressure = field.coordinate("pressure")

**1.c.2)** View this coordinate with minimal detail level.

In [12]:
pressure

<CF DimensionCoordinate: pressure(37) mbar>

**1.c.3)** Now use the standard approach to view it with medium detail level.

In [13]:
print(pressure)

pressure(37) mbar


**1.c.4)** Finally, let's use the approach for full detail level and see everything about this coordinate.

In [14]:
pressure.dump()

Dimension coordinate: pressure
    long_name = 'p'
    positive = 'down'
    standard_name = 'pressure'
    units = 'mbar'
    Data(37) = [1000.0, ..., 1.0] mbar


### d) Inspecting a data array of interest

**1.d.1)** Access the underlying data of the pressure coordinate from the previous sub-section, (1c), assigning it to a variable called 'pressure_data'.

In [15]:
pressure_data = pressure.data

**1.d.2)** Inspect the pressure coordinate data with minimal detail, noticing the units.

In [16]:
pressure_data

<CF Data(37): [1000.0, ..., 1.0] mbar>

**1.d.3)** Access the data array of the pressure coordinate. Note that, because it is small, it is not computationally expensive to access this and similarly with other metadata data arrays, but accessing the underlying data array of the whole field (i.e. its main stored variable) could be intensive because for datasets in real usage the data can be very large and/or multi-dimensional.

In [19]:
pressure_array = pressure_data.array

**1.d.4)** Use the standard Python `print` function to view the pressure array.

In [18]:
print(pressure_array)

[1000.  975.  950.  925.  900.  875.  850.  825.  800.  775.  750.  700.
  650.  600.  550.  500.  450.  400.  350.  300.  250.  225.  200.  175.
  150.  125.  100.   70.   50.   30.   20.   10.    7.    5.    3.    2.
    1.]


***

## 2. Editing the (meta)data and writing out the edited version to file

**In this section we demonstrate how to change the data that has been read-in from file, both in terms of the data arrays and the metadata that describes it, and then how to write data back out to file with a chosen name, so that you can see how cf-python can be used to edit data or to make new data.**

### a) Changing the underlying data

**2.a.1)**

**2.a.2)**

**2.a.3)**

**2.a.4)**

### b) Changing some metadata

**2.b.1)**

**2.b.2)**

**2.b.3)**

**2.b.4)**

### c) Writing a (list of) fields out to a file

**2.c.1)**

**2.c.2)**

**2.c.3)**

**2.c.4)**

***

## 3. Reducing datasets by subspacing and collapsing

**In this section we show how multi-dimensional data can be tamed using cf-python so that you can get a reduced form that can be analysed or plotted, by reducing the dimensions by selecting a subset of point(s) along the axes or collapsing down according to some statistic such as the mean or an extrema.**

### a) Subspacing using metadata conditions

**3.a.1)**

**3.a.2)**

**3.a.3)**

**3.a.4)**

### b) Subspacing using indexing, including equivalency to the above

**3.b.1)**

**3.b.2)**

**3.b.3)**

**3.b.4)**

### c) Statistical collapses

**3.c.1)**

**3.c.2)**

**3.c.3)**

**3.c.4)**

***

## 4. Visualising datasets as contour and vector plots

**In this section we demonstrate how to plot using cf-plot the data we have read and then processed and/or analysed using cf-python, notably showing how to create contour plots and vector plots as examples of some of the available plot types.**

### a) Making a contour plot

**4.a.1)**

**4.a.2)**

**4.a.3)**

**4.a.4)**

### b) Customising the (contour) plot

**4.b.1)**

**4.b.2)**

**4.b.3)**

**4.b.4)**

### c) Making a vector plot with basic customisation

**4.c.1)**

**4.c.2)**

**4.c.3)**

**4.c.4)**

***

## 5. Analysing data: applying mathematical and statistical operations and plotting trends

**In this section we demonstrate how to do some data analysis including performing arithmetic and statistical calculations on the data, showing how cf-python's CF Conventions metadata awareness means that the metadata is automatically updated to account for the operations that are performed.**

### a) Applying mathematics e.g. arithmetic and trigonometry on fields

**5.a.1)**

**5.a.2)**

**5.a.3)**

**5.a.4)**

### b) Line plotting

**5.b.1)**

**5.b.2)**

**5.b.3)**

**5.b.4)**

### c) Calculating seasonal means

**5.c.1)**

**5.c.2)**

**5.c.3)**

**5.c.4)**

### d) Plotting the seasonal means on one (line)plot

**5.d.1)**

**5.d.2)**

**5.d.3)**

**5.d.4)**

***

## 6. Changing the underlying grid of data through regridding

**In this section we demonstrate how to change the underlying grid of the data to another grid which could be a higher- or lower- resolution one, or a completely different grid, which is called regridding or interpolation, and indicate various options cf-python supports for doing this.**

### a) Getting a _source_ field ready to regrid

**6.a.1)**

**6.a.2)**

**6.a.3)**

**6.a.4)**

### b) Getting the _destination_ field: another field in order to regrid the previous _onto its grid_

**6.b.1)**

**6.b.2)**

**6.b.3)**

**6.b.4)**

### c) Performing the regrid operation from the source to the destination fields

**6.c.1)**

**6.c.2)**

**6.c.3)**

**6.c.4)**

### d) Finally, some more advanced cf-plot plotting to compare the source, destination, and regridded results

**6.d.1)**

**6.d.1)**

**6.d.3)**

**6.d.4)**

***