# NetCDF and CF: The Basics



### Abstract and other notes 
This workshop will teach some of the basics of CF metadata for netCDF data files
with some hands-on work available in Jupyter Notebooks using Python. Along with
introduction to netCDF and CF, we will introduce the CF data model and discuss
some netCDF implementation details to consider when deciding how to write data
with CF and netCDF. We will cover gridded data as well as in situ data
(stations, soundings, etc.) and touch on storing geometries data in CF.

Assume: Basic understanding of netCDF and CF (what they are and how they work
together?)

Target Audience: Data producer or manager
- Have data they want to (or have been told they should) write to CF compliant
  netCDF files

### Some References

- See CF Conventions doc ([1.7](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html))
- See Jonathan Gregory's old [CF presentation](http://cfconventions.org/Data/cf-documents/overview/viewgraphs.pdf)
- See [CF presentation](https://docs.google.com/presentation/d/1OImxWBNxyj-zdreIarH5GSIuDyREGB62rDah19g6M94/edit#) I gave at Oct 2018 nc training workshop
- See  NASA ESDS “Dataset Interoperability Recommendations for Earth Science” ([web page](https://earthdata.nasa.gov/user-resources/standards-and-references/dataset-interoperability-recommendations-for-earth-science))
- See CF Data Model (cfdm) python package [tutorial](https://ncas-cms.github.io/cfdm/tutorial.html)
- See Tim Whiteaker's cfgeom python package (GitHub [repo](https://github.com/twhiteaker/CFGeom))([tutorial]( https://twhiteaker.github.io/CFGeom/tutorial.html))

## Overview of netCDF and CF



### netCDF Data Model

### CF Data Model


## Gridded Data
### Basic gridded data
Lets start out with two 3-D arrays of temperature and dew point temperature
```
netcdf twoarrays {
  dimensions:
      lat = 12 ;
      lon = 19 ;
      time = 4 ;
  variables:
      float temp(time, lat, lon) ;
      float dewpoint(time, lat, lon) ;
}
```

### Describe the data

The 'units' attribute should be used for all variables that represent a
dimensional quantity. With only a few exceptions, the value of the `units`
attribute must be recognizable by the Unidata Udunits package.

CF standard names are used to describe the physical quantity a variable
represents. The `standard_name` attribute should be used for all data variables,
whenever possible. The `units` attribute must be consistent with (or convertable
to) the canonical units of the given standard name (and any modifier).

missing and valid ...

```
netcdf twoarrays {
  dimensions:
      lat = 12 ;
      lon = 19 ;
      time = 4 ;
  variables:
      float temp(time, lat, lon) ;
        temp:units = "Celsius" ;
        temp:standard_name = "surface_temperature" ;
      float dewpoint(time, lat, lon) ;
        dewpoint:units = "Celsius" ;
        dewpoint:standard_name = "dew_point_temperature" ;
}
```

### NUG Coordinate Variables
The Unidata NUG defines coordinate variables as a 1-D variable that has the same
name as a dimension. These variables define the physical coordinate
corresponding to the dimension. Many generic software packages understand how to
use NUG coordinate variables.

```
netcdf twoarrays_coordvars {
  dimensions:
      lat = 12 ;
      lon = 19 ;
      time = 4 ;
  variables:
      float lat(lat) ;
      float lon(lon) ;
      float time(tim) ;
      float temp(time, lat, lon) ;
        temp:units = "Celsius" ;
        temp:standard_name = "surface_temperature" ;
      float dewpoint(time, lat, lon) ;
        dewpoint:units = "Celsius" ;
        dewpoint:standard_name = "dew_point_temperature" ;
}
```

### CF coordinate variables

#### Latitude, Longitude, and Vertical

For latitude, include units attribute with a 'degrees_north' value.
```
float lat(lat) ;
  lat:long_name = "latitude" ;
  lat:units = "degrees_north" ;
  lat:standard_name = "latitude" ;
```

For longitude, a units attribute must be included with a 'degrees_east' value. A
standard name of 'longitude' is another mechanism for recognizing longitude.

```
float lon(lon) ;
  lon:long_name = "longitude" ;
  lon:units = "degrees_east" ;
  lon:standard_name = "longitude" ;
```
A vertical coordinate can be recognized by its units and the 'positive'
attribute with value of 'up' or 'down'.

Height, depth
```
axis_name:units = "meters" ;
axis_name:positive = "down" ;
```
Pressure
```
float pres(pres) ;
  pres:long_name = "pressure" ;
  pres:units = "hPa" ;
```

#### Time

Time coordinates must include a 'units' attribute with a string value with a
form similar to
    'seconds since 2019-01-06 12:00:00.00'

'seconds', 'minutes', 'hours', and 'days' are the most commonly used units for
time. Due to the variable length of months and years, they are not recommended.

#### Example

```
netcdf   mydataset {
  dimensions:
    lat = 12 ;
    lon = 19 ;
    time = 4 ;
  variables:
    float lat(lat) ;
      lat:units = "degrees_north" ;
      lat:standard_name = "latitude" ;
    float lon(lon) ;
      lon:units = "degrees_east" ;
      lon:standard_name = "longitude" ;
    float time(time) ;
      time:units = "seconds since 2019-01-06 12:00:00.00";
    float temp(time, lat, lon) ;
      temp:units = "Celsius" ;
      temp:standard_name = "surface_temperature" ;
    float dewpoint(time, lat, lon) ;
      dewpoint:units = "Celsius" ;
      dewpoint:standard_name = "dew_point_temperature" ;
  attributes:
    :Conventions = "CF-1.7";
}

```

### Direct axis identification

The 'axis' attribute can be used with a value of 'X', 'Y', 'Z', and 'T' to
simplify identification of space and time coordinates and to identify generic
spatial coordinates, e.g., a projected coordinate system.

```
netcdf   mydataset {
  dimensions:
    x = 12 ;
    y = 19 ;
    time = 4 ;
  variables:
    float x(x) ;
      x:units = "m" ;
      x:axis = "X" ;
    float y(y) ;
      y:units = "m" ;
      y:axis = "Y" ;
    float time(time) ;
      time:units = "seconds since 2019-01-06 12:00:00.00";
      time:axis = "T" ;
    float temp(time, x, y) ;
      temp:units = "Celsius" ;
      temp:standard_name = "surface_temperature" ;
    float dewpoint(time, lat, lon) ;
      dewpoint:units = "Celsius" ;
      dewpoint:standard_name = "dew_point_temperature" ;
  attributes:
    :Conventions = "CF-1.7";
}

```

### Grid Mapping


For a dataset with 1-D X and Y coordinate variables, the 2-D latitude and
longitude coordinates may be indicated with the `coordinates` attribute.
To describe the mapping between the X and Y coordinate variables and the
latitude and longitude coordinates, a `grid mpping` variable must be used.

For instance, here's a temperature field whose X and Y coordinate variables map
to latitude and longitude using a Lambert Conformal projection. 
 
```
dimensions:
  y = 228;
  x = 306;
  time = 41;

variables:
  int Lambert_Conformal;
    Lambert_Conformal:grid_mapping_name = "lambert_conformal_conic";
    Lambert_Conformal:standard_parallel = 25.0;
    Lambert_Conformal:longitude_of_central_meridian = 265.0;
    Lambert_Conformal:latitude_of_projection_origin = 25.0;
  double y(y);
    y:units = "km";
    y:long_name = "y coordinate of projection";
    y:standard_name = "projection_y_coordinate";
  double x(x);
    x:units = "km";
    x:long_name = "x coordinate of projection";
    x:standard_name = "projection_x_coordinate";
  double lat(y, x);
    lat:units = "degrees_north";
    lat:long_name = "latitude coordinate";
    lat:standard_name = "latitude";
  double lon(y, x);
    lon:units = "degrees_east";
    lon:long_name = "longitude coordinate";
    lon:standard_name = "longitude";
  int time(time);
    time:long_name = "forecast time";
    time:units = "hours since 2004-06-23T22:00:00Z";
  float Temperature(time, y, x);
    Temperature:units = "K";
    Temperature:long_name = "Temperature @ surface";
    Temperature:missing_value = 9999.0;
    Temperature:coordinates = "lat lon";
    Temperature:grid_mapping = "Lambert_Conformal";
```

A `grid mapping` variable may also be used to describe the figure of the Earth
used to define the latitude and longitude coordinates, or to describe
another coordinate reference system definition used by some coordinates
or auxiliary coordinates.

### Cell Bounds

  - an example
  
```
netcdf precip_bucket_bounds {
  dimensions:
      lat = 12 ;
      lon = 19 ;
      time = 8 ;
      tbv = 2;
  variables:
      float lat(lat) ;
      float lon(lon) ;
      float time(time) ;
        time:units = "hours since 2019-07-12 00:00:00.00";
        time:bounds = "time_bounds" ;
      float time_bounds(time,tbv)
      float precip(time, lat, lon) ;
        precip:units = "inches" ;
  data:
    time = 3, 6, 9, 12, 15, 18, 21, 24;
    time_bounds = 0, 3, 0, 6, 6, 9, 6, 12, 12, 15, 12, 18, 18, 21, 18, 24;
}

```


## Observational data (CF DSG)
- Overview of Point, station, sounding, trajectory
- Examples
- Reference to [NOAA NCEI netCDF Templates](https://www.nodc.noaa.gov/data/formats/netcdf/v2.0/)


## Geometries

- Overview
- Example
- Link to cfgeom python package


## CF Standard Names
- Overview:
  - Controlled vocabulary of names that describe physical quantity
    - (name, description, canonical units)
  - Canonical units (units used must be convertable to)
  - Help users to decide if data from two sources are comparable
    

## Other topics
- Missing data
  - _FillValue, missing_data, valid_*
- Packed data (scale_factor/add_offset)
- Chunking and compression
- CRS, shape of earth, horizontal and vertial datum

### Cell methods
  - an example
- ??"climatology" attribute??s


## Recommendations from NASA 
- [NASA Dataset Interoperability Recommendations for Earth Science](https://earthdata.nasa.gov/esdis/eso/standards-and-references/dataset-interoperability-recommendations-for-earth-science) - two PDF documents.
- NASA Data Prodcut Developer's Guide (from the [previous session](https://2019esipsummermeeting.sched.com/event/PtOg/data-product-developers-guide-workshop))
