# NetCDF format description

## What is NetCDF

![NetCDFLogo](https://www.unidata.ucar.edu/images/logos/netcdf-150x150.png)

It stands for **Network Common Data Form** and it is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. It is also a community standard for sharing scientific data.  

NetCDF is maintanied by **Unidata**, one of the University Corporation for Atmospheric Research (UCAR)'s Community Programs (UCP). Unidata also supports and maintains netCDF programming interfaces for C, C++, Java, and Fortran. Programming interfaces are also available for Python, IDL, MATLAB, R, Ruby, and Perl.

https://www.unidata.ucar.edu/

## How is the netCDF format?

**Self-Describing** A netCDF file includes information about the data it contains.

**Portable** A netCDF file can be accessed by computers with different ways of storing integers, characters, and floating-point numbers.

**Scalable** Small subsets of large datasets in various formats may be accessed efficiently through netCDF interfaces, even from remote servers.

**Appendable** Data may be appended to a properly structured netCDF file without copying the dataset or redefining its structure.

**Sharable** One writer and multiple readers may simultaneously access the same netCDF file.

**Archivable** Access to all earlier forms of netCDF data will be supported by current and future versions of the software

## Using netCDF data

Let's see most of these properties of a netcf file by using data from the **NOAA OI SST V2 High Resolution Dataset**,  
that is a High-resolution Blended Analysis of Daily SST and Ice, [more details here](https://www.psl.noaa.gov/data/gridded/data.noaa.oisst.v2.highres.html#detail)

I have pre-downloaded the daily data from 2019. Yo can use this code for dowloading it:

`! wget ftp://ftp2.psl.noaa.gov/Datasets/noaa.oisst.v2.highres/sst.day.mean.2019.nc `

In [14]:
fileExampleNC='./Data/sst.day.mean.2019.nc'

There area several methods to read netCDF data in python. Here we will show two of them. In the first one, we will use [netcdf4-python](https://unidata.github.io/netcdf4-python/netCDF4/), a Python interface to the netCDF C library.

### First, import libraries

In [15]:
import netCDF4
import numpy as np

In [16]:
SST = netCDF4.Dataset(fileExampleNC)
type(test)

netCDF4._netCDF4.Dataset

- **`SST`** is a `Dataset` object, representing an open netCDF file.
- printing the object gives you summary information

In [8]:
print(test)

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4_CLASSIC data model, file format HDF5):
    Conventions: CF-1.5
    title: NOAA/NCEI 1/4 Degree Daily Optimum Interpolation Sea Surface Temperature (OISST) Analysis, Version 2.1
    institution: NOAA/National Centers for Environmental Information
    source: NOAA/NCEI https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/
    References: https://www.psl.noaa.gov/data/gridded/data.noaa.oisst.v2.highres.html
    dataset_title: NOAA Daily Optimum Interpolation Sea Surface Temperature
    version: Version 2.1
    comment: Reynolds, et al.(2007) Daily High-Resolution-Blended Analyses for Sea Surface Temperature (available at https://doi.org/10.1175/2007JCLI1824.1). Banzon, et al.(2016) A long-term record of blended satellite and in situ sea-surface temperature for climate monitoring, modeling and environmental studies (available at https://doi.org/10.5194/essd-8-165-2016). Huang et al. (2020) Impr

And here it comes the **Self-Describing** propierty of the netCDF format, with all information about the data it contains:
The inforamtion provides depend on the particular data set, as we will se for Argo data, but here for this SST data you can find:
- 'source: NOAA/NCEI https://www.ncei.noaa.gov/data/sea-surface-temperature-optimum-interpolation/v2.1/access/avhrr/'

In [6]:
import xarray as xr

test = xr.open_dataset(fileNC)
print(test.info())

xarray.Dataset {
dimensions:
	lat = 720 ;
	lon = 1440 ;
	time = 365 ;

variables:
	datetime64[ns] time(time) ;
		time:long_name = Time ;
		time:delta_t = 0000-00-01 00:00:00 ;
		time:avg_period = 0000-00-01 00:00:00 ;
		time:axis = T ;
		time:actual_range = [79988. 80352.] ;
	float32 lat(lat) ;
		lat:long_name = Latitude ;
		lat:standard_name = latitude ;
		lat:units = degrees_north ;
		lat:actual_range = [-89.875  89.875] ;
		lat:axis = Y ;
	float32 lon(lon) ;
		lon:long_name = Longitude ;
		lon:standard_name = longitude ;
		lon:units = degrees_east ;
		lon:actual_range = [1.25000e-01 3.59875e+02] ;
		lon:axis = X ;
	float32 sst(time, lat, lon) ;
		sst:long_name = Daily Sea Surface Temperature ;
		sst:units = degC ;
		sst:valid_range = [-3. 45.] ;
		sst:precision = 2.0 ;
		sst:dataset = NOAA High-resolution Blended Analysis ;
		sst:var_desc = Sea Surface Temperature ;
		sst:level_desc = Surface ;
		sst:statistic = Mean ;
		sst:parent_stat = Individual Observations ;
		sst:actual_range

In [1]:
#http://xarray.pydata.org/en/latest/io.html#netcdf

https://confluence.ecmwf.int/display/CKB/What+are+NetCDF+files+and+how+can+I+read+them

In [None]:
xarray includes support for OPeNDAP (via the netCDF4 library or Pydap), which lets us access large datasets over HTTP.
remote_data = xr.open_dataset(
   ....:     "http://iridl.ldeo.columbia.edu/SOURCES/.OSU/.PRISM/.monthly/dods",
   ....:     decode_times=False,
   ....: )
