# Read netCDF

obrero comes with a small module that uses `xarray.open_dataset()` function to read netCDF files. But after it reads the netCDF file, it will make sure its coordinates are named `latitude`, `longitude` and `level`, to be able to keep consistency everywhere else with other obrero code. Aditionally it adds a new bound method to the xarray's `DataArray` class: `.convert_units()`, which uses cf-units module to convert units when possible. So let's first import obrero:

In [1]:
# small hack to be able to import module without install
import os
import sys
sys.path.append(os.getcwd() + '/../')

import obrero

Now we can simply read the netCDF file using function `obrero.read_nc()`:

In [5]:
# read data
fname = 'data/ctl.nc'

# read as data array
ds = obrero.read_nc(fname)

Depending on the contents of the netCDF file `da` will be a `DataArray` (single variable) or `Dataset` (multiple variables). In this case we can see the object contains several arrays, one for each variable:

In [6]:
ds

<xarray.Dataset>
Dimensions:    (latitude: 32, longitude: 64, time: 72)
Coordinates:
  * time       (time) object 2005-01-01 00:00:00 ... 2010-12-01 00:00:00
  * longitude  (longitude) float64 0.0 5.625 11.25 16.88 ... 343.1 348.8 354.4
  * latitude   (latitude) float64 85.76 80.27 74.74 ... -74.74 -80.27 -85.76
Data variables:
    tas        (time, latitude, longitude) float32 ...
    pr         (time, latitude, longitude) float32 ...
    gpp        (time, latitude, longitude) float32 ...
Attributes:
    CDI:          Climate Data Interface version 1.9.6 (http://mpimet.mpg.de/...
    Conventions:  CF-1.0
    history:      Mon Jun 10 21:34:13 2019: cdo selvar,tas,gpp,pr ctl.nc ctl2...
    title:        PUMA/PLASIM DATA
    CDO:          Climate Data Operators version 1.9.6 (http://mpimet.mpg.de/...

If we want a single variable, we can put this inside the function call next to the file name string:

In [8]:
# read as data array
da = obrero.read_nc(fname, 'pr')
da

<xarray.DataArray 'pr' (time: 72, latitude: 32, longitude: 64)>
[147456 values with dtype=float32]
Coordinates:
  * time       (time) object 2005-01-01 00:00:00 ... 2010-12-01 00:00:00
  * longitude  (longitude) float64 0.0 5.625 11.25 16.88 ... 343.1 348.8 354.4
  * latitude   (latitude) float64 85.76 80.27 74.74 ... -74.74 -80.27 -85.76
Attributes:
    standard_name:  total_precipitation
    long_name:      total_precipitation
    units:          m s-1
    code:           260

We can even select several variables using a list object:

In [9]:
# read as data array
ds2 = obrero.read_nc(fname, ['pr', 'tas'])
ds2

<xarray.Dataset>
Dimensions:    (latitude: 32, longitude: 64, time: 72)
Coordinates:
  * time       (time) object 2005-01-01 00:00:00 ... 2010-12-01 00:00:00
  * longitude  (longitude) float64 0.0 5.625 11.25 16.88 ... 343.1 348.8 354.4
  * latitude   (latitude) float64 85.76 80.27 74.74 ... -74.74 -80.27 -85.76
Data variables:
    pr         (time, latitude, longitude) float32 ...
    tas        (time, latitude, longitude) float32 ...