#  CDMS Python Application Programming Interface

## A First Example


The following Python script reads January and July monthly temperature
data from an input dataset, averages over time, and writes the results
to an output file. The input temperature data is ordered (time,
latitude, longitude).

Makes the CDM2S and MV2 modules available.

* MV2 defines arithmetic functions.

In [1]:
import cdms2, cdat_info
from cdms2 import MV2

Opens a netCDF file read-only. 
* The result jones is a dataset object.

In [2]:
jones = cdms2.open('https://aims3.llnl.gov/thredds/dodsC/user_pub_work/CDAT-sample/v1/tas_mo.nc')

Gets the surface air temperature variable. ‘tas’ is the name of the variable in the input dataset. 
* This does not actually read the data."

In [3]:
tasvar = jones['tas']

Read all January monthly mean data into a variable jans.
* Variables can be sliced like arrays.
* The slice operator [0::12] means take every 12th slice from dimension 0, starting at index 0 and ending at the last index. 
* If the stride 12 were omitted, it would default to 1. 

**Note:** that the variable is actually 3-dimensional. Since no slice is specified for the second or third dimensions, all values of those 2,3 dimensions are retrieved. The slice operation could also have been written [0::12, : , :]. 


**Also note:** that the same script works for multi-file datasets. CDMS opens the needed data files, extracts the appropriate slices, and concatenates them into the result array."
       

In [4]:
jans = tasvar[0::12]

Reads all July data into a masked array julys.

In [5]:
julys = tasvar[6::12]

Calculate the average January value for each grid zone.
* Any missing data is handled automatically.

In [6]:
janavg = MV2.average(jans)

Set the variable id and long\_name attributes. 
* The id is used as the name of the variable when plotted or written to a file."

In [7]:
janavg.id = "tas_jan"
janavg.long_name = "mean January surface temperature"


Calculate the average July value for each grid zone.

In [8]:
julyavg = MV2.average(julys)
julyavg.id = "tas_jul"
julyavg.long_name = "mean July surface temperature"

Create a new netCDF output file named "***janjuly.nc***" to hold the results.

Write the January average values to the output file. 
* The variable will have id “tas\_jan” in the file.
* "**write**" is a utility function which creates the variable in the file, then writes data to the variable.
* A more general method of data output is first to create a variable, then set a slice of the variable.

**Note:** that janavg and julavg have the same latitude and longitude information as tasvar. It is carried along with the computations."

In [9]:
out = cdms2.open('janjuly.nc','w')
out.write(janavg)
out.write(julyavg)


You can query different values of compression using the functions:
cdms2.getNetcdfShuffleFlag() returning 1 if shuffling is enabled, 0 otherwise
cdms2.getNetcdfDeflateFlag() returning 1 if deflate is used, 0 otherwise
cdms2.getNetcdfDeflateLevelFlag() returning the level of compression for the deflate method

If you want to turn that off or set different values of compression use the functions:
value = 0
cdms2.setNetcdfShuffleFlag(value) ## where value is either 0 or 1
cdms2.setNetcdfDeflateFlag(value) ## where value is either 0 or 1
cdms2.setNetcdfDeflateLevelFlag(value) ## where value is a integer between 0 and 9 included

To produce NetCDF3 Classic files use:
cdms2.useNetCDF3()
To Force NetCDF4 output with classic format and no compressing use:
cdms2.setNetcdf4Flag(1)
NetCDF4 file with no shuffling or deflate and noclassic will be open for parallel i/o


<cdms2.fvariable.FileVariable at 0x7f719809f630>

Set the global attribute "**comment**".
* Close the output file.

In [10]:
out.comment = "Average January/July from Jones dataset"


In [11]:
jones.close()
out.close()

Look at the resulting file using ncdump.  The written file follows the CF-1 connvention.

In [12]:
!ncdump -h janjuly.nc

netcdf janjuly {
variables:
	float tas_jan ;
		tas_jan:subgrid = "time:mean" ;
		tas_jan:long_name = "mean January surface temperature" ;
		tas_jan:units = "K" ;
		tas_jan:axis = "TZYX" ;
		tas_jan:missing_value = 1.e+20f ;
		tas_jan:_FillValue = 1.e+20f ;
	float tas_jul ;
		tas_jul:subgrid = "time:mean" ;
		tas_jul:long_name = "mean July surface temperature" ;
		tas_jul:units = "K" ;
		tas_jul:axis = "TZYX" ;
		tas_jul:missing_value = 1.e+20f ;
		tas_jul:_FillValue = 1.e+20f ;

// global attributes:
		:Conventions = "CF-1.0" ;
		:comment = "Average January/July from Jones dataset" ;
}


---

## Cdms2 Module


The cdms2 module is the Python interface to CDMS. The objects and methods in this chapter are made accessible with the command:

In [13]:
import cdms2

The functions described in this section are not associated with a class.
Rather, they are called as module functions, e.g.,

In [14]:
file = cdms2.open('https://aims3.llnl.gov/thredds/dodsC/user_pub_work/CDAT-sample/v1/clt.nc')

**See Also**: [Cdms Module Functions](https://cdms.readthedocs.io/en/latest/manual/cdms_2.html#cdms-module-functions)

### CdmsObj

Get a list of all external attributes of obj.

In [15]:
extatts = file.attributes.keys()
print(extatts)

dict_keys(['Conventions', 'comments', 'model', 'center', 'DODS_EXTRA.Unlimited_Dimension'])


### CoordinateAxis

A CoordinateAxis is a variable that represents coordinate information.
It may be contained in a file or dataset, or may be transient
(memoryresident). Setting a slice of a file CoordinateAxis writes to the file, and referencing a file CoordinateAxis slice reads data from the file. Axis objects are also used to define the domain of a Variable.

CDMS defines several different types of CoordinateAxis objects. See [MV module](https://cdms.readthedocs.io/en/latest/manual/cdms_2.html#id4) documents methods that are common to all CoordinateAxis types. See [HorizontalGrid]( https://cdms.readthedocs.io/en/latest/manual/cdms_2.html#id6) specifies methods that are unique to 1D Axis objects.

In [16]:
clt=file['clt']
axis=clt.getAxis(2)
print(axis)

   id: longitude
   Designated a longitude axis.
   units:  degrees_east
   Length: 72
   First:  -180.0
   Last:   175.0
   Other axis attributes:
      long_name: Longitude
   Python id:  0x7f7196f07240



### isCircular()

Returns True if the axis has circular topology.

An axis is defined as circular if:
* axis.topology == 'circular', or
* axis.topology is undefined, and the axis is a longitude.

The default cycle for circular axes is 360.0

In [17]:
print(axis.isCircular())

1


### mapIntervalExt(interval)

Map a coordinate interval to an index
interval. interval is a tuple having one of the forms:
* (x,y)
* (x,y,indicator)
* (x,y,indicator,cycle)

None or ':'

* where x and y are coordinates indicating the interval [x,y], and:
  * indicator is a two or three-character string, where the first character is 'c' if the interval is closed on the left, 'o' if open, and the second character has the same meaning for the right-hand point. If present, the third character specifies how the interval should be intersected with the axis
  * 'n' - select node values which are contained in the interval
  * 'b' -select axis elements for which the corresponding cell boundary intersects the interval
  * 'e' - same as n, but include an extra node on either side
  * 's' - select axis elements for which the cell boundary is a subset of the interval
  
The default indicator is ‘ccn’, that is, the interval is closed, and nodes in the interval are selected.

If cycle is specified, the axis is treated as circular with the given cycle value.

By default, if axis.isCircular() is true, the axis is treated as circular with a default modulus of 360.0.

An interval of None or ':' returns the full index interval of the axis.

The method returns the corresponding index interval as a 3tuple (i,j,k), where k is the integer stride, and  (i.j) is the half-open index interval (i <= k < j),  (i >= k > j if k < 0), or none if the intersection is empty.

For an axis which is circular (axis.topology == 'circular'), (i,j) is interpreted as follows, where n = len(axis)

if **0 <= i < n and 0 <= j <= n**, the interval does not wrap around the axis endpoint.

otherwise the interval wraps around the axis endpoint.

**see also**: mapinterval, variable.subregion()

In [18]:
print(axis.mapIntervalExt((-5.0,5.0,'co')))

(35, 37, 1)


### CdmsFile

A ``CdmsFile`` is a physical file, accessible via the ``cdunif``
interface. netCDF files are accessible in read-write mode. All other
formats (DRS, HDF, GrADS/GRIB, POP, QL) are accessible read-only.

As of CDMS V3, the legacy cuDataset interface is also supported by
Cdms-Files. See “cu Module”.

### The following reads data for variable ‘clt’, year 1980.

In [19]:
import cdms2
f = cdms2.open('https://aims3.llnl.gov/thredds/dodsC/user_pub_work/CDAT-sample/v1/clt.nc')
x = f('clt', time=('1980-1','1981-1'))

### The following gets the axis named time

In [20]:
t = f.axes['time']
print("axis name:", t.id)
t = f['time']
print("axis name:", t.id)

axis name: time
axis name: time


## 2.9. MV2 Module

The fundamental CDMS data object is the variable. A variable is comprised of:

    a masked data array, as defined in the NumPy "ma" module.
    a domain is an ordered list of axes and/or grids.
    an attribute dictionary.

MV2 can be imported with the command:

In [21]:
import MV2

For completeness MV2 provides access to all the "numpy.ma" functions. The functions not listed in the following tables are identical to the corresponding numpy.ma function: **allclose, allequal, common_fill_value, compress, create_mask, dot, e, fill_value, filled, get_print_limit, getmask, getmaskarray, identity, indices, innerproduct, isMV2, isMaskedArray, is_mask, isarray, make_mask, make_mask_none, mask_or, masked, pi, put, putmask, rank, ravel, set_fill_value, set_print_limit, shape, size**. See the documentation at https://github.com/numpy/numpy for a description of these functions.

In [22]:
import cdms2
fh = cdms2.open("https://aims3.llnl.gov/thredds/dodsC/user_pub_work/CDAT-sample/v1/ta_ncep_87-6-88-4.nc")
ta=fh['ta']
print(ta.shape)
data = ta.subRegion(':', (-45.0,45.0,'co'), (0.0, 180.0))
print(data.shape)

(11, 17, 73, 144)
(11, 3, 37, 144)


  result = result[revlist]


or equivalently:

In [23]:
data = ta.subRegion(latitude=(-45.0,45.0,'co'), longitude=(0.0,180.0))
print(data.shape)

(11, 17, 36, 73)


Read all data for ``March, 1980``:

In [24]:
data = ta.subRegion(time=('1988-3','1988-4','co'))
print(data.shape)

(1, 17, 73, 144)


### Selectors

A selector is a specification of a region of data to be selected from a variable. For example, the statement:

In [25]:
x = ta(time='1988-1-1', level=(1000.0,100.0))
print(x.shape)

(1, 12, 73, 144)


means `select the values of variable v for time ‘1979-1-1’ and levels 1000.0 to 100.0 inclusive, setting x to the result.` Selectors are generally used to represent regions of space and time.

The form for using a selector is:

In [26]:
from cdms2.selectors import Selector
s =  Selector(time='1988-1-1', level=(1000.0,100.0))
result = ta(s)
print(result.shape)

(1, 12, 73, 144)


where v is a variable and s is the selector. An equivalent form is:

In [27]:
result = fh('ta', s)
print(result.shape)

(1, 12, 73, 144)


where f is a file or dataset, and ‘varid’ is the string ID of a variable.

A selector consists of a list of selector components. 

For example, the selector:

* **time='1979-1-1', level=(1000.0,100.0)**
 
has two components: time=’1979-1-1’, and level=(1000.0,100.0). This illustrates that selector components can be defined with keywords, using the form:

<div class=note>
    <b>NOTE</b>
    
For the keywords time, level, latitude, and longitude, the selector can be used with any variable. If the corresponding axis is not found, the selector component is ignored. This is very useful for writing general purpose scripts. The required keyword overrides this behavior. These keywords take values that are coordinate ranges or index ranges as defined in See Index and Coordinate Intervals.</div>

Another form of selector components is the positional form, where the component order corresponds to the axis order of a variable. For example:
</div>

In [28]:
x9 = ta(('1988-1-1','1988-2-1'),1000.0)

reads data for the range (‘1988-1-1’,’1988-2-1’) of the first axis, and coordinate value ``1000.0`` of the second axis. Non-keyword arguments of the form(s) listed in Index and Coordinate Intervals are treated as positional. Such selectors are more concise, but not as general or flexible as the other types described in this section.

Selectors are objects in their own right. This means that a selector can be defined and reused, independent of a particular variable. Selectors are constructed using the cdms.selectors.Selector class. The constructor takes an argument list of selector components. For example:

In [29]:
from cdms2.selectors import Selector
sel = Selector(time=('1988-1-1','1988-2-1'), level=1000.)
x1 = ta(sel)
x2 = ta(sel)


For convenience CDMS provides several predefined selectors, which can be used directly or can be combined into more complex selectors. 

The selectors **time**, **level**, **latitude**, **longitude**, and **required** are equivalent to their keyword counterparts. For example:

In [30]:
from cdms2 import time, level
x = ta(time('1988-1-1','1988-2-1'), level(1000.))
print(x.shape)

(2, 1, 73, 144)


and

are equivalent. Additionally, the predefined selectors **latitudeslice**, **longitudeslice**, **levelslice**, and **timeslice** take arguments ``(startindex, stopindex[, stride])``:

In [31]:
from cdms2 import timeslice, levelslice
x = ta(timeslice(0,2), levelslice(16,17))
print(x.shape)

(2, 1, 73, 144)


Finally, a collection of selectors is defined in module cdutil.region:

In [32]:
from cdutil.region import *

NH=NorthernHemisphere=domain(latitude=(0., 90.))
SH=SouthernHemisphere=domain(latitude=(-90., 0.))
Tropics=domain(latitude=(-23.4,23.4))
SPZ=AAZ=AntarcticZone=domain(latitude=(-90., -66.6))
                             

Selectors can be combined using the `&` operator, 

or by refining them in the call:

In [33]:
from cdms2.selectors import Selector
from cdms2 import level
sel2 = Selector(time=('1988-1-1','1988-2-1'))
sel3 = sel2 & level(1000.0)
x1 = ta(sel3)
print(x1.shape)
x2 = ta(sel2, level=1000.0)
print(x2.shape)

(2, 1, 73, 144)
(2, 1, 73, 144)


### Selector Examples

CDMS provides a variety of ways to select or slice data. In the following examples, variable hus is contained in file ``sample.nc``, and is a function of **(time, level, latitude, longitude)**. Time values are monthly starting at 1987-6-1. There are 17 levels, the last level being 1000.0. The name of the vertical level axis is ‘level’. All the examples select the first two times and the last level. The last two examples remove the singleton level dimension from the result array.

In [34]:
import cdms2
f = cdms2.open('https://aims3.llnl.gov/thredds/dodsC/user_pub_work/CDAT-sample/v1/ta_ncep_87-6-88-4.nc')
ta = f.variables['ta']

#### Keyword selection

In [35]:
x = ta(time=('19988-1-1','1988-2-1'), level=1000.)
print(x.shape)

(3, 1, 73, 144)


#### Interval indicator (see mapIntervalExt)

In [36]:
x = ta(time=('1988-1-1','1988-3-1','co'), level=1000.)
print(x.shape)

(2, 1, 73, 144)


##### Axis ID (plev) as a keyword

In [37]:

x = ta(time=('1988-1-1','1988-2-1'), plev=1000.)
print(x.shape)

(2, 1, 73, 144)


##### Positional

In [38]:

x = ta(('1988-1-1','1988-2-1'),1000.0)
print(x.shape)

(2, 1, 73, 144)


##### Predefined selectors

In [39]:

from cdms2 import time, level
x = ta(time('1988-1-1','1988-2-1'), level(1000.))
print(x.shape)

from cdms2 import timeslice, levelslice
x = ta(timeslice(0,2), levelslice(16,17))
print(x.shape)

(2, 1, 73, 144)
(2, 1, 73, 144)


##### Call file as a function

In [40]:

x = f('ta', time=('1988-1-1','1988-2-1'), level=1000.)
print(x.shape)

(2, 1, 73, 144)


##### Python slices

In [41]:
x = ta(time=slice(0,2), level=slice(16,17))
print(x.shape)

(2, 1, 73, 144)


##### Selector objects

In [42]:

from cdms2.selectors import Selector
sel = Selector(time=('1988-1-1','1988-2-1'), level=1000.)
x = ta(sel)
print(x.shape)
sel2 = Selector(time=('1988-1-1','1988-2-1'))
sel3 = sel2 & level(1000.0)
x = ta(sel3)
print(x.shape)
x = ta(sel2, level=1000.0)
print(x.shape)

(2, 1, 73, 144)
(2, 1, 73, 144)
(2, 1, 73, 144)


##### Squeeze singleton dimension (level)

In [43]:
x = ta[0:2,16]
print(x.shape)

x = ta(time=('1988-1-1','1988-2-1'), level=1000., squeeze=1)
print(x.shape)

f.close()

(2, 73, 144)
(2, 73, 144)


  dout = self.data[indx]


In [44]:
from IPython.core.display import HTML
def css_styling():
    styles = open("./styles/custom.css", "r").read()
    return HTML(styles)
css_styling()