# netCDF4

* Opening and creating netCDF
* Groups
* Variables
* Writing and retrieving data
* Attributes
* Dimensions
* Multi-file datasets
* Variable length datasets
* Strings
* Time coordinates


## Opening and creating netCDF
To create a netCDF file from Python, you call the `Dataset` constructor. This is also the method used to open an existing netCDF file. If the file is open for write access (`mode=w`, `r+` or `a`), you may write any type of data including new dimensions, groups, variables and attributes.

When creating a new file, the format may be specified using the format keyword in the `Dataset` constructor. The default format is NETCDF4. To see how a given file is formatted, you can examine the `data_model` attribute. Closing the netCDF file is accomplished via the `Dataset.close` method of the `Dataset` instance.

In [1]:
from netCDF4 import Dataset

rootgrp = Dataset("test.nc", "w", format="NETCDF4")
print(rootgrp.data_model)

rootgrp.close()

NETCDF4


## Groups
To create Group instances, use the `Dataset.createGroup` method of a `Dataset` or `Group` instance.`Dataset.createGroup` takes a single argument, a Python string containing the name of the new group. The new `Group` instances contained within the root group can be accessed by name using the groups dictionary attribute of the `Dataset` instance. 

To simplify the creation of nested groups, you can use a unix-like path as an argument to `Dataset.createGroup`. 

In [2]:
rootgrp = Dataset("test.nc", "w")
fcstgrp = rootgrp.createGroup("forecasts")
analgrp = rootgrp.createGroup("analyses")
print(rootgrp.groups)

fcstgrp1 = rootgrp.createGroup("/forecasts/model1")
fcstgrp2 = rootgrp.createGroup("/forecasts/model2")

print("\n", fcstgrp.groups)

{'forecasts': <class 'netCDF4._netCDF4.Group'>
group /forecasts:
    dimensions(sizes): 
    variables(dimensions): 
    groups: , 'analyses': <class 'netCDF4._netCDF4.Group'>
group /analyses:
    dimensions(sizes): 
    variables(dimensions): 
    groups: }

 {'model1': <class 'netCDF4._netCDF4.Group'>
group /forecasts/model1:
    dimensions(sizes): 
    variables(dimensions): 
    groups: , 'model2': <class 'netCDF4._netCDF4.Group'>
group /forecasts/model2:
    dimensions(sizes): 
    variables(dimensions): 
    groups: }


Here's an example that shows how to navigate all the groups in a `Dataset`. The function `walktree` is a Python generator that is used to walk the directory tree. Note that printing the `Dataset` or `Group` object yields summary information about its contents.

In [3]:
def walktree(top):
    yield top.groups.values()
    for value in top.groups.values():
        yield from walktree(value)

print('\n', rootgrp, '\n')

for children in walktree(rootgrp):
    for child in children:
        print(child)


 <class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): 
    variables(dimensions): 
    groups: forecasts, analyses 

<class 'netCDF4._netCDF4.Group'>
group /forecasts:
    dimensions(sizes): 
    variables(dimensions): 
    groups: model1, model2
<class 'netCDF4._netCDF4.Group'>
group /analyses:
    dimensions(sizes): 
    variables(dimensions): 
    groups: 
<class 'netCDF4._netCDF4.Group'>
group /forecasts/model1:
    dimensions(sizes): 
    variables(dimensions): 
    groups: 
<class 'netCDF4._netCDF4.Group'>
group /forecasts/model2:
    dimensions(sizes): 
    variables(dimensions): 
    groups: 


## Variables
To create a netCDF variable, use the `Dataset.createVariable` method of a `Dataset` or `Group` instance. The `Dataset.createVariable` method has two mandatory arguments, the variable name (a Python string), and the variable datatype. The variable's dimensions are given by a tuple containing the dimension names (defined previously with `Dataset.createDimension`). To create a scalar variable, simply leave out the dimensions keyword.

Valid datatype specifiers include: 
- `f4` (32-bit floating point), 
- `f8` (64-bit floating point), 
- `i4` (32-bit signed integer), 
- `i2` (16-bit signed integer), 
- `i8` (64-bit signed integer), 
- `i1` (8-bit signed integer), 
- `u1` (8-bit unsigned integer), 
- `u2` (16-bit unsigned integer), 
- `u4` (32-bit unsigned integer), 
- `u8` (64-bit unsigned integer), 
- or `S1` (single-character string). 

In [4]:
# we have to first make dimensions to create variables (more on this later)
level = rootgrp.createDimension("level", None)
time = rootgrp.createDimension("time", None)
lat = rootgrp.createDimension("lat", 73)
lon = rootgrp.createDimension("lon", 144)

# one dimensional data
times = rootgrp.createVariable("time","f8",("time",))
levels = rootgrp.createVariable("level","i4",("level",))
latitudes = rootgrp.createVariable("lat","f4",("lat",))
longitudes = rootgrp.createVariable("lon","f4",("lon",))

# two dimensions unlimited
temp = rootgrp.createVariable("temp","f4",("time","level","lat","lon",))
temp.units = "K"

Summary info on a variable instance:

In [5]:
print(temp)

<class 'netCDF4._netCDF4.Variable'>
float32 temp(time, level, lat, lon)
    units: K
unlimited dimensions: time, level
current shape = (0, 0, 73, 144)
filling on, default _FillValue of 9.969209968386869e+36 used


Using a path to create a variable within a hierarchy of groups (intermediate groups will be made):

In [6]:
ftemp = rootgrp.createVariable("/forecasts/model1/temp","f4",("time","level","lat","lon",)) 

Querying a dataset or group instance:

In [7]:
print(rootgrp["/forecasts/model1"])  # a Group instance

print(rootgrp["/forecasts/model1/temp"])  # a Variable instance

<class 'netCDF4._netCDF4.Group'>
group /forecasts/model1:
    dimensions(sizes): 
    variables(dimensions): float32 temp(time, level, lat, lon)
    groups: 
<class 'netCDF4._netCDF4.Variable'>
float32 temp(time, level, lat, lon)
path = /forecasts/model1
unlimited dimensions: time, level
current shape = (0, 0, 73, 144)
filling on, default _FillValue of 9.969209968386869e+36 used


The variables in the `Dataset` or `Group` are stored in a Python dictionary:

In [8]:
print(rootgrp.variables)

{'time': <class 'netCDF4._netCDF4.Variable'>
float64 time(time)
unlimited dimensions: time
current shape = (0,)
filling on, default _FillValue of 9.969209968386869e+36 used, 'level': <class 'netCDF4._netCDF4.Variable'>
int32 level(level)
unlimited dimensions: level
current shape = (0,)
filling on, default _FillValue of -2147483647 used, 'lat': <class 'netCDF4._netCDF4.Variable'>
float32 lat(lat)
unlimited dimensions: 
current shape = (73,)
filling on, default _FillValue of 9.969209968386869e+36 used, 'lon': <class 'netCDF4._netCDF4.Variable'>
float32 lon(lon)
unlimited dimensions: 
current shape = (144,)
filling on, default _FillValue of 9.969209968386869e+36 used, 'temp': <class 'netCDF4._netCDF4.Variable'>
float32 temp(time, level, lat, lon)
    units: K
unlimited dimensions: time, level
current shape = (0, 0, 73, 144)
filling on, default _FillValue of 9.969209968386869e+36 used}


## Writing and retrieving data
Simple data assignment:

In [9]:
import numpy as np

lats =  np.arange(-90,91,2.5)
lons =  np.arange(-180,180,2.5)
latitudes[:] = lats
longitudes[:] = lons
print("latitudes =\n{}".format(latitudes[:]))

latitudes =
[-90.  -87.5 -85.  -82.5 -80.  -77.5 -75.  -72.5 -70.  -67.5 -65.  -62.5
 -60.  -57.5 -55.  -52.5 -50.  -47.5 -45.  -42.5 -40.  -37.5 -35.  -32.5
 -30.  -27.5 -25.  -22.5 -20.  -17.5 -15.  -12.5 -10.   -7.5  -5.   -2.5
   0.    2.5   5.    7.5  10.   12.5  15.   17.5  20.   22.5  25.   27.5
  30.   32.5  35.   37.5  40.   42.5  45.   47.5  50.   52.5  55.   57.5
  60.   62.5  65.   67.5  70.   72.5  75.   77.5  80.   82.5  85.   87.5
  90. ]


Unlike NumPy's array objects, netCDF Variable objects with unlimited dimensions will grow along those dimensions if you assign data outside the currently defined range of indices.
Note that the size of the levels variable grows when data is appended along the level dimension of the variable temp, even though no data has yet been assigned to levels.


In [10]:
# append along two unlimited dimensions by assigning to slice.
nlats = len(rootgrp.dimensions["lat"])
nlons = len(rootgrp.dimensions["lon"])
print("temp shape before adding data = {}".format(temp.shape))

from numpy.random import uniform
temp[0:5, 0:10, :, :] = uniform(size=(5, 10, nlats, nlons))
print("temp shape after adding data = {}".format(temp.shape))

# levels have grown, but no values yet assigned.
print("levels shape after adding pressure data = {}".format(levels.shape))

# now, assign data to levels dimension variable.
levels[:] =  [1000.,850.,700.,500.,300.,250.,200.,150.,100.,50.]

temp shape before adding data = (0, 0, 73, 144)
temp shape after adding data = (5, 10, 73, 144)
levels shape after adding pressure data = (10,)


## Attributes
There are two types of attributes in a netCDF file, global and variable. Global attributes provide information about a group, or the entire dataset, as a whole. Variable attributes provide information about one of the variables in a group. Global attributes are set by assigning values to `Dataset` or `Group` instance variables. Variable attributes are set by assigning values to `Variable` instances variables. Attributes can be strings, numbers or sequences. Returning to our example,

In [11]:
import time
rootgrp.description = "an example script"
rootgrp.history = "Created " + time.ctime(time.time())
rootgrp.source = "netCDF4 python module tutorial"
latitudes.units = "degrees north"
longitudes.units = "degrees east"
levels.units = "hPa"
temp.units = "K"
times.units = "hours since 0001-01-01 00:00:00.0"
times.calendar = "gregorian"

The `Dataset.ncattrs` method of a `Dataset`, `Group` or `Variable` instance can be used to retrieve the names of all the netCDF attributes. This method is provided as a convenience, since using the built-in dir Python function will return a bunch of private methods and attributes that cannot (or should not) be modified by the user.

In [12]:
for name in rootgrp.ncattrs():
    print("Global attr {} = {}".format(name, getattr(rootgrp, name)))

Global attr description = an example script
Global attr history = Created Tue Oct 11 15:37:08 2022
Global attr source = netCDF4 python module tutorial


The `__dict__` attribute of a `Dataset`, `Group` or `Variable` instance provides all the netCDF attribute name/value pairs in a Python dictionary:

In [13]:
print(rootgrp.__dict__)

{'description': 'an example script', 'history': 'Created Tue Oct 11 15:37:08 2022', 'source': 'netCDF4 python module tutorial'}


Attributes can be deleted from a netCDF `Dataset`, `Group` or `Variable` using the Python del statement (i.e. del grp.foo removes the attribute foo the the group grp).

## Dimensions
A dimension is created using the `Dataset.createDimension` method of a `Dataset` or `Group` instance. We used this earlier as it has to be defined before creating a variable. 

A Python string is used to set the name of the dimension, and an integer value is used to set the size. To create an unlimited dimension (a dimension that can be appended to), the size value is set to `None` or `0`. 

In [14]:
print(rootgrp.dimensions)

for dimobj in rootgrp.dimensions.values():
    print(dimobj)
    
rootgrp.close()

{'level': <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'level', size = 10, 'time': <class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'time', size = 5, 'lat': <class 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 73, 'lon': <class 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 144}
<class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'level', size = 10
<class 'netCDF4._netCDF4.Dimension'> (unlimited): name = 'time', size = 5
<class 'netCDF4._netCDF4.Dimension'>: name = 'lat', size = 73
<class 'netCDF4._netCDF4.Dimension'>: name = 'lon', size = 144


## Multi-file datasets
You can use the `MFDataset` class to read the data as if it were contained in a single file. Instead of using a single filename to create a `Dataset` instance, create a `MFDataset` instance with either a list of filenames, or a string with a wildcard (which is then converted to a sorted list of files using the Python glob module). Variables in the list of files that share the same unlimited dimension are aggregated together, and can be sliced across multiple files. 

In [15]:
for nf in range(10):
    with Dataset("mftest%s.nc" % nf, "w", format="NETCDF4_CLASSIC") as f:
        _ = f.createDimension("x",None)
        x = f.createVariable("x","i",("x",))
        x[0:10] = np.arange(nf*10,10*(nf+1))

# Now read all the files back in at once with MFDataset

from netCDF4 import MFDataset
f = MFDataset("mftest*nc")
print(f.variables["x"][:])

#[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
# 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
# 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
# 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
# 96 97 98 99]

# Note that MFDataset can only be used to read, not write, multi-file datasets.

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
 96 97 98 99]


#### `MFDataset(files, check=False, aggdim=None, exclude=\[], master_file=None)`

`__init__(self, files, check=False, aggdim=None, exclude=\[], master_file=None)`

Open a Dataset spanning multiple files, making it look as if it was a single file. Variables in the list of files that share the same dimension (specified with the keyword `aggdim`) are aggregated. If `aggdim` is not specified, the unlimited is aggregated. Currently, `aggdim` must be the leftmost (slowest varying) dimension of each of the variables to be aggregated.

`files`: either a sequence of netCDF files or a string with a wildcard (converted to a sorted list of files using glob) If the `master_file` kwarg is not specified, the first file in the list will become the "master" file, defining all the variables with an aggregation dimension which may span subsequent files. Attribute access returns attributes only from "master" file. The files are always opened in read-only mode.

`check`: True if you want to do consistency checking to ensure the correct variables structure for all of the netcdf files. Checking makes the initialization of the `MFDataset` instance much slower. Default is False.

`aggdim`: The name of the dimension to aggregate over (must be the leftmost dimension of each of the variables to be aggregated). If None (default), aggregate over the unlimited dimension.

`exclude`: A list of variable names to exclude from aggregation. Default is an empty list.

`master_file`: file to use as "master file", defining all the variables with an aggregation dimension and all global attributes.

## Variable length Datasets
NetCDF 4 has support for variable-length or "ragged" arrays. These are arrays of variable length sequences having the same type. To create a variable-length data type, use the `Dataset.createVLType` method of a `Dataset` or `Group` instance.

In [16]:
f = Dataset("tst_vlen.nc","w")
vlen_t = f.createVLType(np.int32, "phony_vlen")

The numpy datatype of the variable-length sequences and the name of the new datatype must be specified. Any of the primitive datatypes can be used (signed and unsigned integers, 32 and 64 bit floats, and characters), but compound data types cannot. A new variable can then be created using this datatype.

In [17]:
x = f.createDimension("x",3)
y = f.createDimension("y",4)
vlvar = f.createVariable("phony_vlen_var", vlen_t, ("y","x"))

Since there is no native vlen datatype in NumPy, vlen arrays are represented in Python as object arrays (arrays of dtype object). These are arrays whose elements are Python object pointers, and can contain any type of python object. For this application, they must contain 1-D numpy arrays all of the same type but of varying length. In this case, they contain 1-D NumPy int32 arrays of random length between 1 and 10.

In [18]:
import random
random.seed(54321)
data = np.empty(len(y)*len(x),object)
for n in range(len(y)*len(x)):
    data[n] = np.arange(random.randint(1,10),dtype="int32")+1
data = np.reshape(data,(len(y),len(x)))
vlvar[:] = data
print("vlen variable =\n{}".format(vlvar[:]))

print('\n', f)

print('\n', f.variables["phony_vlen_var"])

print('\n', f.vltypes["phony_vlen"])

vlen variable =
[[array([1, 2, 3, 4, 5, 6, 7, 8], dtype=int32) array([1, 2], dtype=int32)
  array([1, 2, 3, 4], dtype=int32)]
 [array([1, 2, 3], dtype=int32)
  array([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)
  array([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)]
 [array([1, 2, 3, 4, 5, 6, 7], dtype=int32) array([1, 2, 3], dtype=int32)
  array([1, 2, 3, 4, 5, 6], dtype=int32)]
 [array([1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)
  array([1, 2, 3, 4, 5], dtype=int32) array([1, 2], dtype=int32)]]

 <class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): x(3), y(4)
    variables(dimensions): int32 phony_vlen_var(y, x)
    groups: 

 <class 'netCDF4._netCDF4.Variable'>
vlen phony_vlen_var(y, x)
vlen data type: int32
unlimited dimensions: 
current shape = (4, 3)

 <class 'netCDF4._netCDF4.VLType'>: name = 'phony_vlen', numpy dtype = int32


Numpy object arrays containing Python strings can also be written as vlen variables, For vlen strings, you don't need to create a vlen data type. Instead, simply use the Python str builtin (or a numpy string datatype with fixed length greater than 1) when calling the `Dataset.createVariable` method.

In [19]:
z = f.createDimension("z",10)
strvar = f.createVariable("strvar", str, "z")

In this example, an object array is filled with random Python strings with random lengths between 2 and 12 characters, and the data in the object array is assigned to the vlen string variable.

In [20]:
chars = "1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
data = np.empty(10,"O")
for n in range(10):
    stringlen = random.randint(2,12)
    data[n] = "".join([random.choice(chars) for i in range(stringlen)])
strvar[:] = data
print("variable-length string variable:\n{}".format(strvar[:]))

print('\n', f)

print(f.variables["strvar"])

variable-length string variable:
['Lh' '25F8wBbMI' '53rmM' 'vvjnb3t63ao' 'qjRBQk6w' 'aJh' 'QF'
 'jtIJbJACaQk4' '3Z5' 'bftIIq']

 <class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): x(3), y(4), z(10)
    variables(dimensions): int32 phony_vlen_var(y, x), <class 'str'> strvar(z)
    groups: 
<class 'netCDF4._netCDF4.Variable'>
vlen strvar(z)
vlen data type: <class 'str'>
unlimited dimensions: 
current shape = (10,)


It is also possible to set contents of vlen string variables with NumPy arrays of any string or unicode data type. Note, however, that accessing the contents of such variables will always return NumPy arrays with dtype object.

## Strings
The most flexible way to store arrays of strings is with the Variable-length (vlen) string data type. However, this requires the use of the NETCDF4 data model, and the vlen type does not map very well as NumPy arrays (you have to use NumPy arrays of `dtype=object`, which are arrays of arbitrary Python objects). NumPy does have a fixed-width string array data type, but unfortunately the netCDF data model does not. Instead fixed-width byte strings are typically stored as arrays of 8-bit characters. To perform the conversion to and from character arrays to fixed-width NumPy string arrays, the following convention is followed by the Python interface. 
- If the `_Encoding` special attribute is set for a character array (dtype S1) variable, the chartostring utility function is used to convert the array of characters to an array of strings with one less dimension (the last dimension is interpreted as the length of each string) when reading the data. The character set (usually ascii) is specified by the `_Encoding` attribute. 
- If `_Encoding` is 'none' or 'bytes', then the character array is converted to a NumPy fixed-width byte string array (dtype S#), otherwise a NumPy unicode (dtype U#) array is created. When writing the data, stringtochar is used to convert the numpy string array to an array of characters with one more dimension. For example,

In [21]:
from netCDF4 import stringtochar
nc = Dataset('stringtest.nc','w',format='NETCDF4_CLASSIC')
_ = nc.createDimension('nchars',3)
_ = nc.createDimension('nstrings',None)
v = nc.createVariable('strings','S1',('nstrings','nchars'))
datain = np.array(['foo','bar'],dtype='S3')
v[:] = stringtochar(datain)           # manual conversion to char array
print(v[:])                           # data returned as char array

v._Encoding = 'ascii'                 # this enables automatic conversion
v[:] = datain                         # conversion to char array done internally
print(v[:])                           # data returned in numpy string array

nc.close()

[[b'f' b'o' b'o']
 [b'b' b'a' b'r']]
['foo' 'bar']


Even if the `_Encoding` attribute is set, the automatic conversion of char arrays to/from string arrays can be disabled with `Variable.set_auto_chartostring`.

A similar situation is often encountered with NumPy structured arrays with subdtypes containing fixed-wdith byte strings (dtype=S#). Since there is no native fixed-length string netCDF datatype, these NumPy structure arrays are mapped onto netCDF compound types with character array elements. In this case the string <-> char array conversion is handled automatically (without the need to set the `_Encoding` attribute) using NumPy views. The structured array dtype (including the string elements) can even be used to define the compound data type - the string dtype will be converted to character array dtype under the hood when creating the netcdf compound type. Here's an example:

In [22]:
nc = Dataset('compoundstring_example.nc','w')
dtype = np.dtype([('observation', 'f4'),
                     ('station_name','S10')])
station_data_t = nc.createCompoundType(dtype,'station_data')
_ = nc.createDimension('station',None)
statdat = nc.createVariable('station_obs', station_data_t, ('station',))
data = np.empty(2,dtype)
data['observation'][:] = (123.,3.14)
data['station_name'][:] = ('Boulder','New York')
print('\n', statdat.dtype)                   # strings actually stored as character arrays

statdat[:] = data                            # strings converted to character arrays internally
print('\n', statdat[:])                      # character arrays converted back to strings

print('\n', statdat[:].dtype)

statdat.set_auto_chartostring(False)         # turn off auto-conversion
statdat[:] = data.view(dtype=[('observation', 'f4'),('station_name','S1',10)])
print('\n', statdat[:])                      # now structured array with char array subtype is returned

nc.close()


 {'names': ['observation', 'station_name'], 'formats': ['<f4', ('S1', (10,))], 'offsets': [0, 4], 'itemsize': 16, 'aligned': True}

 [(123.  , b'Boulder') (  3.14, b'New York')]

 {'names': ['observation', 'station_name'], 'formats': ['<f4', 'S10'], 'offsets': [0, 4], 'itemsize': 16, 'aligned': True}

 [(123.  , [b'B', b'o', b'u', b'l', b'd', b'e', b'r', b'', b'', b''])
 (  3.14, [b'N', b'e', b'w', b' ', b'Y', b'o', b'r', b'k', b'', b''])]


Note that there is currently no support for mapping NumPy structured arrays with unicode elements (dtype U#) onto netCDF compound types, nor is there support for netCDF compound types with vlen string components.

## Time Coordinates
 The functions `num2date` and `date2num` are provided by `cftime` to convert values of time to and from calender dates. 

In [23]:
import netCDF4
from netCDF4 import Dataset
import numpy as np

# Setting up a file
rootgrp = Dataset("test.nc", "w")
fcstgrp = rootgrp.createGroup("forecasts")
analgrp = rootgrp.createGroup("analyses")

fcstgrp1 = rootgrp.createGroup("/forecasts/model1")
fcstgrp2 = rootgrp.createGroup("/forecasts/model2")

# we have to first make dimensions to create variables (more on this later)
level = rootgrp.createDimension("level", None)
time = rootgrp.createDimension("time", None)
lat = rootgrp.createDimension("lat", 73)
lon = rootgrp.createDimension("lon", 144)

# one dimensional data
times = rootgrp.createVariable("time","f8",("time",))
levels = rootgrp.createVariable("level","i4",("level",))
latitudes = rootgrp.createVariable("lat","f4",("lat",))
longitudes = rootgrp.createVariable("lon","f4",("lon",))

# two dimensions unlimited
temp = rootgrp.createVariable("temp","f4",("time","level","lat","lon",))
temp.units = "K"

# creating attributes
import time
rootgrp.description = "an example script"
rootgrp.history = "Created " + time.ctime(time.time())
rootgrp.source = "netCDF4 python module tutorial"
latitudes.units = "degrees north"
longitudes.units = "degrees east"
levels.units = "hPa"
temp.units = "K"
times.units = "hours since 0001-01-01 00:00:00.0"
times.calendar = "gregorian"

# Setting up values
import numpy as np
lats =  np.arange(-90,91,2.5)
lons =  np.arange(-180,180,2.5)
latitudes[:] = lats
longitudes[:] = lons
nlats = len(rootgrp.dimensions["lat"])
nlons = len(rootgrp.dimensions["lon"])

from numpy.random import uniform
temp[0:5, 0:10, :, :] = uniform(size=(5, 10, nlats, nlons))
levels[:] =  [1000.,850.,700.,500.,300.,250.,200.,150.,100.,50.]
temp[0, 0, [0,1,2,3], [0,1,2,3]].shape
tempdat = temp[::2, [1,3,6], lats>0, lons>0]

# fill in times.
from datetime import datetime, timedelta
from cftime import num2date, date2num
dates = [datetime(2001,3,1)+n*timedelta(hours=12) for n in range(temp.shape[0])]
times[:] = date2num(dates,units=times.units,calendar=times.calendar)
print("time values (in units {}):\n{}".format(times.units, times[:]))

dates = num2date(times[:],units=times.units,calendar=times.calendar)
print("dates corresponding to time values:\n{}".format(dates))

time values (in units hours since 0001-01-01 00:00:00.0):
[17533104. 17533116. 17533128. 17533140. 17533152.]
dates corresponding to time values:
[cftime.DatetimeGregorian(2001, 3, 1, 0, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2001, 3, 1, 12, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2001, 3, 2, 0, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2001, 3, 2, 12, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2001, 3, 3, 0, 0, 0, 0, has_year_zero=False)]


`num2date` converts numeric values of time in the specified units and calendar to datetime objects, and `date2num` does the reverse. All the calendars currently defined in the CF metadata convention are supported. A function called `date2index` is also provided which returns the indices of a netCDF time variable corresponding to a sequence of datetime instances.

#### `date2num(dates, units, calendar=None, has_year_zero=None)`

Return numeric time values given datetime objects. The units of the numeric time values are described by the units argument and the `calendar` keyword. The datetime objects must be in UTC with no time-zone offset. If there is a time-zone offset in units, it will be applied to the returned numeric values.

`dates`: A datetime object or a sequence of datetime objects. The datetime objects should not include a time-zone offset. They can be either native Python datetime instances (which use the proleptic gregorian calendar) or `cftime.datetime` instances.

`units`: a string of the form since describing the time units. can be days, hours, minutes, seconds, milliseconds or microseconds. is the time origin. months since is allowed only for the `360_day` calendar and `common_years` since is allowed only for the `365_day` calendar.

`calendar`: describes the calendar to be used in the time calculations. All the values currently defined in the CF metadata convention <http://cfconventions.org/cf-conventions/cf-conventions#calendar>__ are supported. Valid calendars 'standard', 'gregorian', 'proleptic_gregorian' 'noleap', '365_day', '360_day', 'julian', 'all_leap', '366_day'. Default is `None` which means the calendar associated with the first input datetime instance will be used.

`has_year_zero`: If set to True, astronomical year numbering is used and the year zero exists. If set to False for real-world calendars, then historical year numbering is used and the year 1 is preceded by year -1 and no year zero exists. The defaults are set to conform with CF version 1.9 conventions (False for 'julian', 'gregorian'/'standard', True for 'proleptic_gregorian' (ISO 8601) and True for the idealized calendars 'noleap'/'365_day', '360_day', 366_day'/'all_leap') Note that CF v1.9 does not specifically mention whether year zero is allowed in the proleptic_gregorian calendar, but ISO-8601 has a year zero so we have adopted this as the default. The defaults can only be over-ridden for the real-world calendars, for the idealized calendars the year zero always exists and the `has_year_zero` kwarg is ignored. This kwarg is not needed to define calendar systems allowed by CF (the calendar-specific defaults do this).

returns a numeric time value, or an array of numeric time values with approximately 1 microsecond accuracy.

#### `num2date(times, units, calendar=u'standard', only_use_cftime_datetimes=True, only_use_python_datetimes=False, has_year_zero=None)`

Return datetime objects given numeric time values. The units of the numeric time values are described by the units argument and the calendar keyword. The returned datetime objects represent UTC with no time-zone offset, even if the specified units contain a time-zone offset.

`times`: numeric time values.

`units`: a string of the form since describing the time units. can be days, hours, minutes, seconds, milliseconds or microseconds. is the time origin. months since is allowed only for the `360_day` calendar and `common_years` since is allowed only for the `365_day` calendar.

`calendar`: describes the calendar used in the time calculations. All the values currently defined in the CF metadata convention <http://cfconventions.org/cf-conventions/cf-conventions#calendar>__ are supported. Valid calendars 'standard', 'gregorian', 'proleptic_gregorian' 'noleap', '365_day', '360_day', 'julian', 'all_leap', '366_day'. Default is 'standard', which is a mixed Julian/Gregorian calendar.

`only_use_cftime_datetimes`: if False, Python `datetime.datetime` objects are returned from `num2date` where possible; if True dates which subclass `cftime.datetime` are returned for all calendars. Default True.

`only_use_python_datetimes`: always return Python `datetime.datetime` objects and raise an error if this is not possible. Ignored unless `only_use_cftime_datetimes=False`. Default False.

`has_year_zero`: if set to True, astronomical year numbering is used and the year zero exists. If set to False for real-world calendars, then historical year numbering is used and the year 1 is preceded by year -1 and no year zero exists. The defaults are set to conform with CF version 1.9 conventions (False for 'julian', 'gregorian'/'standard', True for 'proleptic_gregorian' (ISO 8601) and True for the idealized calendars 'noleap'/'365_day', '360_day', 366_day'/'all_leap') The defaults can only be over-ridden for the real-world calendars, for the the idealized calendars the year zero always exists and the `has_year_zero` kwarg is ignored. This kwarg is not needed to define calendar systems allowed by CF (the calendar-specific defaults do this).

returns a datetime instance, or an array of datetime instances with microsecond accuracy, if possible.

Note: If `only_use_cftime_datetimes=False` and `use_only_python_datetimes=False`, the datetime instances returned are 'real' Python datetime objects if `calendar='proleptic_gregorian'`, or `calendar='standard'` or 'gregorian' and the date is after the breakpoint between the Julian and Gregorian calendars (1582-10-15). Otherwise, they are `ctime.datetime` objects which support some but not all the methods of native Python datetime objects. The datetime instances do not contain a time-zone offset, even if the specified units contains one.

#### `MFTime(time, units=None, calendar=None)`
`__init__(self, time, units=None, calendar=None)`

Create a time Variable with units consistent across a multifile dataset.

`time`: Time variable from a `MFDataset`.

`units`: Time units, for example, 'days since 1979-01-01'. If `None`, use the units from the master variable.

`calendar`: Calendar overload to use across all files, for example, 'standard' or 'gregorian'. If `None`, check that the calendar attribute is present on each variable and values are unique across files raising a `ValueError` otherwise.

## Review Questions

Create a file. Create a hierarchy of groups and datasets with the following details:

- These
    - crazy
        - dataset1 (1-100)
    - weird
- Many
    - odd
        - dataset2 (2D)
    - wacky
    - unusual
        - varlendataset (a variable length dataset) 

Print dataset2 using iteration.

Create a minimum of 2 attributes for each dataset.