# NeXus Data Format

NeXus is an international standard for storing neutron, x-ray and muon data. NeXus data are stored in HDF5 files, using a logical design that makes them easy to navigate, and a pre-defined dictionary of groups, fields, and attributes that describe commonly occurring experimental metadata. Further details can be found on the [NeXus website](http://www.nexusformat.org). This notebook provides a tutorial on using the Python [nexusformat package](http://nexpy.github.io/nexpy/pythonshell.html) to read, write, and manipulate NeXus data.

## Installation
Before running this notebook, you should make sure that the `nexusformat` package has been installed. If you have an Anaconda Python environment, type:

    >>> conda install -c nexpy nexusformat
    
Otherwise, type:

    >>> pip install nexusformat

> *N.B.* The `nexusformat` package uses [h5py](http://www.h5py.org) to read and write the underlying HDF5 files. It also requires [numpy](http://numpy.scipy.org/) and, if plotting is required, [matplotlib](http://www.matplotlib.org).

If already installed, you can skip the following cell:

In [None]:
!pip install nexusformat

Once installed, it can be imported using:

In [None]:
from nexusformat.nexus import *

> *N.B.* Wildcard imports are usually discouraged, but all the imported names have `NX` or `nx` as a prefix, reducing the risk of name clashes.

## NeXus Fields
NeXus data and metadata are stored as either numeric fields (scalar or array) or text strings. In the `nexusformat` package, fields can be created dynamically created from the command line, even if no NeXus file is open.

In [None]:
import numpy as np
x=NXfield(np.linspace(1,10.0,10))
x

In this case, the `NXfield` object contains the NumPy array, and can be manipulated using standard NumPy operations.

In [None]:
x + 10

In [None]:
np.sin(x)

In [None]:
x+np.linspace(11,20.0,10)

Like NumPy arrays, NeXus fields have datatypes and shape, which can be manipulated using standard NumPy operations.

In [None]:
x.dtype

In [None]:
x.reshape((2,5))

It is also possible to create large empty arrays and fill them incrementally as slabs. The array is stored in an in-memory HDF5 file until it is saved to disk. 

In [None]:
x=NXfield(shape=(1000,1000), dtype=np.float32, fillvalue=np.nan)
x[0,3:6]=range(3)
x[0,0:7]

NeXus fields can contain scalars and arrays of any NumPy data type, as well as text, which is stored in NeXus files as variable-length strings, by default.

In [None]:
title=NXfield('Small-Angle Scattering')
title

## NeXus Groups
NeXus fields are stored in groups, which have a class that defines the type of content it contains. In `nexusformat`, these are subclasses of the `NXgroup` class.

In [None]:
instrument=NXgroup(name='instrument', nxclass='NXinstrument')
instrument

For classes defined by the NeXus standard, the group can be instantiated by its class name. You can find a list of NeXus classes in the [online manual](http://download.nexusformat.org/doc/html/classes/base_classes/).

In [None]:
sample=NXsample()
sample

Fields and groups can be added to any group, using either attribute (*i.e.*, `a.b=c`) or dictionary (*i.e.*, `a['b']=c`) assignments.

In [None]:
entry=NXentry()
entry.title=title
entry.sample=sample
entry['instrument']=instrument
print(entry.tree)

> *N.B.* Dictionary assignments are safer when there are potential name clashes with, *e.g.*, NumPy attributes, and should be used when writing scripts. However, attribute assignments are allowed as a convenience as they are quicker to type in interactive sessions.

Since group entries are defined in a dictionary, the usual dictionary operations are available.

In [None]:
'sample' in entry

In [None]:
for name in entry:
    if isinstance(entry[name], NXfield):
        print("'%s' is a field" % name)
    elif isinstance(entry[name], NXgroup):
        print("'%s' is a group" % name)

The attribute or dictionary name used in the group assignment overrides any internal name that might have been defined when creating the NeXus object.

In [None]:
temperature=NXfield(40.0, name='temp')
entry.sample.temperature=temperature
print(entry.sample.tree)

Extended path strings are recognized in dictionary assignments.

In [None]:
entry['sample/mass']=10.0
print(entry.tree)

## Attributes
Both groups and fields can have attributes that provide additional metadata or serve a structural purpose, such as identifying axes. These are stored in the `attrs` dictionary.

In [None]:
entry.sample.temperature.attrs['units']='K'
print(sample.tree)

In the case of NXfields, attributes can be defined using keyword arguments to the NXfield.

In [None]:
mass=NXfield(10.0,units='g')

## NXData Groups
The NeXus format defines groups, with class `NXdata`, which are designed to contain the results of the experimental measurement and make them easy to plot. To accomplish this, these groups contain a field, containing one-dimensional or multidimensional arrays, that is designated as the **signal** and one or more one-dimensional fields that are designated as the **axes**. 
### Creating NXdata Groups
The `nexusformat` API has a special constructor for creating `NXdata` groups, which allows the signal and axes to be specified as the first two positional arguments.

In [None]:
en=NXfield(np.linspace(10,35,26),name='energy_transfer',dtype=np.float32,units='meV',
          long_name='Energy Transfer (meV)')
co=NXfield([26,28,49,52,96,165,369,932,1933,3510,5170,6265,6863,6766,5566,4366,2789,1611,928,635,392,266,169,141,113,73],
           name='counts',dtype=np.int16,long_name='Detector Counts')
data1D=NXdata(co, [en], title='Example 1D Data')
print(data1D.tree)

In this example, the first two arguments contain fields representing the signal and axes, respectively. 
> *N.B.* In the case of one-dimensional data, it is not necessary to specify the axes as a list.

> *N.B.* We used the keyword argument, `name`, when creating the NXfields. This is the name is used for the field within the NXdata group. Otherwise, default names would have been assigned to the fields, such as `signal`, `axis1`, `axis2`, *etc.*

If you want to use group assignments to generate the name automatically, you can create an empty NXdata group and define its `nxsignal` and `nxaxes` properties.

In [None]:
data2D=NXdata()
data2D.xvals=data2D.yvals=np.linspace(0,2.*np.pi,101)
X,Y=np.meshgrid(data2D.xvals,data2D.yvals)
data2D.zvals=np.sin(X)*np.sin(Y)
data2D.nxsignal=data2D.zvals
data2D.nxaxes=[data2D.yvals,data2D.xvals]
data2D.title='Example 2D Data'
print(data2D.tree)

In this example, the `signal` and `axes` attributes of the `NXdata` group were automatically set when the `nxsignal` and `nxaxes` properties were specified.
### Plotting NXdata Groups
The information within the NXdata group is sufficient to allow them to be plotted automatically.

In [None]:
%matplotlib inline
data1D.plot()
data2D.plot()

The `nexusformat` package uses Matplotlib for plotting. The first positional argument of the `plot` function can contain a Matplotlib formatting string.

In [None]:
data1D.plot('r+')

### Slicing NXdata Groups
The NXdata groups can be sliced as though they were a NumPy array, since they contain one signal array.

In [None]:
print(data2D[0:51,0:51].tree)
data2D[0:51,0:51].plot()

In an extension of the usual Python slicing syntax, floating point values can be used to extract a slice using the axis values, rather than the signal array indices, to set the limits.

In [None]:
print(data2D[0:np.pi,0:np.pi].tree)
data2D[0:np.pi,0:np.pi].plot()

### Manipulating NXdata Groups
The NXdata groups can be used in arithmetic expressions involving addition, subtraction, multiplication, and division, using the same broadcast rules as NumPy arrays.

In [None]:
bkgd=100.*data1D.energy_transfer
(2*data1D+bkgd).plot()

Some statistical functions also make use of the information stored within the group. For example, the 

In [None]:
print("Sum: ", str(data1D.sum()), "Average: ", str(data1D.average()), "Moment: ", str(data1D.moment()))

## NeXus Design
NeXus files should normally follow a standard layout, which is designed to make it easy to understand the contents without any external documentation.

Here is a typical group hierarchy - the NXinstrument groups that are included will depend on the nature of the instrument:

* **NXentry**: This typically encapsulates all the data for a specific run or scan. There can  be multiple NXentry groups, which don't necessarily have to contain the same types of data
  * **NXsample**: Information concerning the measured sample, *e.g.*, temperature or mass
  * **NXinstrument**: The components making up the instrument
    * **NXmonochromator**: The parameters defining a monochromator, such as the incident energy
    * **NXcollimator**: The parameters defining a collimator
    * **NXdetector**: The parameters defining the detector(s)
  * **NXdata**: The measured data

This can be easily initialized by the `nexusformat` package. If the default names are acceptable, then just type the following:

In [None]:
root=NXroot(NXentry(sample,NXinstrument(NXmonochromator(),NXcollimator(),NXdetector()),data1D))
print(root.tree)

> The `NXroot` is a virtual class that defines the file-level of a NeXus tree.

Metadata can then be added to the created groups.

Once a root group has been defined, it can be saved to a file.
## Saving and Loading NeXus Data
So far, we have created a NeXus file purely in memory, but it can be saved to disk at any time. Afterwards, all subsequent operations on the data will be automatically saved to the file.

In [None]:
print(root.tree)
root.save('example.nxs','w')

This data can be reloaded using the `nxload` function. Possible file modes are:
* `r`: Open as read-only
* `rw`: Open as read-write

In [None]:
saved_file=nxload('example.nxs','rw')
print(saved_file.tree)

If the file is opened in read-write mode, all changes to the file are saved automatically.

In [None]:
saved_file['entry/instrument/monochromator/energy']=NXfield(120.0, units='meV')
print(saved_file.tree)

NeXus files can be backed up if required.

In [None]:
saved_file.backup()
saved_file['entry/instrument/monochromator/energy']=NXfield(160.0, units='meV')
print(saved_file.nxbackup)
saved_file.restore(overwrite=True)
print(saved_file.tree)