###### Extended NeXus hierarchy for ARPES experiments

This notebook is built as a demonstrator, it collects data and metadata from a Beamtime performed at FLASH by the Structural and Electronic Surface Dynamics and Dynamics of Correlated Materials groups of the Physical Chemistry department of the Fritz-Haber-Institute. 

It is designed to:
1. showcase the capabilities of NeXus hierarchy in a real-world application
2. create a shared dictionary of items in the hierarchy for metadata of ARPES experiments
3. provide a tool for the conversion to NeXus of data and metadata from experiments where metadata cannot be automatically parsed.

For this, I hand picked the parameters from a combination of a logbook, a preprocessed data file, a raw file containing the data from FLASH and two files containing the metadata from the data processing software. If Nexus format is accepted higher forms of automation will be implemented.

I tried to create an entry for every piece of information that might be available and relevant for an ARPES experiment. Where I could not find information in the metadata, I still added the field, but filled with NaN or "Not found" strings.

In [1]:
import h5py
import numpy as np
import os
import six
import pytest

from nexusformat.nexus import *


def printname(name):
    print(name)

# Packages needed to parse naive date in unix timestamp univocally

import pytz
from timezonefinder import TimezoneFinder
tf = TimezoneFinder()
from datetime import datetime as dt
from datetime import date as d
from datetime import timedelta as td
    
# Routine to calculate the unix timestamp a posteriori from naive date 
# (not timezone and daylight saving time aware) and geographical location.

def naivedate2stamp(longitude,latitude,datestring):
    localmoment_naive = dt.strptime(datestring, '%Y-%m-%dT%H:%M:%S')
    localmoment_zone = pytz.timezone(tf.timezone_at(lng=longitude,lat=latitude))
    epochmoment_aware = pytz.utc.localize(dt(1970,1,1,0,0,0),is_dst=False)
    try:
        localmoment_aware = localmoment_zone.localize(localmoment_naive, is_dst=None)
        utcmoment_aware = localmoment_aware.astimezone(pytz.utc)
        unixmoment = (utcmoment_aware-epochmoment_aware)/td(seconds=1.0)
        return unixmoment
    except pytz.exceptions.NonExistentTimeError as e:
        print("NonExistentTimeError")
        return np.nan

# Read data from hdf5 file

In [2]:
#Reading the volumetric binned data from hdf5 file
in_d = h5py.File('Original Files/201805_WSe2_LCPpump_LPprobe.h5', 'r')
#Show hdf5 hierarchy (dictionary-like syntax)
in_d.visit(printname) # May have large output

axes
axes/E
axes/kx
axes/ky
axes/tpp
binned
binned/V


# Create NXentry object

General entry object. An entry relates to a single measurement. Parameters directly at this level contain very general information to contextualize the experiment such as proposal title, beamtime ID, et similia.

In [3]:
root=NXroot(NXentry())

In [4]:
# Fill general parameters of NXentry
root.entry.experiment_title=NXfield('Excited-state dynamics of WSe2 in the Valence Band and Core-Levels')
root.entry.experiment_location=NXfield('Hamburg, Germany')
root.entry.experiment_geotag=NXfield([9.8813158,53.580646], units='deg')
# The geotag might seem excessive, but it is the only way to retrieve univocally the timestamp
# if the dates are naive. The list contains first latitude, then longitude. 
root.entry.experiment_start_date=NXfield('2018-04-23T00:00:00')
start_tstamp=naivedate2stamp(root.entry.experiment_geotag[0],root.entry.experiment_geotag[1],\
                             str(root.entry.experiment_start_date))
root.entry.experiment_start_date_timestamp=NXfield(start_tstamp, units='s')
root.entry.experiment_end_date=NXfield('2018-05-03T00:00:00')
end_tstamp=naivedate2stamp(root.entry.experiment_geotag[0],root.entry.experiment_geotag[1],\
                           str(root.entry.experiment_start_date))
root.entry.experiment_end_date_timestamp=NXfield(end_tstamp, units='s')
# hdf5 cannot recognize datetime formats, so we propose to store dates as ISO8601
# i.e. as a string with the format yyyy-mm-ddThh:mm:ss.
# For improved machine readability, it is also possible to save the unix timestamp.
# A posteriori timestamping can be done using the converson tool demonstrated here.
root.entry.experiment_summary=NXfield('Charachterization of WSe2 electronic structure evolution \
with soft X-ray probe and NIR pump, measuring shallow core-levels and valence band. \
Pump polarization is changed through various states.')
root.entry.experiment_identifier=NXfield('F-20170538')
root.entry.experiment_run_cycle=NXfield('2018 User Run Block 2')
root.entry.experiment_institution=NXfield('Deutsches Elektronen Synchrotron - Helmholtz-Gemeinschaft')
root.entry.experiment_facility=NXfield('FLASH')
root.entry.experiment_laboratory=NXfield('PG2')
root.entry.entry_title=NXfield('Left Circularly Polarized Pump - Linearly Polarized Probe - Valence Band Dynamics')
root.entry.entry_identifier=NXfield('Run 22118')
root.entry.entry_start_time=NXfield('2018-05-01T07:22:00')
meas_start_tstamp=naivedate2stamp(root.entry.experiment_geotag[0],root.entry.experiment_geotag[1],\
                             str(root.entry.entry_start_time))
root.entry.entry_start_time_timestamp=NXfield(meas_start_tstamp, units='s')
root.entry.entry_end_time=NXfield('2018-05-01T09:22:00')
meas_end_tstamp=naivedate2stamp(root.entry.experiment_geotag[0],root.entry.experiment_geotag[1],\
                             str(root.entry.entry_start_time))
root.entry.entry_start_time_timestamp=NXfield(meas_start_tstamp, units='s')
root.entry.entry_duration=NXfield(7200, units='s')
root.entry.entry_collection_time=NXfield(7200, units='s')

## Create NXuser object

User parameter folder. Contains information on the user.

In [5]:
root.entry.user=NXuser()

In [6]:
# Fill general parameters of NXuser
root.entry.user.name=NXfield('Hex')
root.entry.user.role=NXfield('Not Found')
root.entry.user.affiliation=NXfield('Not Found')
root.entry.user.address=NXfield('Not Found')
root.entry.user.telephone_number=NXfield('Not found')
root.entry.user.email=NXfield('Not found')
root.entry.user.facility_user_id=NXfield('Not found')

   ## Create NXinstrument object

General "Instrument" folder. The "instrument" designation refers to the whole set-up. Each individual component is defined by subclasses of this. As parameters of this group, I added the ones that are referring to an overall performance of the set-up and not of its single components, like temporal, energetic and spatial resolution.

In [7]:
root.entry.instrument=NXinstrument()

In [8]:
# Fill general parameters of NXinstrument
root.entry.instrument.instrument_name=NXfield('HEXTOF @ PG-2')
root.entry.instrument.instrument_temporal_resolution=NXfield(100, units='fs')
root.entry.instrument.instrument_spatial_resolution=NXfield(500, units='um')
root.entry.instrument.instrument_energy_resolution=NXfield(100, units='meV')
root.entry.instrument.instrument_description=NXfield('Time-of-flight momentum microscope equipped with two level hexagonal delay line detector, at the endstation of beamline PG-2 of FLASH')

### Create source:NXsource object

The "source" group refers to the properties of the lightsource used in the experiment, NOT the properties of the beam at the sample. This group is useful in case of complex sources such as large facilities, complex laser systems or sources with complex time structure (as in this case).

We propose to use "source" to indicate the primary probe source to maximize the consistency across the community. Further sources in multiple beam experiment will be specifically named with names such as "source_pump", "source_2", etc.

In [9]:
root.entry.instrument.source=NXsource()

In [10]:
root.entry.instrument.source.name=NXfield('FLASH')
root.entry.instrument.source.type=NXfield('Free Electron Laser')
root.entry.instrument.source.probe=NXfield('x-ray')
root.entry.instrument.source.photon_energy=NXfield(36.41060, units='eV')
root.entry.instrument.source.pulse_energy=NXfield(1.65677118301392, units='uJ')
root.entry.instrument.source.center_wavelength=NXfield(34.05168, units='nm')
root.entry.instrument.source.average_power=NXfield(8.28385591506958, units='mW')
root.entry.instrument.source.peak_power=NXfield(23.6681597573417, units='MW')
root.entry.instrument.source.emittance_x=NXfield(1.4, units='um')
root.entry.instrument.source.emittance_y=NXfield(1.4, units='um')
root.entry.instrument.source.size_x=NXfield(np.nan, units='um')
root.entry.instrument.source.size_y=NXfield(np.nan, units='um')
root.entry.instrument.source.flux=NXfield(4.569e17, units='1/(s cm^2)')
root.entry.instrument.source.energy=NXfield(427, units='MeV')
root.entry.instrument.source.current=NXfield(1, units='uA')
root.entry.instrument.source.frequency=NXfield(10, units='Hz')
root.entry.instrument.source.pulse_duration=NXfield(70, units='fs')
source_photon_energy=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Photon energy',units='eV')
source_photon_counts=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Intensity',units='cts')
root.entry.instrument.source.spectrum=NXdata(source_photon_counts,source_photon_energy)
# Owing to the complex timestructure of DESY pulses, I had to distinguish between bursts (10 Hz) and bunches (1 MHz)
root.entry.instrument.source.mode=NXfield('Burst')
root.entry.instrument.source.number_of_bursts=NXfield(1)
root.entry.instrument.source.burst_length=NXfield(500, units='us')
root.entry.instrument.source.burst_distance=NXfield(199.5, units='ms')
root.entry.instrument.source.bunch_length=NXfield(100, units='fs')
root.entry.instrument.source.bunch_distance=NXfield(1, units='us')
root.entry.instrument.source.number_of_bunches=NXfield(500)
root.entry.instrument.source.top_up=NXfield(True)
source_bunch_time=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Time',units='ps')
source_bunch_count=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Bunches',units='cts')
root.entry.instrument.source.bunch_pattern=NXdata(source_bunch_count,source_bunch_time)
root.entry.instrument.source.burst_number_start=NXfield(102644001)
root.entry.instrument.source.burst_number_end=NXfield(102680129)

### Create source_pump:NXsource object

A second source in a pump-probe experiment is the pump source. Creating a second group of NXsource class will not create an ambiguous hierarchy, because:
1. the class is only an attribute, what appears in the hiearachy structure is the name
2. definitions may require the exisistence of a group named "source" of "NXsource" class that is fully general and refers only to the probe source, and will not suffer ambiguity with the "pump_source"

In [11]:
root.entry.instrument.source_pump=NXsource()

In [12]:
root.entry.instrument.source_pump.name=NXfield('User Laser @ FLASH')
root.entry.instrument.source_pump.type=NXfield('OPCPA')
root.entry.instrument.source_pump.probe=NXfield('NIR')
root.entry.instrument.source_pump.photon_energy=NXfield(1.55, units='eV')
root.entry.instrument.source_pump.pulse_energy=NXfield(98.2, units='uJ')
root.entry.instrument.source_pump.center_wavelength=NXfield(800, units='nm')
root.entry.instrument.source_pump.average_power=NXfield(392.8, units='mW')
root.entry.instrument.source_pump.peak_power=NXfield(1.964, units='GW')
root.entry.instrument.source_pump.frequency=NXfield(10, units='Hz')
root.entry.instrument.source_pump.pulse_duration=NXfield(50, units='fs')
pump_photon_energy=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Photon energy',units='eV')
pump_photon_counts=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Intensity',units='cts')
root.entry.instrument.source_pump.spectrum=NXdata(pump_photon_counts,pump_photon_energy)
root.entry.instrument.source_pump.mode=NXfield('Burst')
root.entry.instrument.source_pump.number_of_bursts=NXfield(1)
root.entry.instrument.source_pump.burst_length=NXfield(400, units='us')
root.entry.instrument.source_pump.burst_distance=NXfield(199.6, units='ms')
root.entry.instrument.source_pump.bunch_length=NXfield(50, units='fs')
root.entry.instrument.source_pump.bunch_distance=NXfield(1, units='us')
root.entry.instrument.source_pump.number_of_bunches=NXfield(400)
root.entry.instrument.source_pump.rms_jitter=NXfield(204.68816194453154, units='fs')
pump_bunch_time=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Time',units='ps')
pump_bunch_count=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Bunches',units='cts')
root.entry.instrument.source_pump.bunch_pattern=NXdata(pump_bunch_count,pump_bunch_time)

### Create beam_pump_0 object

To define the beam properties at the sample location, the most appropriate tool in the NeXus hierarchy is NXbeam. It is designed to define the beam properties at any point where metadata are collected in a beamline. The first entry is the distance from the sample, so I propose the naming convention "beam_pump_n" or "beam_probe_n" where n is the distance from the sample in cm.

In [13]:
root.entry.instrument.beam_pump_0=NXbeam()

In [14]:
root.entry.instrument.beam_pump_0.distance=NXfield(0, units='cm')
root.entry.instrument.beam_pump_0.pulse_energy=NXfield(15.71, units='uJ')
root.entry.instrument.beam_pump_0.average_power=NXfield(62.848, units='mW')
root.entry.instrument.beam_pump_0.photon_energy=NXfield(1.55, units='eV')
root.entry.instrument.beam_pump_0.center_wavelength=NXfield(800, units='nm')
root.entry.instrument.beam_pump_0.polarization_angle=NXfield(np.nan, units='deg') # Angle of polarization ellipse 
# from plane of incidence. For circular polarization, it is not defined.
root.entry.instrument.beam_pump_0.polarization_ellipticity=NXfield(-1.0) # Ellipticity of polarization.
# 0 for linear, +1 for RCP, -1 for LCP.
root.entry.instrument.beam_pump_0.size_x=NXfield(500, units='um')
root.entry.instrument.beam_pump_0.size_y=NXfield(200, units='um')
root.entry.instrument.beam_pump_0.fluence=NXfield(5, units='mJ/cm^2')
root.entry.instrument.beam_pump_0.pulse_duration=NXfield(50, units='fs')
# Parameters to define pulse chirp
pump_frequency=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Frequency',units='nm')
pump_rel_time=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Time relative to pulse center',units='fs')
root.entry.instrument.beam_pump_0.chirp_spectrum=NXdata(pump_frequency,pump_rel_time)
root.entry.instrument.beam_pump_0.chirp_type=NXfield('None')
root.entry.instrument.beam_pump_0.chirp_GDD=NXfield(np.nan, units='fs^2') # Constant for linear chirp

### Create beam_probe_0 object

Analogous structure for the probe beam.

In [15]:
root.entry.instrument.beam_probe_0=NXbeam()

In [16]:
root.entry.instrument.beam_probe_0.distance=NXfield(0, units='cm')
root.entry.instrument.beam_pump_0.pulse_energy=NXfield(1.24258, units='nJ')
root.entry.instrument.beam_pump_0.average_power=NXfield(6.21289, units='uW')
root.entry.instrument.beam_probe_0.photon_energy=NXfield(36.49699020385742, units='eV')
root.entry.instrument.beam_probe_0.polarization_angle=NXfield(0.0, units='deg') # Angle of polarization ellipse 
# from plane of incidence. For circular polarization, it is not defined.
root.entry.instrument.beam_probe_0.polarization_ellipticity=NXfield(0.0) # Ellipticity of polarization.
# 0 for linear, +1 for RCP, -1 for LCP.
root.entry.instrument.beam_probe_0.size_x=NXfield(500, units='um')
root.entry.instrument.beam_probe_0.size_y=NXfield(200, units='um')
root.entry.instrument.beam_probe_0.fluence=NXfield(np.nan, units='mJ/cm^2')
root.entry.instrument.beam_probe_0.pulse_duration=NXfield(70, units='fs')
# Parameters for pulse chirp
probe_frequency=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Frequency',units='nm')
probe_rel_time=NXfield(shape=(1000),dtype=np.float32,fillervalue=np.nan,name='Time relative to pulse center',units='fs')
root.entry.instrument.beam_probe_0.chirp_spectrum=NXdata(pump_frequency,pump_rel_time)
root.entry.instrument.beam_probe_0.chirp_type=NXfield('None')
root.entry.instrument.beam_probe_0.chirp_GDD=NXfield(np.nan, units='1/s^2') # Constant for linear chirp

### Create NXslit object

X-ray sources often have beam shaping apertures that can be used to control beam properties.

In [17]:
root.entry.instrument.slit_WAU=NXslit()

In [18]:
root.entry.instrument.slit_WAU.x_gap=NXfield(0.44, units='mm')
root.entry.instrument.slit_WAU.y_gap=NXfield(0.447, units='mm')
root.entry.instrument.slit_WAU.x_pos=NXfield(16, units='mm')
root.entry.instrument.slit_WAU.y_pos=NXfield(0.5, units='mm')

### Create NXmonochromator object

Group for monochromator metadata, can include specific information about motor positions etc.

In [19]:
root.entry.instrument.monochromator=NXmonochromator()
root.entry.instrument.monochromator.slit=NXslit()
root.entry.instrument.monochromator.grating=NXgrating()

In [20]:
root.entry.instrument.monochromator.energy=NXfield(36.49699020385742, units='eV')
root.entry.instrument.monochromator.energy_error=NXfield(0.21867309510707855, units='eV')
root.entry.instrument.monochromator.grating.dispersion=NXfield(0.10933390259742737, units='eV/mm')
root.entry.instrument.monochromator.slit.y_gap=NXfield(2000.04833984375, units='um')

### Create NXattenuator objects

Many sources are not easily dimmable and require attenuators. NeXus hierachy does not include the "type" and "thickness/pressure" properties that are necessary for Soft-X/UV attenuators.

In [21]:
root.entry.instrument.attenuator_gas=NXattenuator()

In [22]:
root.entry.instrument.attenuator_gas.type=NXfield('Gas')
root.entry.instrument.attenuator_gas.material=NXfield('Argon')
root.entry.instrument.attenuator_gas.pressure=NXfield(0.008980000391602516,units='mbar')
root.entry.instrument.attenuator_gas.status=NXfield('in')

In [23]:
root.entry.instrument.attenuator_al1=NXattenuator()

In [24]:
root.entry.instrument.attenuator_al1.type=NXfield('Freestanding film')
root.entry.instrument.attenuator_al1.material=NXfield('Aluminium')
root.entry.instrument.attenuator_al1.thickness=NXfield(2043,units='nm')
root.entry.instrument.attenuator_al1.status=NXfield('Out')

In [25]:
root.entry.instrument.attenuator_al2=NXattenuator()

In [26]:
root.entry.instrument.attenuator_al2.type=NXfield('Freestanding film')
root.entry.instrument.attenuator_al2.material=NXfield('Aluminium')
root.entry.instrument.attenuator_al2.thickness=NXfield(960,units='nm')
root.entry.instrument.attenuator_al2.status=NXfield('in')

In [27]:
root.entry.instrument.attenuator_pump=NXattenuator()

In [28]:
root.entry.instrument.attenuator_pump.type=NXfield('Waveplate-Polarizer')
root.entry.instrument.attenuator_pump.position=NXfield(1.4112176722846925E-4, units='deg')
root.entry.instrument.attenuator_pump.attenuation=NXfield(0.84)
root.entry.instrument.attenuator_pump.status=NXfield('in')

### Create manipulator:NXpositioner object 

NXpositioner is the class which I assume is designed for objects such as manipulators and other micromotors. I added a "type" field, as you generally can have either Hexapods (which have six translational axes, like the present case) or Rod manipulators. Since often Hexapods also have a way of converting the translations into angles, I also included the angular coordinates.  

In [29]:
root.entry.instrument.manipulator=NXpositioner()

In [30]:
root.entry.instrument.manipulator.type=NXfield('Hexapod')
root.entry.instrument.manipulator.pos_x1=NXfield(11.3,units='um')
root.entry.instrument.manipulator.pos_x2=NXfield(11.3,units='um')
root.entry.instrument.manipulator.pos_y=NXfield(7.2,units='um')
root.entry.instrument.manipulator.pos_z1=NXfield(20.77,units='um')
root.entry.instrument.manipulator.pos_z2=NXfield(21.20,units='um')
root.entry.instrument.manipulator.pos_z3=NXfield(20.22,units='um')
root.entry.instrument.manipulator.pos_azimuth=NXfield(np.nan,units='deg')
root.entry.instrument.manipulator.pos_tilt=NXfield(np.nan,units='deg')
root.entry.instrument.manipulator.pos_polar=NXfield(np.nan,units='deg')
root.entry.instrument.manipulator.cryocoolant=NXfield('None')
root.entry.instrument.manipulator.cryostat_temperature=NXfield(300,units='K')
root.entry.instrument.manipulator.sample_temperature=NXfield(300,units='K')
root.entry.instrument.manipulator.drain_current=NXfield(np.nan,units='pA')
root.entry.instrument.manipulator.sample_bias=NXfield(29,units='V')
root.entry.instrument.manipulator.heater_power=NXfield(0,units='W')

### Create NXdetector object

NXdetector is the class of objects storing the metadata relative to the analyser

In [31]:
root.entry.instrument.analyser=NXdetector()

In [32]:
root.entry.instrument.analyser.extractor_voltage=NXfield(6030, units='V')
root.entry.instrument.analyser.extractor_current=NXfield(-1.5e-6, units='A')
root.entry.instrument.analyser.working_distance=NXfield(4, units='mm')
root.entry.instrument.analyser.lens_mode=NXfield('20180430_KPEEM_M_-2.5_FoV6.2_rezAA_20ToF_focused.sav')
root.entry.instrument.analyser.lens_names=NXfield(['Extr','Foc1','Z1','Z2','UFA','A','B','C','D','E','F','G','H','I','TOF','MCPfront'])
root.entry.instrument.analyser.lens_voltages=NXfield([6030,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,20,np.nan], units='V')
root.entry.instrument.analyser.projection=NXfield('reciprocal')
root.entry.instrument.analyser.magnification=NXfield(-1.5)
root.entry.instrument.analyser.field_aperture=NXfield(750, units='um')
root.entry.instrument.analyser.field_aperture_x=NXfield(-0.200, units='um')
root.entry.instrument.analyser.field_aperture_y=NXfield(5.350, units='um')
root.entry.instrument.analyser.contrast_aperture=NXfield('Out', units='um')
root.entry.instrument.analyser.field_aperture_x=NXfield(np.nan, units='um')
root.entry.instrument.analyser.field_aperture_y=NXfield(np.nan, units='um')
root.entry.instrument.analyser.dispersion_scheme=NXfield('Time of flight')
root.entry.instrument.analyser.pass_energy=NXfield(20, units='eV')
root.entry.instrument.analyser.energy_scan_mode=NXfield('fixed')
root.entry.instrument.analyser.tof_distance=NXfield(0.9, units='m')
root.entry.instrument.analyser.amplifier_type=NXfield('MCP')
root.entry.instrument.analyser.detector_type=NXfield('DLD')
root.entry.instrument.analyser.detector_voltage=NXfield(np.nan, units='V')
root.entry.instrument.analyser.sensor_size=NXfield(40, units='mm')
root.entry.instrument.analyser.sensor_count=NXfield(4)
root.entry.instrument.analyser.pixel_size=NXfield(np.nan, units='um')

## Create NXsample object

NXsample is a base class (only subordinate to entry) where information regarding the sample is stored. Note that the temperature is cross linked from the sample_temperature field in the manipulator, as an example of cross linking data from NXinstrument hierarchy in more elevated and descriptive hierarchy positions.

In [33]:
root.entry.sample=NXsample()

In [34]:
root.entry.sample.name=NXfield('WSe2')
root.entry.sample.sample_id=NXfield('None')
root.entry.sample.state=NXfield('monocrystalline solid')
root.entry.sample.purity=NXfield(0.999)
root.entry.sample.surface_orientation=NXfield('0001')
root.entry.sample.layer=NXfield('bulk')
root.entry.sample.space_group=NXfield(194)
root.entry.sample.chemical_formula=NXfield('WSe2')
root.entry.sample.chemical_name=NXfield('Tungsten diselenide')
root.entry.sample.chem_id_cas=NXfield('12067-46-8')
root.entry.sample.temperature=NXlink(root.entry.instrument.manipulator.sample_temperature)
root.entry.sample.pressure=NXfield(3.27e-10, units='mbar')
root.entry.sample.gas=NXfield('residual')
root.entry.sample.E_field=NXfield(np.nan, units='V')
root.entry.sample.B_field=NXfield(np.nan, units='T')
root.entry.sample.stress_field=NXfield(np.nan, units='Pa')
root.entry.sample.thickness=NXfield(0.5, units='mm')
root.entry.sample.surface_dopant=NXfield('None')
root.entry.sample.surface_dopant_coverage=NXfield(0, units='Å')
root.entry.sample.bias=NXlink(root.entry.instrument.manipulator.sample_bias)
root.entry.sample.drain_current=NXlink(root.entry.instrument.manipulator.drain_current)
root.entry.sample.growth_method=NXfield('Chemical Vapor Deposition')
root.entry.sample.preparation_method=NXfield('in-vacuum cleave')
root.entry.sample.preparation_date=NXfield('Not found')
root.entry.sample.vendor=NXfield('HQ Graphene')
root.entry.sample.substrate_material=NXfield('Not found')
root.entry.sample.substrate_state=NXfield('Not found')
root.entry.sample.substrate_vendor=NXfield('Not found')
root.entry.sample.substrate_notes=NXfield('Not found')

## Create NXprocess object

NXprocess is the class of groups containing information regarding the processing the data has undergone. For our heavily preprocessed data, the substructure is way too simple, and I had to create subgroups. These are obviously not included as real NXclasses, but they just wrapped with the attributes that I input.

Note that the final calibrated axes are saved here and only linked in the main data group.

In [35]:
root.entry.process=NXprocess()

In [36]:
# NXprocess doesn't have a substructure, but in our case of complex preprocessing it is needed. 
root.entry.process.distortion=NXgroup(name='NXdistortion')
root.entry.process.registration=NXgroup(name='NXregistration')
root.entry.process.calibration_k=NXgroup(name='NXcalibration')
root.entry.process.calibration_e=NXgroup(name='NXcalibration')
root.entry.process.correction=NXgroup(name='NXcorrection')
root.entry.process.enhancement=NXgroup(name='NXenhancement')

In [37]:
# Define program characteristics
root.entry.process.program_name=NXfield('FLASH_CBSignalAnalysis_QWP=183_80x80x146x230_deposit.ipynb')
root.entry.process.program_version=NXfield('alpha')
root.entry.process.program_codebase=NXfield('mpes')
root.entry.process.program_sequence=NXfield(['distortion_correction','translation','rotation', 'calibration'])
# Some transformations involved in this process are non-commutative so the order is important

# Metadata on distortion correction
root.entry.process.distortion.applied=NXfield(True)
root.entry.process.distortion.symmetry=NXfield(6)
root.entry.process.distortion.symmetry_angle=NXfield(60)
root.entry.process.distortion.original_centre=NXfield(shape=(2),dtype=np.float32,fillervalue=np.nan)
root.entry.process.distortion.original_points=NXfield(shape=(2,6),dtype=np.float32,fillervalue=np.nan)
root.entry.process.distortion.field=NXfield(shape=(80,80),dtype=np.float32,fillervalue=np.nan)

# Metadata on image registration
root.entry.process.registration.applied=NXfield(True)
root.entry.process.registration.x_translation=NXfield(np.nan)
root.entry.process.registration.y_translation=NXfield(np.nan)
root.entry.process.registration.new_centre=NXfield([np.nan,np.nan])
root.entry.process.registration.rotation_angle=NXfield(np.nan)
root.entry.process.registration.rotation_centre=NXfield([np.nan,np.nan])
root.entry.process.registration.scaling=NXfield(np.nan)

# Metadata on calibration
root.entry.process.calibration_k.applied=NXfield(True)
root.entry.process.calibration_k.x_scaling=NXfield(np.nan)
root.entry.process.calibration_k.y_scaling=NXfield(np.nan)

root.entry.process.calibration_e.applied=NXfield(True)
root.entry.process.calibration_e.files=NXfield(shape=(7),fillervalue='Not found.h5')
root.entry.process.calibration_e.biases=NXfield(shape=(7),dtype=np.float32,fillervalue=np.nan)
root.entry.process.calibration_e.peaks=NXfield(shape=(7),dtype=np.float32,fillervalue=np.nan)
root.entry.process.calibration_e.peaks=NXfield(shape=(7),dtype=np.float32,fillervalue=np.nan)
root.entry.process.calibration_e.E0=NXfield(shape=(7),dtype=np.float32,fillervalue=np.nan)
root.entry.process.calibration_e.offset=NXfield(shape=(7),dtype=np.float32,fillervalue=np.nan)
root.entry.process.calibration_e.coefficients=NXfield(shape=(2),dtype=np.float32,fillervalue=np.nan)

# Define calculated axes
root.entry.process.calculated_kx=NXfield(in_d['axes']['kx'],name='k_x',units='1/Å')
root.entry.process.calculated_ky=NXfield(in_d['axes']['ky'],name='k_y',units='1/Å')
root.entry.process.calculated_Energy=NXfield(in_d['axes']['E'],name='Energy',units='eV')
root.entry.process.calculated_Tpp=NXfield(in_d['axes']['tpp'],name='Pump-probe delay',units='fs')

# Define original axes
px_x=NXfield(np.linspace(0,1,80),dtype=np.float64,name='pixel x')
px_y=NXfield(np.linspace(0,1,80),dtype=np.float64,name='pixel y')
tof=NXfield(shape=(146),dtype=np.float32,fillervalue=np.nan,name='time of flight', units='ns')
ADC=NXfield(shape=(195),dtype=np.float32,fillervalue=np.nan,name='delay stage position')

# Define NXdata fields for the conversion axes
z1=NXlink(root.entry.process.calculated_kx)
root.entry.process.x_to_k_x=NXdata(z1,px_x)
z2=NXlink(root.entry.process.calculated_ky)
root.entry.process.y_to_k_y=NXdata(z2,px_y)
z3=NXlink(root.entry.process.calculated_Energy)
root.entry.process.tof_to_E=NXdata(z3,tof)
z4=NXlink(root.entry.process.calculated_Tpp)
root.entry.process.stage_to_delay=NXdata(z4,ADC)

root.entry.process.other_converts=NXfield('Not found')

# Metadata on other data preprocessing (to be further expanded)
root.entry.process.enhancement.MCLAHE_applied=NXfield(False)
root.entry.process.enhancement.MCLAHE_parameters=NXfield(shape=(3),dtype=np.float32,fillervalue=np.nan)
root.entry.process.correction.space_charge_applied=NXfield(False)
root.entry.process.correction.space_charge_parameters=NXfield(shape=(3),dtype=np.float32,fillervalue=np.nan)
root.entry.process.correction.flat_field_applied=NXfield(False)
root.entry.process.correction.flat_field_map=NXfield(shape=(80,80),dtype=np.float32,fillervalue=np.nan)

## Create NXdata object

Main data group containing the data relevant for the entry. The calibrated axes on which it should be plotted are linked from the NXprocess group, where they are stored.

NXfield inherits data allocation fro h5py. When data is above a cerain size, it divides it in chunks for better memory stability. It also automatically applies a lossless compression. While the chunking may be desirable in this situation, compression may be not. Changing the "compression_opts" number between 0 and 9 allows to go from, respectively, no compression to maximum compression.

In [38]:
# Create the NXdata object
#Build the axes with names and units
kx=NXlink(root.entry.process.calculated_kx)
ky=NXlink(root.entry.process.calculated_ky)
E=NXlink(root.entry.process.calculated_Energy)
tpp=NXlink(root.entry.process.calculated_Tpp)
data_vol=NXfield(in_d['binned']['V'],name='Photoemission intensity',units='counts', compression="gzip", compression_opts=0)

In [39]:
# Wrap the NXfields in a NXdata object
root.entry.data=NXdata(data_vol,[kx,ky,E,tpp])

# Save File

The saving command wraps all the fields, groups and data with the attributes according to NeXus Standard and saves in an h5.

In [40]:
root.save('201805_WSe2_max.nxs', mode='w')

NXroot('root')