<img src="figures/ICALPECS_2025_logo.png">

# Convert myspot multimodal experiment data in to NeXus standand format

- [x] Step 1: Store measurement at each point over the sample into 3D NXdata --> 1 hdf per ET (here, NXfluo and NXxrd)
- [x] Step 2: Prepare json mapping file
- [x] Step 3: Make use of eln to demostrate capability of structured eln data
- [x] Step 4: Run converter
- [x] Step 5: Visualize
- Step 6: Evaluation ?

### Say about the general workflow at the beamline, for eg. 
- uploading/sending directly to upload
- --> access it through NORTH
- --> data reading and a plotting notebook
- --> quick analyis without additional set up  

## Note:
- /entry/instrument/(from spec h5 file i.e combined_output.h5) --> flush into NXcollection ?
- /entry/xas/instrument/(linked here 3D data): Default 3D
- /entry/xrd/instrument/(linked here 3D data): Default 3D
- user and sample data get it form eln
- Make notenook 2: Evaluation --> here put summed xrf and xrd data in NXdata default

# Step 1: One file with NXdata per ET

# Step 2: Configuration file

In [58]:
import json
from pprint import pprint

with open('myspot_fluo.mapping_icalepcs2025.json') as json_data:
    d = json.load(json_data)
    json_data.close()
    pprint(d)

{'/@default': 'entry',
 '/ENTRY[entry]/@default': 'data',
 '/ENTRY[entry]/DATA[data]/@axes': ['energy'],
 '/ENTRY[entry]/DATA[data]/@signal': 'data',
 '/ENTRY[entry]/DATA[data]/data': {'link': '/entry/instrument/fluorescence/data'},
 '/ENTRY[entry]/DATA[data]/energy': {'link': '/entry/instrument/fluorescence/energy'},
 '/ENTRY[entry]/DATA[data]/energy/@units': 'eV',
 '/ENTRY[entry]/INSTRUMENT[instrument]/SOURCE[source]/name': 'BESSY II',
 '/ENTRY[entry]/INSTRUMENT[instrument]/SOURCE[source]/probe': 'x-ray',
 '/ENTRY[entry]/INSTRUMENT[instrument]/SOURCE[source]/type': 'Synchrotron '
                                                             'X-ray Source',
 '/ENTRY[entry]/INSTRUMENT[instrument]/fluorescence/data': '/entry_NXfluo/data/intensity_sum',
 '/ENTRY[entry]/INSTRUMENT[instrument]/fluorescence/energy': '/entry_NXfluo/data/energy',
 '/ENTRY[entry]/INSTRUMENT[instrument]/monochromator/wavelength': 1.0,
 '/ENTRY[entry]/INSTRUMENT[instrument]/monochromator/wavelength/@units': 'angs

# Step 3: Demostrate use of structure eln

In [2]:
import yaml
import pprint

with open("eln_data_xrd.yaml") as stream:
    try:
        pprint.pprint(yaml.safe_load(stream))
    except yaml.YAMLError as exc:
        pprint.pprint(exc)


{'process': {'note': {'description': 'This program does the azimuthal '
                                     'integration of the data acquired at '
                                     'myspot'},
             'program': 'bluesky_v0.1.1'},
 'sample': {'name': 'My Wonderful Sample', 'rotation_angle': 999.0},
 'user': {'email': 'sonal.patel@helmholtz-berlin.de', 'name': 'Sonal'}}


# Step 4: Run the converter

To convert the available files to the NeXus format we use the convert function readily supplied by pynxtools. It uses the downloaded measurement file, a json config file and optionally an electronic lab notebook (ELN) yaml is used to collect the (meta)data not capture during experiment. The json config file maps specific metadata from the h5 measurement file to the nxs file, i.e. a pressure reading which automatically gets collected during measurement. The ELN is a file which supplies additional metadata which is written into the NeXus file. This is data which is not collected automatically, such as the person conducting the experiment. It can be written manually or generated, e.g. by the NOMAD ELN functionality.

The convert command may also be executed in the command line with the command dataconverter:

```
dataconverter \
    --reader json_map \
    --nxdl NXbessy \
    --input-file myspot_multimodal_xrd.mapping.json \
    --input-file eln_data.yaml \
    --output myspot_multimodal.nxs 
```

In [55]:
from pynxtools.dataconverter.convert import convert

The input parameters are defined as follows:

**reader**: The specific reader which gets called inside pynxtools. This is supplied in the pynxtools python code. If you create a specific reader for your measurement file it gets selecetd here. 

**nxdl**: The specific nxdl file to be used. For fluoroscence this should always be `NXfluo` and for differaction measurement `NXmonopd`.
    
**output**: The output filename of the NeXus file.

In [63]:
convert(input_file=["myspot_fluo.mapping_icalepcs2025.json", "myspot_multimodal_fluo_2025-08-28.nxs", "combined_output.h5", "eln_data_fluo.yaml"],
        reader='json_map',
        nxdl='NXfluo',
        output='myspot_fluo.nxs')

Using json_map reader to convert the given files:  
• myspot_fluo.mapping_icalepcs2025.json
• myspot_multimodal_fluo_2025-08-28.nxs
• combined_output.h5
• eln_data_fluo.yaml
The output file generated: myspot_fluo.nxs.


In [5]:
convert(input_file=["myspot_xrd.mapping_icalepcs2025.json", "myspot_multimodal_xrd_2025-08-14.nxs", "eln_data_xrd.yaml"],
        reader="json_map",
        nxdl="NXmonopd",
        output="myspot_xrd.nxs")

Using json_map reader to convert the given files:  
• myspot_xrd.mapping_icalepcs2025.json
• myspot_multimodal_xrd_2025-08-14.nxs
• eln_data_xrd.yaml
The output file generated: myspot_xrd.nxs.


# Step 5: Visualize NeXus data

View the data with H5Web
H5Web is a tool for visualizing any data in the h5 data format. Since the NeXus format builds opon h5 it can be used to view this data as well. We just import the package and call H5Web with the output filename from the convert command above. For an analysis on NeXus data files please refer to analysis example.

You can also view this data with the H5Viewer or other tools from your local filesystem.

In [16]:
import h5py

def print_h5_structure(name, obj):
    print(name, "->", type(obj))

with h5py.File("myspot_fluo.nxs", "r") as f:
    f.visititems(print_h5_structure)

entry -> <class 'h5py._hl.group.Group'>
entry/data -> <class 'h5py._hl.group.Group'>
entry/definition -> <class 'h5py._hl.dataset.Dataset'>
entry/instrument -> <class 'h5py._hl.group.Group'>
entry/instrument/fluorescence -> <class 'h5py._hl.group.Group'>
entry/instrument/fluorescence/data -> <class 'h5py._hl.dataset.Dataset'>
entry/instrument/fluorescence/energy -> <class 'h5py._hl.dataset.Dataset'>
entry/instrument/monochromator -> <class 'h5py._hl.group.Group'>
entry/instrument/monochromator/wavelength -> <class 'h5py._hl.dataset.Dataset'>
entry/instrument/source -> <class 'h5py._hl.group.Group'>
entry/instrument/source/name -> <class 'h5py._hl.dataset.Dataset'>
entry/instrument/source/probe -> <class 'h5py._hl.dataset.Dataset'>
entry/instrument/source/type -> <class 'h5py._hl.dataset.Dataset'>
entry/monitor -> <class 'h5py._hl.group.Group'>
entry/monitor/data -> <class 'h5py._hl.dataset.Dataset'>
entry/monitor/mode -> <class 'h5py._hl.dataset.Dataset'>
entry/monitor/preset -> <class

In [53]:
from jupyterlab_h5web import H5Web

H5Web("myspot_multimodal_fluo_2025-08-14.nxs")

<jupyterlab_h5web.widget.H5Web object>

In [9]:
H5Web("myspot_xrd.nxs", path="/entry")

<jupyterlab_h5web.widget.H5Web object>

<HDF5 group "/entry/data" (2 members)>


KeyError: "Unable to synchronously open object (object 'energy' doesn't exist)"

4.3.4


In [4]:
%pip install silx

[33mDEPRECATION: Loading egg at /home/sonal/anaconda3/lib/python3.11/site-packages/nexusformat-1.0.7-py3.11.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330[0m[33m
[0m[33mDEPRECATION: Loading egg at /home/sonal/anaconda3/lib/python3.11/site-packages/qtconsole-5.6.0-py3.11.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330[0m[33m
[0m[33mDEPRECATION: Loading egg at /home/sonal/anaconda3/lib/python3.11/site-packages/spec2nexus-2021.2.6-py3.11.egg is deprecated. pip 24.3 will enforce this behaviour change. A possible replacement is to use pip for package installation.. Discussion can be found at https://github.com/pypa/pip/issues/12330[0m[33m
[0m[33mDEPRECATION: Loading egg at /home/sonal/a

In [2]:
%pip install pynxtools

Note: you may need to restart the kernel to use updated packages.


In [15]:
%pip install jupyterlab_h5web

Collecting jupyterlab_h5web
  Downloading jupyterlab_h5web-12.4.0-py3-none-any.whl.metadata (5.6 kB)
Downloading jupyterlab_h5web-12.4.0-py3-none-any.whl (810 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m810.6/810.6 kB[0m [31m15.6 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: jupyterlab_h5web
Successfully installed jupyterlab_h5web-12.4.0
Note: you may need to restart the kernel to use updated packages.


In [13]:
%pip install ipywidgets jupyterlab_widgets


Note: you may need to restart the kernel to use updated packages.


In [6]:
import ipywidgets as widgets
widgets.IntSlider()

IntSlider(value=0)

In [3]:
pip install jupyterlab_h5web[notebook]

Note: you may need to restart the kernel to use updated packages.


In [None]:
import os
import h5py
from silx.io.convert import write_to_h5

# Path to your .mca SPEC files
spec_dir = "./mca/"


# Output HDF5 file
output_h5_file = "combined_output.h5"

# Get all matching SPEC files
spec_files = sorted([
    os.path.join(spec_dir, f)
    for f in os.listdir(spec_dir)
    if f.endswith(".mca") and f.startswith("2022-09-30_scans_00003")
])

for spec_file in spec_files:
    base = os.path.splitext(os.path.basename(spec_file))[0]
    temp_path = f"/tmp_{base}"         # Temporary group for silx
    final_path = f"/entry_{base}"      # Final NeXus-compliant NXentry

    print(f"Converting {spec_file} to temporary group {temp_path}")

    #  Correct use of write_to_h5 with only positional arguments
    write_to_h5(spec_file, output_h5_file, temp_path, "a")

    # Open HDF5 and move inner NXentry to final path
    with h5py.File(output_h5_file, "a") as h5f:
        tmp_group = h5f[temp_path]
        entry_names = list(tmp_group.keys())

        if len(entry_names) != 1:
            print(f"Unexpected structure inside {temp_path}. Skipping.")
            continue

        nxentry_name = entry_names[0]
        src_group = tmp_group[nxentry_name]

        # Create destination group
        if final_path in h5f:
            print(f"{final_path} already exists. Skipping.")
            continue

        h5f.copy(src_group, final_path)
        h5f[final_path].attrs["NX_class"] = "NXentry"

        # Clean up: remove temporary group
        del h5f[temp_path]

        print(f"Moved {nxentry_name} → {final_path}")
