Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement of NeXus file IO #2725

Merged
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
21a47ab
Limited the axes list to be the same or smaller than keys from dataset
ptim0626 Mar 26, 2021
f210a38
Set correct chunks by info from signal axes for lazy data
ptim0626 Mar 26, 2021
7d9a731
Load related hdf dataset with correct dimension and chunks
ptim0626 Mar 26, 2021
0608c91
Allowed writing hyperspy signal into NeXus file
ptim0626 Apr 6, 2021
3309855
Saved NXdata with correct chunks and checked units
ptim0626 Apr 6, 2021
e077c6f
Skipped duplicated metadata groups when writing
ptim0626 Apr 7, 2021
0e89e4e
Added support for reading specific dataset by absolute data path
ptim0626 Apr 27, 2021
9c258ba
Added option to avoid loading array metadata
ptim0626 Apr 28, 2021
9a1c8d4
Fixed missing fclose and added new implemented option in read_metadat…
ptim0626 Apr 29, 2021
1e5d65e
Added option to skip saving certain original metadata keys
ptim0626 Apr 29, 2021
7c6565b
Merge branch 'RELEASE_next_minor' into nexus_reader_improvement
ptim0626 Apr 29, 2021
cc7de2f
Checked all elements in _check_search_keys are str
ptim0626 Apr 30, 2021
ad66893
Handled metadata as dictionary when writing
ptim0626 Apr 30, 2021
0f7ff34
Added tests for new features
ptim0626 Apr 30, 2021
1d42f83
Added more tests
ptim0626 May 4, 2021
60e2505
Handled IO of HyperSpy signal class as metadata
ptim0626 May 5, 2021
0a4466c
Updated relevant part of user guide
ptim0626 May 6, 2021
92ff84a
Updated CHANGES.rst
ptim0626 May 6, 2021
d9cc3cf
Updated user guide to show difference of dataset_keys and dataset_paths
ptim0626 May 27, 2021
1cb4ff3
Moved the change to towncrier
ptim0626 May 27, 2021
b19417b
Merge branch 'RELEASE_next_minor' into nexus_reader_improvement
ptim0626 May 27, 2021
2a206a5
Deprecated plural arguments in NeXuS IO
ptim0626 Jun 1, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGES.rst
Expand Up @@ -13,7 +13,7 @@ New
* Add `filter_zero_loss_peak` argument to the `spikes_removal_tool` method (`#1412 <https://github.com/hyperspy/hyperspy/pull/1412>`_)
* Add `vacuum_mask` method for EELSSpectrum (`#2183 <https://github.com/hyperspy/hyperspy/pull/2183>`_)
* Plot overlayed images (`#2599 <https://github.com/hyperspy/hyperspy/pull/2599>`_)
* Support for reading JEOL EDS data (`#2488 <https://github.com/hyperspy/hyperspy/pull/2488>`_,
* Support for reading JEOL EDS data (`#2488 <https://github.com/hyperspy/hyperspy/pull/2488>`_,
`#2607 <https://github.com/hyperspy/hyperspy/pull/2607>`_, `#2620 <https://github.com/hyperspy/hyperspy/pull/2620>`_)
* Add `height` property to the `Gaussian2D` component (`#2688 <https://github.com/hyperspy/hyperspy/pull/2688>`_)

Expand All @@ -23,6 +23,7 @@ Enhancements
* Read Cathodoluminescence metadata (`#2590 <https://github.com/hyperspy/hyperspy/pull/2590>`_)
* Document reading Attolight data with the sur/pro format reader. (`#2559 <https://github.com/hyperspy/hyperspy/pull/2559/files>`_)
* mpfit cleanup (`#2494 <https://github.com/hyperspy/hyperspy/pull/2494>`_)
* NeXus file with more options when reading and writing (`#2725 <https://github.com/hyperspy/hyperspy/pull/2725>`_)

API changes
-----------
Expand Down
26 changes: 21 additions & 5 deletions doc/user_guide/io.rst
Expand Up @@ -1361,8 +1361,10 @@ some additional loading arguments are provided.
Extra loading arguments
+++++++++++++++++++++++

- ``dataset_keys``: ``None``, ``str`` or ``list`` of strings - Default is ``None`` . Absolute path(s) or string(s) to search for in the path to find one or more datasets.
- ``dataset_keys``: ``None``, ``str`` or ``list`` of strings - Default is ``None`` . String(s) to search for in the path to find one or more datasets.
- ``dataset_paths``: ``None``, ``str`` or ``list`` of strings - Default is ``None`` . Absolute path(s) to search for in the path to find one or more datasets.
ericpre marked this conversation as resolved.
Show resolved Hide resolved
- ``metadata_keys``: ``None``, ``str`` or ``list`` of strings - Default is ``None`` . Absolute path(s) or string(s) to search for in the path to find metadata.
- ``skip_array_metadata``: ``bool`` - Default is False. Option to skip loading metadata that are arrays to avoid duplicating loading of data.
- ``nxdata_only``: ``bool`` - Default is False. Option to only convert NXdata formatted data to signals.
- ``hardlinks_only``: ``bool`` - Default is False. Option to ignore soft or External links in the file.
- ``use_default``: ``bool`` - Default is False. Only load the ``default`` dataset, if defined, from the file. Otherwise load according to the other keyword options.
Expand Down Expand Up @@ -1391,16 +1393,16 @@ hdf datasets will be returned as signals:

.. code-block:: python

>>> sig = hs.load("sample.nxs",nxdata_only=False)
>>> sig = hs.load("sample.nxs", nxdata_only=False)

We can load a specific dataset using the ``dataset_keys`` keyword argument.
We can load a specific dataset using the ``dataset_paths`` keyword argument.
Setting it to the absolute path of the desired dataset will cause
the single dataset to be loaded:

.. code-block:: python

>>> # Loading a specific dataset
>>> hs.load("sample.nxs", dataset_keys='/entry/experiment/EDS/data')
>>> hs.load("sample.nxs", dataset_paths="/entry/experiment/EDS/data")
ptim0626 marked this conversation as resolved.
Show resolved Hide resolved

We can also choose to load datasets based on a search key using the
``dataset_keys`` keyword argument. This can also be used to load NXdata not
Expand Down Expand Up @@ -1431,6 +1433,12 @@ Metadata can also be filtered in the same way using ``metadata_keys``:

The Nexus loader removes any NXdata blocks from the metadata.

Metadata that are arrays can be skipped by using ``skip_array_metadata``:

.. code-block:: python

>>> # Load data while skipping metadata that are arrays
>>> hs.load("sample.nxs", skip_array_metadata=True)

Nexus files also support parameters or dimensions that have been varied
non-linearly. Since HyperSpy Signals expect linear variation of parameters /
Expand All @@ -1439,7 +1447,8 @@ replaced with indices.
Nexus and HDF can result in large metadata structures with large datasets within the loaded
original_metadata. If lazy loading is used this may not be a concern but care must be taken
when saving the data. To control whether large datasets are loaded or saved,
use the ``metadata_keys`` to load only the most relevant information.
use the ``metadata_keys`` to load only the most relevant information. Alternatively,
set ``skip_array_metadata`` to ``True`` to avoid loading those large datasets in original_metadata.


Writing
Expand All @@ -1450,6 +1459,7 @@ function.
Extra saving arguments
++++++++++++++++++++++
- ``save_original_metadata``: ``bool`` - Default is True, option to save the original_metadata when storing to file.
- ``skip_metadata_keys``: ``bool`` - ``None``, ``str`` or ``list`` of strings - Default is ``None``. Option to skip certain metadata keys when storing to file.
- ``use_default``: ``bool`` - Default is False. Set the ``default`` attribute for the Nexus file.

.. code-block:: python
Expand Down Expand Up @@ -1477,6 +1487,12 @@ The original_metadata can be omitted using ``save_original_metadata``.

>>> sig.save("output.nxs", save_original_metadata=False)

If only certain metadata are to be ignored, use ``skip_metadata_keys``:

.. code-block:: python

>>> sig.save("output.nxs", skip_metadata_keys=['xsp3', 'solstice_scan'])

To save multiple signals, the file_writer method can be called directly.

.. code-block:: python
Expand Down