Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify how NXDL versions relate to application definitions #1038

Open
kalcutter opened this issue Mar 18, 2022 · 22 comments
Open

Clarify how NXDL versions relate to application definitions #1038

kalcutter opened this issue Mar 18, 2022 · 22 comments

Comments

@kalcutter
Copy link
Contributor

kalcutter commented Mar 18, 2022

We are seeking clarification on how NXDL versions relate to application definitions. Does a particular NXDL version imply fixed versions of each application definition? For example: Does NXDL v2020.10 imply NXmx 1.0? Could it ever make sense to use a newer/older NXDL version with a particular version of an application definition?

We are writing NeXus data files and may let the user choose the application definitions (e.g. NXmx vs. NXsas). If we end supporting multiple versions of said application definitions, what is the recommended way to present this to the user?

For example:
a) Nexus 2020.10 NXmx
b) NXmx 1.0
c) Nexus 2020.10 NXmx 1.0

Is simply specifying (a) or (b) enough? Or are there arguments for presenting it as (c)? Please provide guidance about which option (a)-(c) would be preferred.

@prjemian
Copy link
Contributor

Great question! I consider the repository version number as being the set of all the different NXDL versions at the time when the repository version was tagged. That set includes the nxdl.xsd and nxdlTypes.xsd files, too.

Your (a) case describes which version of the repository and the corresponding version of NXmx that was included in the release. (The various rules for NXmx will also depend on the various items in the nxdl.xsd schema file and that might be dependent on the repository version).

If the application definition was updated since the repository version and if the new version was tagged, then (c) would be used to clarify.

Option (b) does not describe the version of the nxdl.xsd schema to be used so it seems incomplete to me.

@prjemian
Copy link
Contributor

Does NXDL v2020.10 imply NXmx 1.0?

Yes.

@prjemian
Copy link
Contributor

Could it ever make sense to use a newer/older NXDL version with a particular version of an application definition?

Only if these versions existed together in default branch of the repository at the same time. The validating programs will look at a specific snapshot of the repository (tagged version, commit hash, ...).

@prjemian
Copy link
Contributor

Examples of different repository snapshots:

============== =================== =======
NXDL reference date & time         commit
============== =================== =======
a4fd52d        2016-11-19 01:07:45 a4fd52d
v3.3           2017-07-12 10:41:12 9285af9
Schema-3.4     2018-05-15 08:24:34 aa1ccd1
v2018.5        2018-05-15 16:34:19 a3045fd
main           2021-12-17 13:09:18 041c2c0
============== =================== =======

@prjemian
Copy link
Contributor

Despite those examples, best to use one of the releases (in the set above, either v3.3 or v2018.5). Latest release is v2020.10.

@kalcutter
Copy link
Contributor Author

kalcutter commented Mar 18, 2022

@prjemian Thanks so much for the quick reply!

Great question! I consider the repository version number as being the set of all the different NXDL versions at the time when the repository version was tagged. That set includes the nxdl.xsd and nxdlTypes.xsd files, too.

Let me make sure I understand what you're saying. So by repository version "number" you mean any specific commit including a tagged version like v2020.10. So for a specific tag say v2020.10 what are the NXDL "versions"? I guess you mean the versions of the application definitions? I ask because NXDL_VERSION is a single file containing the same version as the tag.

Also, related to this, what is the correct nomenclature to use? You have a file NXDL_VERSION which sounds like it is the version of the NX "definition language" itself. I guess we should just say "nexus version" (for the tagged release)?

@kalcutter
Copy link
Contributor Author

Your (a) case describes which version of the repository and the corresponding version of NXmx that was included in the release. (The various rules for NXmx will also depend on the various items in the nxdl.xsd schema file and that might be dependent on the repository version).

If the application definition was updated since the repository version and if the new version was tagged, then (c) would be used to clarify.

Understood. On the other hand, it seems a bit problematic to use a non-released nexus version (e.g. with (c)). If various items in nxdl.xsd change and the next version of some application definition depend on these items before the next official nexus release then (c) would kind of specify two contradicting points in time? For example, if tomorrow NXmx is bumped to 2.0 and we talk about Nexus 2020.10 NXmx 2.0 then we are kind of specifying both (1) the state of the repository at tag v2020.10 and (2) the state of the repository at tag NXmx-2.0?

@kalcutter
Copy link
Contributor Author

kalcutter commented Mar 18, 2022

In this light, one could be argue that (b) is actually less problematic than (c) for the example above? (b) would simply specify the state of the repository at the corresponding application definition version tag (e.g. tag NXcanSAS-1.1). This could still be problematic, however, if the application definition continued to evolve after getting its latest version tag up until the next official nexus release. Is this allowed to happen?

A definite issue with (b) is that some application definitions have had their version numbers go backwards: NXmx, for example, used to be version 1.4 and now the latest version is 1.0.

@kalcutter
Copy link
Contributor Author

Also, what is the policy for changing the versions of application definitions? For example, if a new field/attribute is added to some related class definition, does this warrant a version bump of an application definition? Or do version changes only signify changes that are incompatible in some way?

@woutdenolf
Copy link
Contributor

woutdenolf commented Jun 30, 2022

For example, if tomorrow NXmx is bumped to 2.0 and we talk about Nexus 2020.10 NXmx 2.0 then we are kind of specifying both

Thinking out loud here.

If you think about it as a python programmer, our NXDL version (the one in the NXDL_VERSION file) is the version of our language and our libraries. In analogy, suppose

  • NXDL is python
  • the NeXus standard is a collection of python libraries numpy, scipy, pandas, ... (each NeXus class is like a python library)
  • a NeXus HDF5 file is like a python program that uses numpy, scipy, pandas, ...

Then our NXDL version is the python version and at the same time the numpy/scipy/pandas version.

Each NXDL version gets released under releases: v2020.1, v2020.10, ... In our analogy, this release contains the python source code (.xsd) and the numpy source code (NXmx.nxdl.xml), scipy source code (NXdetector.nxdl.xml) etc, like the repo does but then snapshot at a certain version.

In our analogy, you don't want python and numpy/scipy/pandas to have the same version. So now the confusion starts. We have

  1. /NXroot@NeXus_version: this refers to the NXDL version. In our analogy, that would be like specifying the python and numpy/scipy/pandas version in the requirements.txt file of our project dependencies. So for example python==v2020.1, numpy==v2020.1, scipy==v2020.1 ...
  2. /NXentry/definition@version: in the docs it says NXDL version number. If it really is the NXDL version, then that would be the same as /NXroot@NeXus_version. But I guess it is meant to be the version of the application definition, not the NXDL version? So a version like 1.0, 1.1, ...

Other NeXus classes that mention a version field other that the ones above (I'm not including NXprocess and others classes that refer to the version of a program or something, nothing to do with NeXus standard)

  1. /NXcanSAS/ENTRY@version
  2. /NXmx/ENTRY@version
  3. /NXapm/ENTRY@version
  4. /NXem/ENTRY@version

So in our analogy these would clearly be the version of numpy, scipy, pandas etc. So 1.0, 1.1, ... Some say SHA256 hashvalue of the file that specifies the application definition. I'm note sure what that means or how such a version looks like. I guess it means that if a single letter or space in the nxdl.xml file changes, the version changes. Which seems a bit strict. Imo we should use semantic versioning for this.

So to summarize using our analogy:

  1. /NXroot@NeXus_version is like the requirements.txt file of our program saying python==v2020.1, numpy==v2020.1, scipy==v2020.1, ...
  2. /NXentry/definition@version this is either the same as the previous one or it is the version of numpy/scipy/pandas. So for example numpy==1.0. So that would solve the problem of python and numpy having the same version.
  3. /NXcanSAS/ENTRY@version this like the version of numpy. So for example numpy==1.0.
  4. /NXapm/ENTRY@version this like the version of scipy. So for example scipy==1.0.
  5. /NXapm/ENTRY@version this like the version of pandas. Not sure what the SHA256 comment is about.
  6. /NXem/ENTRY@version this like the version of matplotlib. Not sure what the SHA256 comment is about.
  7. NXDL versions are released under releases
  8. The version of application definitions (the numpy/scipy/pandas version in the analogy) are tags that are not releases: NXcanSAS-1.1, NXtomo-2.0,, ...

Note that NXtomo does not mention any version, which leads me to believe that /NXentry/definition@version is NOT the NXDL version, it is the version of the NeXus class (the numpy/scipy/pandas version in the analogy).

Also note that like scipy depends on numpy, NXtomo depends on NXentry. However there is currently no version for base classes so this issue does not arise.

Of course another analogy is to look at NeXus classes as classes in object oriented programming. That's useful when talking about inheritance but not when talking about versioning.

All this has evolved historically and probably will evolve in the future.

@prjemian
Copy link
Contributor

NIAC2022 suggested this become an issue for a future code camp.

@padraic-shafer
Copy link
Contributor

So, is the consensus so far that we should rely on the NX release-date version, the application version, some hybrid of these, or an as yet to-be-defined version ID?

@prjemian
Copy link
Contributor

prjemian commented Oct 11, 2022 via email

@woutdenolf
Copy link
Contributor

woutdenolf commented Oct 11, 2022

That contradict NXcanSAS-1.1, NXtomo-2.0, ... But maybe we don't do those anymore?

@woutdenolf
Copy link
Contributor

woutdenolf commented Oct 11, 2022

The problem is that the NXDL version covers much more than "the version of the NeXus Definition Language".

It's like python 3.7.11 pins numpy to 1.19.4. So in our case the version of python (NXDL) pins the version of numpy (the application definition).

So from that pov I would agree with @prjemian : use the NXDL version.

@padraic-shafer
Copy link
Contributor

I always hope that these reaction emojis get communicated to you. In any case, I want to say thank you for the quick feedback. :)

@woutdenolf
Copy link
Contributor

woutdenolf commented Jun 20, 2023

Proposal for an nxdl_version attribute to NXroot, NXentry and NXsubentry (code camp 2023)

Ping @sanbrock @PeterC-DLS @mkuehbach

nxdl_version

@woutdenolf
Copy link
Contributor

As for the deployment of NXDL versions: we could put it here:http://definition.nexusformat.org/nxdl/
(see #1031 (comment))

@g-guenther
Copy link

g-guenther commented Dec 6, 2023

Concerning the proposal of @woutdenolf, I would like to have a version attribute at the NXroot level because

  1. it is logical to define the used standard at the top level of your file, e.g. before looking into a NXentry group.
  2. the definition field of NXentry is optional and, thus, you couldn't refer to a NXDL version when your data doesn't meet an application definition. This may be related to **NXsubentry:** The use of NXsubentry could be fleshed out. #1124.

@prjemian
Copy link
Contributor

prjemian commented Dec 7, 2023

definition field of NXentry is optional

The definition field is used for one purpose, to state the application definition being used. It is required when the application definition is used in the NXentry or NXsubentry group.

@prjemian
Copy link
Contributor

prjemian commented Dec 7, 2023

logical to define the used standard at the top level of your file

You can do this now. Why must it be part of the standard?

@g-guenther
Copy link

You can do this now. Why must it be part of the standard?

It would be machine-readable. If I add something that is not part of the standard it is in the best case human-readable, in the worst case only intelligible to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants