Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math in the <dim> element #1084

Open
prjemian opened this issue Jun 14, 2022 · 12 comments
Open

math in the <dim> element #1084

prjemian opened this issue Jun 14, 2022 · 12 comments
Assignees
Milestone

Comments

@prjemian
Copy link
Contributor

Some NXDL files use mathematics involving one or more symbols to describe a <dim /> element.

<dim index="1" value="nDarkFrames + nBrightFrames + nSampleFrame" />

<dim index="1" value="numtof + 1"/>

<dim index="1" value="numtimechannels + 1"/>

@prjemian prjemian added the NIAC should review The NIAC should review/discuss label Jun 14, 2022
@prjemian prjemian added this to the NXDL 2023.06 milestone Jun 14, 2022
@prjemian
Copy link
Contributor Author

It was suggested at the 2022 Code Camp that such math be expressed in a single symbol (in the symbol table) and that symbol be used in the <dim /> element. Such as:

	<symbols>
		<doc>symbolic array lengths to be coordinated between various fields</doc>
		<symbol name="nDarkFrames "><doc>number of dark frames </doc></symbol>
		<symbol name="nBrightFrames "><doc>number of bright frames </doc></symbol>
		<symbol name="nSampleFrame"><doc>number of sample frame </doc></symbol>
		<symbol name="nTotalFrames">
		  <doc>total number of frames</doc>
		  <math>nDarkFrames + nBrightFrames + nSampleFrame</math>
		</symbol>
	</symbols>

then later

    <dim index="1" value="nTotalFrames" />

@prjemian
Copy link
Contributor Author

We would add a math element to the symbols element in the nxdl.xsd file. How should the math element be described (both in documentation and how it is prescribed in the XML Schema)?

@prjemian
Copy link
Contributor Author

For example, that the math expression can be interpreted as javascript and the variables must be defined in the symbol table.

@prjemian
Copy link
Contributor Author

prjemian commented Sep 16, 2022

@mkoennecke asked:

cnxvalidate is written in C, is there a library it can use?

@yayahjb suggests:

https://github.com/samthor/gumnut

@RussBerg
Copy link
Contributor

RussBerg commented Sep 16, 2022

A C library for processing math strings in EPICS is one used in the calc record
https://epics.anl.gov/EpicsDocumentation/AppDevManuals/RecordRef/Recordref-13.html

scrolling down a bit it discusses supported operators and some examples

@prjemian
Copy link
Contributor Author

The EPICS library likely needs additional work to use in this context.

@RussBerg
Copy link
Contributor

RussBerg commented Sep 16, 2022

next code camp I could look at how much work this would be

@benajamin
Copy link
Contributor

benajamin commented Sep 16, 2022

Proposal is to explore the use of javascript syntax for mathematical expressions in NXDL symbol tables (and elsewhere in NeXus) and encourage the production of a technical demonstration.

@PeterC-DLS
Copy link
Contributor

Other candidates for a parser/interpreter:

@yayahjb
Copy link
Contributor

yayahjb commented Sep 16, 2022

Sounds worth exploring

@benajamin benajamin removed the NIAC should review The NIAC should review/discuss label Sep 17, 2022
@PeterC-DLS
Copy link
Contributor

@sanbrock
Copy link
Contributor

Also mentioned this topic on the Code Camp in relationship to #1271 where it is suggested that an NX_CHAR metadata filed with a purpose of describing relationships between other data objects shall follow a specific syntax (provided by an EBNF grammar).

Please note that we do not introduce any formal language to the NXDL in #1271 , but only encourage the use of text entries to become easier to read by machine (which may be less and less important with the evolution of AI).

What I mean:

  • It is important to separate 2 different topics which I do not think we have properly differentiated between in the past:
    (a) elements in NXDL (to an ontology language describing concepts/definitions and their relationships)
    (b) elements in a NeXus data file.
    E.g.
    Energy - sdkjfsdkfsdf an array of ...
    Threshold - sdfkjsdhfks an array of …
    type - dskjhfksjdhf one of "kinetic, binding"
    where
    rule1: length of Energy must be the same as Threshold (this, we somehow manage with symbol tables)
    rule 2 : Energy type must be either “kinetic" or “binding” (this, we somehow manage using docstrings and enumerations)
    rule 3a : Threshold values must be the half of the Energy values (this, we only manage via docstrings, and sometimes we require references to a software (as in NXprocess) which did the conversion/calculation e.g. calibration)
    rule 3b : Threshold is calculated from Energy following … (less specific, we do not tell the exact relationship) (this, we also only support in docstrings, and can allow to add a separate enumeration, description, NXnote, NXprocess to specify how this has been done)

Hence, some rules are formal part of the definitions expressed in NXDL and are expected to be interpreted, verified by a generic(!) NXDL tool, while others are only expressed in docstrings and expected to be understood only(!) by specific software coded after understanding the docsrting.

  • Indeed, we had discussions also on introducing math e.g. to dim, to symbols, to … (e.g. math in the <dim> element #1084) where the question was what syntax (language) NeXus (namely NXDL) could incorporate into its own definitions. So we can add expressions inside nxdl.xml files, which shall be machine readable. As a consequence ANY generic NXDL interpreter/verifier should interpret and evaluate these expressions. Indeed this bring a security whole and has to be carefully considered because we are speaking about a generic solution which effect the whole of NeXus.
  • On the other hand here, we do NOT propose any new element into NXDL. The NXDL is the same as above, only in the docstring we tell that certain (meta)DATA items should be formulated according to some specifications. This we have done on a lot all around where we specify how an array should be filled, what the meanings of the number should be, chemical_formula shall follow Hills system, etc.
  • Here, we particularly suggest in a specific docstring to use a human readable way of documenting how certain calculations has been performed.
    Knowing the convention, people can (write code to) interpret and run such calculation to verify, reproduce the data stored in the NeXus data file. It is exactly the same, when in NXprocess someone is referencing a software which can then also be run by the people. Note that developers of the data acquisition software CAMELS also use NXnote to store the data acquisition protocol (a Bluesky script following python syntax) directly in the data file. NeXus never forbid referencing or storing software as data artefact.
  • Note that we have not proposed a specific language with a given syntax (like python. js, …) to be introduced and interpreted in NXDL, but rather used a language description language (EBNF grammar https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form) to specify the syntax how data should be stored. (The same as telling in the docstring that you have to list elements from the Periodic Table, etc.)

One more interesting point:

Note that here we do not use, but define a syntax, and so we rule the space what can be expressed by it.
Although we have not proposed, the adoption of the use of EBNF in NXDL can be also discussed if it would be a good choice, so next to the existing XSD rules, all NXDL definitions could also have certain parts which must be compliant with certain EBNF rules. Indeed this is a much safer way then using a generic programming language as we define the syntax and we implement its interpretation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

7 participants