Multi-part uncertainties for numbers #24

josephwright · 2013-05-06T08:07:26Z

[Slightly edited from an e-mail from Roberrt Riemann]

In physics you use sometimes the following syntax to indicate different types of errors: 2149 ± 46 ± 51. It would be nice to get that behaviour with something like

\num{10.55(18)(16)}

The text was updated successfully, but these errors were encountered:

josephwright · 2013-05-06T08:09:23Z

Looking at this again, I feel that this is really beyond the scope of siunitx: I cannot find an example of this in anywhere I've looked. It can be achieved by hand:

\num[parse-numbers  = false ]{10.55 \pm 0.18 \pm 0.16}

and so I am closing this WONTFIX.

josephwright · 2013-05-06T08:10:00Z

I now have some examples for this: page 10 of http://www.springerlink.com/content/545u2ml70u605x42/ and page 13 of http://www.springerlink.com/content/ml21044675647532/. There seem to be three cases:

Numbers with a statistic error only, given as$(1.23 \pm 4.5)$\,pb
Numbers with an asymmetric error only, given as $\left( 1.23 \substack{+4.4 \\ -5.5} \right)$\,pb
Numbers with both types of error, given as $\left( 1.23 \pm 4.5 (\text{stat.}) \substack{+4.4 \\ -5.5} (\text{sys.}) \right)$\,pb

josephwright · 2013-05-06T08:10:44Z

I'm still struggling to come up with an interface for this. I think this is the sort of thing that would be best covered by my plans for v3, where I'd like to increase the separation of parts within the package to make 'pluggable' extension easier.

joleroi · 2017-09-07T15:39:20Z

I came here from this rather old post on stackexchange. Has there been any development towards multi-part or asymetric uncertainties?

josephwright · 2017-09-07T15:47:58Z

I've considered it by am still not that happy: the internals one needs to cope with such values are very complex and for almost all use cases are not needed. There's therefore a performance hit which I'm not keen on, plus a lot of work for me at the 'back end'. I'm also concerned that it ends up mixing concepts: the (...) value in siunitx has always conceptually been an uncertainty.

alexshpilkin · 2018-06-13T18:29:49Z

@josephwright For an example, try any recent results report in hep-ex. Literally the most recent submitted paper mentions “0.67 ± 0.18 (stat) ± 0.05 (syst)”. [If there were more than one result of this kind in the paper, the (stat) and (syst) labels would most probably appear “out of band” in the surrounding text.] I recently looked at an astrophysics paper that had four uncertainties (perturbative calculation, numerical modelling, averaging, instrument).

Note that while it is perhaps correct to say (about this as well as about #273) that the people who request such advanced features are perhaps a minority, the number of such uses might not be that insignificant (the RPP alone is about two thousand pages). And they care a lot. I mean, I’d like to recommend siunitx to my experimenter friends as the typographically correct way, but I can’t, because it doesn’t satisfy their needs.

(To be honest, the 123(4) notation also looks rather specialized to me—even more specialized—, but that might just be my bias towards scientific rather than engineering literature.)

josephwright · 2018-06-13T18:54:49Z

@alexshpilkin Sure, the amount of use of a particular style of output is hard to judge: my impression to date is that multiple uncertainties are common in astrophysics, but not elsewhere. (The only examples I've ever been sent are from that area.)

On the 123(4) format, it's common enough to be mentioned in the BIPM documentation (https://www.bipm.org/en/publications/si-brochure/section5-3-5.html): ultimately that's the reference for SI units, and so for siunitx.

I've not closed the issue precisely because I know it's important. At the moment, I'm imagining I'll need to look at a swap-out parser, etc. (I've already got to cover complex numbers, exponents, multi-part numbers, ...: it's a tricky mix!)

alexshpilkin · 2018-06-13T19:14:13Z

@josephwright Well, the paper I referenced is in high-energy (collider) physics (which is where I first encountered this as well). The th/model/stat/syst split is actually quite common there when the experiment is complex enough (and each of these parts may well be asymmetrical as per #273, except the statistical one).

As to the 123(4) notation, well, SI itself is essentially an engineering system—in the sense that it’s best at dealing with mostly everyday values, so e.g. chemistry also counts as engineering here. (The most frequent sources of the parenthetical notation in my experience are actually chemists, with tables of constants in the second place.) It’s not a bad thing, it’s just useful to know what informed its design (and, it seems, documentation) and understand its limits.

I hear you on parsing in TeX, it’s surprisingly painful for what’s essentially a macro language. It’s not a simple problem you solve, and if you don’t consider this issue to be unimportant, then I’m fine just pointing you at arXiv’s hep-ex as another source of examples.

josephwright · 2018-06-14T08:33:40Z

@alexshpilkin Hmm, the need or at least possibility of 'open ended' lists of uncertainties is itself a bit tricky. I wonder if I can come up with some 'container' syntax, for example multi-part-uncertainty = true, which then allows a 'pluggable' parser just for that part. I could go with something like [ ... ] for such multi-part uncertainties:

\num{1.2[\pm 1.8(stat) \pm 2.1 (syst)]}

I'd then need an interface for creating 'sub parsers' and 'sub printers' for such things: would address my concerns over ordering. Still looks a bit awkward but it might be workable.

josephwright · 2018-06-21T10:14:38Z

Carrying forward some ideas form #273 (closed as a duplicate of this question), there area essentially three things which need to be done here:

Ensure that the internal number format has flexibility in the nature of an uncertainty
Provide one or more parsers for the various types of multi-part uncertainty
Provide printing routines for the various types of multi-part uncertainty

I'll use this issue as a 'meta' one, and open specific issues for each of those ideas.

josephwright · 2018-06-29T08:31:08Z

I've now implemented the necessary data storage in v3: see #342. Writing print routines will likely be easy enough, so those might also get done for v3.0. The issue will be parsers: I'm currently thinking of perhaps having uncertainty-mode and using that to determine what type of uncertainty to look for.

maxnoe · 2019-03-18T17:57:10Z

I think most of us will be vey happy with a new macro, that should make it much easier to come up with a good interface. E.g.

\SIAsymUncert{value}{lower}{upper}{unit}

BoostCookie · 2020-08-21T06:07:47Z

Because you've closed #273 and #342 I'm writing here regarding asymmetric uncertainties. Because the symmetric uncertainty can be parsed as \num{number+-uncert} I think the assymetric uncertainty should be parsed as \num{number+upper-lower}.

Phidica · 2020-11-18T01:15:27Z

If there is still sense in making suggestions about the syntax, then to mirror the \num{123(4)} style, which produces 123 ± 4, I would like to suggest extending what can appear inside of the parentheses. I guess I'll just show some examples of what I'm imagining:

\num{123(4,5,6)}                             -->   123 ± 4 ± 5 ± 6
\num{123(+4,-5)}                             -->   123^{+4}_{-5}
\num{123(+1,-2[stat],5[syst],6[any text])}   -->   123^{+1}_{-2} (stat) ± 5 (syst) ± 6 (any text)

By using a comma separated list we can enumerate any number of uncertainty sources. One downside is with ensuring that asymmetric uncertainties are always properly defined and have exactly one component with a + and one with a -.

Anyway, I don't know whether this kind of syntax parsing is easy or incredibly difficult with the package as it stands. Just wanted to voice how this feature looks in the ideal world of my imagination :p

josephwright · 2021-04-26T18:06:30Z

@Phidica An interesting idea and perhaps one to pursue, although I'm not 100% sure about trying to freely mix symmetrical and asymmetrical uncertainties (I need to have some internal structures to print things).

josephwright · 2021-04-26T18:17:05Z

To update everyone, my current plan is to take small steps. I'm going I think with uncertainty-type (I need uncertainty-mode elsewhere, and in a sense it doesn't quite fit the other mode uses, which are more output-oriented).

The plan then is to take small steps. The first 'new' type of uncertainty I think will be a single-asymmetric one, so something like uncertainty-type = single-symmetrical for the current approach and uncertainty-type = single-asymmetrical for the 12.3+4-5 type. I don't fancy 'auto-detection' between the two. That would probably mean an input syntax 12.3(4)(5) would be hard-coded as equivalent to 12.3 +4 -5 in this case. Output then can be a straight copy of the input or 12.3^{+4}_{-5}.

I can then look at more open-ended types. @Phidica's suggestion for non-bounded lists is interesting, but I do wonder if that's common. It's also a lot easier at the internal level if I know how many components I'm handling. I wonder if the stat/sys split needs to have free text in the input, or could be covered by uncertainty-parts with then uncertainty-type = named-symmetrical or named-unsymmetrical (number of parts required then taken from uncertainty-parts). I guess that depends on whether the same names always turn up: do I need to cope with 'This value has a sys and a stat, this value only has a stat'?

For those interested, the internal format at the moment uses {S}{nnn} to represent the symmetrical value. I'm thinking of {A}{{nn}{mm}} for a single asymmetrical, then {S2}{{nnn}{nnn}} for a two-part symmetrical, etc. That way internally the code won't care about the naming: they are just 'a list in order'.

maxfl · 2021-04-27T07:19:43Z

stat/syst is not the only possibility for the uncertainties. I've met following cases:

triplets of stat/syst/theory were used;
asymmetric stat and syst uncertainties, here;
in case of the error budget estimation, the groups may be arbitrary (detector, background, etc). It worth noting that in case of number of uncertainties is larger then 3 they are usually typeset in a table.
different labels are used in papers: stat, stat., syst, syst., (stat), (stat.), etc.

My personal impression is that split in 2-3 groups is used most often. The labels vary.

Hope this helps.

josephwright · 2022-03-25T07:14:21Z

I'm working on an implementation for this area. Looking again at the parser problem, I suspect @Phidica's idea slightly modified is best. I'm imagining

\num{1.23(+1:-2;5;6)}

which will result in something like

1.23 \substack{+0.01 \\ -0.02} \pm 0.05 \pm 0.06

or similar. I think 'labelled' uncertainties are best handled by having an option uncertainty-classes or similar, so if that is set then we take the label from there

\num[uncertaint-classes = sys;stat]{1.23(+1:-2;5;6)}

or

\num[uncertaint-classes = {sys,stat}]{1.23(+1:-2;5;6)}

That leaves open how to best give the uncertainty parts. One might use the approach I've suggested above or might prefer

\num{1.23(+0.01:-0.02;\pm0.05;\pm0.06)}

perhaps then allowing a 'mix'

\num{1.23(+0.01:-0.02;5;\pm0.06)}

where with no leading \pm the uncertainty is treated like the current bracketed ones (given in the last places).

I'll probably try to come up with something for beta testing over the next couple of weeks.

maxfl · 2022-03-25T07:30:37Z

I like the proposed solution with no \pm, but the mixture is also ok.

Phidica · 2022-03-25T10:58:37Z

Having the flexibility for either syntax seems good for different user preferences. Controlling the labels with an option is also a good, clean approach that I like.

What were you thinking should happen if only one "class" name has been set in the preamble? Would it show up on all uncertainties, even simple (ie, single-part) ones? Or should there need to be at least two class names set, by definition of the circumstance of needing multi-part uncertainties?

josephwright · 2022-03-25T11:55:45Z

Having the flexibility for either syntax seems good for different user preferences. Controlling the labels with an option is also a good, clean approach that I like.

I have the parser code to build on, so I hope I can pull this off - it's a question of tracking the data internally correctly.

What were you thinking should happen if only one "class" name has been set in the preamble? Would it show up on all uncertainties, even simple (ie, single-part) ones? Or should there need to be at least two class names set, by definition of the circumstance of needing multi-part uncertainties?

I was thinking something like this

If there is a single uncertainty (either a symmetrical or an asymmetrical), ignore any classes - so 1.23 \pm 0.04 prints the same as now
If there are multiple uncertainties, take the 'labels' in order, so with uncertainty-classes = sys;stat and 1.23(4;5;6) you'd get 1.23 \pm 0.04 \, (sys) \pm 0.05 \, (stat) \pm 0.06, i.e. if there are more uncertainty classes than labels, the remaining values are anonymous

(Implied there is some setting to decide how to format the classes)

josephwright · 2022-03-26T10:29:56Z

Continuing to think, I'm not keen on \num{1.23(\pm0.04;+0.05:-0.06) as that confuses the existing 'short' and 'long' syntaxes. So I think it needs to be \num 1.23 \pm 0.04 + 0.05 - 0.06 or \num{1.23(4;5:-6} or similar. The only question then is for the 'short' form is it better to have \num{1.23(4)(+5:-6) or \num{1.23(4;+5:-6)} or ... I'm thinking the second form, i.e. (...) is 'the entire uncertainty part'. I think overall I do want +...:-... explicitly for asymmetric uncertainties.

Phidica · 2022-03-26T10:48:26Z

I will say that mentally parsing the difference between the semicolons and colons when they're all in one big set of parentheses does take some focus, I think. In practice I'd probably be wanting to put whitespace around them so I can read them in my code. More fully "encapsulating" each uncertainty part in a different set of parentheses seems a lot more readable at a glance without needing to pad them out with spaces, if you're set on keeping the colon as the asymmetric separator (and I do like it for that). It also still feels fairly consistent with the existing design: one set of parentheses = one \pm uncertainty, therefore more parentheses in sequence = more uncertainty parts.

josephwright · 2022-03-28T12:37:50Z

I've closed sub-issue #344 with working code. What I don't have there yet is an interface for adjusting how multi-part uncertainties are printed. Probably I will do that after sorting extending the parser, at which point users can test.

For the present, if you want to check out the new code as far as it works, try something like

\documentclass{article}
\usepackage{siunitx}
\begin{document}
\ExplSyntaxOn
% One "A" uncertainty, one "S" one
% "A" = +75:-80, "S" = 15
% Likely input syntax \num{123.456(75:80)(15)}
\tl_set:Nn \l_tmpa_tl
  { { } { } { 123 } { 456 } { {AS} { {75} {80} } {15} } { }{ 0 } }
\exp_args:Nx \siunitx_print_number:n
  { \siunitx_number_output:N \l_tmpa_tl }
\ExplSyntaxOff
\end{document}

josephwright · 2022-04-03T11:31:10Z

I have

\documentclass{article}
\usepackage{siunitx}
\begin{document}
\num[uncertainty-descriptors = {sys,stat}]{1.23(4)(5)}
\end{document}

working. Next is likely the 1.23 \pm 0.04 \pm 0.05 format, then I'll look at asymmetrical values (I have the formatting all ready, it's just the parsing).

josephwright · 2022-04-05T09:54:47Z

The parser for 1.23 \pm 0.04 \pm 0.05 is now sorted. I'm now going to tidy up some aspects of that before even thinking about asymmetric values. In particular, I realise that one needs to worry about ambiguous number detection, which means I likely can't simply ignore uncertainty-mode.

josephwright · 2022-04-22T12:04:31Z

I am pushing to v3.2 for the asymmetrical aspect: I want to get some real usage of the multi-part symmetrical system first.

ghost assigned josephwright May 6, 2013

josephwright mentioned this issue Jun 14, 2018

Asymmetrical uncertainties and round-mode w/ omit-uncertainty #273

Closed

josephwright added this to the v3.1 milestone Jun 21, 2018

This was referenced Jun 21, 2018

Extend internal number format to allow varied uncertainty/tolerance/error types #342

Closed

Parser(s) for mutli-part numbers #343

Closed

Print routines for multi-part uncertainties #344

Closed

josephwright mentioned this issue Apr 26, 2021

"compact" uncertainty format is incorrect #371

Closed

josephwright modified the milestones: v3.1, v3.2 Apr 22, 2022

josephwright modified the milestones: v3.2, v3.3 Jan 2, 2023

josephwright mentioned this issue Jul 23, 2023

Support asymmetry uncertainties/tolerances #675

Open

josephwright closed this as completed Jul 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-part uncertainties for numbers #24

Multi-part uncertainties for numbers #24

josephwright commented May 6, 2013

josephwright commented May 6, 2013

josephwright commented May 6, 2013

josephwright commented May 6, 2013

joleroi commented Sep 7, 2017

josephwright commented Sep 7, 2017

alexshpilkin commented Jun 13, 2018

josephwright commented Jun 13, 2018

alexshpilkin commented Jun 13, 2018 •

edited

josephwright commented Jun 14, 2018

josephwright commented Jun 21, 2018

josephwright commented Jun 29, 2018

maxnoe commented Mar 18, 2019

BoostCookie commented Aug 21, 2020

Phidica commented Nov 18, 2020 •

edited

josephwright commented Apr 26, 2021

josephwright commented Apr 26, 2021 •

edited

maxfl commented Apr 27, 2021

josephwright commented Mar 25, 2022

maxfl commented Mar 25, 2022 •

edited

Phidica commented Mar 25, 2022

josephwright commented Mar 25, 2022

josephwright commented Mar 26, 2022

Phidica commented Mar 26, 2022 •

edited

josephwright commented Mar 28, 2022

josephwright commented Apr 3, 2022

josephwright commented Apr 5, 2022

josephwright commented Apr 22, 2022

Multi-part uncertainties for numbers #24

Multi-part uncertainties for numbers #24

Comments

josephwright commented May 6, 2013

josephwright commented May 6, 2013

josephwright commented May 6, 2013

josephwright commented May 6, 2013

joleroi commented Sep 7, 2017

josephwright commented Sep 7, 2017

alexshpilkin commented Jun 13, 2018

josephwright commented Jun 13, 2018

alexshpilkin commented Jun 13, 2018 • edited

josephwright commented Jun 14, 2018

josephwright commented Jun 21, 2018

josephwright commented Jun 29, 2018

maxnoe commented Mar 18, 2019

BoostCookie commented Aug 21, 2020

Phidica commented Nov 18, 2020 • edited

josephwright commented Apr 26, 2021

josephwright commented Apr 26, 2021 • edited

maxfl commented Apr 27, 2021

josephwright commented Mar 25, 2022

maxfl commented Mar 25, 2022 • edited

Phidica commented Mar 25, 2022

josephwright commented Mar 25, 2022

josephwright commented Mar 26, 2022

Phidica commented Mar 26, 2022 • edited

josephwright commented Mar 28, 2022

josephwright commented Apr 3, 2022

josephwright commented Apr 5, 2022

josephwright commented Apr 22, 2022

alexshpilkin commented Jun 13, 2018 •

edited

Phidica commented Nov 18, 2020 •

edited

josephwright commented Apr 26, 2021 •

edited

maxfl commented Mar 25, 2022 •

edited

Phidica commented Mar 26, 2022 •

edited