# OCaml lacks a proper implementation of min and max for floats #7892

Open
opened this Issue Jan 3, 2019 · 6 comments

Projects
None yet
2 participants
Collaborator

### vicuna commented Jan 3, 2019

Original bug ID: 7892
Reporter: sfuric
Status: new
Resolution: open
Priority: normal
Severity: minor
Version: 4.07.0
Category: language features
Monitored by: @nojb @shindere @yakobowski

## Bug description

There is an old issue (#5781) concerning IEEE 754 conformance of min and max with float arguments. Many advanced floating point users would expect min and max functions to have this behavior

min nan neg_infinity = min neg_infinity nan = neg_infinity

and

max nan infinity = max infinity nan = infinity

The rationale is: if partial evaluation of a two-argument float function yields a constant one-argument function for all floats but nans, then it should also return the same value when evaluated with nan. Indeed, nan is not only a value representing erroneous calculations; it also represents "absence of value". This explains the current (and correct, i.e., expected) behavior of ** in OCaml:

1.0 ** nan = 1.0

and

nan ** 0.0 = 1.0

Having a separate Float module is an opportunity to provide correct implementations of min and max that do not interfere with polymorphic min and max provided by Pervasives.

Collaborator Author

### vicuna commented Jan 3, 2019

 Comment author: @dbuenzli This has been added in #1794 and will be part of 4.08. See the {min,max}_num functions in that PR.
Collaborator Author

### vicuna commented Jan 4, 2019

 Comment author: sfuric I was not taking about the {min, max}_num functions but about the minm and maxm functions as Kahan (the "father" of IEEE 754) calls them. He says: "maxm(+Infinity, NaN) must be +Infinity, NOT NaN, to honor the rule that if a function f(x, y) has the property for some value X that f(X, y) is independent of y , be it finite or infinite, then that f(X, NaN) must be the same as f(X, y) .". Notice that [min_num 1.0 nan] returns [1.0] while [min 1.0 nan] returns [nan]. I incorrectly used "absence of value" above instead of "inability to deliver a meaningful result", sorry, my fault. I had a look at the current implementation of {min, max} in Float and I disagree with the treatment of NaNs. It does not follow Kahan recommendation (however I like the proper treatment of [+0] and [-0], despite it is not mandatory, since it honors [f (min x y)] <= [f (max x y)] when [f] is monotically increasing). A last remark: many comments in Float say "[nan] is returned". This is a bit confusing IMO because it gives the wrong impression of a unique [nan] value, which goes against the idea of having NaNs possibly containing user defined error codes (there are reserved bits for that purpose). IMO documentation should follow IEEE 754 recommendations by saying that the same NaN is returned in case it is unique and if returning [nan] is appropriate, and by saying that any of the NaNs is returned in case both arguments are NaNs (IEEE 754 does not say more however it is possible the return the greater error code in order to honor "commutativity" when appropriate).
Collaborator Author

### vicuna commented Jan 4, 2019

 Comment author: @dbuenzli I don't understand what your objections are here. Basically depending on context you want two kinds of min,max function: Those that treat NaN as a problem in your computation and propagate the NaNs. If that is the case you can use Float.{min,max}. Those that treat NaN as absence of value ({min,max}Num in IEEE 754). If that is the case you can use Float.{min,max}_num. What is exactly bothering you here, the names, the implementations ? Regarding your last remark the dev team decided it was not worth the trouble bothering end users with NaN subtelties see #4948
Collaborator Author

### vicuna commented Jan 4, 2019

 Comment author: sfuric Yes, depending on context I want two kinds of {min,max} functions. But I don't want to choose between one which ALWAYS returns a NaN whenever one of its arguments is a Nan and one which NEVER returns a NaN whenever one of its arguments is not a NaN. The "correct" (i.e. expected by advanced users) behavior, as explained in my first note (partial evaluation of a two-argument function) and in my last note (see quotation from Kahan) is: [min neg_infinity x] = [min x neg_infinity] = [neg_infinity] WHATEVER THE VALUE OF [x], otherwise the result is the one returned by the current implementation of [Float.min]. [max infinity x] = [max x infinity] = [infinity] WHATEVER THE VALUE OF [x], otherwise the result is the one returned by the current implementation of [Float.max]. This is clearly a highly questionable choice, at least from our functional programmer point of view, but this is an expectation from many advanced users (see the behavior of ** in my first note). Regarding the remark about "NaN subtelties", this is really a pity to choose the wrong option since implementation is correct (e.g. it does what IEEE 754 suggests)...
Collaborator Author

### vicuna commented Jan 4, 2019

 Comment author: @dbuenzli Okay so that's yet another possible definition and I can see the point of the definition (for reference the full Kahan quote seems to stem from here [1]). However you mention "this is an expectation from many advanced users" but as far as I remember the things we reviewed during GPR1794 no one (libraries, languages) seemed to have this particular "expected" definition. It was either the 1 or 2 (under the name nan{min,max}) I mentioned. One should be careful in introducing yet another definition since this could introduce subtle result discrepancies when porting numerical algorithms from other languages. Do you have references to systems that actually implement the semantics your mention ?
Collaborator Author

### vicuna commented Jan 7, 2019

 Comment author: sfuric I cannot give you a reference of a system that implements this semantics. This is a typical problem with IEEE 754 software implementation: everybody has a good reason not to follow the spirit of original designers as far as Infs and NaNs are concerned. Also, if your argument is to "follow the masses" then the winner is, by far, the polymorphic {min, max} pair! My suggestion is to follow the convention already adopted for ** (i.e. when partial evaluation yields a constant function, then NaNs are ignored), because it is common and because OCaml already adopted it (BTW this avoids having to remember which convention is followed by which function).