New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Stdlib.Float module #1638
Add Stdlib.Float module #1638
Conversation
It would be good to have a module Array : sig
type t = floatarray
external create : int -> t = "caml_floatarray_create"
external length : t -> int = "%floatarray_length"
external get : t -> int -> float = "%floatarray_safe_get"
external set : t -> int -> float -> unit = "%floatarray_safe_set"
external unsafe_get : t -> int -> float = "%floatarray_unsafe_get"
external unsafe_set : t -> int -> float -> unit = "%floatarray_unsafe_set"
end and to make See previous discussion under the "Configure-time option for float array optimization" PR. |
(** [modf f] returns the pair of the fractional and integral | ||
part of [f]. *) | ||
|
||
type t = float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a nice convention if such module related to a specific type would come with the type t = ....
alias as their first exposed component, perhaps immediately followed by common functions (compare, equal).
Let's also add Float.hash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, in modules such as Int32
or String
they currently appear at the end.
Perhaps the convention you suggest should be the subject of another PR,
to be applied to all relevant modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add Float.hash.
I think this will require that I duplicate the definition of Hashtbl.hash
to avoid a circular dependency (Array
-> Float
-> Hashtbl
-> Array
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hashtbl.hash is a one-line wrapper around a primitive, so the duplication is fine (there are plenty of other cases of such duplication in the stdlib).
And IMO it would also be ok to drop the dependency from Array to Float (by duplicating the current definition of the module, marking Array.Float as deprecated, and only extending Float.Array going forward).
This is now done. |
stdlib/float.mli
Outdated
a negative integer if [x] is less than [y], and a positive integer | ||
if [x] is greater than [y]. The ordering implemented by [compare] | ||
is compatible with the comparison predicates [=], [<] and [>] | ||
defined above, with one difference on the treatment of the float value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these predicates be exposed in Float (it seems they aren't, currently)? If so, perhaps in a sub-mode Floats.Ops
?
Also, concerning the equality, I always found it unfortunate that there is no easy way to create a "Leibniz-equality" for floats, which is needed to implement memoization correctly (for instance). compare x y = 0
is also good, but it doesn't make the difference between positive and negative, which behave differently (for x -> 1/x).
Would it be a good time to add such a function (or just the associated total ordering)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rely on a another compare function, that would compare
float values by first converting them into int64 ones (bit patterns)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far there seem to be 3 notions of comparison which are useful in different situations:
Pervasives.compare
: total order, useful for using withMap.Make
, etc.Pervasives.(<)
: IEEE 754 comparison, can detect ifx
isnan
withx <> x
.- "bitwise", "Leibniz" or "physical"-equality: useful for memoization, etc.
Why not expose all of them? We need to decide on suitable names. And we need to choose one of them for equal
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: bitwise would differentiate nan with different encodings, while I'm not sure we can make the difference otherwise (well, except inspecting the bits).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want a total order as our main comparison function, and this should be the only kind of comparison exposed cleanly.
For nan
detection, an is_nan
function makes more sense to me than needing any kind of comparison. Bitwise comparison is a niche and should have an appropriately specific name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want a total order as our main comparison function, and this should be the only kind of comparison exposed cleanly.
The question is which one? The current one is fine for many uses, but it collapses -0. and 0. which makes it dangerous to use in some cases. But a refined version with -0. < 0. would not be suitable in all cases either. Both are total.
For nan detection, we already have classify_float (or x = x
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to see more orders exposed and documented, but there is no doubt in my mind that (Float.compare x y)
must have the exact same semantics as (Pervasives.compare x y)
when x, y
are floats.
stdlib/float.mli
Outdated
*) | ||
|
||
val equal: t -> t -> bool | ||
(** The equal function for floating-point numbers. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If would be useful to document where this is compare x y = 0
or x = y
.
@@ -555,14 +555,14 @@ bytecomp/emitcode.cmo : bytecomp/translmod.cmi typing/primitive.cmi \ | |||
bytecomp/opcodes.cmo utils/misc.cmi bytecomp/meta.cmi \ | |||
parsing/location.cmi bytecomp/lambda.cmi bytecomp/instruct.cmi \ | |||
typing/ident.cmi typing/env.cmi utils/config.cmi bytecomp/cmo_format.cmi \ | |||
utils/clflags.cmi typing/btype.cmi parsing/asttypes.cmi \ | |||
bytecomp/emitcode.cmi | |||
utils/clflags.cmi bytecomp/bytegen.cmi typing/btype.cmi \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just to be sure: I assume this change is unrelated to this PR, right?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I guess it is a side-effect of make depend
.
external modf : float -> float * float = "caml_modf_float" | ||
type t = float | ||
external compare : float -> float -> int = "%compare" | ||
let equal x y = compare x y = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Float.equal
and Pervasives.(=)
will behave differently, since
# nan = nan;;
- : bool = false
# Pervasives.compare nan nan;;
- : int = 0
If that's the desired behaviour it'd be good to have a comment in the documentation for Float.equal
Would it make sense to add functions for conversion to/from bits? And also maybe conversion to/from integer types |
Possibly, but in the existing numeric modules the conversion functions are only put in one of the sides, not in both (e.g. there is no |
Sorry, it wasn't clear, I was actually suggesting to only add
|
Cf also #1354. |
So many nice other PRs depends on this new Float module; we don't want to delay it too much. So let's keep the discussed additions (conversions with int32/int64; other ordering/equality; etc) for later, and focus instead on polishing the existing PR (documentation, dropping dependency between Float and Array, etc). |
I removed the dependency between I also rebased to avoid cluttering the commit history with |
stdlib/float.ml
Outdated
@@ -85,6 +85,8 @@ external modf : float -> float * float = "caml_modf_float" | |||
type t = float | |||
external compare : float -> float -> int = "%compare" | |||
let equal x y = compare x y = 0 | |||
external seeded_hash_param : int -> int -> int -> float -> int = "caml_hash" [@@noalloc] | |||
let hash x = seeded_hash_param 10 100 0 x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Later, it could be useful to create an ad hoc runtime primitive for Float.hash, with an unboxed form.)
(* *) | ||
(* OCaml *) | ||
(* *) | ||
(* Xavier Leroy, projet Cristal, INRIA Rocquencourt *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the line above!
val to_string : float -> string | ||
(** Return the string representation of a floating-point number. *) | ||
|
||
type fpclass = Pervasives.fpclass = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be Stdlib.fpclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but got into trouble with ocamldoc
:
../../byterun/ocamlrun ../../ocamlc -nostdlib -nopervasives -I -c -open Pervasives float.mli
File "float.mli", line 109, characters 15-29:
Error: Unbound module Stdlib
Makefile:17: recipe for target 'float.cmi' failed
make[4]: *** [float.cmi] Error 2
make[4]: Leaving directory '/home/nojebar/ocaml/ocamldoc/stdlib_non_prefixed'
Makefile.unprefix:109: recipe for target '../ocamldoc/stdlib_non_prefixed/pervasives.cmi' failed
make[3]: *** [../ocamldoc/stdlib_non_prefixed/pervasives.cmi] Error 2
Any ideas how to fix this? @diml ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -I -c
is odd, it might be because of -I $(HERE)
in the Makefile, not sure where this is defined. @Octachron I believe you wrote this line, do you know where $(HERE)
is defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nowhere. The whole flag -I $(HERE)
should be removed. Nevertheless, this does not affect the issue at hand: ocamldoc simply knows nothing about a Stdlib
when building the documentation and is only aware of the extracted Pervasives module. Moreover, ocamldoc limited support of module type of
means that it would not be able to link to Stdlib.fpclass even after building the Stdlib documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, indeed. I don't see a simple solution to this problem, we could sed s/Stdlib/Pervasives/
when copying the mli files. Thought it's slightly not satisfactory that the documentation will mention pervasives rather than stdlib
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so we keep Pervasives, and people who insist on deprecating this module will find a solution ;)
LGTM (except perhaps the minor comment about Pervasives vs Stdlib). We can always add components or refine the doc later. I'll merge soon if nobody objects. |
A minor point: the new module is missing an entry in the documentation |
I will add one, thanks! |
@@ -62,6 +62,7 @@ from being garbage-collected \\ | |||
\subsubsection*{Arithmetic:} | |||
\begin{tabular}{lll} | |||
"Complex" & p.~\pageref{Complex} & Complex numbers \\ | |||
"Float" & p.~\pageref{Float} & Floating-point numbers \\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You missed the index part starting from line 100 for the html version of the manual, and line 147 for the latex version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, thanks! Should be fixed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manual looks good now, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be explicitly said that it is Double precision floating-point numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe. In http://caml.inria.fr/pub/docs/manual-ocaml/core.html#sec547, type float
is described as being the type of "floating-point numbers" though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure but there is a chance that people find the module Float
before the type float
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the PR left adding interesting constants like pi
or e
, also signature-wise it could be made more IntX like (e.g. adding zero
, one
).
floating-point number greater than [1.0]. *) | ||
|
||
external of_int : int -> float = "%floatofint" | ||
(** Convert an integer to floating-point. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to add a constant max_exact_int
and indicate that integers in the range [-max_exact_int;max_exact_int]
will be represented exactly. This constant is also useful e.g. when you serialize integers to json numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but perhaps for a future addition. As simple as such an addition seems, there will be space for discussion, such as:
-
The suggested naming is confusing since of course this is not the maximum integer represented as a float (that would be
max_float
). -
What would be the type of this value? float? int? int64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was not convinced by the name I gave in Gg.Float
and tried max_exact_int
which is silly. Regarding the type of the value it can't be int
(which can be 31-bits) and I don't really see why this should be an int64
, it's a particular value of that type, you can convert it to something else if you need.
Indeed. About pi, there is #964. The goal here is to make Float enter Stdlib rather quickly, without real additions compared to existing features (to make the PR consensual) in order to unlock many other PRs that are pending on it. |
stdlib/float.mli
Outdated
val min_float : float | ||
(** The smallest positive, non-zero, non-denormalized value of type [float]. *) | ||
|
||
val epsilon_float : float |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the naming of these constants:
-
Should that be just named
epsilon
? -
For
max_float
/min_float
, one could usemax_value
(and later add Int32.max_value as an alias for Int32.max_int, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on epsilon_float
. For the others, I would leave it for a different PR that addresses all numeric modules at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also, min_value is really a bad name for min_float.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think it should simply be epsilon
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about smallest
or lowest
or the more explicit smallest_pos
instead of min_float
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I renamed it epsilon
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a nice thing about epsilon_float is the similarity with float.h
(where it is DBL_EPSILON).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Float.epsilon
is as similar — moreover, I do not think we aim to mimic C.
|
||
external div : float -> float -> float = "%divfloat" | ||
(** Floating-point division. *) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also add prefix (~+
, ~-
) and infix operators (+.
, -.
, *.
, /.
, **
). The idea is that if these operators are redefined by another module, one can locally open this one to recover the usual behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely something that needs to be considered - but I think it is better to do it in a different PR that addresses all numeric modules at the same time.
BTW, wouldn't it make sense to use (+) instead of (+.) now that we have a separate module for it ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could but it would make heavier to mix integer and floating points in the same expression — a situation that occurs quite often.
ordering relation. *) | ||
|
||
val equal: t -> t -> bool | ||
(** The equal function for floating-point numbers, compared using {!compare}. *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think specialized versions of min
and max
should be provided (clearly describing their behavior on NaN — maybe several versions are required: one that ignores NaN, another one that returns NaN as soon as one is present).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation for [equal] is obscure. What about:
(** [equal x y] compares [x] and [y] for equality. Unlike standard equality on floating-point numbers, [equal] treats [nan] as equal to itself and different from any other floating-point value. This treatment of [nan] ensures that [equal] defines an equivalence relation. [equal x y] is equivalent to [compare x y = 0]. *)
I would drop the |
external get : t -> int -> float = "%floatarray_safe_get" | ||
external set : t -> int -> float -> unit = "%floatarray_safe_set" | ||
external unsafe_get : t -> int -> float = "%floatarray_unsafe_get" | ||
external unsafe_set : t -> int -> float -> unit = "%floatarray_unsafe_set" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be more operations such as blit
, append
,...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current version is just a copy of the previous Array.Floatarray. We can extend it later.
Exactly. Trying to match C is silly -- they're dealing with a global namespace. The whole point here is that people can automatically guess the names of functions they need without looking them up because they're the same name as non-float functions, except in the Float module. They need to think about what the function does -- not about arcane prefixes or suffixes. |
I suggest the following set of changes:
|
The current state LGTM. Does anyone want to suggest other changes? (Additions can be left for later.) |
@nojb Let's merge this one! Can you rebase/fix the conflicts? |
Done. |
Ok, hopefully the CI will finish before the next conflict pops up. |
Just out of curiosity: why is a bootstrap needed for this change? |
Good catch, it is not! I guess it was needed at some intermediate stage of the PR (or at least I thought it was) and it was left there after that. I have removed the bootstrap commit. |
Let the floodgates open! 😄
Some remarks:
abs_float -> Float.abs
,mod_float -> Float.rem
,string_of_float -> Float.to_string
, etc.Pervasives
float functions?See also #1010, #964, #944, #1294, #1354.
Comments welcome!