Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Stdlib.Float module #1638

Merged
merged 10 commits into from Mar 16, 2018
Merged

Add Stdlib.Float module #1638

merged 10 commits into from Mar 16, 2018

Conversation

nojb
Copy link
Contributor

@nojb nojb commented Feb 27, 2018

Let the floodgates open! 😄

Some remarks:

  • Most names are the same, except that for analogy with the other numeric modules I renamed: abs_float -> Float.abs, mod_float -> Float.rem, string_of_float -> Float.to_string, etc.
  • Do we want to deprecate the "old" Pervasives float functions?

See also #1010, #964, #944, #1294, #1354.

Comments welcome!

@yallop
Copy link
Member

yallop commented Feb 27, 2018

It would be good to have a Float.Array submodule like this:

module Array : sig
  type t = floatarray
  external create : int -> t = "caml_floatarray_create"
  external length : t -> int = "%floatarray_length"
  external get : t -> int -> float = "%floatarray_safe_get"
  external set : t -> int -> float -> unit = "%floatarray_safe_set"
  external unsafe_get : t -> int -> float = "%floatarray_unsafe_get"
  external unsafe_set : t -> int -> float -> unit = "%floatarray_unsafe_set"
end

and to make Array.Floatarray an alias for this module.

See previous discussion under the "Configure-time option for float array optimization" PR.

(** [modf f] returns the pair of the fractional and integral
part of [f]. *)

type t = float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a nice convention if such module related to a specific type would come with the type t = .... alias as their first exposed component, perhaps immediately followed by common functions (compare, equal).

Let's also add Float.hash.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in modules such as Int32 or String they currently appear at the end.
Perhaps the convention you suggest should be the subject of another PR,
to be applied to all relevant modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also add Float.hash.

I think this will require that I duplicate the definition of Hashtbl.hash to avoid a circular dependency (Array -> Float -> Hashtbl -> Array).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashtbl.hash is a one-line wrapper around a primitive, so the duplication is fine (there are plenty of other cases of such duplication in the stdlib).

And IMO it would also be ok to drop the dependency from Array to Float (by duplicating the current definition of the module, marking Array.Float as deprecated, and only extending Float.Array going forward).

@nojb
Copy link
Contributor Author

nojb commented Feb 27, 2018

It would be good to have a Float.Array submodule like this:

This is now done.

stdlib/float.mli Outdated
a negative integer if [x] is less than [y], and a positive integer
if [x] is greater than [y]. The ordering implemented by [compare]
is compatible with the comparison predicates [=], [<] and [>]
defined above, with one difference on the treatment of the float value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these predicates be exposed in Float (it seems they aren't, currently)? If so, perhaps in a sub-mode Floats.Ops?

Also, concerning the equality, I always found it unfortunate that there is no easy way to create a "Leibniz-equality" for floats, which is needed to implement memoization correctly (for instance). compare x y = 0 is also good, but it doesn't make the difference between positive and negative, which behave differently (for x -> 1/x).

Would it be a good time to add such a function (or just the associated total ordering)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we rely on a another compare function, that would compare
float values by first converting them into int64 ones (bit patterns)?

Copy link
Contributor Author

@nojb nojb Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far there seem to be 3 notions of comparison which are useful in different situations:

  • Pervasives.compare: total order, useful for using with Map.Make, etc.
  • Pervasives.(<): IEEE 754 comparison, can detect if x is nan with x <> x.
  • "bitwise", "Leibniz" or "physical"-equality: useful for memoization, etc.

Why not expose all of them? We need to decide on suitable names. And we need to choose one of them for equal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: bitwise would differentiate nan with different encodings, while I'm not sure we can make the difference otherwise (well, except inspecting the bits).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want a total order as our main comparison function, and this should be the only kind of comparison exposed cleanly.

For nan detection, an is_nan function makes more sense to me than needing any kind of comparison. Bitwise comparison is a niche and should have an appropriately specific name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want a total order as our main comparison function, and this should be the only kind of comparison exposed cleanly.

The question is which one? The current one is fine for many uses, but it collapses -0. and 0. which makes it dangerous to use in some cases. But a refined version with -0. < 0. would not be suitable in all cases either. Both are total.

For nan detection, we already have classify_float (or x = x).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to see more orders exposed and documented, but there is no doubt in my mind that (Float.compare x y) must have the exact same semantics as (Pervasives.compare x y) when x, y are floats.

stdlib/float.mli Outdated
*)

val equal: t -> t -> bool
(** The equal function for floating-point numbers. *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If would be useful to document where this is compare x y = 0 or x = y.

@@ -555,14 +555,14 @@ bytecomp/emitcode.cmo : bytecomp/translmod.cmi typing/primitive.cmi \
bytecomp/opcodes.cmo utils/misc.cmi bytecomp/meta.cmi \
parsing/location.cmi bytecomp/lambda.cmi bytecomp/instruct.cmi \
typing/ident.cmi typing/env.cmi utils/config.cmi bytecomp/cmo_format.cmi \
utils/clflags.cmi typing/btype.cmi parsing/asttypes.cmi \
bytecomp/emitcode.cmi
utils/clflags.cmi bytecomp/bytegen.cmi typing/btype.cmi \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Just to be sure: I assume this change is unrelated to this PR, right?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess it is a side-effect of make depend.

external modf : float -> float * float = "caml_modf_float"
type t = float
external compare : float -> float -> int = "%compare"
let equal x y = compare x y = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Float.equal and Pervasives.(=) will behave differently, since

# nan = nan;;
- : bool = false
# Pervasives.compare nan nan;;
- : int = 0

If that's the desired behaviour it'd be good to have a comment in the documentation for Float.equal

@xclerc
Copy link
Contributor

xclerc commented Feb 27, 2018

Would it make sense to add functions for conversion to/from bits?
(`Int{32,64}.{bits_of_float,float_of_bits})

And also maybe conversion to/from integer types
({Nativeint,Int32,Int64}.{of,to}_float).

@nojb
Copy link
Contributor Author

nojb commented Feb 27, 2018

Would it make sense to add functions for conversion to/from bits?
(`Int{32,64}.{bits_of_float,float_of_bits})

And also maybe conversion to/from integer types
({Nativeint,Int32,Int64}.{of,to}_float).

Possibly, but in the existing numeric modules the conversion functions are only put in one of the sides, not in both (e.g. there is no Int32.to_int64), so this is a change that could be the subject of a different PR and that would affect existing numeric modules as well.

@xclerc
Copy link
Contributor

xclerc commented Feb 27, 2018

Sorry, it wasn't clear, I was actually suggesting to only add
thing to the new Float module:

  • Float.to_bits32 : t -> int32;
  • Float.of_bits32 : int32 -> t;
  • Float.to_bits64 : t -> int64;
  • Float.of_bits64 : int64 -> t;
  • Float.to_int32 : t -> int32;
  • Float.of_int32 : int32 -> t;
  • Float.to_int64 : t -> int64;
  • Float.of_int64 : int64 -> t;
  • Float.to_nativeint : t -> nativeint;
  • Float.of_nativeint : nativeint -> t.

@nojb nojb added the stdlib label Feb 27, 2018
@alainfrisch
Copy link
Contributor

Cf also #1354.

@alainfrisch alainfrisch self-assigned this Feb 28, 2018
@alainfrisch
Copy link
Contributor

So many nice other PRs depends on this new Float module; we don't want to delay it too much.

So let's keep the discussed additions (conversions with int32/int64; other ordering/equality; etc) for later, and focus instead on polishing the existing PR (documentation, dropping dependency between Float and Array, etc).

@nojb
Copy link
Contributor Author

nojb commented Feb 28, 2018

I removed the dependency between Array and Float, added a note to the doc of Float.equal specifying that it uses compare as comparison function and added Float.hash.

I also rebased to avoid cluttering the commit history with make depend and bootstrapping commits.

stdlib/float.ml Outdated
@@ -85,6 +85,8 @@ external modf : float -> float * float = "caml_modf_float"
type t = float
external compare : float -> float -> int = "%compare"
let equal x y = compare x y = 0
external seeded_hash_param : int -> int -> int -> float -> int = "caml_hash" [@@noalloc]
let hash x = seeded_hash_param 10 100 0 x
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Later, it could be useful to create an ad hoc runtime primitive for Float.hash, with an unboxed form.)

(* *)
(* OCaml *)
(* *)
(* Xavier Leroy, projet Cristal, INRIA Rocquencourt *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the line above!

val to_string : float -> string
(** Return the string representation of a floating-point number. *)

type fpclass = Pervasives.fpclass =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be Stdlib.fpclass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but got into trouble with ocamldoc:

../../byterun/ocamlrun ../../ocamlc -nostdlib -nopervasives -I  -c -open Pervasives float.mli
File "float.mli", line 109, characters 15-29:
Error: Unbound module Stdlib
Makefile:17: recipe for target 'float.cmi' failed
make[4]: *** [float.cmi] Error 2
make[4]: Leaving directory '/home/nojebar/ocaml/ocamldoc/stdlib_non_prefixed'
Makefile.unprefix:109: recipe for target '../ocamldoc/stdlib_non_prefixed/pervasives.cmi' failed
make[3]: *** [../ocamldoc/stdlib_non_prefixed/pervasives.cmi] Error 2

Any ideas how to fix this? @diml ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -I -c is odd, it might be because of -I $(HERE) in the Makefile, not sure where this is defined. @Octachron I believe you wrote this line, do you know where $(HERE) is defined?

Copy link
Member

@Octachron Octachron Feb 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nowhere. The whole flag -I $(HERE) should be removed. Nevertheless, this does not affect the issue at hand: ocamldoc simply knows nothing about a Stdlib when building the documentation and is only aware of the extracted Pervasives module. Moreover, ocamldoc limited support of module type of means that it would not be able to link to Stdlib.fpclass even after building the Stdlib documentation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, indeed. I don't see a simple solution to this problem, we could sed s/Stdlib/Pervasives/ when copying the mli files. Thought it's slightly not satisfactory that the documentation will mention pervasives rather than stdlib

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so we keep Pervasives, and people who insist on deprecating this module will find a solution ;)

@alainfrisch
Copy link
Contributor

LGTM (except perhaps the minor comment about Pervasives vs Stdlib). We can always add components or refine the doc later.

I'll merge soon if nobody objects.

@Octachron
Copy link
Member

A minor point: the new module is missing an entry in the documentation manual/manual/libref/stdlib.etex, should I take care of this point after the merge?

@nojb
Copy link
Contributor Author

nojb commented Feb 28, 2018

A minor point: the new module is missing an entry in the documentation manual/manual/libref/stdlib.etex, should I take care of this point after the merge?

I will add one, thanks!

@@ -62,6 +62,7 @@ from being garbage-collected \\
\subsubsection*{Arithmetic:}
\begin{tabular}{lll}
"Complex" & p.~\pageref{Complex} & Complex numbers \\
"Float" & p.~\pageref{Float} & Floating-point numbers \\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You missed the index part starting from line 100 for the html version of the manual, and line 147 for the latex version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, thanks! Should be fixed now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The manual looks good now, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be explicitly said that it is Double precision floating-point numbers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe. In http://caml.inria.fr/pub/docs/manual-ocaml/core.html#sec547, type float is described as being the type of "floating-point numbers" though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but there is a chance that people find the module Float before the type float.

Copy link
Contributor

@dbuenzli dbuenzli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the PR left adding interesting constants like pi or e, also signature-wise it could be made more IntX like (e.g. adding zero, one).

floating-point number greater than [1.0]. *)

external of_int : int -> float = "%floatofint"
(** Convert an integer to floating-point. *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to add a constant max_exact_int and indicate that integers in the range [-max_exact_int;max_exact_int] will be represented exactly. This constant is also useful e.g. when you serialize integers to json numbers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but perhaps for a future addition. As simple as such an addition seems, there will be space for discussion, such as:

  • The suggested naming is confusing since of course this is not the maximum integer represented as a float (that would be max_float).

  • What would be the type of this value? float? int? int64?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was not convinced by the name I gave in Gg.Float and tried max_exact_int which is silly. Regarding the type of the value it can't be int (which can be 31-bits) and I don't really see why this should be an int64, it's a particular value of that type, you can convert it to something else if you need.

@alainfrisch
Copy link
Contributor

It seems that the PR left adding interesting constants like pi or e, also signature-wise it could be made more IntX like (e.g. adding zero, one).

Indeed. About pi, there is #964. The goal here is to make Float enter Stdlib rather quickly, without real additions compared to existing features (to make the PR consensual) in order to unlock many other PRs that are pending on it.

stdlib/float.mli Outdated
val min_float : float
(** The smallest positive, non-zero, non-denormalized value of type [float]. *)

val epsilon_float : float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the naming of these constants:

  • Should that be just named epsilon?

  • For max_float/min_float, one could use max_value (and later add Int32.max_value as an alias for Int32.max_int, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion on epsilon_float. For the others, I would leave it for a different PR that addresses all numeric modules at once.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, min_value is really a bad name for min_float.)

Copy link
Member

@Chris00 Chris00 Mar 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think it should simply be epsilon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about smallest or lowest or the more explicit smallest_pos instead of min_float?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I renamed it epsilon.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a nice thing about epsilon_float is the similarity with float.h (where it is DBL_EPSILON).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Float.epsilon is as similar — moreover, I do not think we aim to mimic C.


external div : float -> float -> float = "%divfloat"
(** Floating-point division. *)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add prefix (~+, ~-) and infix operators (+., -., *., /., **). The idea is that if these operators are redefined by another module, one can locally open this one to recover the usual behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely something that needs to be considered - but I think it is better to do it in a different PR that addresses all numeric modules at the same time.

BTW, wouldn't it make sense to use (+) instead of (+.) now that we have a separate module for it ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could but it would make heavier to mix integer and floating points in the same expression — a situation that occurs quite often.

ordering relation. *)

val equal: t -> t -> bool
(** The equal function for floating-point numbers, compared using {!compare}. *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think specialized versions of min and max should be provided (clearly describing their behavior on NaN — maybe several versions are required: one that ignores NaN, another one that returns NaN as soon as one is present).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for [equal] is obscure. What about:

(** [equal x y] compares [x] and [y] for equality.  Unlike standard equality on floating-point numbers, [equal] treats [nan] as equal to itself and different from any other floating-point value.  This treatment of [nan] ensures that [equal] defines an equivalence relation.  [equal x y] is equivalent to [compare x y = 0].  *)

@Chris00
Copy link
Member

Chris00 commented Mar 12, 2018

I would drop the f since this is due to the poor man module system (hum) of C. For classify_float, I would prefer Float.classify.

external get : t -> int -> float = "%floatarray_safe_get"
external set : t -> int -> float -> unit = "%floatarray_safe_set"
external unsafe_get : t -> int -> float = "%floatarray_unsafe_get"
external unsafe_set : t -> int -> float -> unit = "%floatarray_unsafe_set"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be more operations such as blit, append,...?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current version is just a copy of the previous Array.Floatarray. We can extend it later.

@bluddy
Copy link
Contributor

bluddy commented Mar 12, 2018

I would drop the f since this is due to the poor man module system (hum) of C. For classify_float, I would prefer Float.classify.

Exactly. Trying to match C is silly -- they're dealing with a global namespace. The whole point here is that people can automatically guess the names of functions they need without looking them up because they're the same name as non-float functions, except in the Float module. They need to think about what the function does -- not about arcane prefixes or suffixes.

@alainfrisch
Copy link
Contributor

I suggest the following set of changes:

  • classify_float -> classify
  • drop the prefixes for Float.fpclass (Normal|Subnormal|Zero|Infinite|Nan)
  • drop modf: the naming is not so clear, and one might want to return a record with 2 (unboxed) floats instead for performance reasons; let's keep that for later
  • drop rem/fmod, unless a consensus emerges on the naming

@alainfrisch
Copy link
Contributor

The current state LGTM.

Does anyone want to suggest other changes? (Additions can be left for later.)

@alainfrisch
Copy link
Contributor

@nojb Let's merge this one! Can you rebase/fix the conflicts?

@nojb
Copy link
Contributor Author

nojb commented Mar 15, 2018

Can you rebase/fix the conflicts?

Done.

@alainfrisch
Copy link
Contributor

Ok, hopefully the CI will finish before the next conflict pops up.

@trefis
Copy link
Contributor

trefis commented Mar 15, 2018

Just out of curiosity: why is a bootstrap needed for this change?

@nojb
Copy link
Contributor Author

nojb commented Mar 15, 2018

Just out of curiosity: why is a bootstrap needed for this change?

Good catch, it is not! I guess it was needed at some intermediate stage of the PR (or at least I thought it was) and it was left there after that.

I have removed the bootstrap commit.

@nojb nojb closed this Mar 15, 2018
@nojb nojb reopened this Mar 15, 2018
@nojb nojb merged commit 4724e39 into ocaml:trunk Mar 16, 2018
@nojb nojb deleted the float_module branch March 16, 2018 05:22
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet