Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve stdlib support for option values #1940

Merged
merged 2 commits into from Aug 8, 2018

Conversation

@dbuenzli
Copy link
Contributor

commented Jul 29, 2018

This PR improves support stdlib support for option values. It adds an Option module to the standard library and the Format.pp_print_option function.

I have reviewed the usual standard library extension libraries. Those tend to define quite a few other functions. I didn't find these particularly enlighting from a code reading point of view: the type being simple, pattern matching on the datatype to indicate what you want to do is often clearer than calling a named function whose behaviour you have to remember in your head. So the definitions are kept to an essential (in my eyes) feature set.

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 23327f9 to 0d8ab5c Jul 29, 2018

@pmetzger

This comment has been minimized.

Copy link
Member

commented Jul 29, 2018

The exact functions to be included (and their names) can be debated, but the usefulness of something like this seems obvious.

@Drup
Copy link
Contributor

left a comment

Thanks for volunteering for doing this one, @dbuenzli , it is highly appreciated. :)

I think the subset of functions you choose is perfectly fine. I would only like two additional ones, iter and to_seq.

(** [some v] is [Some v]. *)

val evict : none:'a -> 'a option -> 'a
(** [evict ~none o] is [none] if [o] is [None] and [v] if [o] is [Some v]. *)

This comment has been minimized.

Copy link
@Drup

Drup Jul 29, 2018

Contributor

On one hand, this name is typical "bunzli", which makes it very amusing. On the other hand, it's completely unhelpful. may is sometimes use for this, but I don't find it particularly clear either. I quite like Container's convention which uses the _or prefix for operation that define a default value, which gives you get_or and map_or.

This comment has been minimized.

Copy link
@rgrinberg

rgrinberg Jul 29, 2018

Member

The signature in cute should also be considered:

value : 'a option -> default:'a -> 'a

In my opinion, both core and containers chose better names here.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 29, 2018

Author Contributor

I did use Option.value in the past but eventually dismissed it: I don't find it compelling since options are also values themselves.

Regarding the label the advantage of using ~none rather than ~default is that you can consistently use it everywhere to denote what happens in this case. E.g. Option.to_result, Format.pp_print_option, or whatever may be added in the future that needs a special argument for None.

This comment has been minimized.

Copy link
@pmetzger

pmetzger Jul 29, 2018

Member

extract perhaps? evict reads very weirdly to a native English speaker, I would assume it would destroy something rather than extracting the wrapped value.

This comment has been minimized.

Copy link
@kennetpostigo

kennetpostigo Jul 29, 2018

evict doesn't make much sense to me either

This comment has been minimized.

Copy link
@yallop

yallop Jul 30, 2018

Member

The Haskell equivalent is called fromMaybe, which suggests from_option or of_option.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 30, 2018

Author Contributor

@yallop doesn't Option.{of,from}_option read very weirdly to you ? It seems at odds with the OCaml M.of conventions.

@bluddy I explained why the argument is none (note that this pattern also works for e.g. result, see the various labellings in the API of rresult) do you suggest:

  1. To be inconsistent and use ~default only in that function and ~none in the others ?
  2. To use ~default in all of these other functions; I doubt that it is a great idea.

If none is mentioned in the name though we get could then label with use, something like:

let v = Option.on_none ~use:x (Map.find_opt k m) 

This comment has been minimized.

Copy link
@yallop

yallop Jul 30, 2018

Member

@dbuenzli: it reads okay to me, although I can see how it seems at odds with the other of functions.

One more possibility (in case there aren't enough under consideration already): the SML basis uses getOPT; we could use Option.get here, and drop the function that raises Invalid_argument.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 30, 2018

Author Contributor

Actually those two (raising and non-raising) are those I miss the most...

This comment has been minimized.

Copy link
@pmetzger

pmetzger Jul 30, 2018

Member

Having a raising and a non-raising version would be good, yes.

As for this:

What is so strange about Option.evict?

To a native speaker, eviction has completely incorrect connotations. One evicts a person who is late on their rent, for example, and the connotation is not of extracting something to examine it but rather of banishment. The word really isn't correct. One is not evicting the value (which sounds like what you would do when removing an entry from a hash table), one is fetching the value. Words like fetch, get, peek, look, examine, extract, etc. come to mind. Evict really means "banish", and you are not banishing the value, you're getting it.

@Armael

This comment has been minimized.

Copy link
Member

commented Jul 29, 2018

I agree with @dbuenzli that in many cases it is enough to just use pattern matching, and most of the time I do not feel the need for many option-related combinators.
There is however one infix operator defined in Batteries which I am quite fond of: (|?) : 'a option -> 'a -> 'a -- I find it nice and natural to use in |> pipelines. I wonder what people think about it? Personally that's maybe the only option combinator I really want to use :).

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Jul 29, 2018

The exact functions to be included (and their names) can be debated, but the usefulness of something like this seems obvious.

@pmetzger concrete contributions to the discussion are welcome but comments like this bring nothing but noise in my opinion. May I ask you to refrain doing them ? github reactions are perfect for this.

@Drup Added Option.to_seq and Option.iter. Regarding evict I think it's a good name, for now I'll let the bike-shedding continue on that one. If you have a strong opinion please make a precise signature proposal so that if that needs to happen we can finally list a few contenders to choose from.

@Armael Except for constructing some values (e.g. {Map,Set}.t), I'm personally not very fond of the point-free pipelining style (especially some people tend to use totally gratuitously, I saw some let y = x |> f in the wild, not kidding). Unless there's a strong desire I'll keep your request out for now.

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch 2 times, most recently from a927b2f to da73324 Jul 29, 2018

@Armael

This comment has been minimized.

Copy link
Member

commented Jul 29, 2018

Fair enough. To be clear, I'm not particularly advocating for the pipelining style -- it's just that when I use it (which still happens sometimes), I find it convenient to have |? to deal with options.

@xavierleroy xavierleroy added the stdlib label Jul 29, 2018

@xavierleroy

This comment has been minimized.

Copy link
Contributor

commented Jul 29, 2018

Adding a back reference to #1920.

let get = function Some v -> v | None -> invalid_arg "option is None"
let map f o = match o with None -> None | Some v -> Some (f v)
let flat_map f o = match o with None -> None | Some v -> f v
let flatten = function Some (Some _ as o) -> o | _ -> None

This comment has been minimized.

Copy link
@xavierleroy

xavierleroy Jul 29, 2018

Contributor

flat_map is also known as monadic bind. Could we make sure this Option module will work with whatever PPX implementing monads we have today? And perhaps with a future, long-overdue "monadic let" syntax?

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 29, 2018

Author Contributor

I can certainly add a bind and a return function if that's desired, but then which signature should be supported ?

A quick search in opam suggest there are already at least four ppxs to this: ocaml-monadic, omonad, ppx_monadic and ppx_let and as far as I can tell the interface needed by the latter is incompatible with the interface of the others.

Couldn't we simply add these name aliases in a Monad submodule once this future, long-overdue "monadic let" syntax lands and/or a proper Monad.S signature has been defined in the stdlib ?

This comment has been minimized.

Copy link
@lpw25

lpw25 Jul 29, 2018

Contributor

Can I suggest just removing flat_map from this PR? The name really comes from the equivalent function on lists, but we don't actually have that function in the standard library. By removing it we can avoid getting into issues about how people want the standard library to deal with monads and other computation abstractions on a PR that is about bringing support for options into line with support for other built-in types.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 29, 2018

Author Contributor

@lpw25 I don't understand your concern and request:

  1. flat_map as defined in this PR would in any case not be a good candidate for a monadic interface due to the order in which I wrote it (bind has the argument swapped).
  2. I wouldn't mind if List.flat_map was actually added to the stdlib.
  3. I don't see why adding these functions poses a problem and how not adding these useful functions "avoid[s] getting into issues about how people want the standard library to deal with monads and other computation abstractions ".

I mean at that rate why don't you tell me to remove Option.flatten, this is just monadic join in disguise.

This comment has been minimized.

Copy link
@lpw25

lpw25 Jul 29, 2018

Contributor

I wouldn't mind if List.flat_map was actually added to the stdlib.

I have no particular opinion on whether List.flat_map should be added or not -- my point is just that I would rather see List.flat_map and Option.flat_map added together in a different PR -- where discussions about how they should named and the order of their arguments can be discussed. Essentially, I find it very strange to have Option.flat_map without List.flat_map because the name flat_map derives from the list version of the function.

I mean at that rate why don't you tell me to remove Option.flatten

Because we already have List.flatten so adding Option.flatten is consistent.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 29, 2018

Author Contributor

Not really convinced but I'm a pleasing machine this evening. Function removed from the PR.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 29, 2018

Author Contributor

(and for interested persons I have no plan to carry the work for {List,Option}.flat_map myself)

@gasche

This comment has been minimized.

Copy link
Member

commented Jul 29, 2018

@Armael: one problem with |? : 'a option -> 'a -> 'a is that it computes the default value (the second argument) even when it is not needed, which makes it a potential source of performance bad-surprises.

?none:(formatter -> unit -> unit) ->
(formatter -> 'a -> unit) -> (formatter -> 'a option -> unit)
(** [pp_print_option ?none pp_v ppf o] prints [o] on [ppf]
using [pp_v] if [o] is [Some v] and [none] if it is [None] (defaults

This comment has been minimized.

Copy link
@hcarty

hcarty Jul 29, 2018

Contributor

Maybe [none] prints nothing by default?

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 29, 2018

Author Contributor

Thanks. Tweaked.

@pmetzger

This comment has been minimized.

Copy link
Member

commented Jul 29, 2018

The Menhir option.ml/mli also has fold (and hash). See: https://gitlab.inria.fr/fpottier/menhir/blob/master/src/option.mli

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 62cf37a to 17166b4 Jul 29, 2018

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Jul 29, 2018

Now that I added Option.iter I guess I can add Option.fold though I wonder if people actually use that stuff. I used Menhir's signature as it correspond to the order in {Map,Set}.fold; it is however not the order of CCOpt.fold.

I also added Option.hash. To avoid making Option dependent on the Hashtbl module I simply hash None to 0, a hash expert should tell me if that's ok or if I should rather reorder the dependencies and use Hashtbl.hash on None.

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 17166b4 to cf2db61 Jul 29, 2018

(** [map f o] is [None] if [o] is [None] and [Some (f v)] is [o] is [Some v]. *)

val flatten : 'a option option -> 'a option
(** [flatten oo] is [Some v] is [oo] is [Some (Some v)] and [None]

This comment has been minimized.

Copy link
@yallop

yallop Jul 30, 2018

Member

typo: is [oo]if [oo]

(** [to_result ~none o] is [Ok v] if [o] is [Some v] and [none] if *)

val of_result : ('a, _) result -> 'a option
(** [of_result r] is [Some v] is [r] is [Ok v] and [None] otherwise. *)

This comment has been minimized.

Copy link
@yallop

yallop Jul 30, 2018

Member

typo: is [r]if [r]

@garrigue

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2018

+1 for Option.unwrap, since it reads nicely as a qualified identifier.

For what it is worth, lablgtk internally uses the following functions:

val may : f:('a -> 'b) -> 'a option -> unit
val may_map : f:('a -> 'b) -> 'a option -> 'b option
val default : 'a -> opt:'a option -> 'a
val may_default : ('a -> 'b) -> 'a -> opt:'b option -> 'b
(** {1:convert Converting} *)

val to_result : none:'e -> 'a option -> ('a, 'e) result
(** [to_result ~none o] is [Ok v] if [o] is [Some v] and [none] if *)

This comment has been minimized.

Copy link
@yallop

yallop Jul 30, 2018

Member

typo: [none] if *)[none] otherwise. *)

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from ba50e24 to af52f45 Jul 30, 2018

(** {1:convert Converting} *)

val to_result : none:'e -> 'a option -> ('a, 'e) result
(** [to_result ~none o] is [Ok v] if [o] is [Some v] and [none] otherwise. *)

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Jul 30, 2018

Contributor

... and [Error none] otherwise...

(** [of_result r] is [Some v] if [r] is [Ok v] and [None] otherwise. *)

val to_seq : 'a option -> 'a Seq.t
(** [to_seq o] is [o] as a sequence. *)

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Jul 30, 2018

Contributor

Perhaps this is obvious enough for most participants to this discussion, but making it explicit that Some (resp. None) produces a sequence of length 1 (resp 0) might be useful.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Jul 30, 2018

Independently of the naming for the function and the argument, what about using a single function for raising and non raising variants:

val get: ?default:'a -> 'a option -> 'a

One could argue that keeping them separate makes it easier to grep for the raising variant, or mark its type as such (when typed effects will be there).

Of course, one should also have a non-raising, option returning variant to be coherent with the rest of the stdlib


(** {1:preds Predicates and comparisons} *)

val is_none : 'a option -> bool

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Jul 30, 2018

Contributor

It's completely minor, but to be coherent with of_result, you should either use _ for all ignored type variables (val is_none : _ option -> bool) or none of them (val of_result : ('a, 'b) result -> 'a option). I believe using _ might be slightly confusing for beginner, so I'd have a preference for introducing (useless) variable names.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jul 30, 2018

Author Contributor

Done (named it so that comment won't outdate itself).

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 4bf16ab to 3491e85 Jul 30, 2018


val compare : ('a -> 'a -> int) -> 'a option -> 'a option -> int
(** [compare cmp o0 o1] is a total order on options using [cmp] to compare
values wrapped by [Some _]. *)

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Jul 30, 2018

Contributor

I'd suggest documenting the respective ordering of Some vs None.

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Jul 30, 2018

@alainfrisch that's an interesting get proposal but I worry this might be slightly more error prone to write and harder to keep in the head when reading or looking for a bug.


(** {1:options Options} *)

type 'a t = 'a option

This comment has been minimized.

Copy link
@alainfrisch

alainfrisch Jul 30, 2018

Contributor

What about re-exporting constructors here, which would allow explicit qualification Option.Some (also in patterns)?

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 9865f75 to 3cc587b Aug 1, 2018

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 1, 2018

I have the impression that most of the hard points have been discussed. I have thus rebased the work to two commit. In that process:

  1. I have changed Option.evict to Option.value with the exact signature @diml suggested (the polls seems to be in favor of it aswell at the time I write this).
  2. I removed the Option.hash function. I think it is better to handle this precisely at the point where you need it (e.g. if you need the seeding stuff).
  3. Removed the Option.of_result function (this was the only of_ among all these to_ functions and will become clearer soon, see below).
  4. Became optimist about the version in which that should become available.

I will now prepare a PR for Result so that the consistency with certain names can be cross-checked.

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 3cc587b to 27f1b97 Aug 1, 2018

@mshinwell

This comment has been minimized.

Copy link
Contributor

commented Aug 1, 2018

I think it's a mistake to remove Option.hash. Even if there is another claimed solution, what I reckon will happen here is simply more use of the polymorphic hash function, which is bad.

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 1, 2018

@mshinwell if I get a definitive ok by people who are more into hashes than I am that these two would be good for option and result I will add them to the proposal(s):

let hash_option : ('a -> int) -> 'a option -> int = Option.fold 0
let hash_result ('a -> int) -> ('e -> int) -> ('a, 'e) result -> int = Result.fold

(or whatever else these people would recommend; I do however worry a bit about potentially not applying any form of seeding to None and not distinguishing the cases in hash_result).

@yminsky

This comment has been minimized.

Copy link

commented Aug 1, 2018

These definitions of hash_option and hash_result seem at least a little suspect to me. I wonder if @braibant or @aalekseyev have any thoughts on this, since they did a lot of work on hashing as part of the development of ppx_hash.

@aalekseyev

This comment has been minimized.

Copy link
Contributor

commented Aug 1, 2018

I would say the main problem with the proposed hash functions is that they create systematic collisions.
In ppx_hash we generally followed the rule of never creating systematic collisions (the only collisions are the ones created by the user or by the mixing primitive).

What I'm talking about is Some None colliding with None and Ok x colliding with Error x.
So you end up with very big families of collisions in types like (unit, unit) result list.

We do is things like:

let hash_result hash_a hash_e x = function
  | Ok a -> mix 0 (hash_a a)
  | Error e -> mix 1 (hash_e e)

For some reasonable somewhat-collision-resistant mixing primitive mix.

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 1, 2018

@mshinwell given words of @aalekseyev (thanks !) I think it's better if we leave the hash functions out until suitable Hashtlb.mix primitives are added to the stdlib.

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 27f1b97 to 52ccb79 Aug 6, 2018

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 6, 2018

So I rebased the PR in the process:

  1. Since this and this comment received no objections I unified the interface with the Result PR. This means replaced Option.flatten by Option.join and added Option.bind.
  2. Added a test suite.
(* TEST
*)

module Option = Stdlib__option (* why ? *)

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Aug 6, 2018

Author Contributor

@shindere Could you please have a look at this. It seems that the testing framework is not able to compile the this file if I don't add this (same problem in Result PR). If you take for example lib-uchar this is not needed so I suspect a bootstrap issue but I don't think a bootstrap should be needed if you add a module to the stdlib and it's not used by the compiler itself.

Also it would be nice to report the actual compilation error when you fail to compile the test.ml file rather than a simple exit code.

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

@dbuenzli I could run this test without issue with a fresh build of your branch; maybe try cleaning your tree ?

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

(that is, I could run it without the module Option = ... line)

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Aug 6, 2018

Author Contributor

@nojb how clean ? somehow with make clean doesn't work here.

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

git clean -fdx

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

What is the error given by the compiler ?

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Aug 6, 2018

Author Contributor

Tried make distclean, didn't work but git clean -fdx did, thanks ! (in case you care the error was that Option module didn't exist)

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch 2 times, most recently from 152f7a1 to cc3e561 Aug 6, 2018

@nojb nojb closed this Aug 7, 2018

@nojb nojb reopened this Aug 7, 2018

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from cc3e561 to 2213abc Aug 7, 2018

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2018

CI green! The PR seems to have reached a stable point, so I am planning to merge this tomorrow unless someone speaks up, or beats me to it.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2018

@dbuenzli Could you update Changes ?

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 2213abc to 30e02ba Aug 8, 2018

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 8, 2018

Done. For credits see this comment.

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 8, 2018

I'll rebase that to solve the conflicts now that #1956 is in.

@dbuenzli dbuenzli force-pushed the dbuenzli:option-support branch from 30e02ba to f9c97d1 Aug 8, 2018

@dbuenzli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 8, 2018

Done.

@nojb nojb merged commit 09b4aee into ocaml:trunk Aug 8, 2018

1 of 2 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@dbuenzli dbuenzli deleted the dbuenzli:option-support branch Nov 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.