New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "monadic" let operators #1947
Conversation
One place this was once supported was MetaOCaml:
|
I'm in favor of such a feature. We started using applicative parsers in dune and it's really painful to write applicative code without something like this. |
It would be lovely if ppx_let could be entirely obsoleted by this new syntax. Some missing features:
I suspect we could quite easily live without support for |
I'd be very happy to see basic syntactic support for monads. By which I mean a "let"-like construct for monadic "bind" and nothing else. There is no "and" in any theory of monads I know of. Not to mention the other stuff that @yminsky mentions. Could we leave the extra syntax to existing PPXs, for people who can't live without, and just have basic monads that everyone understands in the core system? |
It's probably worth me describing some alternative translations that are worth considering. One possibility is to use something halfway between this PR and let+ x = A
and y = B
and z = C in
expr and translate it to: let def = A
and def2 = B
and def3 = C
and func = fun ((x, y), z) -> expr
and (bind, single, and_) = (let+) in
bind func (and_ (and_ (single def) def2) def3) This prevents supporting A slight variation would be to use the same translation but to force people to write it as: let+ x = A
and+ y = B
and+ z = C in
expr which I think is aesthetically nicer, although I don't like that there is no actual Another possibility would be to separate out the let (let+) = map
let (let+..and+) = (map, id, product) with individual |
In particular, monads where |
Sure there is. val pure : 'a -> 'a t
val prod : 'a t -> 'b t -> ('a * 'b) t If you don't support In addition, in most useful monads the applicative operations are cheaper than the monad operations. This is true of: lwt, async, incremental, build monads in Jenga and dune, monadic parsers etc. Which means that writing efficient code requires you to use |
I'm not really sure how to enforce that without using higher-order unification or requiring the operator to be defined as a module. It also rules out using the construct for applicatives. Beyond that I don't think it is really a necessary restriction. I prefer to view these operations as interpretations of the let syntax, if a user wishes to uses an esoteric interpretation I don't see why we should stop them. |
I'm not used to thinking about the let-based syntax and its translation to monads/applicatives in OCaml - unfortunately, these things work pretty well if you just treat them as black boxes - but I think this suggestion is great. Can't wait for the chapter in the manual about monads and applicatives :) |
Well, call me old-school and fixated on the literature, but in the original McBride-Patterson definition of applicatives, there is no product but there is a I agree applicatives are very useful, nearly as much as monads. I'd still like either or both to be supported in a way that is closer to established literature and formalizations. Efficiency considerations come second to understandability. |
Is there a clean way to support |
See section 7 of that paper, where the "Monoidal" typeclass is described as isomorphic to the other signature they present. I've always preferred the monoidal presentation because it makes the nature of applicatives much clearer. For example, the laws for the monoidal presentation are basically just the monoid laws, which I find easier to follow than the laws for the application-based presentation. |
I feel a bit more mellow than @xavierleroy about the proposal. I suspect that there is a large part we all agree on (but it requires a bit of time and effort to find it out), but that maybe this is blurred by dubious salesmanship -- making your first example an ASCII christmas tree was maybe not the best move. One thing that would be nice is to not look at mixed-operator settings at first, and show and discuss examples that correspond to the use of a single monad (bind and return) and a single applicative (pure and |
I think inserting any syntactic sugar into OCaml is going to be hard, since the language doesn't currently have much sugar as a precedent. It may be worth considering putting the minimal amount into the language at first to get people used to the idea and expanding later. Also, while |
Yes, I was just using different names for each |
Regarding examples of applicatives, a good example is the cmdliner library. It is a widely used command line parsing library that uses an applicative parser. Using an applicative allows in particular to extract the man page from the parsing code, which is really nice. It is used by 190 packages in opam, and we have a similar library at Jane Street that is used by hundrerds (possibly thousands) of our programs. Code using it usually looks like this: let x1 = flag "-x1" type1 in
let x2 = flag "-x2" type2 in
...
let xn = flag "-xn" typen in
let make x1 x2 ... xn =
{ x1; x2; ...; xn }
in
Term.(const make $ x1 $ x2 $ ... $ xn) where let+ x1 = flag "-x1" type1
and+ x2 = flag "-x2" type2
...
and+ xn = flag "-xn" typen
in
{ x1; x2; ...; xn } Just to give a real life example, here is some code extracted from opam: let create_global_options
git_version debug debug_level verbose quiet color opt_switch yes strict
opt_root external_solver use_internal_solver
cudf_file solver_preferences best_effort safe_mode json no_auto_upgrade
working_dir ignore_pin_depends =
let debug_level = OpamStd.Option.Op.(
debug_level >>+ fun () -> if debug then Some 1 else None
) in
let verbose = List.length verbose in
{ git_version; debug_level; verbose; quiet; color; opt_switch; yes;
strict; opt_root; external_solver; use_internal_solver;
cudf_file; solver_preferences; best_effort; safe_mode; json;
no_auto_upgrade; working_dir; ignore_pin_depends; }
...
let global_options =
let section = global_option_section in
let git_version =
mk_flag ~section ["git-version"]
"Print the git version of opam, if set (i.e. you are using a development \
version), and exit."
in
let debug =
mk_flag ~section ["debug"]
"Print debug message to stderr. \
This is equivalent to setting $(b,\\$OPAMDEBUG) to \"true\"." in
... about a 100 lines of code like this ...
Term.(const create_global_options
$git_version $debug $debug_level $verbose $quiet $color $switch $yes
$strict $root $external_solver
$use_internal_solver $cudf_file $solver_preferences $best_effort
$safe_mode $json_flag $no_auto_upgrade $working_dir
$ignore_pin_depends) |
So it is probably worth me describing how I arrived at the current proposal, or at least arrived at the general idea of the current proposal. The translation used in In practice, this worked pretty well but it couldn't capture some useful cases which was frustrating. This encouraged me to try a slightly different approach. This time I'm trying to essentially define the operators as eliminators of the val elim_let : single:(term -> 'a) -> cons:(term -> 'a -> 'a) -> pair:('a -> term -> 'b) -> 'b Obviously, there are other ways you could structure this eliminator, but by choosing this form we still get that the applicative let and the monadic let are built out of fundemental components. This time they are
Yeah, I jumped somewhat into the middle of my train of thought without including details from my considerations of this subject over the last few years. |
The unmixed applicative case needs to translate: let+ x = a
and+ y = b
and+ z = c in
body into map (fun ((x, y), z) -> body) (prod (prod a b) c) where val map : ('a -> 'b) -> 'a t -> 'b t
val prod : 'a t -> 'b t -> ('a * 'b) t are the map and monoidal product respectively. The unmixed monadic case needs to translate: let* x = a in
expr into: bind (fun x -> expr) a where: val bind : ('a -> 'b t) -> 'a t -> 'b t is the bind operation of the monad. However, I would really like to emphasize how important the mixed monadic case is. One of my favorite papers in this area is "Idioms are oblivious, arrows are meticulous, monads are promiscuous" by Lindley, Wadler and Yallop. It really emphasizes how the relationship between monads and applicatives is all about dependencies between computations. Accurately expressing the dependencies in your computations is really important, and from this perspective: let* x = a in
let* y = b in
expr is not a suitable replacement for: let* x = a
and* y = b in
expr because they do not really represent the same computations. For some simple monads like option or list it's not an important distinction, but for many monads the distinction is key to using them effectively. For this mixed monadic case we need to translate: let* x = a
and* y = b in
expr into bind (fun (x, y) -> expr) (prod a b) where val bind : ('a -> 'b t) -> 'a t -> 'b t
val prod : 'a t -> 'b t -> ('a * 'b) t are the bind and monoidal product respectively. |
(Note that the above translation are a bit simpler than the ones I have been giving before because they ignore the issue of evaluation order. Since OCaml evaluates lets top-to-bottom, but function arguments right-to-left, you need to first bind the arguments in a let to make sure not to surprise users. I know technically all these evaluation orders are undefined, but I'd still rather not needlessly confuse people) |
To make Leo's point about dependencies more concrete, consider the case of Incremental. The whole point of Incremental is to optimize the recomputation of the calculation in question, and dependency tracking is critical. So, the following two computations: let+ x = f a
and+ y = g b
in
x + y let* x = f a in
let+ y = g b in
x + y Have very different recomputation semantics. In particular, if the value of Similarly, moving from the applicative subset of an API to the full monadic one gives up on the analyzability of the resulting object, which goes to things like the Commandliner example the @diml mentioned. Generally speaking, I think there is now a decent amount of prior art (at least ppx_let and Haskell) that suggests that one should make it possible to be explicit about when one is using the applicative subset of a monad, vs when one is using the full power of bind. Leo's proposal allows that, but only providing a monadic let binding does not. |
My pay grade is not high enough to comment on the specifics of the proposal. However, I will note that there are no corresponding changes to the documentation included in your patches. I think writing the documentation, in addition to its more obvious use, also has the effect of helping one think through whether one's feature is optimal. If it's hard to explain, it might be not quite right... |
I expect there are a few possible ways to handle that. For example, you could translate match+ expr with
| pat1 -> case1
| pat2 -> case2
| exception E1 -> case3
| exception E2 -> case4 to something like this: (match+)
(function Ok pat1 -> case1 | Ok pat2 -> case2 | Error E1 -> case3 | Error E4 -> case4)
(match expr with v -> Ok v | exception e -> Error e) or perhaps to something like this: match expr with
| v -> (match+) (function pat1 -> case1 | pat2 -> case2) v
| exception e -> (match+) (function E1 -> case3 | E4 -> case4) e Personally, I think it'd be useful to see the
|
Second that. |
Sorry to add to the volume of the discussion, but it so happens that I have been obsessively thinking about such feature for quite some time and wanted to compare notes. So, how about this: For any identifier let.bind x = y in z ⇒ bind y ~f:(fun x -> z)
let.map x = y in z ⇒ map y ~f:(fun x -> z)
let.lwt x = y in z ⇒ lwt y ~f:(fun x -> z)
let.foo x = y in z ⇒ foo y ~f:(fun x -> z) This allows for easier mixing of, for example, Mixed monadic case: let.bind a = x
and.prod b = y
and.prod c = z in e
⇒
bind ~f:(fun ((a, b), c) -> e) (prod (prod x y) z) This assumes the following: val map : 'a m -> f:('a -> 'b) -> 'b m
val bind : 'a m -> f:('a -> 'b m) -> 'b m
val prod : 'a t -> 'b t -> ('a * 'b) t Importantly, it requires val map : 'a m -> ('a -> 'b) -> 'b m
val bind : 'a m -> ('a -> 'b m) -> 'b m But not like this: val map : ('a -> 'b) -> 'a m -> 'b m
val bind : 'a m -> ('a -> 'b m) -> 'b m If nothing else, I second the support for the mixed monadic case, which I use a lot with ppx_let. |
We use some internal parsing library, using applicative-style combinators, but "postfix style": val ( @@ ): 'a t -> ('a -> 'b) t -> 'b t
val ( @@@ ): unit t -> 'a t -> 'a t
val (!!): 'a -> 'a t leading to code such as: str "(" @@@ number @@ str "," @@@ number @@ str ")" @@@ !!(fun y x -> (x, y))
Ideally the code above could be written as: str "(";@
let@ x = number in
str ",";@
let@ y = number in
str ")";@
(x, y) |
That feels oddly reminiscent of #508 , which I still find ugly as hell. |
I don't find that particularly nice either, but the alternative would be quite heavy: let@ () = str "(" in
let@ x = number in
let@ () = str "," in
let@ y = number in
let@ () = str ")" in
(x, y) (at this point, using the original infix combinator syntax would probably work better in practice) We don't necessarily need to support ad hoc sequencing; one could also interpret |
@alainfrisch we are now using an applicative s-expression parser in Dune, and to avoid to do positional argument matching we introduced a text preprocessor to interpret let%map loc = loc
and () = keyword "select"
and result_fn = file
and () = keyword "from"
and choices = repeat choice in
Select { result_fn; choices; loc } |
With the current proposal, would you be able to use While writing my example above, it occurred to me that |
@alainfrisch Personally, I quite like the form using let+ (x, y) = str "(" @@ number @@@ str "," @@ number @@@@ str ")" in
... which might be more to your taste.
With this proposal it has to be
No, and to be honest I don't like the look of that. If people want to avoid equals then I would rather we use the fairly traditional: let+ x <= ... in although personally I think |
I think this is a good point of comparison. The existing options of |
My opinion is that the resistance to |
I think @bluddy just put his finger on much of what was bothering me. |
Precisely what @bluddy stated, that combined with you don't know where the new |
This applies to every single value -- you have no idea where it's coming from, and instead have to find the relevant open. You do have the option of specifying a module for values, and maybe we could allow that option, so we could have |
In addition to above |
I've had a go at improving the documentation in #2206. I didn't change much, but hopefully its a bit clearer. Further suggestions welcome. |
So I'm finally playing around with this — pretty cool! I'm pretty confused as to what is expected of the let (let+) = Lwt.map
let (and+) = Lwt.map
let (let*) = Lwt.bind
let (and*) = Lwt.bind
let () =
Lwt_main.run begin
let* line1 = Lwt_io.(read_line stdin)
and* line2 = Lwt_io.(read_line stdin) in
Lwt_io.(write_line stdout (line1 ^ line2))
end yields:
Coming from JavaScript, I'm pretty excited for something that basically looks like a more-general async/await (if I'm reading this correctly? Forgive me, talk of applicatives and monads sometimes goes over my head 🤣); but I can't quiiiite figure out how to apply this with Lwt for the similar purposes of cleanly interleaving asynchronous and synchronous code … |
You can use the following for the let (and*) a b =
Lwt.bind a (fun x -> Lwt.bind b (fun y -> (x, y)) although there is not much point using them when they are defined that way. I would have thought that lwt could provide a more efficient implementation, but I can only see the confusingly named val join : unit t list -> unit t
val ( <&> ) : unit t -> unit t -> unit t for composing multiple threads. What you need is a slightly more general function like val both : 'a t -> 'b t -> ('a * 'b) t Note that the lwt ppx just uses |
@lpw25 Would it be possible for a tutorial to be written on the use of these operators? I suspect they're going to be confusing to a lot of people, especially since monads aren't a common thing (yet) in the OCaml world. |
val both : 'a t -> 'b t -> ('a * 'b) t There's an issue open about this: More general join and choose operators (#325) There's also some discussion under the issue of an apparent problem with the Lwt translation of |
* Fix problems with character literals in comments * Do not parse unclosed comments and quoted strings * Update known failures * Only allow line number directives with filename (ocaml/ocaml#931) * Rename dot_operator to indexing_operator * Disallow .~ in indexing operator (ocaml/ocaml#2106) * Add test for indexing operator * Support empty variants (ocaml/ocaml#1546) * Support binding operators (ocaml/ocaml#1947) * Use tree-sitter 0.14.0 * Cleanup trvis config
Do we have some plans to optimize such pattern? |
There are three character classes in user-choosable operator symbols in OCaml: - 'symbolchar', which allows everything one can use in infix operators - 'dotsymbolchar', which removes + '.', which creates ambiguities with ..> as valid object type syntax + '<', incompatible with Camlp4 + '~', incompatible with MetaOCaml - 'kwdopchar', which was introduced by Leo in ocaml#1947 (binding operators), which is similar to 'dotsymbolchar' but also + removes '!' and '?', only allowed in prefix operators + removes '%', incompatible with ppx syntax + removes ':' (why not?) + adds '<' back in (why ?) It's fairly hard to justify these choices and have a coherent story for the character classes. It would already be easier if at least we had a monotonic hierarchy of more and more permissive classes. The present commit removes '<' from the 'kwdopchar' set, so that it corresponds to a core "safe characters" class that is included in all others. (This means that user programs that would use a let-operator such as "let<@>" or "and<@>", accepted in 4.08 and 4.09, would now be rejected in future OCaml releases incorporating this change.)
Note that val both : 'a t -> 'b t -> ('a * 'b) t was added to Lwt in ocsigen/lwt@d7e23c7 in March 2019. |
It would be worth editing the examples in the first comment. Apparently some information is not correct:
The committed version does not include
Are https://caml.inria.fr/pub/docs/manual-ocaml/manual046.html picks Some experiments suggest that the compiler checks the return type of let (let*) o f = match o with None -> None | Some x -> f x;;
let* a = Some 3 in Some 5 (* >>= *) let (let*) o f = match o with None -> None | Some x -> Some (f x);;
let* a = Some 3 in 5 (* <*> *) |
Based on a few recent comments and PRs, I thought it might be a good time to revive the idea of adding some support for "monadic" syntax to OCaml. I can't find the last attempt at this on GitHub or Mantis, but like last time the idea is to allow people to define "let" operators like
let*
andlet+
. The general form of the operators islet
followed by a sequence of infix operator characters. (Note that this deliberately excludeslet!
-- the existence ofopen!
andmethod!
would make that too confusing).Outline
This proposal aims to be fully general whilst making it easy to implement the common cases. In particular, it focuses on supporting functors, applicatives and monads easily. The translation chosen is a generalisation of the one used by ppx_let. Essentially, given a let binding like:
it is translated directly to:
This translation essentially treats the input to a let binding as a pair of a tagged list of bindings and a function. The
bind
function handles pairing the bindings with the function, whilst thesingle
,and++
andand+++
functions create the list of bindings. Thebind
andsingle
functions are combined to form the(let+)
value.Example
For a monad, I suggest the following approach to using these let operators, using
list
as an example:So we use
let+
for map,let*
for bind and set both theand
s to the monoidal product. This gives the expected behaviours:(I've never found list to be a particularly useful monad and I think it makes these examples a bit less compelling than if they were using a more useful monad like
lwt
orasync
, however list is the simplest monad that makes clear how these operators work so I used it anyway).Observe that this approach aligns cleanly for the functor and applicative cases: for functors we can define
let+
, for applicatives we can definelet+
andand+
, and for monads we can define all four operators. This also naturally supports monads for which the applicative operations are more efficient than the equivalent monadic operations -- i.e. wherelet+ ... and+ ...
is more efficient thanlet* ... let+ ...
.Differences from
ppx_let
For comparision, the
ppx_let
translation essentially takes:and translates it to:
A key difference between the translation in this PR and the one used by
ppx_let
is the addition of thesingle
part of thelet
operators. Note that it was alwaysid
in the example above. One reason for this addition is that with some monads it is more efficient to gather together multiple computations and bind/map them all at once rather than to merge them together using their monoidal product and then bind/map that -- one notable example isDeferred.t
in the async library. Having thesingle
combinator allows the type of the argument tobind
to differ from the type of the things being bound.Another difference from
ppx_let
is the support for mixing differentlet
andand
operators together. This is partly to allow the direct alignment from operators to the functor, applicative and monad use cases -- as described above. It also makes the construct more general allowing, for example, a computation that can bind on multiple sorts of things to mix them together in a singlelet ... and ...
construct.match
operatorsThese have now been split into #1955
Open questions
Currently the implementation is entirely as a translation in the parser, much as other operators are. I generally prefer to avoid elaboration and instead to give things their own AST nodes, but I wanted to keep its implementation in line with other operators for this initial attempt. Should I change the implementation before we merge? Should I give all operators there own AST nodes -- that would certainly make
pprintast.ml
s job a lot easier?Is there any reasonable way to support qualified versions of these operators? By this I mean allowing you to use
List.(let*)
without openingList
. I personally think it will be quite difficult to find an aesthetic syntax, and I think that much of the value of these operators does not require it. Once we have implicits this will be much less of an issue.Currently, the
let*
is a single token from the lexer. Should it instead be two separate tokens?How should this be integrated into the standard library? My personal preference is not to do any such integration yet, and to wait for the arrival of implicits before adding generic support for monads etc.