Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "finally" function to Pervasives #1855

Merged
merged 9 commits into from Aug 8, 2018

Conversation

Projects
None yet
@mseri
Copy link
Member

commented Jun 22, 2018

This exposes Pervasives.try_finally function, already implemented in the Misc module and used in the compiler.

This function is very often used to make sure that resources are clened up on failure, and practically reimplemented over and over in many modules and standard librarires.
The following are from different sources in the ocaml ecosystem:

  1. From containers (CCFun)

    val finally : h:(unit > _) ‑> f:(unit > 'a) ‑> 'a
    (* finally h f calls f () and returns its result. If it raises, the same exception is raised;
       in any case, h () is called after f () terminates. *)
    
    val finally1 : h:(unit > _) ‑> ('a> 'b) ‑> 'a> 'b
    (* finally1 ~h f x is the same as f x, but after the computation, h () is called whether
       f x rose an exception or not.
        Since: 0.16 *)
    
    val finally2 : h:(unit > _) ‑> ('a> 'b> 'c) ‑> 'a> 'b> 'c
    (* finally2 ~h f x y is the same as f x y, but after the computation, h () is called whether
       f x y rose an exception or not.
        Since: 0.16 *)
  2. From base (Base.Exn)

    val protectx : f:('a > 'b) ‑> 'a> finally:('a > unit) ‑> 'b
    (* Executes f and afterwards executes finally, whether f throws an exception or not. *)
    
    val protect : f:(unit > 'a) ‑> finally:(unit > unit) ‑> 'a
  3. From batteries (BatPervasives)

    val finally : (unit -> unit) -> ('a -> 'b) -> 'a -> 'b
    (* finally fend f x calls f x and then fend() even if f x raised an exception. *)
    
    val with_dispose : dispose:('a -> unit) -> ('a -> 'b) -> 'a -> 'b
    (* with_dispose dispose f x invokes f on x, calling dispose x when f
       terminates (either with a return value or an exception). *)
  4. From xapi-stdext-pervasives (Pervasiveext) (the same module is also in xen-oxenstored)

    val finally : (unit -> 'a) -> (unit -> unit) -> 'a
    (* finally f g returns f () guaranteeing to run clean-up actions g ()
       even if f () throws an exception. *)
  5. From Unix System Programming in Ocaml:

    let try_finalize f x finally y =
       let res = try f x with exn -> finally y; raise exn in
       finally y;
           res

As you also see from the implementations, there is a tendency to already use finally to mention this function.
In other languages this idea seems to vary: haskell has a bracket function, python, javascript, C# and F# have finally as a keyword in the exception matching constructs, ruby uses ensure.
I think that even if it is not an optimal name, it would ease adoption. But I am happy to get any feedback on what could be the right name and implementation. See also the discussion in #640

An alternative implementation could be the one from @c-cube here: 6bfb795

@@ -51,6 +51,12 @@ exception Exit
(** The [Exit] exception is not raised by any library function. It is
provided for use in your programs. *)

val try_finally : (unit -> 'a) -> (unit -> unit) -> 'a
(** [try_finally f g] calls [f ()] and returns its result. If it raises,
the same exception is raised; in {b: any} case, [g ()] is called after

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jun 22, 2018

Contributor

What you write here does not hold. If g raises then it's not the same exception that is raised, it's the exception of g. (also no need for this colon in {b:).

This comment has been minimized.

Copy link
@mseri

mseri Jun 22, 2018

Author Member

True. I'd rather reword it than hide the exception from g, what do you think?

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jun 22, 2018

Contributor

True. I'd rather reword it than hide the exception from g, what do you think?

Yes it's not a good idea, as you could hide e.g. an Out_of_memory or asynchronous exceptions like Sys.Break.

This comment has been minimized.

Copy link
@mseri

mseri Jun 22, 2018

Author Member

Yes, that's exactly what I was thinking about

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

I would leave the uses in the compiler unchanged, I would rather avoid a bootstrap for such a small addition.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

I had understood that the fact that the signature of Pervasives changed was enough to require the bootstrap. If that is not the case, I am happy to leave the copy in Misc even if it is a bit redundant.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

No, it should not be necessary to bootstrap if you do not use the added functionality in the compiler proper.

let result = (try work () with e -> cleanup (); raise e) in
cleanup ();
result

This comment has been minimized.

Copy link
@nojb

nojb Jun 22, 2018

Contributor

Personally, I find

match work () with
| result -> cleanup (); result
| exception e -> cleanup (); raise e

easier to read.

This comment has been minimized.

Copy link
@mseri

mseri Jun 22, 2018

Author Member

I agree. I thought of keeping the changes minimal by just moving the old code from Misc, but I don't see why not update the style! I'll update the code, and remove the changes to the compiler as you suggested

@mseri mseri force-pushed the mseri:pervasive-finally branch from 8ee0bb6 to eb4e52a Jun 22, 2018

@mshinwell

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

I would like to argue for not leaving the copy in Misc. Having redundant copies of functions in the tree isn't going to further maintainability of the code, a goal I think we should all be striving for.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

I have a backup of the branch, so it's easily done. I probably prefer to have just the implementation in Pervasives (that already has to be copied to the threads version), but I am fine to keep changes to a minimum (it could be done in two stages, cleaning up Misc once something bigger requires a bootstrap anyway)

@gasche

This comment has been minimized.

Copy link
Member

commented Jun 22, 2018

Why you are at it, when you remove Misc.foo into Pervasives.foo, I think you could just write foo as Pervasives is opened by default.

But I haven't seen the Real Bikeshedding start yet: are we sure that we don't want to label the argument(s)?

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

@gasche I had written Pervasives.try_finalize because I had noticed Pervasives explicitly mentioned in a few parts of the code and thought that that was some kind of preferred style.

I did not name the arguments because I was not really happy with their names and because in Pervasives there is no other use of named arguments. I'd rather not label them but I'll leave that open to votes, like the Misc cleanup.

@xclerc

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

But I haven't seen the Real Bikeshedding start yet: are we sure that we don't want to label the argument(s)?

Not only that... we might prefer a version where work/cleanup accept
a non-unit parameter...

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

@xclerc I'd like that discussion though. Is it more ergonomic/used as try_finalize f x finally y or as try_finally f finally? Almost all the use I've seen have been of the second type.

@bluddy

This comment has been minimized.

Copy link

commented Jun 22, 2018

I like the convention of naming the cleanup function so you could reorder, and cleanup is a nice, clear name.

@yakobowski

This comment has been minimized.

Copy link

commented Jun 22, 2018

Could we also discuss the opportunity of storing the backtrace raised by work, and using Printexc.reraise instead of raise? This version of try_finally makes debugging really painful.

@damiendoligez

This comment has been minimized.

Copy link
Member

commented Jun 22, 2018

Why do you need to bootstrap if you change Misc?

@xclerc

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

@mseri I guess might depend on your computations, but if you
consider the more-or-less canonical example of reading data
from a file, try_finally will lead to something akin to:

let first_line filename =
  let chan = open_in filename in
  try_finally
    (fun () -> input_line chan)
    (fun () -> ignore (close_in chan))

while try_finalize would avoid the closures:

let first_line filename  =
  let chan = open_in filename in
  try_finalize input_line chan close_in chan

Also, I cannot remember ever wanting the cleanup function
to take anything else than the input of the work function (possibly
just ignoring it). It has the arguably nice property that you do not
have to let-bind the value:

let first_line filename  =
  try_finalize input_line chan close_in

(To be clear, it is not the terseness that I find appealing, but the
fact that you can avoid namespace pollution.)

@bobot

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

To follow @yakobowski you could look at #374 for a discussion on which implementation for Misc.try_finally that use reraise. You could also adopt the PR if you want.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

@yakobowski I completely agree on the use of reraise.
@xclerc indeed the third option would probably be the most common.
Thanks @bobot, I was completely unaware of that PR, I am going to read the discussion there. I wonder if that implementation is too general for the common use, on the other hand the arguments are optional so it should not make too much difference.

If we settle on a form of try_finally for Pervasives, I am happy to update that PR next. I suggest to postpone the cleanup of Misc to that stage then

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

I have read the discussion on #374. I am going to use that implementation as it is allowing for greater flexibility without too much additional burden, but I think ~always should be non optional. As far as I can see: calling it with nothing makes not sense in general, calling it with just ~exceptional but no ~always is equivalent to calling it with ~always. Am I missing something?

I am still undecided in the balance between simplicity and flexibility. I think if we use this other version then making the three functions get different parameters gets too verbose. In that case I'd stick with unit.

@c-cube

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

Seems like this would help for #640💯

@bobot

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

As far as I can see: calling it with just ~exceptional:f but no ~always is equivalent to calling it with ~always:f.

It is not the case, ~exceptional is run if and only if work or always raises. So if always is not given exceptional is run only if work raises.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 22, 2018

Oh, yes, you are right! I'll update the PR

@mseri mseri force-pushed the mseri:pervasive-finally branch 2 times, most recently from 84cfa4e to 6c8a85d Jun 23, 2018

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 23, 2018

I had to locally re-define raw_backtrace, get_raw_backtrace, raise_with_backtrace because I could not refer to Printexc. It seemed better than trying to change the build to make it work.

?always:(unit -> unit) ->
?exceptionally:(unit -> unit) ->
(unit -> 'a) -> 'a
(** [try_finally work ~always ~exceptionally] is designed to run code

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member

The argument order should be the same in the signature and in the documentation.

This comment has been minimized.

Copy link
@mseri

mseri Jun 23, 2018

Author Member

Oh yes, well spotted!

changes that would only make sense if the function completes
correctly).}
}
For example:

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member

This example will get some people to think: I cannot understand it because I don't know the compiler codebase. In fact they probably could, but still it is not engaging. Could you use a realistic but simpler example, or at least using less unknown third-party functions?

This comment has been minimized.

Copy link
@mseri

mseri Jun 23, 2018

Author Member

I agree, but I had not touched it on purpose. I will add a simpler example (if needed) after we settle on an implementation

~always:(fun () -> close_out oc)
~exceptionally:(fun _exn -> remove_file objfile);
]}
If [exceptionally] fail with an exception, it is propagated as

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member

fails with an exception

(Maybe this could be formulated as: "Exceptions raised by exceptionnaly are not caught by this function, they will be propagated to the caller as usual."

This comment has been minimized.

Copy link
@mseri

mseri Jun 23, 2018

Author Member

I like your rewording, I’ll use that

If [always] or [exceptionally] use exceptions internally for
control-flow but do not raise, then [try_finally] is careful to
preserve any exception backtrace coming from [work] or [always]
for easier debugging.

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member
[try_finally] is careful to preserve the backtrace of any exception it re-raises from [work], 
even if `[always]` or `[exceptionally]` use exceptions internally. For more details,
see {!Printexc.raise_with_backtrace}.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jun 23, 2018

Contributor

I'm not sure this implementation detail is worth spelling out in the documentation. That's what I expect from the OCaml system.

This comment has been minimized.

Copy link
@mseri

mseri Jun 23, 2018

Author Member

I agree that I would expect it from a Stdlib implementation, but that‘s often not what happens: in fact, I went to check, and none of the implementations in the various standard libraries mentioned above does that (I am going to send a PR to all of them in the next few days though)

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member

raise_with_backtrace wasn't available until recently, so using it in stdlibs may require backward-compatibility hacks.

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jun 23, 2018

Contributor

True but I expect that the number of persons who would not use the function fearing that it might not be the case is border to nil. So I rather see it as documentation noise.

This comment has been minimized.

Copy link
@mseri

mseri Jun 23, 2018

Author Member

@gasche oh it was 4.05. I thought it was 4.04, in that case it would not have been a problem for some of them. I’ll check and discuss if it is worth in the various cases.

@dbuenzli Probably you are right, If we drop the exceptionally argument I think it will become even less important

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member

Note that saving and restoring backtraces has a small-but-nonempty performance cost (you have to copy the backtrace out and move it back). So I thought that being explicit about the behavior might be useful in documentation for cost-reasoning purposes. That said, I think that people that are very careful about performance in this way would go read the implementation anyway, and others would probably not notice the difference.

{ul
{- [always], that must be run after {b any} execution
of the function (typically, freeing system resources), and}
{- [exceptionally], that should be run {b only} if [work] or [always]

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jun 23, 2018

Contributor

I'm not sold on this signature. I don't see any advantage in it, the primitive is bundling too many feature in my opinion. Somehow I find:

try try_finally work ~finally:always with 
| e -> exceptionally (); raise e

to be perfectly fine and maybe even more clear; which is the semantics of your function if I understand correctly.

This comment has been minimized.

Copy link
@mseri

mseri Jun 23, 2018

Author Member

I do agree with you. Keeping the simple version makes actually for clearer code imho, as you don’t need to carefully read the documentation to understand what is going on with exceptionally, and it also makes the implementation much simpler.

This comment has been minimized.

Copy link
@gasche

gasche Jun 23, 2018

Member

The two versions are not equivalent when backtraces are enabled, if exceptionally terminates correctly but uses exceptions for internal control flow. On the other hand, the complexity reduction may still be worth it -- it's not like the rest of the stdlib is careful with backtraces anyway, and we could think of improving backtrace handling in the compiler/semantics instead of in the standard library.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jun 24, 2018

I have updated the implementation by dropping the ~exceptionally argument, as discussed here. I think that this version behaves much more closely to what most users would expect, and adding the exceptional handling is possibly even easier to understand by dealing with it explicitly (as @dbuenzli was suggesting):

try try_finally work ~always with 
| e -> exceptionally (); raise e

or

try try_finally work ~always with 
| e ->
  let bt = Printexc.get_raw_backtrace in
  exceptionally ();
  Printexc.raise_with_backtrace e bt

I have kept a link to Printexc.raise_with_backtrace in the docstrings but I made it a bit less prominent. Let me know if I should remove it completely but I think it is a reasonable middle ground.

Should I keep an example in the docstring, maybe something like the one in @xclerc comment?

let first_line filename =
  let chan = open_in filename in
  try_finally
    (fun () -> input_line chan)
    (fun () -> close_in chan)

It may be worth adding an example on the line of the second snippet above (without try_finally) to the Printexc.raise_with_backtrace documentation, to show an example of use (and why it can be useful there).

Concerning @xclerc comments about the signature, I am not convinced that going to the full generality of try_finalize with something like

val try_finally : work:('a -> 'b) -> 'a -> always:('c -> unit) -> 'c -> 'b

is worth. It reduces the number of closures needed to possibly none but complicates unnecessarily the function call. Regarding the other suggestion of passing the same input

val try_finally : always:('a -> unit) -> ('a -> 'b) -> 'a -> 'b

I think it would cover nicely a common use case, but I think the assumption on always is unnecessarily restrictive. I have seen many examples in which you need to pass different values to the cleanup function, that would practically require to add closure that ignores the input and deals with the rest.

In this respect I find the current version

val try_finally : always:(unit -> unit) -> (unit -> 'a) -> 'a

explicit enough and easy to understand, and I think that the fact that it usually requires to add two closures is a minor pain and does not affect the readability of the code.

caught, they will be propagated to the caller as usual.
In any other case [try_finally] will return the result of the
[work] funciton or re-raise its exception, preserving the
backtrace (For more details, see {!Printexc.raise_with_backtrace}).

This comment has been minimized.

Copy link
@dbuenzli

dbuenzli Jun 24, 2018

Contributor

I think this doc string is unnecessarily long-winded. Here would be my take on it:

try_finally ~always work invokes work () and then always () before work returns with its value or an exception. In the latter case the exception is re-raised after always (), preserving the backtrace (see {!Printexc.raise_with_backtrace}). If always () raises, this exception is not caught and may shadow one work () may have raised.

Or even shorter (I'm still convinced the discussion about the backtrace is pointless):

try_finally ~always work invokes work () and then always () before work returns with its value or an exception. In the latter case the exception is re-raised after always (). If always () raises, this exception is not caught and may shadow one work () may have raised.

This comment has been minimized.

Copy link
@gasche

gasche Jun 24, 2018

Member

I think that if we use the term "re-raise" explicitly, then it is reasonable to assume that this implicitly suggests "with the same backtrace" -- and indeed I hope that we can move in the future to a world where reraise preserves the backtrace always.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Jul 24, 2018

Appveyor failed because it took too long to compile and run the tests: "Build execution time has reached the maximum allowed time for your plan (60 minutes)."

@mseri

This comment has been minimized.

Copy link
Member Author

commented Aug 6, 2018

Is there any reason not to merge this?

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 6, 2018

I realize now that @alainfrisch, who has been driving most of the standard-library issue, is not cc-ed in this PR, so maybe that's one reason why he hasn't given an opinion yet. (Or maybe people are in holidays in August or less active.)

@nojb

nojb approved these changes Aug 6, 2018

Copy link
Contributor

left a comment

Looks good to go, except for a couple of small tweaks.

external _get_raw_backtrace:
unit -> _raw_backtrace = "caml_get_exception_raw_backtrace"
external _raise_with_backtrace: exn -> _raw_backtrace -> 'a
= "%raise_with_backtrace"

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

This is nitpicking, but: I am guessing the _ is suppose to mean that these identifiers are "for internal use"; however this is already expressed by not exposing these values in the interface. Moreover, underscores in OCaml have another meaning (disabling warning 32), so I would rather remove them.

external _get_raw_backtrace:
unit -> _raw_backtrace = "caml_get_exception_raw_backtrace"
external _raise_with_backtrace: exn -> _raw_backtrace -> 'a
= "%raise_with_backtrace"

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

Idem.

If [finally ()] raises, this exception is not caught and may shadow
one [work ()] may have raised.
@since NEXT_RELEASE *)

This comment has been minimized.

Copy link
@nojb

nojb Aug 6, 2018

Contributor

Please write 4.08.0 instead of NEXT_RELEASE.

mseri added some commits Jun 22, 2018

stdlib: introduce try_finally in Pervasives
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
stdlib: update docstring wording
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
stdlib: update try_finally to newer syntax
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
stdlib: use a version of @bobot try_finally
`try_finally work ~always ~exceptionally` is designed to run code
in `work` that may fail with an exception, and has two kind of
cleanup routines:

- `always`, that must be run after **any** execution of the function
  (typically, freeing system resources), and

- `exceptionally`, that should be run **only** if `work` or `always`
  failed with an exception (typically, undoing user-visible state
  changes that would only make sense if the function completes
  correctly).

I had to locally re-define `rab_backtrace`, `get_raw_backtrace`,
`raise_with_backtrace` because I could not refer to `Printexc`.

Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
stdlib: simplify try_finally signature and implementation
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
stdlib: simplify try_finally docstring
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
Update Changes
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>
stdlib: rename try_finally to protect
As a result of the poll in #1855.
The votes at the time of commiting are:

- 18 for `protect ~finally`
- 12 for `try_finally ~finally`
- 1  for `finally ~cleanup`
- 0  for `try_finally ~always`
- 0  for `try_finally ~cleanup`

Signed-off-by: Marcello Seri <marcello.seri@gmail.com>

@mseri mseri force-pushed the mseri:pervasive-finally branch from d4c74d9 to 129458b Aug 7, 2018

stdlib: address @nojb review comments
Signed-off-by: Marcello Seri <marcello.seri@gmail.com>

@nojb nojb closed this Aug 8, 2018

@nojb nojb reopened this Aug 8, 2018

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2018

I would be inclined to merge once the CI passes. @gasche do you agree? (BTW, Alain is on holiday, so I don't expect him to chime in soon).

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 8, 2018

@nojb: I don't have an opinion.

Personally I find it too difficult to make decisions on standard library changes because the design space is very large and so subjective. When I don't see objective reasons to prefer one choice to another, I prefer to not make a choice at all. I'm glad that @alainfrisch (and you, apparently) volunteered to take on the burden of making those decisions, and I'll let you do just that.

(I think that the Option/Result modules that went in today as part of this Standard Library Festival Week are just fine, and I'm glad that you merged them. Thanks!)

@nojb nojb merged commit 4a2b27a into ocaml:trunk Aug 8, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 8, 2018

Merged; unfortunately I didn't notice that the history was quite a mess; it would have been much better to squash before.

@mseri

This comment has been minimized.

Copy link
Member Author

commented Aug 8, 2018

Sorry, I was waiting for all the comments before squashing them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.