Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Format.dprintf, the delayed printing function #1959

Merged
merged 5 commits into from Aug 3, 2018

Conversation

Projects
None yet
7 participants
@Drup
Copy link
Contributor

commented Aug 2, 2018

In #1952, @Armael uses the type Format.formatter -> unit as the vehicle for "messages with delayed formatting decisions". Delaying the formatting decisions mean they can be taken when all the information regarding the final error message, the context and the output are known, instead of being translated into string now, resulting in bad formatting. Format.formatter -> unit also corresponds to the "%t" modifier in format strings.

Unfortunately, it means introducing a function

val errorf: ?loc:t -> ?sub:msg list -> (formatter -> unit) -> error

with usages that look like

Location.errorf ~loc:(Location.in_file filename) (fun ppf ->
        Format.fprintf ppf "Cannot open file %s@." (Printexc.to_string e))

This is not very nice. This PR introduces dprintf which works like a usual printf function, except it returns a closure of type formatter -> unit containing the printing actions.

Using kdprintf (the version that takes a continuation), Location.errorf can be redefined to be of type ?loc:t -> ?sub:msg list -> ('a, formatter, unit, error) format4 -> 'a, which is more in line with the usual formatting functions and nicer to use.

I'm not particularly convinced neither by the current name nor the documentation, so feel free to propose improvements.

Implementation

The first two commits are somewhat independent, but necessary for being able to express dprintf. All printf-like functions are defined in term of CamlinternalFormat.make_printf which used to be of type:

val make_printf :
  ('b -> ('b, 'c) acc -> 'd) -> 'b -> ('b, 'c) acc ->
  ('a, 'b, 'c, 'c, 'c, 'd) CamlinternalFormatBasics.fmt -> 'a

The 'b parameter corresponds (in our case) to Format.formatter. This means that you must have a formatter at hand to do anything with a format string. However, this formatter is not actually used, it's only passed around everywhere until the continuation is called. The two first commits simply remove that argument everywhere and modify the continuations in Format and Printf to be closure that capture the required formatter.

As a side note, adding functions to Format's api shift the ids, which makes two test fails for no real reason.

@Drup Drup force-pushed the Drup:formatfun branch 2 times, most recently from 1344789 to e5ce068 Aug 2, 2018

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2018

@dbuenzli I just realized that this might be useful to Logs, to simplify your API similarly (but this time for performance reasons).

@alainfrisch alainfrisch added the stdlib label Aug 2, 2018

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2018

Could the change impact performance of existing code?

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2018

@alainfrisch I'm not sure why. make_printf seems too big to be unrolled/specialized and this simply removes an argument. kprintf's continuation was already capturing, and ksprintf's continuation doesn't capture more than it used too.

Do you have benchmarks that I can try? IIRC, you made some for the GADT version.

@alainfrisch

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2018

I don't remember doing any benchmark for the GADT version (what I did was to measure some gains when changing stypes.ml to avoid using too many format strings, and perhaps the GADT version, which came later, would have given a close-enough performance than the direct version).

I'd just do a micro-benchmark, just with simple calls to Printf.sprintf in a loop.

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 2, 2018

I have reviewed the code and believe it is correct.

I don't think there will be any performance difference with sprintf, but there might be one with i{,k}fprintf because there is an extra allocation when it is invoked, if I read the code correctly. As you micro-benchmark, it would be important to also benchmark ifprintf -- which is even more performance-sensitive than actual printing functions.

@yallop

This comment has been minimized.

Copy link
Member

commented Aug 2, 2018

Do you have benchmarks that I can try? IIRC, you made some for the GADT version.

@alainfrisch had a post comparing the performance of the pre-GADT format functions to low-level code, which may be what you had in mind:

http://web.archive.org/web/20160327103236/http://www.lexifi.com/blog/note-about-performance-printf-and-format

@gasche and @bvaugon also compared the GADT-based format implementation with the previous version:

https://ocaml.org/meetings/ocaml/2013/slides/vaugon.pdf

And I had some benchmarks for the later optimised version of ifprintf:

mirage/ocaml-git#130 (comment)

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2018

Following @alainfrisch post, I measured a loop that looked like this:

let () =
  for i = 1 to 10000000 do
    Format.fprintf Format.std_formatter "(%d,%d)@." i i
  done

And I ran that while piping the output to a local file.

Implementation Time Allocations
fprintf 13.890903s 10400470624
Modified fprintf 13.386733s 10240490632
ifprintf 0.233529s 1280060720
Modified ifprintf 0.257059s 1680075784
asprintf 8.439024s 23360885824
Modified asprintf 8.592807s 23200490384

The 'i' version is indeed a bit slower (or at least, it allocates more). I could actually drop that commit, since it's not necessary at all to implement dprintf, I just did it for consistency. Unless someone has an idea to make it faster of course...

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2018

For the sake of it, I measured dprintf and a version of may_printf based on kdprintf.

  for i = 1 to 10000000 do
    let t = Format.dprintf "(%d,%d)@." i i in
    t Format.std_formatter ;
  done

and

let may_fprintf b ppf fmt =
  Format.kdprintf (fun f -> if b then f ppf else ()) fmt
Implementation Time Allocations
dprintf 13.407084s 10000373192
may_printf false 2.459119s 4240138320
may_printf true 13.538509s 10400375576

So dprintf will not help for Logs, but it will not incur any additional costs when used for @Armael's patches.

@pmetzger

This comment has been minimized.

Copy link
Member

commented Aug 2, 2018

Just a side note: there's a C routine called "dprintf" on several OSes already that does something quite different. That said, it may be that this is the best name anyway, I'm not sure.

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 2, 2018

Thanks for the benchmarks! I would be more comfortable with the PR without the ifprintf performance regression.

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2018

If I define a simplified ifprintf like that:

let ifprintf _ppf (Format (fmt, _)) = make_iprintf ignore fmt

With 10 times more loops, I get the following perfs:

Implementation Time Allocations
ikfprintf 2.357069s 8000312736
Modified ikfprintf 2.314199s 12000466624
ifprintf 2.417592s 12800544152
Modified ifprintf 2.267771s 8000314024

The times are pretty volatile, but it seems like:

  1. There are more allocations with the modified ikfprintf, but it doesn't impact timing significantly
  2. The simplified ifprintf significantly outperforms the old one.

I haven't tested with flambda, but I'm pretty sure as soon as ikfprintf is specialized on the continuation, we get the old behavior back.

I'll probably keep the argument on make_iprintf, but make the type more liberal to accept any kind of state and use the simplified definition of ifprintf.

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 2, 2018

Your simplification means that whichever is preferable depends on whether people use ifprintf or ikfprintf more in practice. I'm not sure (would @yallop or @dbuenzli be able to say?), but I would guess ifprintf, which is good for you.

There are more allocations [...], but it doesn't impact timing significantly

Well, it's always that way with micro-benchmarks, you can't see a cost for allocations as long as they are short-lived. But the point of ifprintf is to reassure its users that it does not affect the performance profile of their codebase at all, and in this context modifications to the allocation profile can be an issue (because it changes when collections happens, etc.). (The argument would be stronger to decide between two versions, one which does not allocate at all, but even here a 50% difference is not neglegible.)

@dbuenzli

This comment has been minimized.

Copy link
Contributor

commented Aug 2, 2018

@dbuenzli can't say since in fact logs uses a technique (see here for the high-level pattern) suggested by @yallop which eschews the need for ifprintf or ifkprintf. So neither of these functions are used by the logs library or its users.

@Drup Drup force-pushed the Drup:formatfun branch 2 times, most recently from 31508af to d41a2bb Aug 2, 2018

Drup added some commits Aug 2, 2018

Remove the state argument in CamlinternalFormat.make_printf.
This state argument was passed around, but never used, except
by the continuation.
@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 2, 2018

@gasche Final version, implemented in ffc28e2. It doesn't remove the argument of make_iprintf, but it relaxes its type, which allow the simplification I presented before. The results are below. This should satisfy everyone. I added a change entry for this simplification as well.

Implementation Time Allocations
Modified ikfprintf 2.219763s 8000314024
Modified ifprintf 2.186827s 8000314024

P.S.: No, the identical numbers are not a copy paste error. :]

@Drup Drup force-pushed the Drup:formatfun branch 2 times, most recently from 04e10e0 to e162d28 Aug 2, 2018

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

I'm very satisfied with your i-related results. I think the only thing that was up in the air for discussion was the naming; @pmetzger notes a collision with C's dprintf where d means "(file) descriptor". I think this is not a big deal, but maybe there are other names that people like and who don't have that? Any suggestions?

(Personally I think that "d" for "delayed" makes sense, and I can't think of a much better name right now.)

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 3, 2018

An other suggestion was tprintf, for the association with %t, but @Octachron pointed out that it was too reminiscent of tabulations.

@gasche

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

Let me demonstrate my lack of inspiration by mentioning that fprintf with the f argument coming last could be named printff.

@pmetzger

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

@gasche Truly "print" should be the name for "printf", and "printf" should be the name for "fprintf", but we're kind of stuck with history.

@gasche

gasche approved these changes Aug 3, 2018

Copy link
Member

left a comment

I'm satisfied with the PR. I'm not very hopeful that a better name would emerge, and I think the current one is not-perfect-but-ok. So approved.

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 3, 2018

The patchset is ready on my end.

@gasche gasche merged commit 7f51aae into ocaml:trunk Aug 3, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@pmetzger

This comment has been minimized.

Copy link
Member

commented Aug 3, 2018

One last note: the documentation on this isn't that good. In particular, your description in the PR motivates why you would want this and gives a bit more information on how to use it, but the documentation in the .mli file itself is much less descriptive. I suspect people with needs for this sort of thing will not notice it while reading through the docs. Perhaps a future PR could add a bit of an explanation and an example.

('a, formatter, unit, formatter -> unit) format4 -> 'a
(** Same as {!fprintf}, except the formatter is the last argument.
All printing actions are delayed until the formatter is provided.
In particular, [printff "..." a b c] is a

This comment has been minimized.

Copy link
@yallop

yallop Aug 3, 2018

Member

printffdprintf

@rbonichon

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2018

@Drup @gasche -- this is a point that I think has not been made during the discussion -- and I know I am very late in the discussion.

There is a drawback to using dprintf instead of the closure. In effect, dprintf is actually less delayed than what the closure does in the motivating example.

For example, if you are debugging (I am assuming this might be a use case), you may call a computationally costly function to print useful information. The closure permits to delay any call until its return value is needed to be printed. Using dprintf will evaluate the value whenever dprintf is called (and, effectively, not print it). While there is no observable difference with already evaluated values, there might be some in particular use cases --- which we may want to mention (or not since this derives from OCaml's strict evaluation) --- where the behavior may not match what the expectations. What do you think ?

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 31, 2018

@rbonichon Absolutely. In a way, that's the goal dprintf: it delays the printing action, not the evaluation of the printing arguments. As you say, it directly derives from OCaml's strict evaluation. You can also easily recover the printing delay by including the conditionals in a pp function and use %a.

So, if I see people bumping into this in practice, I'll make a proposition to clarify the documentation, but I do not think it is necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.