Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPR#7500: Remove Uchar.dump #1081

Merged
merged 1 commit into from
Mar 7, 2017
Merged

Conversation

dbuenzli
Copy link
Contributor

@dbuenzli dbuenzli commented Mar 5, 2017

and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.

https://caml.inria.fr/mantis/view.php?id=7500

Copy link
Member

@dra27 dra27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IANDG, but a breaking change at this stage for 4.05 seems unlikely. However, perhaps Format.print_dump_uchar and Format.pp_print_dump_uchar with a deprecation of Uchar.dump might be OK for 4.05? Further changes to the stdlib which rely on this change must obviously be at least 4.06.

{{:http://www.unicode.org/versions/latest/appA.pdf}notational
convention for code points}.

@since 4.05 *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be very surprised if @damiendoligez will permit a breaking change this close to the 4.05 release - this'd be 4.06 (and below)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a breaking change. However if people don't want this in 4.05 I can remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did mean the entire GPR (which is breaking) - I had the deprecation idea seconds later!

@@ -520,6 +528,9 @@ val pp_print_as : formatter -> int -> string -> unit
val pp_print_int : formatter -> int -> unit
val pp_print_float : formatter -> float -> unit
val pp_print_char : formatter -> char -> unit
val pp_print_dump_uchar : formatter -> Uchar.t -> unit
(** @since 4.05 see {!print_dump_uchar}. *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous

@dbuenzli
Copy link
Contributor Author

dbuenzli commented Mar 5, 2017

IANDG, but a breaking change at this stage for 4.05 seems unlikely

A review of the all the public uses of Uchar.dump on github indicates that only my libraries are affected. As @gasche mentioned on mantis I think the sooner we get rid of it, the better.

@xavierleroy
Copy link
Contributor

With all due respect I believe Uchar.dump was a bad API and Format.pp_print_dump_uchar is nearly as bad. If I understand correctly the need is to produce a printable representation of uchars for debugging and other purposes. In which case the function should take a Uchar.t and return a string. Call it Uchar.to_string or Uchar.escaped for example. The actual printing is left to the user.

Why tie this production of a printable representation with the printing on a formatter? A lot of OCaml users never use the Format module...

@dbuenzli
Copy link
Contributor Author

dbuenzli commented Mar 5, 2017

Call it Uchar.to_string or Uchar.escaped for example.

Neither of these names would make sense in my opinion, the first one would be ambiguous in the context of Unicode characters and the second one would be inconsistent with what String.escape does.

A lot of OCaml users never use the Format module...

#install_printer does.

@dbuenzli
Copy link
Contributor Author

dbuenzli commented Mar 5, 2017

Just to make things clear I'm all for having a better name than pp_print_dump_uchar, but this better name has to be unambiguous.

For all it's badness one virtue of it is that it is sufficiently curious that a reader won't take it for what it's not (e.g. things like pp_print_ascii_uchar or _uchar_to_ascii would be ambiguous). The actual printing follows Unicode notational convention, but this notation doesn't have a particular name.

@gasche
Copy link
Member

gasche commented Mar 5, 2017

I think it would make sense to separate the change in two:

  • removing UChar.dump, which I would be in favor of doing as soon as possible. If we decide to do the change in 4.05, I would like to merge the change in the next few days.
  • deciding what is the best interface for this feature (and in which module to go), which may take a bit more time to converge (if it's ready for 4.05, fine, but we can also have it in trunk)

This split would mean that it would be possible to be release with UChar.dump removed but without an equivalent function somewhere. I think that the feature is easy enough for the user to reimplement for this not to be so much of an issue.

@dbuenzli
Copy link
Contributor Author

dbuenzli commented Mar 5, 2017

Ok, I'll wait for @damiendoligez's decision about either removal or deprecation before proceeding as you suggest.

I'll just mention that:

deciding what is the best interface for this feature (and in which module to go), which may take a bit more time to converge (if it's ready for 4.05, fine, but we can also have it in trunk)

This is not really about an interface. It's just a name for a pretty printer that can be used either for printing your own datastructures or for quickly turning:

# Uchar.max;;
- : Uchar.t = <abstr>

into

# Uchar.max;;
- : Uchar.t = U+10FFFF

via an #install_printer. (If you want to know why I didn't actually set it up by default in the toplevel the reason is that if I'm not mistaken, currently, by default, when not <abstr> right hand sides of equations are always valid OCaml values, something that is not the case with this pretty-printer).

I think that the feature is easy enough for the user to reimplement for this not to be so much of an issue.

Anyone seriously using Uchar will need at some point (and if you want a string you have Format.asprintf) I thought that the time when the OCaml stdlib wasn't supporting its own mechanisms were gone.

@xavierleroy
Copy link
Contributor

I stand by my claim that the functionality should be exposed as a function in Uchar with type t -> string. Name it however you want. A dependency-free implementation is easy, just declare "caml_format_int" as an external.

@gasche
Copy link
Member

gasche commented Mar 5, 2017

I see Daniel's point that it is convenient to have pp_print_* functions in the Format module for various datatypes of interest, but looking at the list of current such functions it seems that we have pp_print_* functions for built-in types only, not for Int32 or Int64 for example. One could argue for extending this to stdlib-defined types in general, but I'm not sure that this is the best design -- in particular, this makes Format depend on everything on earth, which may cause further dependency issues down the road.

and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.
@dbuenzli dbuenzli changed the title MPR#7500: move Uchar.dump to Format.[pp_]print_dump_uchar MPR#7500: Remove Uchar.dump Mar 5, 2017
@dbuenzli
Copy link
Contributor Author

dbuenzli commented Mar 5, 2017

So I amended the commit to simply remove Uchar.dump.

Regarding:

I stand by my claim that the functionality should be exposed as a function in Uchar with type t -> string. Name it however you want. A dependency-free implementation is easy, just declare "caml_format_int" as an external.

The module already declares caml_format_int. I'll let someone else do that and find the good name as 1) I have no interest in such a function 2) find it inferior to a formatter

One could argue for extending this to stdlib-defined types in general, but I'm not sure that this is the best design -- in particular, this makes Format depend on everything on earth, which may cause further dependency issues down the road.

If one wants to improve the usability of the toplevel and the stdlib this should be given a serious thought at some point. Type directed usage of formatters in the wider ecosystem is an established idiom (libraries that provide toplevel and printf debugging support on their values, for usage in testing frameworks to print offending values, for usage in logging frameworks, or, tupled with corresponding parsers, to support textual interaction with humans).

@alainfrisch
Copy link
Contributor

Since this PR is now only about removing Uchar.dump, let's focus on this part (and not a future replacement). I support this change, even for inclusion in 4.05.

@alainfrisch
Copy link
Contributor

I think we need @damiendoligez input before merging the PR in 4.05.

@damiendoligez
Copy link
Member

I'm OK with merging this in 4.05 and I agree with Xavier: the basic functionality is a Uchar.t -> string function on top of which the formatter functions should be built, not the other way around.

@damiendoligez damiendoligez merged commit 991ffbe into ocaml:trunk Mar 7, 2017
damiendoligez pushed a commit that referenced this pull request Mar 7, 2017
and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.
@dbuenzli dbuenzli deleted the remove-uchar-dump branch March 7, 2017 12:48
@damiendoligez
Copy link
Member

Cherry-picked to 4.05 (commit 07e4594).

camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017
and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.
@ELLIOTTCABLE
Copy link
Contributor

Does this mean there's no functionality in 4.05 / 4.06 to produce a debugging-representation of a Uchar.t?

@dra27
Copy link
Member

dra27 commented Jun 22, 2018

@ELLIOTTCABLE - there’s still Uchar.to_int so you can still write let dump ppf u = Format.fprintf ppf "U+%04X" (Uchar.to_int u)

@vicuna vicuna mentioned this pull request Mar 14, 2019
stedolan added a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants