New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPR#7500: Remove Uchar.dump #1081

Merged
merged 1 commit into from Mar 7, 2017

Conversation

Projects
None yet
7 participants
@dbuenzli
Contributor

dbuenzli commented Mar 5, 2017

and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.

https://caml.inria.fr/mantis/view.php?id=7500

@dra27

IANDG, but a breaking change at this stage for 4.05 seems unlikely. However, perhaps Format.print_dump_uchar and Format.pp_print_dump_uchar with a deprecation of Uchar.dump might be OK for 4.05? Further changes to the stdlib which rely on this change must obviously be at least 4.06.

Show outdated Hide outdated stdlib/format.mli
{{:http://www.unicode.org/versions/latest/appA.pdf}notational
convention for code points}.
@since 4.05 *)

This comment has been minimized.

@dra27

dra27 Mar 5, 2017

Contributor

I'd be very surprised if @damiendoligez will permit a breaking change this close to the 4.05 release - this'd be 4.06 (and below)

@dra27

dra27 Mar 5, 2017

Contributor

I'd be very surprised if @damiendoligez will permit a breaking change this close to the 4.05 release - this'd be 4.06 (and below)

This comment has been minimized.

@dbuenzli

dbuenzli Mar 5, 2017

Contributor

This is not a breaking change. However if people don't want this in 4.05 I can remove it.

@dbuenzli

dbuenzli Mar 5, 2017

Contributor

This is not a breaking change. However if people don't want this in 4.05 I can remove it.

This comment has been minimized.

@dra27

dra27 Mar 5, 2017

Contributor

I did mean the entire GPR (which is breaking) - I had the deprecation idea seconds later!

@dra27

dra27 Mar 5, 2017

Contributor

I did mean the entire GPR (which is breaking) - I had the deprecation idea seconds later!

Show outdated Hide outdated stdlib/format.mli
@@ -520,6 +528,9 @@ val pp_print_as : formatter -> int -> string -> unit
val pp_print_int : formatter -> int -> unit
val pp_print_float : formatter -> float -> unit
val pp_print_char : formatter -> char -> unit
val pp_print_dump_uchar : formatter -> Uchar.t -> unit
(** @since 4.05 see {!print_dump_uchar}. *)

This comment has been minimized.

@dra27

dra27 Mar 5, 2017

Contributor

Same as previous

@dra27

dra27 Mar 5, 2017

Contributor

Same as previous

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Mar 5, 2017

Contributor

IANDG, but a breaking change at this stage for 4.05 seems unlikely

A review of the all the public uses of Uchar.dump on github indicates that only my libraries are affected. As @gasche mentioned on mantis I think the sooner we get rid of it, the better.

Contributor

dbuenzli commented Mar 5, 2017

IANDG, but a breaking change at this stage for 4.05 seems unlikely

A review of the all the public uses of Uchar.dump on github indicates that only my libraries are affected. As @gasche mentioned on mantis I think the sooner we get rid of it, the better.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Mar 5, 2017

Contributor

With all due respect I believe Uchar.dump was a bad API and Format.pp_print_dump_uchar is nearly as bad. If I understand correctly the need is to produce a printable representation of uchars for debugging and other purposes. In which case the function should take a Uchar.t and return a string. Call it Uchar.to_string or Uchar.escaped for example. The actual printing is left to the user.

Why tie this production of a printable representation with the printing on a formatter? A lot of OCaml users never use the Format module...

Contributor

xavierleroy commented Mar 5, 2017

With all due respect I believe Uchar.dump was a bad API and Format.pp_print_dump_uchar is nearly as bad. If I understand correctly the need is to produce a printable representation of uchars for debugging and other purposes. In which case the function should take a Uchar.t and return a string. Call it Uchar.to_string or Uchar.escaped for example. The actual printing is left to the user.

Why tie this production of a printable representation with the printing on a formatter? A lot of OCaml users never use the Format module...

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Mar 5, 2017

Contributor

Call it Uchar.to_string or Uchar.escaped for example.

Neither of these names would make sense in my opinion, the first one would be ambiguous in the context of Unicode characters and the second one would be inconsistent with what String.escape does.

A lot of OCaml users never use the Format module...

#install_printer does.

Contributor

dbuenzli commented Mar 5, 2017

Call it Uchar.to_string or Uchar.escaped for example.

Neither of these names would make sense in my opinion, the first one would be ambiguous in the context of Unicode characters and the second one would be inconsistent with what String.escape does.

A lot of OCaml users never use the Format module...

#install_printer does.

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Mar 5, 2017

Contributor

Just to make things clear I'm all for having a better name than pp_print_dump_uchar, but this better name has to be unambiguous.

For all it's badness one virtue of it is that it is sufficiently curious that a reader won't take it for what it's not (e.g. things like pp_print_ascii_uchar or _uchar_to_ascii would be ambiguous). The actual printing follows Unicode notational convention, but this notation doesn't have a particular name.

Contributor

dbuenzli commented Mar 5, 2017

Just to make things clear I'm all for having a better name than pp_print_dump_uchar, but this better name has to be unambiguous.

For all it's badness one virtue of it is that it is sufficiently curious that a reader won't take it for what it's not (e.g. things like pp_print_ascii_uchar or _uchar_to_ascii would be ambiguous). The actual printing follows Unicode notational convention, but this notation doesn't have a particular name.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Mar 5, 2017

Member

I think it would make sense to separate the change in two:

  • removing UChar.dump, which I would be in favor of doing as soon as possible. If we decide to do the change in 4.05, I would like to merge the change in the next few days.
  • deciding what is the best interface for this feature (and in which module to go), which may take a bit more time to converge (if it's ready for 4.05, fine, but we can also have it in trunk)

This split would mean that it would be possible to be release with UChar.dump removed but without an equivalent function somewhere. I think that the feature is easy enough for the user to reimplement for this not to be so much of an issue.

Member

gasche commented Mar 5, 2017

I think it would make sense to separate the change in two:

  • removing UChar.dump, which I would be in favor of doing as soon as possible. If we decide to do the change in 4.05, I would like to merge the change in the next few days.
  • deciding what is the best interface for this feature (and in which module to go), which may take a bit more time to converge (if it's ready for 4.05, fine, but we can also have it in trunk)

This split would mean that it would be possible to be release with UChar.dump removed but without an equivalent function somewhere. I think that the feature is easy enough for the user to reimplement for this not to be so much of an issue.

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Mar 5, 2017

Contributor

Ok, I'll wait for @damiendoligez's decision about either removal or deprecation before proceeding as you suggest.

I'll just mention that:

deciding what is the best interface for this feature (and in which module to go), which may take a bit more time to converge (if it's ready for 4.05, fine, but we can also have it in trunk)

This is not really about an interface. It's just a name for a pretty printer that can be used either for printing your own datastructures or for quickly turning:

# Uchar.max;;
- : Uchar.t = <abstr>

into

# Uchar.max;;
- : Uchar.t = U+10FFFF

via an #install_printer. (If you want to know why I didn't actually set it up by default in the toplevel the reason is that if I'm not mistaken, currently, by default, when not <abstr> right hand sides of equations are always valid OCaml values, something that is not the case with this pretty-printer).

I think that the feature is easy enough for the user to reimplement for this not to be so much of an issue.

Anyone seriously using Uchar will need at some point (and if you want a string you have Format.asprintf) I thought that the time when the OCaml stdlib wasn't supporting its own mechanisms were gone.

Contributor

dbuenzli commented Mar 5, 2017

Ok, I'll wait for @damiendoligez's decision about either removal or deprecation before proceeding as you suggest.

I'll just mention that:

deciding what is the best interface for this feature (and in which module to go), which may take a bit more time to converge (if it's ready for 4.05, fine, but we can also have it in trunk)

This is not really about an interface. It's just a name for a pretty printer that can be used either for printing your own datastructures or for quickly turning:

# Uchar.max;;
- : Uchar.t = <abstr>

into

# Uchar.max;;
- : Uchar.t = U+10FFFF

via an #install_printer. (If you want to know why I didn't actually set it up by default in the toplevel the reason is that if I'm not mistaken, currently, by default, when not <abstr> right hand sides of equations are always valid OCaml values, something that is not the case with this pretty-printer).

I think that the feature is easy enough for the user to reimplement for this not to be so much of an issue.

Anyone seriously using Uchar will need at some point (and if you want a string you have Format.asprintf) I thought that the time when the OCaml stdlib wasn't supporting its own mechanisms were gone.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Mar 5, 2017

Contributor

I stand by my claim that the functionality should be exposed as a function in Uchar with type t -> string. Name it however you want. A dependency-free implementation is easy, just declare "caml_format_int" as an external.

Contributor

xavierleroy commented Mar 5, 2017

I stand by my claim that the functionality should be exposed as a function in Uchar with type t -> string. Name it however you want. A dependency-free implementation is easy, just declare "caml_format_int" as an external.

@gasche

This comment has been minimized.

Show comment
Hide comment
@gasche

gasche Mar 5, 2017

Member

I see Daniel's point that it is convenient to have pp_print_* functions in the Format module for various datatypes of interest, but looking at the list of current such functions it seems that we have pp_print_* functions for built-in types only, not for Int32 or Int64 for example. One could argue for extending this to stdlib-defined types in general, but I'm not sure that this is the best design -- in particular, this makes Format depend on everything on earth, which may cause further dependency issues down the road.

Member

gasche commented Mar 5, 2017

I see Daniel's point that it is convenient to have pp_print_* functions in the Format module for various datatypes of interest, but looking at the list of current such functions it seems that we have pp_print_* functions for built-in types only, not for Int32 or Int64 for example. One could argue for extending this to stdlib-defined types in general, but I'm not sure that this is the best design -- in particular, this makes Format depend on everything on earth, which may cause further dependency issues down the road.

MPR#7500: remove Uchar.dump
and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.

@dbuenzli dbuenzli changed the title from MPR#7500: move Uchar.dump to Format.[pp_]print_dump_uchar to MPR#7500: Remove Uchar.dump Mar 5, 2017

@dbuenzli dbuenzli referenced this pull request Mar 5, 2017

Closed

Add Fmt.Dump.uchar #21

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Mar 5, 2017

Contributor

So I amended the commit to simply remove Uchar.dump.

Regarding:

I stand by my claim that the functionality should be exposed as a function in Uchar with type t -> string. Name it however you want. A dependency-free implementation is easy, just declare "caml_format_int" as an external.

The module already declares caml_format_int. I'll let someone else do that and find the good name as 1) I have no interest in such a function 2) find it inferior to a formatter

One could argue for extending this to stdlib-defined types in general, but I'm not sure that this is the best design -- in particular, this makes Format depend on everything on earth, which may cause further dependency issues down the road.

If one wants to improve the usability of the toplevel and the stdlib this should be given a serious thought at some point. Type directed usage of formatters in the wider ecosystem is an established idiom (libraries that provide toplevel and printf debugging support on their values, for usage in testing frameworks to print offending values, for usage in logging frameworks, or, tupled with corresponding parsers, to support textual interaction with humans).

Contributor

dbuenzli commented Mar 5, 2017

So I amended the commit to simply remove Uchar.dump.

Regarding:

I stand by my claim that the functionality should be exposed as a function in Uchar with type t -> string. Name it however you want. A dependency-free implementation is easy, just declare "caml_format_int" as an external.

The module already declares caml_format_int. I'll let someone else do that and find the good name as 1) I have no interest in such a function 2) find it inferior to a formatter

One could argue for extending this to stdlib-defined types in general, but I'm not sure that this is the best design -- in particular, this makes Format depend on everything on earth, which may cause further dependency issues down the road.

If one wants to improve the usability of the toplevel and the stdlib this should be given a serious thought at some point. Type directed usage of formatters in the wider ecosystem is an established idiom (libraries that provide toplevel and printf debugging support on their values, for usage in testing frameworks to print offending values, for usage in logging frameworks, or, tupled with corresponding parsers, to support textual interaction with humans).

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Mar 6, 2017

Contributor

Since this PR is now only about removing Uchar.dump, let's focus on this part (and not a future replacement). I support this change, even for inclusion in 4.05.

Contributor

alainfrisch commented Mar 6, 2017

Since this PR is now only about removing Uchar.dump, let's focus on this part (and not a future replacement). I support this change, even for inclusion in 4.05.

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Mar 6, 2017

Contributor

I think we need @damiendoligez input before merging the PR in 4.05.

Contributor

alainfrisch commented Mar 6, 2017

I think we need @damiendoligez input before merging the PR in 4.05.

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Mar 7, 2017

Member

I'm OK with merging this in 4.05 and I agree with Xavier: the basic functionality is a Uchar.t -> string function on top of which the formatter functions should be built, not the other way around.

Member

damiendoligez commented Mar 7, 2017

I'm OK with merging this in 4.05 and I agree with Xavier: the basic functionality is a Uchar.t -> string function on top of which the formatter functions should be built, not the other way around.

@damiendoligez damiendoligez merged commit 991ffbe into ocaml:trunk Mar 7, 2017

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

damiendoligez added a commit that referenced this pull request Mar 7, 2017

MPR#7500: remove Uchar.dump (#1081)
and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.

@dbuenzli dbuenzli deleted the dbuenzli:remove-uchar-dump branch Mar 7, 2017

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Mar 7, 2017

Member

Cherry-picked to 4.05 (commit 07e4594).

Member

damiendoligez commented Mar 7, 2017

Cherry-picked to 4.05 (commit 07e4594).

camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017

MPR#7500: remove Uchar.dump (#1081)
and make the Uchar independent from the Format and Printf
modules. Previously this made it impossible to use the type in the
natural habitat that the String, Bytes and Buffer modules could be.
@ELLIOTTCABLE

This comment has been minimized.

Show comment
Hide comment
@ELLIOTTCABLE

ELLIOTTCABLE Jun 22, 2018

Does this mean there's no functionality in 4.05 / 4.06 to produce a debugging-representation of a Uchar.t?

ELLIOTTCABLE commented Jun 22, 2018

Does this mean there's no functionality in 4.05 / 4.06 to produce a debugging-representation of a Uchar.t?

@dra27

This comment has been minimized.

Show comment
Hide comment
@dra27

dra27 Jun 22, 2018

Contributor

@ELLIOTTCABLE - there’s still Uchar.to_int so you can still write let dump ppf u = Format.fprintf ppf "U+%04X" (Uchar.to_int u)

Contributor

dra27 commented Jun 22, 2018

@ELLIOTTCABLE - there’s still Uchar.to_int so you can still write let dump ppf u = Format.fprintf ppf "U+%04X" (Uchar.to_int u)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment