Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format semantic tags as extensible sum types. #1966

Merged
merged 2 commits into from Sep 2, 2018

Conversation

Projects
None yet
4 participants
@Drup
Copy link
Contributor

commented Aug 3, 2018

This PR redefines Format "semantic tags" as an extensible sum types.

Semantics tags are annotations that are inserted into the formatting queue (either through combinators or trough the syntax "@{<foo>..@}") and allow to mark some part of the text. Formatters can then act on these tags to do arbitrary things either before or after computation of the layout.

A very common usage, including in the compiler itself, is to insert terminal sequences for colors. There are however more original use cases (Interactive UIs, S-exprs, richer terminal informations, structured HTML, ... tune in for the upcoming OCaml workshop presentation by @let-def and myself for many crazy format use cases </self promotion>).

While semantic tags allows fairly original hacks, one big issue is that they are strings. Whichever information is passed down through the tags must be encoded into strings, passed through the formatting mechanisms and recovered at the other end. This has three major problems:

  • Lack of encapsulation. semantic tags are open for anyone to interpret. If I remember @dbuenzli's remarks correctly, this is one of the major reason that led Fmt to avoid them.
  • Even without considering encapsulation, recognizing that a tag belongs to a library requires careful care.
  • Everything must be serialized to strings or use IDs coupled with additional tables.

Hence this PR! Extensible sum types offer a very nice solution to all these problems. Libraries can declare new constructors that will contain arbitrary data. By keeping the constructor hidden, they can prevent users to mess with them. Constructors also allows to easily differentiate the various family of tags.

Given all that, the implementation is rather straightforward. On top of the new extensible sum type stag, I introduce the String_tag constructor which represent string tags. The format syntax still output string tags, and cannot be used to create arbitrary tags. I do not believe this is a problem in practice (especially given the various use cases presented above).

Compatibility

Currently, all the Format "user facing" functions such as {close,open}_tag are kept, but marked as deprecated and to be replaced with the new ones. The tag type is kept unchanged. Functions expected to interpret formats (formatter_tag_functions) are all changed to use the new type stag.

There are many ways we could go about this, so I'll wait for comments.

Going further

Currently, I have not changed the internal tags used by the compiler to use the new semantic tags. I will be happy to if people deem it worthwhile.

@Drup Drup force-pushed the Drup:htags branch 2 times, most recently from 6e6055c to 0d86fd7 Aug 3, 2018

@nojb
Copy link
Contributor

left a comment

Thanks very much for this very nice change! An unqualified improvement with a very simple implementation.

For backwards-compatibility, I would leave the type formatter_tag_functions as it is (but deprecated) for the special case of string tags (ignore other tags) and introduce a new type (and associated functions) formatter_stag_functions.

Updating the compiler code base (is it just the one use in Misc?) will require a bootstrap, so personally I would leave that out of this PR.

Default tag-marking functions behave the HTML way: tags are enclosed in "<"
and ">"; hence, opening marker for tag [t] is ["<t>"] and closing marker is
["</t>"].
Default tag-marking functions behave the HTML way: string tags are enclosed

This comment has been minimized.

Copy link
@nojb

nojb Aug 5, 2018

Contributor

Small nit: string tags are only introduced after this paragraph. Maybe add a reference (see {!tag}) ?

*)

val pp_open_stag : formatter -> stag -> unit
val open_stag : stag -> unit
(** [pp_open_tag ppf t] opens the semantic tag named [t].

This comment has been minimized.

Copy link
@nojb

nojb Aug 5, 2018

Contributor

s/pp_open_tag/pp_open_stag/

val pp_close_tag : formatter -> unit -> unit
val close_tag : unit -> unit
val pp_close_stag : formatter -> unit -> unit
val close_stag : unit -> unit
(** [pp_close_tag ppf ()] closes the most recently opened semantic tag [t].

This comment has been minimized.

Copy link
@nojb

nojb Aug 5, 2018

Contributor

s/pp_close_tag/pp_close_stag/

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2018

I forgot to mention the problem of pp_get_formatter_tag_functions in the review above which will not be easy to maintain.

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 5, 2018

I forgot to mention the problem of pp_get_formatter_tag_functions in the review above which will not be easy to maintain.

Yes, this is precisely the part that led me not to provide backward compatible versions for any of these functions. I don't see a reasonable way to maintain that, and honestly, I'm not sure it's worth it.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 5, 2018

Would it be too heretic to do:

let pp_get_formatter_tag_functions fmt () =
  let funs = pp_get_formatter_stag_functions fmt () in
  let mark_open_tag s = funs.mark_open_stag (String_tag s) in
  let mark_close_tag s = funs.mark_close_stag (String_tag s) in
  let print_open_tag s = funs.print_open_stag (String_tag s) in
  let print_close_tag s = funs.print_close_stag (String_tag s) in
  {mark_open_tag; mark_close_tag; print_open_tag; print_close_tag}

?

@Drup Drup force-pushed the Drup:htags branch from 0d86fd7 to f980cb6 Aug 6, 2018

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 6, 2018

@nojb Yes, but it is a good idea, so I implemented the heretical version! Now all the previously existing functions are still available, but all the one that mentioned tag in their type are deprecated.

The unfortunate consequence is that half the functions have stag in their name and the other half have tag.....

@Drup Drup force-pushed the Drup:htags branch from f980cb6 to ce4b1ea Aug 6, 2018


val pp_set_formatter_tag_functions :
formatter -> formatter_tag_functions -> unit
[@@ocaml.deprecated "Use Format.pp_set_formatter_stag_functions."]

This comment has been minimized.

Copy link
@Octachron

Octachron Aug 6, 2018

Contributor

Should the deprecation comment warn users that calling pp_set_formatter_tag_functions on stag-enabled formatters results in a loss of all non-String_tag functions (since there is an alternative implementation that captures the old tag functions and only call the new function on String_tags ) ?

This comment has been minimized.

Copy link
@Drup

Drup Aug 6, 2018

Author Contributor

Done.

@Drup Drup force-pushed the Drup:htags branch from ce4b1ea to f9dc742 Aug 6, 2018

@nojb

nojb approved these changes Aug 7, 2018

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2018

I approved since my concerns were all addressed. Someone else may want to chime in about the deprecation path.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2018

Also, as it stands, this patch is fully backwards-compatible, so * can be changed to - in Changes.

@Drup Drup force-pushed the Drup:htags branch 2 times, most recently from 62f8546 to f4bccc4 Aug 17, 2018

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 18, 2018

Anyone has further comments ? Everything is ready on my end.

@pierreweis

This comment has been minimized.

Copy link
Contributor

commented Aug 18, 2018

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 18, 2018

@pierreweis The new semantic tags are exactly as dynamic as the old ones. Furthermore, backward compatibility is completely preserved since string tags are just one particular constructor of semantic tags. There should be no loss of expressive power whatsoever and no code breakage.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2018

Since no objections have been raised, am planning to merge this in a few days unless someone speaks up.

@pierreweis

This comment has been minimized.

Copy link
Contributor

commented Aug 24, 2018

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Aug 24, 2018

@pierreweis As I said before, backward compatibility is completely preserved. String tags are still accessible, in and out of format strings. Old tags are completely preserved, as a special case of the new fancier tags.

As for format strings, the new non-string semantic tags are simply not directly accessible with them. Many of the new Format uses blend a combinator style with the old format-string style (using the trusty %t and %a). Not having a concrete syntax for semantic tags is not an issue for practical uses, especially given it comes at no costs whatsoever to the current uses of tags. String tags are still accessible using the usual @{<foo>..@} syntax.

If you still have concerns, I would suggest to read the patch. It's very short, and it would address all your comments.

[@@ocaml.deprecated
"This function will erase non-string tag formatting functions. \
Use Format.pp_set_formatter_stag_functions."]
[@@warning "-3"]

This comment has been minimized.

Copy link
@nojb

nojb Aug 31, 2018

Contributor

Why is this warning annotation needed?

This comment has been minimized.

Copy link
@Drup

Drup Aug 31, 2018

Author Contributor

The type formatter_tag_functions is deprecated. It would raise a warning 3.

@nojb

This comment has been minimized.

Copy link
Contributor

commented Sep 2, 2018

@Drup Could you rebase please? We can merge afterwards.

@Drup Drup force-pushed the Drup:htags branch from f4bccc4 to b271ef7 Sep 2, 2018

@Drup

This comment has been minimized.

Copy link
Contributor Author

commented Sep 2, 2018

@nojb Done

@nojb nojb merged commit eb7982b into ocaml:trunk Sep 2, 2018

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@Drup Drup deleted the Drup:htags branch Feb 9, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.