Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start transition from tinyformat to fmt/std::format -- multi-stage #2076

Merged
merged 1 commit into from Nov 27, 2018

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Nov 24, 2018

For many years we have used Chris Foster's magnificent "tinyformat"
package to do a lot of the heavy lifting for our type-safe string
formatting with printf-like notation, which we use for Strutil::format,
nearly every error reporting facility in the major classes, etc. Thanks,
Chris!

The C++ committee has finally caught up to Chris, advancing drafts of a
"string formatting" proposal that now seems very likely to be part of
C++20. The best implementation at this point is the {fmt} package.

Unfortunately, {fmt} and the C++ draft papers differ from tinyformat in
that rather than traditional C printf notation for formatting, they use
a notation very similar to Python string formatting, e.g.,

format ("%d %.3f", 1, 23.4)  -->  format ("{} {.3f}", 1, 23.4)

This compatibility break is one reason I've stayed away from {fmt},
despite that (no slight to Chris) it's at least twice twice the speed of
tinyformat.

But knowing that this is almost certainly the shape of the future C++
standard, I am confident that the best long-term path is to migrate to
its conventions. And it will be nice to reuse the same cognitive
machinery whether we're programming in C++ or Python, and get the speed
boost.

But how to do this migration without breaking every string format
operation in OIIO, and additionally in all the packages that call OIIO
and use any of its string formatting or error reporting facilities (and
indeed, we use it extensively in OSL)? This PR is the first step of
implementing the transition, which will occur over a couple major
releases. Here we go:

  1. Introduce Strutil::sprintf(), ustring::sprintf(), and various
    errorf(), warningf(), etc., functions as needed that now and forever
    will use the printf notation (and for now are still implemented
    primarily with tinyformat, but that's not a detail you need to worry
    about).

    Neary all internal uses of the format() commands within OIIO internals
    have switched to sprintf().

    Calling apps that make use of OIIO's string formatting can therefore
    also change their calls from format() to sprintf() and everything
    will continue working for the forseeable future.

  2. Introduce, in a namespace, Strutil::fmt::format() and certain
    fmterror(), etc., that use the NEW Python-like notation, implemented
    underneath with fmt::format().

    Calling apps may start using these now if they prefer the Python
    notation...

    But BEWARE! I have discovered that the fmt library does not correctly
    implement the promised locale-inedependent formatting for floating
    point values (it does for integers), and so could cause problems for
    those of you who might have apps running on platforms wth certain
    locales that, for example, use ',' as the decimal (sheesh). This is a
    known problem with fmt, will be fixed soon, and as soon as it does,
    I'll update our embedded copy of fmt and give you the all clear to go
    nuts with the new python style formatting directives.

  3. For OIIO 2.0, Strutil::format, ustring::format, and the various error()
    functions will behave identically to sprintf (C printf notation), but
    for OIIO 2.1, I am planning on having them all switch to to the new
    python (a.k.a. C++20-ish std::format()) format designations.

So, the upshot is that NOTHING should break for your existing code's
format()/error() calls when switching from OIIO 1.x to 2.0.
(Proof: OSL master compiles against it and passes all tests, without a
single change.)

But after you upgrade to 2.0 and before 2.1 is released (I'm assuming in
the latter half of 2019 at the earliest), you'll want to either switch
all those calls to Strutil::sprintf()/errorf() and stick to the
printf-style formatting OR change them to
Strutil::fmt::format()/fmterror() using the new formatting specs.
Pretty much all of the OIIO internals (error calls, etc.) have been
changed to use the explicit printf-like forms. So they seem safe.

After 2.1, then, the old names format()/error() will be the new behavior
and you can switch back to those if you want. And by the time C++20
comes out, we'll all be on the same page, and some day OIIO won't even
need to use the fmt library because it can rely on std::format.

There you go. Here is the PR that implements the first phase.

We have put the relevant parts of fmt library in
include/OpenImageIO/fmt. We are currently using commit 982ee5c, with
one fix by me on top of it that fixes a Windows compile issue. We will
update the version of fmt as fixes and improvements come.

Sorry for such a weird change in the middle of the beta period. But
sometimes a looming deadline of a code freeze and major release (after
which I wouldn't want to introduce this kind of disrupction) really
makes one focus about doing it right now instead of having to wait
potentially an additional year.

@lgritz lgritz force-pushed the lg-fmt branch 2 times, most recently from 6142108 to a4212f3 Compare November 25, 2018 06:12
@fpsunflower
Copy link
Contributor

At first glance of the fmt docs, they do appear to support printf style arguments as well. Is there any reason, not to use their code to implement the printf style variants on our end?

/// The main (or "full detail") method -- takes a code (with high
/// bits being an ErrCode) and writes the message, with a prefix
/// indicating the error category (no prefix for "MESSAGE") and
/// error string.
virtual void operator()(int errcode, const std::string& msg);

/// Info message with printf-like formatted error message.
/// Will not print unless verbosity >= VERBOSE.
// Base cases -- take a single string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these base cases necessary?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or at the very least -- can we use string_view here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use string_view, and the caller passes a string, it will have to make a second string when it passes along to operator(). In an ideal world, we would have made operator() take a string_view, but we established ErrorHandler's interface before we had string_view. And every renderer that uses OSL supplies or uses ErrorHandler, so we can't change the interface now. But remember, this is an error handler, it's only called rarely, it's ok if there are a few extra string copies.

The base cases are unfortunately necessary because people (including, um, me, in OSL, sorry) sometimes pass just one string to send an error that's not supposed to be formatted at all. If that should contain a "%" or someday a "{}", it will go kablooey if there isn't the right kind of argument following.

template<typename... Args>
void info(string_view format, const Args&... args)
void info(const char* format, const Args&... args)
{
if (verbosity() >= VERBOSE)
info(Strutil::format(format, args...));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be a way to avoid the allocation here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which allocation? The string produced by the format() call?

@fpsunflower
Copy link
Contributor

At first glance of the fmt docs, they do appear to support printf style arguments as well. Is there any reason, not to use their code to implement the printf style variants on our end?

Nevermind -- I see the comment about this being because of the float bug. Why not wait for that to land and switch all at once? It seems like it would be a smoother transition.

template<typename... Args>
void info(string_view format, const Args&... args)
void info(const char* format, const Args&... args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to use Arg&&... args and std::forward to get move semantics if appropriate?

It probably wouldn't matter in the vast majority of cases, but it might be a tiny bit more efficient if handling some temporary objects?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno. If that were so important, wouldn't fmt itself (and tinyformat before it) do its declarations like that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see what you're getting at here. Seems like the right place to use the "perfect forwarding" idiom. But then why is that not what's used in format and tinyformat themselves when they pass things around?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to the wisdom of those other projects. For tinyformat it might be due to its pre-C++11 compatibility, not sure about fmt ...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's also something that we can do later if we decide forwarding is better. It's just an inline template, changing to && later won't break source or link compatibility. From the caller's point of view, the args are the same.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I want to merge what I've got (which basically preserves the source call signatures we've had for a long time, but just adds new ones that can be counted on to stay explicitly printf-like), without having to do a lot more testing and soul-searching while I'm trying to get a release out. Then I will do more careful testing to see if there's advantage to switching to &&, which I think we can do any time without a second compatibility break.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I have a theory. Perfect forwarding doesn't matter here, because all it needs is a reference. Example 1:

format ("{}", existing_string_variable);

passes a const ref to the string var (no copy).

And, example 2:

format ("{}", string_a+string_b);
// creates a temp of the concatenation, passes const ref to that temp

No extra allocations are involved. What exactly would be saved with perfect forwarding?

As an example of where you want to forward, consider:

std::vector<std::string> vec;
vec.push_back (A_str_variable_will_be_used_again);
vec.push_back (B_str_variable_never_used_again);
vec.push_back (C_str_temporary_expression);

Example A needs to make a copy of a string to put into the vector, but B and C can move the string into the vector, that's why you want && and perfect forwarding in that case -- because it's a sink for constructed data structures. But format() and friends are not -- they are not taking possession of the data, they just need a ref for the duration of the call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes perfect sense.

For many years we have used Chris Foster's magnificent "tinyformat"
package to do a lot of the heavy lifting for our type-safe string
formatting with printf-like notation, which we use for Strutil::format,
nearly every error reporting facility in the major classes, etc. Thanks,
Chris!

The C++ committee has finally caught up to Chris, advancing drafts of a
"string formatting" proposal that now seems very likely to be part of
C++20. The best implementation at this point is the {fmt} package.

Unfortunately, {fmt} and the C++ draft papers differ from tinyformat in
that rather than traditional C printf notation for formatting, they use
a notation very similar to Python string formatting, e.g.,

    format ("%d %.3f", 1, 23.4)  -->  format ("{} {.3f}", 1, 23.4)

This compatibility break is one reason I've stayed away from {fmt},
despite that (no slight to Chris) it's at least twice twice the speed of
tinyformat.

But knowing that this is almost certainly the shape of the future C++
standard, I am confident that the best long-term path is to migrate to
its conventions. And it will be nice to reuse the same cognitive
machinery whether we're programming in C++ or Python, and get the speed
boost.

But how to do this migration without breaking every string format
operation in OIIO, and additionally in all the packages that call OIIO
and use any of its string formatting or error reporting facilities (and
indeed, we use it extensively in OSL)? This PR is the first step of
implementing the transition, which will occur over a couple major
releases. Here we go:

1. Introduce Strutil::sprintf(), ustring::sprintf(), and various
   errorf(), warningf(), etc., functions as needed that now and forever
   will use the printf notation (and for now are still implemented
   primarily with tinyformat, but that's not a detail you need to worry
   about).

   Neary all internal uses of the format() commands within OIIO internals
   have switched to sprintf().

   Calling apps that make use of OIIO's string formatting can therefore
   also change their calls from format() to sprintf() and everything
   will continue working for the forseeable future.

2. Introduce, in a namespace, Strutil::fmt::format() and certain
   fmterror(), etc., that use the NEW Python-like notation, implemented
   underneath with fmt::format().

   Calling apps may start using these now if they prefer the Python
   notation...

   But BEWARE! I have discovered that the fmt library does not correctly
   implement the promised locale-inedependent formatting for floating
   point values (it does for integers), and so could cause problems for
   those of you who might have apps running on platforms wth certain
   locales that, for example, use ',' as the decimal (sheesh). This is a
   known problem with fmt, will be fixed soon, and as soon as it does,
   I'll update our embedded copy of fmt and give you the all clear to go
   nuts with the new python style formatting directives.

3. For OIIO 2.0, Strutil::format, ustring::format, and the various error()
   functions will behave identically to sprintf (C printf notation), but
   for OIIO 2.1, I am planning on having them all switch to to the new
   python (a.k.a. C++20-ish std::format()) format designations.

So, the upshot is that NOTHING should break for your existing code's
format()/error() calls when switching from OIIO 1.x to 2.0.
(Proof: OSL master compiles against it and passes all tests, without a
single change.)

But after you upgrade to 2.0 and before 2.1 is released (I'm assuming in
the latter half of 2019 at the earliest), you'll want to either switch
all those calls to Strutil::sprintf()/errorf() and stick to the
printf-style formatting *OR* change them to
Strutil::fmt::format()/fmterror() using the new formatting specs.
Pretty much all of the OIIO internals (error calls, etc.) have been
changed to use the explicit printf-like forms. So they seem safe.

After 2.1, then, the old names format()/error() will be the new behavior
and you can switch back to those if you want. And by the time C++20
comes out, we'll all be on the same page, and some day OIIO won't even
need to use the fmt library because it can rely on std::format.

There you go. Here is the PR that implements the first phase.

We have put the relevant parts of fmt library in
include/OpenImageIO/fmt.  We are currently using commit 01640f44, with
one fix by me on top of it that fixes a Windows compile issue. We will
update the version of fmt as fixes and improvements come.

Sorry for such a weird change in the middle of the beta period. But
sometimes a looming deadline of a code freeze and major release (after
which I wouldn't want to introduce this kind of disrupction) really
makes one focus about doing it right now instead of having to wait
potentially an additional year.
@lgritz lgritz merged commit fb9f7b3 into AcademySoftwareFoundation:master Nov 27, 2018
@lgritz lgritz deleted the lg-fmt branch November 28, 2018 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants