Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add full support of format string parsing in compile-time API #2129

Conversation

alexezeder
Copy link
Contributor

@alexezeder alexezeder commented Feb 7, 2021

Compile-time API functionality extended to support manual ordering and named arguments. Unlike my first attempt to do this in #2111, where I tried to use a format part array, here I'm just reusing that recursion of functions compile_format_string() and parse_tail().

Some points for the changes:

  • works with C++17 as it did before
  • unlike Add full support of format string parsing in compile-time API #2111, this PR fully supports custom parsing in formatters
  • manual indexing with {0} and automatic indexing with {name} work exactly as they work in the runtime API
  • a switch of the argument indexing between automatic to manual or manual to automatic is detected at compile-time, and static_asserts fail with corresponding messages
  • usage of named arguments with specs is unsupported because these arguments cannot provide type information unless their names are available at compile-time (I wrote about that with more details here), in case if named argument is used with specs in format string then unknown_format() is returned from string compilation procedure, thus we fallback to the runtime API for this string (but this fallback is currently broken)
  • tests added

Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for another high quality PR!

include/fmt/compile.h Outdated Show resolved Hide resolved
include/fmt/compile.h Outdated Show resolved Hide resolved
include/fmt/compile.h Show resolved Hide resolved
include/fmt/compile.h Outdated Show resolved Hide resolved
constexpr void on_error(const char* message) { throw format_error(message); }

constexpr int on_arg_id() {
throw format_error("handler cannot be used for empty arg_id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"for empty arg_id" -> "with automatic indexing"

Also can this be an assert?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it actually can be used for automatic indexing with named identifiers. Both runtime and (now) compile-time APIs keep automatic indexing when a named argument identifier is used. So it just cannot be used for an unnamed argument identifier in the automatic indexing mode, which this message is trying to say.
By the way, this function wouldn't be used in normal conditions because the code that invokes this handler actually controls that this handler is used only for numeric or named arguments. As long as it's true, no one would see this message, but when someone breaks the parsing code, they will get this message.

Also can this be an assert?

As I said, it just indicates an internal error, so the cause of this compile-time error can be everything not compile-time friendly. I saw several usages of throw format_error(...) and use it too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.

it just indicates an internal error

Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, "unnamed argument identifier" sounds a bit strange. 🙂
But the problem is probably in my wrong understanding of how named arguments work. After updating this PR (as I wrote here), this wording problem would be probably eliminated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +607 to +610
template <typename Char> struct parse_arg_id_result {
arg_ref<Char> arg_id;
const Char* arg_id_end;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pass begin by reference in parse_arg_id and avoid introducing this struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... it would be a reference to the pointer, or (IMHO better) a pointer to the pointer, is it ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think reference is better unless it can be null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's probably impossible because there is a need to have arg_id_end as a constexpr variable or, more importantly, begin has to be a non-constexpr variable in that case, but it should be used in a constexpr context.

Comment on lines 142 to 177
struct test_custom_formattable {};

FMT_BEGIN_NAMESPACE
template <> struct formatter<test_custom_formattable> {
enum class output_type { two, four } type{output_type::two};

FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) {
auto it = ctx.begin(), end = ctx.end();
while (it != end && *it != '}') {
++it;
}
auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin()));
auto tag = string_view("custom");
if (spec.size() == tag.size()) {
bool is_same = true;
for (size_t index = 0; index < spec.size(); ++index) {
if (spec[index] != tag[index]) {
is_same = false;
break;
}
}
type = is_same ? output_type::four : output_type::two;
} else {
type = output_type::two;
}
return it;
}

template <typename FormatContext>
auto format(const test_custom_formattable&, FormatContext& ctx) const
-> decltype(ctx.out()) {
return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}",
42);
}
};
FMT_END_NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using one of the existing formatters such as duration formatter instead of introducing a new one here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem here is that the chrono::duration formatter is not ready to be used with compile-time API because of that format() constness requirement. Should I update it in this PR or the separate one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I update it in this PR or the separate one?

This PR is OK since it should be a small change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with the weirdest looking format string from chrono-test

Comment on lines 188 to 227
FMT_BEGIN_NAMESPACE
template <> struct formatter<test_dynamic_formattable> {
size_t amount = 0;
detail::arg_ref<char> width_refs[3];

FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) {
amount = static_cast<size_t>(*ctx.begin() - '0');
if (amount >= 1) {
width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id());
}
if (amount >= 2) {
width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id());
}
if (amount >= 3) {
width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id());
}
return ctx.begin() + 1;
}

template <typename FormatContext>
auto format(const test_dynamic_formattable&, FormatContext& ctx) const
-> decltype(ctx.out()) {
int widths[3]{};
for (size_t i = 0; i < amount; ++i) {
detail::handle_dynamic_spec<detail::width_checker>(widths[i],
width_refs[i], ctx);
}
if (amount == 1) {
return format_to(ctx.out(), "{:{}}", 41, widths[0]);
} else if (amount == 2) {
return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]);
} else if (amount == 3) {
return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42,
widths[1], 43, widths[2]);
} else {
throw format_error("formatting error");
}
}
};
FMT_END_NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. duration formatter has dynamic field support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the previous one (about custom formatter) and this are not the same.
Yes, it has dynamic field support. But as far as I can see, it supports the same set of nested replacement fields as the default formatter, {:{}.{}}. So handling 2 dynamic fields for the default formatter would probably be enough to pass the test with chrono::duration formatter.
While this custom formatter has a custom syntax for nested replacement fields (non {:{}.{}}), and it has 3 of them. So handling default dynamic fields wouldn't be enough to pass the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to test the implementation of exotic formatter specializations here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with format string from chrono-test that uses dynamic specs

include/fmt/compile.h Show resolved Hide resolved
constexpr void on_error(const char* message) { throw format_error(message); }

constexpr int on_arg_id() {
throw format_error("handler cannot be used for empty arg_id");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what you mean by "unnamed argument identifier". Both "{}" and "{:...}" denote automatic indexing which is why I'm suggesting this minor wording change. It doesn't matter much since it's an internal error but a bit more consistent with the wording elsewhere.

it just indicates an internal error

Right and this is exactly why I'm suggesting to use an assert if possible. This will distinguish an internal error from a user error even though they both result in a compilation error. If assert doesn't work for some reason, then throw is OK.

Comment on lines +607 to +610
template <typename Char> struct parse_arg_id_result {
arg_ref<Char> arg_id;
const Char* arg_id_end;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think reference is better unless it can be null.

Comment on lines 142 to 177
struct test_custom_formattable {};

FMT_BEGIN_NAMESPACE
template <> struct formatter<test_custom_formattable> {
enum class output_type { two, four } type{output_type::two};

FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) {
auto it = ctx.begin(), end = ctx.end();
while (it != end && *it != '}') {
++it;
}
auto spec = string_view(ctx.begin(), static_cast<size_t>(it - ctx.begin()));
auto tag = string_view("custom");
if (spec.size() == tag.size()) {
bool is_same = true;
for (size_t index = 0; index < spec.size(); ++index) {
if (spec[index] != tag[index]) {
is_same = false;
break;
}
}
type = is_same ? output_type::four : output_type::two;
} else {
type = output_type::two;
}
return it;
}

template <typename FormatContext>
auto format(const test_custom_formattable&, FormatContext& ctx) const
-> decltype(ctx.out()) {
return format_to(ctx.out(), type == output_type::two ? "{:>2}" : "{:>4}",
42);
}
};
FMT_END_NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I update it in this PR or the separate one?

This PR is OK since it should be a small change.

Comment on lines 188 to 227
FMT_BEGIN_NAMESPACE
template <> struct formatter<test_dynamic_formattable> {
size_t amount = 0;
detail::arg_ref<char> width_refs[3];

FMT_CONSTEXPR auto parse(format_parse_context& ctx) -> decltype(ctx.begin()) {
amount = static_cast<size_t>(*ctx.begin() - '0');
if (amount >= 1) {
width_refs[0] = detail::arg_ref<char>(ctx.next_arg_id());
}
if (amount >= 2) {
width_refs[1] = detail::arg_ref<char>(ctx.next_arg_id());
}
if (amount >= 3) {
width_refs[2] = detail::arg_ref<char>(ctx.next_arg_id());
}
return ctx.begin() + 1;
}

template <typename FormatContext>
auto format(const test_dynamic_formattable&, FormatContext& ctx) const
-> decltype(ctx.out()) {
int widths[3]{};
for (size_t i = 0; i < amount; ++i) {
detail::handle_dynamic_spec<detail::width_checker>(widths[i],
width_refs[i], ctx);
}
if (amount == 1) {
return format_to(ctx.out(), "{:{}}", 41, widths[0]);
} else if (amount == 2) {
return format_to(ctx.out(), "{:{}}{:{}}", 41, widths[0], 42, widths[1]);
} else if (amount == 3) {
return format_to(ctx.out(), "{:{}}{:{}}{:{}}", 41, widths[0], 42,
widths[1], 43, widths[2]);
} else {
throw format_error("formatting error");
}
}
};
FMT_END_NAMESPACE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to test the implementation of exotic formatter specializations here.

@alexezeder
Copy link
Contributor Author

Actually, I'm going to make this PR a draft (yep, again 😄).
For some reason, I assumed that named arguments should be used with automatic indexing only, which is false for runtime API - https://godbolt.org/z/oTPqKK. This ended up with the wrong implementation here. I don't know why I just haven't checked all those cases before.
I need to make few changes to make the same behavior in compile-time API as runtime API has. Anyway, I think all comments will be valid even after applying these changes.

@alexezeder alexezeder marked this pull request as draft February 15, 2021 22:30
@alexezeder alexezeder marked this pull request as ready for review February 16, 2021 05:15
Copy link
Contributor

@vitaut vitaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor comments, otherwise looks good.

const T& arg = get<N>(args...);
return write<Char>(out, arg);
if constexpr (is_named_arg<typename std::remove_cv<T>::type>::value) {
decltype(T::value) arg = get<N>(args...).value;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think decltype(T::value) can be replaced with a bit simpler const auto&

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

if constexpr (str.size() == 2 && str[0] == '{' && str[1] == '}')
return fmt::to_string(detail::first(args...));
if constexpr (str.size() == 2 && str[0] == '{' && str[1] == '}') {
auto first = detail::first(args...);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may introduce an extra copy. Please use const auto& instead of auto&.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@vitaut vitaut merged commit ab0f7d7 into fmtlib:master Feb 20, 2021
@vitaut
Copy link
Contributor

vitaut commented Feb 20, 2021

Merged, thanks!

@alexezeder alexezeder deleted the feature/support_full_syntax_in_compile_time_api branch March 8, 2021 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants