Skip to content

Fix interaction between debug presentation, precision, and width for strings #4478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

localspook
Copy link
Contributor

@localspook localspook commented Jun 23, 2025

Consider our function that formats strings. When using debug presentation, it has 2 issues:

template <typename Char, typename OutputIt>
FMT_CONSTEXPR auto write(OutputIt out, basic_string_view<Char> s,
                         const format_specs& specs) -> OutputIt {
  auto data = s.data();
  auto size = s.size();
  if (specs.precision >= 0 && to_unsigned(specs.precision) < size)
    size = convert_precision_to_size(s, to_unsigned(specs.precision)); // Issue 1: we truncate the string
                                                                       // to fit within the precision...
  bool is_debug = specs.type() == presentation_type::debug;            //
  if (is_debug) {                                                      //
    auto buf = counting_buffer<Char>();                                //
    write_escaped_string(basic_appender<Char>(buf), s); // ...but don't account for the fact that it
    size = buf.count();                                 // can expand when escaped.
  } // ^^^^^^^^^^^^^^^^──────┐
                          // │ Issue 2: we reinterpret size in bytes as 
  size_t width = 0;       // │ display width. We can't do that, because
  if (specs.width != 0) { // │ debug presentation doesn't mean "just ASCII".
    width =     // vvvv──────┘
        is_debug ? size : compute_width(basic_string_view<Char>(data, size));
  }
  return write_padded<Char>(
      out, specs, size, width, [=](reserve_iterator<OutputIt> it) {
        return is_debug ? write_escaped_string(it, s)
                        : copy<Char>(data, data + size, it);
      });
}

Issue 1 manifests like this:

fmt::println("{:.2?}", "\n"); // Prints "\n", but should print "\

And Issue 2 like this:

fmt::println("{:*<5?}", "щ"); // Prints "щ"*, but should print "щ"** 
                              // (щ is two bytes, so {fmt} thinks it takes two terminal cells)

Godbolt link. This PR adds the logic to handle such cases.

@localspook
Copy link
Contributor Author

Tangent, but: while working in this area of the code, I noticed we escape characters differently than the standard: we use \uXXXX and \UXXXXXXXX, while the standard uses \u{...}... is this intentional?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant