Skip to content

Conversation

@gennaroprota
Copy link
Collaborator

@gennaroprota gennaroprota commented Oct 2, 2025

In the case of the floating point conversions, this effectively avoids a copy of the buffer contents. In the case of the integer conversions, it doesn't eliminate the copy, but still removes the extra buffer.

@gennaroprota gennaroprota force-pushed the feature/dont_use_an_additional_buffer_in_the_arithmetic_conversions branch 2 times, most recently from aa894b1 to 6945249 Compare October 2, 2025 17:14
@Flamefire
Copy link
Contributor

Maybe merge #64 first to let the full CI run, I disabled C++26 testing just now to let it pass until #66 is resolved

@vinniefalco vinniefalco requested a review from Copilot October 3, 2025 17:28
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes arithmetic conversion functions by eliminating the use of intermediate character buffers. The changes directly write formatted values into the result string objects instead of using temporary buffers and then copying the data.

  • Replaced temporary char/wchar_t arrays with direct manipulation of static_string objects
  • Added friend function declarations to enable access to private members
  • Modified floating-point conversions to write directly to result string data

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines 552 to 557
static_string<N> result;
result.set_size(N);
const auto digits_end = result.end();
const auto digits_begin = integer_to_string<std::char_traits<char>, Integer>(
digits_end, value, std::is_signed<Integer>{});
return static_string<N>(digits_begin, std::distance(digits_begin, digits_end));
result.set_size(std::distance(digits_begin, digits_end));
std::char_traits<char>::move(result.data(), digits_begin, result.size());
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the size to N initially and then immediately changing it creates an inefficient pattern. The string is first sized to N, then resized based on the actual digits needed. This could lead to unnecessary initialization of the buffer contents.

Copilot uses AI. Check for mistakes.
// we know that a formatting error will not occur, so
// we assume that the result is always positive
if (std::size_t(std::snprintf(buffer, N + 1, "%f", value)) > N)
std::size_t length = std::snprintf(result.data(), N + 1, "%f", value);
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing N + 1 characters to result.data() when the result string has capacity N is a buffer overflow. The snprintf call should use N + 1 for the buffer size only if result has been allocated with that capacity.

Copilot uses AI. Check for mistakes.
// we know that a formatting error will not occur, so
// we assume that the result is always positive
if (std::size_t(std::snprintf(buffer, N + 1, "%Lf", value)) > N)
std::size_t length = std::snprintf(result.data(), N + 1, "%Lf", value);
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same buffer overflow issue as with the double version - writing N + 1 characters to a buffer with capacity N.

Copilot uses AI. Check for mistakes.
const long long num_written =
std::swprintf(buffer, N + 1, L"%f", value);
long long num_written =
std::swprintf(result.data(), N + 1, L"%f", value);
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer overflow issue - attempting to write N + 1 wide characters to a buffer with capacity N.

Copilot uses AI. Check for mistakes.
const long long num_written =
std::swprintf(buffer, N + 1, L"%Lf", value);
long long num_written =
std::swprintf(result.data(), N + 1, L"%Lf", value);
Copy link

Copilot AI Oct 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Buffer overflow issue - attempting to write N + 1 wide characters to a buffer with capacity N.

Copilot uses AI. Check for mistakes.
return last;
}

template<std::size_t N, typename Integer>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are those declarations required after the definitions above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, those are not needed, thanks.

template<std::size_t, class, class>
friend class basic_static_string;

template<std::size_t P, typename Integer>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't use any private properties, do they? So no need for friends

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They use set_size(), which is private in basic_static_string due to private inheritance.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding all these friend functions for to_string implementations feels odd. Ideally, users should be able to create their own similar utility functions with comparable performance, but they can't make their functions friends of static_string. Would adding a resize_and_overwrite member function remove the need for those friends?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I've added resize_and_overwrite() and implemented the arithmetic conversions in terms of it. Please have a look.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it was just a suggestion, but you've already implemented it, and it seems to work.
@pdimov, does adding resize_and_overwrite() to static_string make sense? Context: Gennaro wants to rewrite the to_static_string functions and eliminate the use of a temporary buffer:

char buffer[N];
const auto digits_end = std::end(buffer);
const auto digits_begin = integer_to_string<std::char_traits<char>, Integer>(
digits_end, value, std::is_signed<Integer>{});
return static_string<N>(digits_begin, std::distance(digits_begin, digits_end));

This enables RVO and produces more efficient assembly: https://godbolt.org/z/bbesGEPj8

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it does.

char buffer[N];
const auto digits_end = std::end(buffer);
static_string<N> result;
result.set_size(N);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is redundant here, isn't it?

Suggested change
result.set_size(N);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first set_size() is not redundant with the constructor, because the constructor creates a string with size zero. I could avoid it by writing:

const auto digits_end = result.begin() + N;

but I don't see any way around the second set_size().

Copy link
Contributor

@Flamefire Flamefire Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I missed the use of result.end(), so it is fine then although a bit confusing to some (like me)

To me the most readable approach would be const auto digits_end = result.data() + N; as it highlights the buffer use and mirrors the use in the move below.

wchar_t buffer[N];
const auto digits_end = std::end(buffer);
static_wstring<N> result;
result.set_size(N);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

Suggested change
result.set_size(N);

@gennaroprota gennaroprota force-pushed the feature/dont_use_an_additional_buffer_in_the_arithmetic_conversions branch from 6945249 to c103f63 Compare October 6, 2025 14:49
Comment on lines 570 to 571
result.set_size(N);
const auto digits_end = result.end();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result.set_size(N);
const auto digits_end = result.end();
const auto digits_end = result.data() + N;

@gennaroprota gennaroprota force-pushed the feature/dont_use_an_additional_buffer_in_the_arithmetic_conversions branch 3 times, most recently from 31d70c8 to 15a1e18 Compare October 6, 2025 15:19
@ashtum
Copy link

ashtum commented Oct 6, 2025

this effectively avoids a copy of the buffer contents. In the case of the integer conversions, it doesn't eliminate the copy, but still removes the extra buffer.

Nice, it also enables RVO and produces more efficient assembly because it skips the call to the constructor that is not noexcept: https://godbolt.org/z/bbesGEPj8

Operation op)
{
if (n > max_size()) {
detail::throw_exception<std::length_error>("n > max_size() in resize_and_overwrite()");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the message "n > max_size()" (identical to what's thrown in resize()) is sufficient. there's no need to proliferate string literals.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it sufficient? It gives no clue to the user about where the problem occurred.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe detail::throw_exception uses BOOST_THROW_EXCEPTION under the hood, which would also print the source location. But generally, the exception message doesn't need to include the current function name, "n > max_size()" is descriptive enough for this kind of error message in the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BOOST_THROW_EXCEPTION, if used inside detail::throw_exception, would emit the source location of detail::throw_exception.

You probably want boost::throw_with_location here, or the two argument overload of boost::throw_exception, so that the source location of the throw is captured, not that of detail::throw_exception.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gennaroprota, it might be more convenient to create an issue for this and address it in a separate PR once you’re done with the current one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if BOOST_STATIC_STRING_STANDALONE is defined?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this configuration is to still be supported (is it?) you'd probably need to define BOOST_STATIC_STRING_CURRENT_LOCATION and then use detail::throw_exception( E(...), BOOST_STATIC_STRING_CURRENT_LOCATION ) instead of boost::throw_exception( E(...), BOOST_CURRENT_LOCATION ).

Something better might also be possible, I'm not sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this configuration is to still be supported (is it?)

I'd be happy to remove the support. The less conditional compilation, the better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably ask on the list.

}

void
testResizeAndOverwrite()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to split this PR into two commits: the first adds the resize_and_overwrite function, and the second improves the performance of the to_string implementations.

/**
Resize the string and overwrite its contents.
Resizes the string to contain `n` characters, and uses the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that is useful, the size passed here doesn't really matter, does it? Maybe a write_and_resize(op) is more useful? Or is there any realistic use case for this param being less than N?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the function is specified for std::string.

For static_string specifically the only thing the passed size does is throw if it's too big.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense

@gennaroprota gennaroprota force-pushed the feature/dont_use_an_additional_buffer_in_the_arithmetic_conversions branch 3 times, most recently from 07d2e93 to 4796985 Compare October 9, 2025 15:39
In the case of the floating point conversions, this effectively avoids a
copy of the buffer contents. In the case of the integer conversions, it
doesn't eliminate the copy, but still removes the extra buffer.

This fixes issue #65.
Reason: This is in preparation of the next commit. See its commit
message.
Reason: Performing the conversions without accessing private members,
providing a model for users to implement their own with comparable
efficiency.
@gennaroprota gennaroprota force-pushed the feature/dont_use_an_additional_buffer_in_the_arithmetic_conversions branch from 4796985 to 9c5d694 Compare October 9, 2025 15:47
@gennaroprota gennaroprota merged commit 9c5d694 into develop Oct 9, 2025
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants