- 
                Notifications
    You must be signed in to change notification settings 
- Fork 32
Don't use an additional buffer in the arithmetic conversions #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't use an additional buffer in the arithmetic conversions #67
Conversation
aa894b1    to
    6945249      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes arithmetic conversion functions by eliminating the use of intermediate character buffers. The changes directly write formatted values into the result string objects instead of using temporary buffers and then copying the data.
- Replaced temporary char/wchar_t arrays with direct manipulation of static_string objects
- Added friend function declarations to enable access to private members
- Modified floating-point conversions to write directly to result string data
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| static_string<N> result; | ||
| result.set_size(N); | ||
| const auto digits_end = result.end(); | ||
| const auto digits_begin = integer_to_string<std::char_traits<char>, Integer>( | ||
| digits_end, value, std::is_signed<Integer>{}); | ||
| return static_string<N>(digits_begin, std::distance(digits_begin, digits_end)); | ||
| result.set_size(std::distance(digits_begin, digits_end)); | ||
| std::char_traits<char>::move(result.data(), digits_begin, result.size()); | 
    
      
    
      Copilot
AI
    
    
    
      Oct 3, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the size to N initially and then immediately changing it creates an inefficient pattern. The string is first sized to N, then resized based on the actual digits needed. This could lead to unnecessary initialization of the buffer contents.
| // we know that a formatting error will not occur, so | ||
| // we assume that the result is always positive | ||
| if (std::size_t(std::snprintf(buffer, N + 1, "%f", value)) > N) | ||
| std::size_t length = std::snprintf(result.data(), N + 1, "%f", value); | 
    
      
    
      Copilot
AI
    
    
    
      Oct 3, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing N + 1 characters to result.data() when the result string has capacity N is a buffer overflow. The snprintf call should use N + 1 for the buffer size only if result has been allocated with that capacity.
| // we know that a formatting error will not occur, so | ||
| // we assume that the result is always positive | ||
| if (std::size_t(std::snprintf(buffer, N + 1, "%Lf", value)) > N) | ||
| std::size_t length = std::snprintf(result.data(), N + 1, "%Lf", value); | 
    
      
    
      Copilot
AI
    
    
    
      Oct 3, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same buffer overflow issue as with the double version - writing N + 1 characters to a buffer with capacity N.
| const long long num_written = | ||
| std::swprintf(buffer, N + 1, L"%f", value); | ||
| long long num_written = | ||
| std::swprintf(result.data(), N + 1, L"%f", value); | 
    
      
    
      Copilot
AI
    
    
    
      Oct 3, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer overflow issue - attempting to write N + 1 wide characters to a buffer with capacity N.
| const long long num_written = | ||
| std::swprintf(buffer, N + 1, L"%Lf", value); | ||
| long long num_written = | ||
| std::swprintf(result.data(), N + 1, L"%Lf", value); | 
    
      
    
      Copilot
AI
    
    
    
      Oct 3, 2025 
    
  
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Buffer overflow issue - attempting to write N + 1 wide characters to a buffer with capacity N.
| return last; | ||
| } | ||
|  | ||
| template<std::size_t N, typename Integer> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are those declarations required after the definitions above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, those are not needed, thanks.
| template<std::size_t, class, class> | ||
| friend class basic_static_string; | ||
|  | ||
| template<std::size_t P, typename Integer> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They don't use any private properties, do they? So no need for friends
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They use set_size(), which is private in basic_static_string due to private inheritance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding all these friend functions for to_string implementations feels odd. Ideally, users should be able to create their own similar utility functions with comparable performance, but they can't make their functions friends of static_string. Would adding a resize_and_overwrite member function remove the need for those friends?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. I've added resize_and_overwrite() and implemented the arithmetic conversions in terms of it. Please have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it was just a suggestion, but you've already implemented it, and it seems to work.
@pdimov, does adding resize_and_overwrite() to static_string make sense? Context: Gennaro wants to rewrite the to_static_string functions and eliminate the use of a temporary buffer:
static_string/include/boost/static_string/static_string.hpp
Lines 552 to 556 in 0d255f7
| char buffer[N]; | |
| const auto digits_end = std::end(buffer); | |
| const auto digits_begin = integer_to_string<std::char_traits<char>, Integer>( | |
| digits_end, value, std::is_signed<Integer>{}); | |
| return static_string<N>(digits_begin, std::distance(digits_begin, digits_end)); | 
This enables RVO and produces more efficient assembly: https://godbolt.org/z/bbesGEPj8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it does.
| char buffer[N]; | ||
| const auto digits_end = std::end(buffer); | ||
| static_string<N> result; | ||
| result.set_size(N); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant here, isn't it?
| result.set_size(N); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first set_size() is not redundant with the constructor, because the constructor creates a string with size zero. I could avoid it by writing:
const auto digits_end = result.begin() + N;
but I don't see any way around the second set_size().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I missed the use of result.end(), so it is fine then although a bit confusing to some (like me)
To me the most readable approach would be const auto digits_end = result.data() + N; as it highlights the buffer use and mirrors the use in the move below.
| wchar_t buffer[N]; | ||
| const auto digits_end = std::end(buffer); | ||
| static_wstring<N> result; | ||
| result.set_size(N); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above
| result.set_size(N); | 
6945249    to
    c103f63      
    Compare
  
    | result.set_size(N); | ||
| const auto digits_end = result.end(); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| result.set_size(N); | |
| const auto digits_end = result.end(); | |
| const auto digits_end = result.data() + N; | 
31d70c8    to
    15a1e18      
    Compare
  
    | 
 Nice, it also enables RVO and produces more efficient assembly because it skips the call to the constructor that is not  | 
| Operation op) | ||
| { | ||
| if (n > max_size()) { | ||
| detail::throw_exception<std::length_error>("n > max_size() in resize_and_overwrite()"); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the message "n > max_size()" (identical to what's thrown in resize()) is sufficient. there's no need to proliferate string literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it sufficient? It gives no clue to the user about where the problem occurred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe detail::throw_exception uses BOOST_THROW_EXCEPTION under the hood, which would also print the source location. But generally, the exception message doesn't need to include the current function name, "n > max_size()" is descriptive enough for this kind of error message in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BOOST_THROW_EXCEPTION, if used inside detail::throw_exception, would emit the source location of detail::throw_exception.
You probably want boost::throw_with_location here, or the two argument overload of boost::throw_exception, so that the source location of the throw is captured, not that of detail::throw_exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gennaroprota, it might be more convenient to create an issue for this and address it in a separate PR once you’re done with the current one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if BOOST_STATIC_STRING_STANDALONE is defined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this configuration is to still be supported (is it?) you'd probably need to define BOOST_STATIC_STRING_CURRENT_LOCATION and then use detail::throw_exception( E(...), BOOST_STATIC_STRING_CURRENT_LOCATION ) instead of boost::throw_exception( E(...), BOOST_CURRENT_LOCATION ).
Something better might also be possible, I'm not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this configuration is to still be supported (is it?)
I'd be happy to remove the support. The less conditional compilation, the better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably ask on the list.
| } | ||
|  | ||
| void | ||
| testResizeAndOverwrite() | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to split this PR into two commits: the first adds the resize_and_overwrite function, and the second improves the performance of the to_string implementations.
| /** | ||
| Resize the string and overwrite its contents. | ||
| Resizes the string to contain `n` characters, and uses the | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that is useful, the size passed here doesn't really matter, does it? Maybe a write_and_resize(op) is more useful? Or is there any realistic use case for this param being less than N?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's how the function is specified for std::string.
For static_string specifically the only thing the passed size does is throw if it's too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, makes sense
07d2e93    to
    4796985      
    Compare
  
    In the case of the floating point conversions, this effectively avoids a copy of the buffer contents. In the case of the integer conversions, it doesn't eliminate the copy, but still removes the extra buffer. This fixes issue #65.
Reason: This is in preparation of the next commit. See its commit message.
Reason: Performing the conversions without accessing private members, providing a model for users to implement their own with comparable efficiency.
4796985    to
    9c5d694      
    Compare
  
    
In the case of the floating point conversions, this effectively avoids a copy of the buffer contents. In the case of the integer conversions, it doesn't eliminate the copy, but still removes the extra buffer.