Skip to content

Simplify the management of .c_str() at the cost of never sharing the …#145

Merged
EmJayGee merged 1 commit intomainfrom
micgrier/really-fix-sstring
Dec 12, 2025
Merged

Simplify the management of .c_str() at the cost of never sharing the …#145
EmJayGee merged 1 commit intomainfrom
micgrier/really-fix-sstring

Conversation

@EmJayGee
Copy link
Copy Markdown
Collaborator

…value if constructed mid-view

The existing code attempted to allow for sharing of the .c_str() value across copied/moved/swapped basic_sstring<> instances even if the .c_str() had had to be constructed. (For the uninitiated, basic_sstring stores the "basic value" in the "basic_const_string<>" type which is always null terminated. The basic_sstring has an arefc_ptr<> to the basic_const_string<>, which is essentially a slimmed down shared_ptr<> modeled on Rust's Arc<T>. When a basic_sstring<> is first created, "obviously" the c_str() is the whole thing because basic_const_string<> enables that. If a basic_sstring operation trims a string such that the end still aligns with the end of the basic_const_string, the c_str() is still just a pointer into the initial basic_const_string.

But if the basic_sstring<> operation returns say one character in the middle of the string, c_str() needs to allocate a new string.

The old code tried to be clever an made a new arefc<basic_const_string>, so that if the basic_sstring were assigned etc, the c_str() that had been computed could be reused.

This can still be done but what I had didn't work and I wasn't converging on a solution right away so instead this change reduces the size of a basic_sstring which bothered me also and changes the logic so that either the stored c_str is:

  • nullptr
  • a pointer into the basic_const_string
  • a pointer to a dynamically allocated character array via std::make_unique<char[]>(n)

so the destructor is slightly more complicated but we'll deal with this mess and perhaps get the desired efficiency some other time. The bottom three bits of the m_c_str member variable are unused and could be used without having to bloat basic_sstring albeit at further cost of complexity.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR simplifies the c_str() management in basic_sstring by eliminating the sharing of dynamically allocated c_str values across copied/moved/swapped instances. The implementation reduces the size of basic_sstring by replacing the stored string_view with an offset_and_size struct, and changes c_str storage to use one of three states: nullptr, a pointer into the shared basic_const_string buffer, or a std::unique_ptr-managed allocation.

Key Changes:

  • Replaced m_view member with offset_and_size struct containing offset and size fields
  • Removed m_c_str_v member (arefc_ptr to const_string for shared c_str)
  • Changed c_str() allocation strategy to use std::unique_ptr instead of shared arefc_ptr
  • Updated copy/move/assignment operations to not share dynamically allocated c_str values

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/libraries/sstring/include/m/sstring/sstring.h Outdated
Comment thread src/libraries/sstring/include/m/sstring/sstring.h
Comment thread src/libraries/sstring/include/m/sstring/sstring.h
Comment thread src/libraries/sstring/include/m/sstring/sstring.h
Comment thread src/libraries/sstring/include/m/sstring/sstring.h Outdated
Comment thread src/libraries/sstring/include/m/sstring/sstring.h
Comment thread src/libraries/sstring/include/m/sstring/sstring.h Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@EmJayGee EmJayGee force-pushed the micgrier/really-fix-sstring branch 2 times, most recently from d39677f to 7f25ca6 Compare December 12, 2025 18:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/libraries/sstring/include/m/sstring/sstring.h
Comment on lines +204 to +206
auto t = other.m_c_str.exchange(nullptr, std::memory_order_acq_rel);
auto u = m_c_str.exchange(t, std::memory_order_acq_rel);
other.m_c_str.exchange(u, std::memory_order_acq_rel);
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The move constructor and move assignment operator have unnecessarily complex atomic exchange logic. Lines 204-206 perform three exchanges when a simpler approach would suffice. The code exchanges other.m_c_str with nullptr (getting t), exchanges m_c_str with t (getting u), then exchanges other.m_c_str with u again. A clearer and more efficient implementation would be: auto temp = other.m_c_str.exchange(nullptr, std::memory_order_acq_rel); auto old = m_c_str.exchange(temp, std::memory_order_acq_rel); other.m_c_str.store(old, std::memory_order_release);

Suggested change
auto t = other.m_c_str.exchange(nullptr, std::memory_order_acq_rel);
auto u = m_c_str.exchange(t, std::memory_order_acq_rel);
other.m_c_str.exchange(u, std::memory_order_acq_rel);
auto temp = other.m_c_str.exchange(nullptr, std::memory_order_acq_rel);
auto old = m_c_str.exchange(temp, std::memory_order_acq_rel);
other.m_c_str.store(old, std::memory_order_release);

Copilot uses AI. Check for mistakes.
Comment on lines +673 to +682
if (m_arefc)
{
if (cstr == m_arefc->view().data() + m_offset_and_size.m_offset)
{
//
// This is the case that we *don't* have to do
// anything, so just exit early.
//
return;
}
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential memory leak in unmake_c_str(). If m_arefc is null but m_c_str is not null (which can happen when make_c_str(nullptr, 0) is called), the function returns early at line 671 without deallocating the dynamically allocated c_str. The logic should check if cstr needs deallocation even when m_arefc is null. Since make_c_str(nullptr, 0) can allocate a single-character null-terminated string, this needs to be handled.

Suggested change
if (m_arefc)
{
if (cstr == m_arefc->view().data() + m_offset_and_size.m_offset)
{
//
// This is the case that we *don't* have to do
// anything, so just exit early.
//
return;
}
// Only skip deallocation if m_arefc is non-null and cstr points to the internal buffer.
if (m_arefc && cstr == m_arefc->view().data() + m_offset_and_size.m_offset)
{
//
// This is the case that we *don't* have to do
// anything, so just exit early.
//
return;

Copilot uses AI. Check for mistakes.
// may be performing either of these at a time.
//
// The .m_c_str() handles multithreaded construction of its
// cached value as a "convenience". The overal basic_sstring<>
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling error: "overal" should be "overall".

Suggested change
// cached value as a "convenience". The overal basic_sstring<>
// cached value as a "convenience". The overall basic_sstring<>

Copilot uses AI. Check for mistakes.
Comment on lines 200 to +206
using std::swap;
swap(m_arefc, other.m_arefc);
swap(m_view, other.m_view);
swap(m_c_str_v, other.m_c_str_v);
m_c_str.store(nullptr, std::memory_order_release);
swap(m_offset_and_size, other.m_offset_and_size);

auto t = other.m_c_str.exchange(nullptr, std::memory_order_acq_rel);
auto u = m_c_str.exchange(t, std::memory_order_acq_rel);
other.m_c_str.exchange(u, std::memory_order_acq_rel);
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The move assignment operator swaps m_arefc and m_offset_and_size before handling m_c_str, but doesn't clean up the existing m_c_str values first. This creates a complex scenario where cached c_str pointers that were valid for the old arefc/offset combinations are swapped to contexts where they may not be valid. While the atomic exchange logic attempts to swap the c_str values, it would be clearer and more maintainable to call unmake_c_str() on this object before performing the swaps, similar to how the copy assignment operator should work. This would ensure proper cleanup of any dynamically allocated c_str before taking ownership of other's state.

Copilot uses AI. Check for mistakes.
Comment thread src/libraries/sstring/include/m/sstring/sstring.h
@EmJayGee EmJayGee merged commit 8e1ee92 into main Dec 12, 2025
11 checks passed
@EmJayGee EmJayGee deleted the micgrier/really-fix-sstring branch December 12, 2025 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants