-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Description
When using libcxx and clang 14, we observe a miscompilation of the following simple std::string
code:
std::string r;
...
r = std::string("Hello, ") + "world";
We expect string r to be equal to "Hello, world" but we end up with "H" followed by 11 garbage characters, depending on the earlier contents of r. This is a godbolt demo. The only non-trivial factor to trigger the wrong behavior, is the use of the new TBAA struct path format (using clang option -new-struct-path-tbaa
).
Analysis
After appending to the left-hand side string, the statement triggers a move constructor and move assignment to get the result in place. Each move consists of a copy of the string representation, followed by a zero operation that clears the representation of the stolen string. It seems these memory access operations do not have proper TBAA information, leading to incorrect memory optimization.
TBAA tags
In the LLVM IR for the move assignment operator and move constructor, we can find back the TBAA tags for the memory operations. These tags should be compatible as they may refer to overlapping memory access. The internal representation of a basic_string is the following structure:
union __ulx{__long __lx; __short __lxx;};
enum {__n_words = sizeof(__ulx) / sizeof(size_type)};
struct __raw
{
size_type __words[__n_words];
};
struct __rep
{
union
{
__long __l;
__short __s;
__raw __r;
};
};
__compressed_pair<__rep, allocator_type> __r_;
Type-punning using a union violates strict aliasing, but is supported by e.g. GCC provided the memory is accessed through the union type. Also in clang, access using a union is normally given the tag of "omnipotent char".
The move assignment operator is doing (in __move_assign
):
__r_.first() = __str.__r_.first();
__str.__set_short_size(0);
The assignment at the level of __rep
gives a memcpy
intrinsic with TBAA base and access tag of __rep
, which has a member (the anonymous union) of omnipotent char. __set_short_size
uses the access expression __r_.first().__s.__size_
, the union access has base and access tag of omnipotent char.
The move constructor is doing:
: __r_(_VSTD::move(__str.__r_))
{
__str.__zero();
The move at the level of the __compressed_pair
gives a memcpy
intrinsic with TBAA base and access tag of the __compressed_pair
, but this structure has no members in the TBAA info, this seems incomplete:
!8 = !{!9, !9, i64 0, i64 24}
!9 = !{!6, i64 24, !"_ZTSNSt3__117__compressed_pairINS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE5__repES5_EE"}
The function __zero
is doing:
void __zero() _NOEXCEPT
{
size_type (&__a)[__n_words] = __r_.first().__r.__words;
for (unsigned __i = 0; __i < __n_words; ++__i)
__a[__i] = 0;
}
The store of the zero value in the loop body is given TBAA base and access tag of long
, this is clearly wrong. However, function __zero
violates the constraint "memory is accessed through the union type" from the GCC manual (cited above). The use of the intermediate reference variable a
is equivalent to the counter-example (using an intermediate pointer variable int* ip
) given there. If the source code is changed to remove the use of the reference variable, the base and access tag is omnipotent char, as expected.
Workaround
Changing the code of __move_assign
do the assignment at the level of __raw
:
__r_.first().__r = __str.__r_.first().__r;
Gives a memcpy with TBAA tag of omnipotent char (since we are accessing a union member) and prevents the incorrect memory optimization. Presumably because this forces to alias with the __compressed_pair
(but as noted above, that tag seems incomplete).
Credits
Bhramar Bhushan Vatsa, Wouter Vermaelen and Jeroen Dobbelaere contributed to this bug report.