Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Favor bit_cast over reinterpret_cast #1517

Open
ghost opened this issue Sep 18, 2019 · 22 comments
Open

Favor bit_cast over reinterpret_cast #1517

ghost opened this issue Sep 18, 2019 · 22 comments
Assignees

Comments

@ghost
Copy link

ghost commented Sep 18, 2019

Now that std::bit_cast is coming in C++20, the valid use cases for reinterpret_cast are slim to none in the majority of applications. I believe there should be a rule heavily discouraging reinterpret_cast in favor of std::bit_cast or another named cast.

Often times switching from reinterpret_cast to std::bit_cast would be the difference between invoking undefined behavior or not, due to type aliasing rules.

@beinhaerter
Copy link
Contributor

beinhaerter commented Sep 18, 2019

@hsutter
Copy link
Contributor

hsutter commented Oct 1, 2020

Editors call: Even though it's a C++20 features and the Guidelines don't yet support/require C++20, enough people are interested in this that we should say something. Assigning to @GabrielDosReis. Thanks!

@N-Dekker
Copy link
Contributor

Just for my understanding, do I understand correctly that this pull request favors bit_cast over reinterpret_cast in any possible case? Including the cases depicted by three code examples below here?

void f1(std::uintptr_t address) {
  auto ptr1 = reinterpret_cast<void*>(address);
  auto ptr2 = std::bit_cast<void*>(address); // Favorable?
}

void f2(std::istream& input, std::vector<std::byte>& bytes) {
  input.read(reinterpret_cast<char*>(bytes.data()), bytes.size());
  input.read(std::bit_cast<char*>(bytes.data()), bytes.size()); // Favorable?
}

void f3(int i) {
  auto byte_ptr1 = reinterpret_cast<std::byte*>(&i);
  auto byte_ptr2 = std::bit_cast<std::byte*>(&i); // Favorable?
}

I'm very interested to hear if that is indeed going to be the official guideline, with respect to reinterpret_cast versus bit_cast

@jwakely
Copy link
Contributor

jwakely commented Apr 20, 2022

auto ptr2 = std::bit_cast<void*>(address); // Favorable?

This might be il-formed because sizeof(void*) == sizeof(uintptr_t) is not necessarily true, and even if it compiles, it might give a different answer. The reinterpret_cast does an implementation-defined conversion and does not necessarily produce a pointer value that is bitwise identical to the integer input.

input.read(std::bit_cast<char*>(bytes.data()), bytes.size()); // Favorable?

It's not guaranteed that byte* and char* have the same representation, so this might be ill-formed too.

auto byte_ptr2 = std::bit_cast<std::byte*>(&i); // Favorable?

Ditto for int* and byte*.

@N-Dekker
Copy link
Contributor

auto ptr2 = std::bit_cast<void*>(address); // Favorable?

This might be il-formed because sizeof(void*) == sizeof(uintptr_t) is not necessarily true, and even if it compiles, it might give a different answer. The reinterpret_cast does an implementation-defined conversion and does not necessarily produce a pointer value that is bitwise identical to the integer input.

Thanks @jwakely This particular void* example is actually based on an attempt of mine to replace some C-style casts like the one from https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/beginthread-beginthreadex doing:

hThread = (HANDLE)_beginthreadex( NULL, 0, &SecondThreadFunc, NULL, 0, &threadID );

_beginthreadex returns a uintptr_t, and HANDLE is just a type alias of void*. Would you use reinterpret_cast<HANDLE>, rather than std::bit_cast<HANDLE>, in such a case?

input.read(std::bit_cast<char*>(bytes.data()), bytes.size()); // Favorable?

It's not guaranteed that byte* and char* have the same representation, so this might be ill-formed too.

Would it be any better when doing reinterpret_cast<char*>(bytes.data()) instead?

auto byte_ptr2 = std::bit_cast<std::byte*>(&i); // Favorable?

Ditto for int* and byte*.

So again, would it be better to use reinterpret_cast here?

@jwakely
Copy link
Contributor

jwakely commented Apr 20, 2022

Your reinterpret cast examples are ok, I would not replace any of them with std::bit_cast.

@N-Dekker
Copy link
Contributor

@jwakely Thanks again, Jonathan. So do you have a suggestion how the core guideline on reinterpret_cast versus bit_cast should be like? Apparently it isn't simply "Favor bit_cast over reinterpret_cast"!

@MikeGitb
Copy link

MikeGitb commented Apr 20, 2022

Correct me if I'm wrong, but I thought using bit_cast as a 1:1 replacement for reinterpret cast ist almost always wrong.
Afaik the idea is not use bit_cast to cast the pointer as you would with `reinterpret_cast (which is often UB, but usually works as long as alignment is correct), but to cast the value the pointer points to. E.g.

#include <iostream>
#include <bit>

struct Foo {
    std::uint32_t low;
    std::uint32_t high;       
};

int main()
{
    alginas(alignof(std::uint64_t)) Foo f{ 0x1, 0x2};

    //pretty sure this is UB, but seems to be common practice
    auto* p = reinterpret_cast<uint64_t*>(&f);
    std::cout << *p << std::endl;  

    auto v = std::bit_cast<uint64_t>(f);
    std::cout << v << std::endl;
}

@jwakely
Copy link
Contributor

jwakely commented Apr 20, 2022

Right. But that doesn't make much sense for Niels' examples, where he wants to read the char values making up the object representation of some other objects. You could read each byte as a char using bit_cast but that's not very useful when you want to process N bytes at a time.

@MikeGitb
Copy link

Right. But that doesn't make much sense for Niels' examples,

Absolutely. My point is that bit_cast isn't a replacement for reinterpret_cast any more than std::unique_ptr is a replacement for raw pointers. It can be used to avoid a specific subset of usecases for reinterpret_cast. And any Guideline should explain that Otherwise people get confused just like we see here or the people that initially thought they should replace all their raw pointers with either unique or shared pointer.

@MikeGitb
Copy link

On another, but related topic: Is my understanding correct that recent changes to the standard (don't remember the name unfortunately and whether it was c++20, c++23 or a DR) now officially permit to cast a pointer to a range of raw bytes (e.g. received over the network or mmap) to a pointer to a c++ object and dereference this pointer?

@sweemer
Copy link

sweemer commented Apr 21, 2022

@MikeGitb This might be the paper you are thinking of: cplusplus/papers#592

@MikeGitb
Copy link

MikeGitb commented Apr 22, 2022

@sweemer : Thanks, but I don't think that is it.

From crossreading that paper talks about viewing the storage of an Foo object as a sequence of char/unsigned char/std::byte, which did and was supposed to work for ages, but the actual wording in the standard didn't permit it.

What I'm talking aobut is the reverse case: I'm getting a pointer to a char/std::byte buffer that has been filled with bytes that contain a valid representation of a Foo object and I'd like to access that buffer through a Foo* pointer. However, as no actual c++ object has ever been created in that buffer this is UB. While this too works in practice (and has to work in order vor c- code to be compiled as c++ code), as long as alignment requirements are observed, I think there was a much bigger debate around whether that should be officielly blessed by the standard.

@frederick-vs-ja
Copy link
Contributor

frederick-vs-ja commented Apr 25, 2022

What I'm talking aobut is the reverse case: I'm getting a pointer to a char/std::byte buffer that has been filled with bytes that contain a valid representation of a Foo object and I'd like to access that buffer through a Foo* pointer. However, as no actual c++ object has ever been created in that buffer this is UB. While this too works in practice (and has to work in order vor c- code to be compiled as c++ code), as long as alignment requirements are observed, I think there was a much bigger debate around whether that should be officielly blessed by the standard.

Given the Foo (an implicit-lifetime type) shown in your example (#1517 (comment)), there can be a Foo object created within that buffer when the buffer is created, due to implicit object creation (introduced as DR by P0593R6).

@frederick-vs-ja
Copy link
Contributor

On another, but related topic: Is my understanding correct that recent changes to the standard (don't remember the name unfortunately and whether it was c++20, c++23 or a DR) now officially permit to cast a pointer to a range of raw bytes (e.g. received over the network or mmap) to a pointer to a c++ object and dereference this pointer?

I guess this usage sometimes requires std::launder after reinterpret_cast (see P0137R1) currently, although AFAIK no compiler thinks std::launder is in effect in this case.

@MikeGitb
Copy link

Given the Foo (an implicit-lifetime type) shown in your example (#1517 (comment)), there can be a Foo object created within that buffer when the buffer is created, due to implicit object creation (introduced as DR by P0593R6).

Thats the paper I had in mind. Thank you very much.

N-Dekker added a commit to N-Dekker/ITK that referenced this issue Apr 26, 2022
It appears that commit 548b45f ("BUG: Replace
C-style casts from `_beginthreadex` with `bit_cast<HANDLE>`", from pull request
InsightSoftwareConsortium#3380) was wrong:
`bit_cast<HANDLE>` might do a different conversion than the corresponding
C-style cast, `(HANDLE)`.

The C-style cast `(HANDLE)` behaves exactly like `reinterpret_cast<HANDLE>`, by
definition.

`reinterpret_cast<HANDLE>` does an implementation-defined conversion, as was
explained by Jonathan Wakely at isocpp/CppCoreGuidelines#1517 (comment)
(issue "Favor bit_cast over reinterpret_cast").

Fixed by replacing all `bit_cast<HANDLE>` calls with `reinterpret_cast<HANDLE>`.
dzenanz pushed a commit to InsightSoftwareConsortium/ITK that referenced this issue Apr 26, 2022
It appears that commit 548b45f ("BUG: Replace
C-style casts from `_beginthreadex` with `bit_cast<HANDLE>`", from pull request
#3380) was wrong:
`bit_cast<HANDLE>` might do a different conversion than the corresponding
C-style cast, `(HANDLE)`.

The C-style cast `(HANDLE)` behaves exactly like `reinterpret_cast<HANDLE>`, by
definition.

`reinterpret_cast<HANDLE>` does an implementation-defined conversion, as was
explained by Jonathan Wakely at isocpp/CppCoreGuidelines#1517 (comment)
(issue "Favor bit_cast over reinterpret_cast").

Fixed by replacing all `bit_cast<HANDLE>` calls with `reinterpret_cast<HANDLE>`.
@N-Dekker
Copy link
Contributor

N-Dekker commented May 1, 2022

@jwakely Can you please confirm that the line of code that does a reinterpret_cast in the following example has undefined behavior?

void f( uint32_t u ) {
  int i1 = *reinterpret_cast< int32_t * >(&u); // Undefined behavior, right?
  int i2 = std::bit_cast< int32_t>(u); // Proper bit_cast use case?
}

Assuming that a bitwise conversion is intended, I guess that would be a proper use case for bit_cast. Right? I just proposed this to the ITK library, pull request InsightSoftwareConsortium/ITK#3402

@jwakely
Copy link
Contributor

jwakely commented May 1, 2022

The reinterpret_cast itself is ok, but dereferencing the result of the cast is undefined.

The bit_cast version is ok, but redundant. For two's complement integers the implicit conversion from uint32_t to int32_t preserves the original bit pattern, and C++ now requires two's complement integers (and all known implementations already use them anyway).

@N-Dekker
Copy link
Contributor

N-Dekker commented May 1, 2022

For two's complement integers the implicit conversion from uint32_t to int32_t preserves the original bit pattern, and C++ now requires two's complement integers

@jwakely Ah, thanks Jonathan. I see now that with p0907R2 - Signed Integers are Two’s Complement, in case of out of range, the result is no longer implementation-defined. That's also new to me.

So in this particular case (from uint32_t to int32_t), one might say "Favor bit_cast over reinterpret_cast", but then both implicit conversion and static_cast<int32_t> would still be more favorable than bit_cast. 🤔

With respect to my ITK pull request , I'm not entirely sure if all compilers supported by ITK already implement integer conversion exactly like that (without any disagreeing implementation-defined behavior), as ITK still supports C++14 compilation by GCC 5.1, LLVM Clang 3.4, and VS2017 (v141). So I think ITK might still use its own hand-written bit_cast replica for a while. But of course, that's beyond this CppCoreGuidelines issue.

@jwakely
Copy link
Contributor

jwakely commented May 2, 2022

GCC, clang and msvc always implemented that rule on all supported architectures. That's why changing the standard was completely uncontroversial. It used to be implementation defined to allow for other representations, but no C++ compiler ever used anything except two's complement.

@frederick-vs-ja
Copy link
Contributor

frederick-vs-ja commented May 2, 2022

@N-Dekker I think *reinterpret_cast< int32_t * >(&u) is well-defined according to [basic.lval]/(11.2):

a type that is the signed or unsigned type corresponding to the dynamic type of the object, or

But I also suggest you to use static_cast in this case.

@leimao
Copy link

leimao commented Dec 29, 2022

@N-Dekker @jwakely I agree with @frederick-vs-ja that the int32_t i1 = *reinterpret_cast< int32_t * >(&u); (I changed the int to int32_t to eliminate the assignment undefined behavior (caused by integer overflow) which is irrelevant to this topic.) behavior should be well-defined. However, if it were float i1 = *reinterpret_cast< float * >(&u);, then the behavior is undefined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants