Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trouble of a reinterpret_cast from a std::string::data() #2346

Closed
usagi opened this issue May 11, 2014 · 2 comments
Closed

Trouble of a reinterpret_cast from a std::string::data() #2346

usagi opened this issue May 11, 2014 · 2 comments

Comments

@usagi
Copy link
Contributor

usagi commented May 11, 2014

  • Environments
    • emcc --version: 1.16.0
    • clang --version: 3.3
    • nodejs --version: 0.10.15

In PC-native(with clang++-3.3 on x86_64 GNU/Linux), the example code at the below show "3.14159" all case. But, in Emscripten(1.16.0 with nodejs-0.10.15), show an unexpected value on the case 2.

Example:

#include <iostream>
#include <string>

auto main() -> int
{
  // case 1: char[]
  const char a[] = { char(0xdb), char(0x0f), char(0x49), char(0x40) };
  for ( auto n = 0; n < sizeof(float); ++n )
    std::cout << "a[" << n << ":" << std::hex << (void*)(&a[n]) << "] "
              << std::hex << ((+a[ n ])&0xff) << "\n";
  std::cout << "a: " << *reinterpret_cast<const float*>(a) << "\n";

  // case 2: std::string
  std::string s( { char(0xdb), char(0x0f), char(0x49), char(0x40) } );
  for ( auto n = 0; n < sizeof(float); ++n )
    std::cout << "s[" << n << ":" << std::hex << (void*)(&s[n]) << "] "
              << std::hex << ((+s[ n ])&0xff) << "\n";
  std::cout << "s.data(): " << std::hex << (void*)(s.data()) << "\n";
  std::cout << "s: " << *reinterpret_cast<const float*>(s.data()) << "\n";

  // case 3: std::string with std::copy ( same as std::memcpy result )
  float f;
  std::copy( std::begin(s), std::end(s), reinterpret_cast<char*>(&f) );
  std::cout << "f: " << f;
}

The result of PC-native:

a[0:0x4015f4] db
a[1:0x4015f5] f
a[2:0x4015f6] 49
a[3:0x4015f7] 40
a: 3.14159
s[0:0x1bc8028] db
s[1:0x1bc8029] f
s[2:0x1bc802a] 49
s[3:0x1bc802b] 40
s.data(): 0x1bc8028
s: 3.14159
f: 3.14159

The result of Emscripten with nodejs:

(Note the second row from the tail.)

a[0:0x8] db
a[1:0x9] f
a[2:0xa] 49
a[3:0xb] 40
a: 3.14159
s[0:0x3679] db
s[1:0x367a] f
s[2:0x367b] 49
s[3:0x367c] 40
s.data(): 0x3679
s: 589233
f: 3.14159

Is it a bug of Emscripten?

@juj
Copy link
Collaborator

juj commented May 11, 2014

The issue is with unaligned memory accesses, which JavaScript doesn't support. Try compiling with -s SAFE_HEAP=1 and you will see the program abort on the line

std::cout << "s: " << *reinterpret_cast<const float*>(s.data()) << "\n";

the problem here is that you are reinterpreting the memory address to a pointer to a float, and to dereference that pointer, the address should be 32-bit aligned. This requirement for aligned memory is the same on ARM, e.g. on Android that code wouldn't work either. Unaligned memory loads and stores are also undefined behavior by the C and C++ standards, but e.g. the x86 platform supports them still.

To perform a 16bit/32bit/64bit load from an address that you explicitly know to be unaligned (but which Clang cannot know), either use memcpy to copy the bytes to a float, or try decorating the pointer with __attribute__((aligned(1))) to signal that it is unaligned.

As a general recommendation, it's a good idea to do all your debug builds with the -s SAFE_HEAP=1 flag to catch the bad memory accesses.

@usagi
Copy link
Contributor Author

usagi commented May 11, 2014

I understood it. And I close the issue.
@juj Thank you for your carefully commentary. 😃

@usagi usagi closed this as completed May 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants