Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upfastcomp: extractvalue generates unaligned loads #3070
Comments
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
cc: @chadaustin |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
waywardmonkeys
Dec 10, 2014
Contributor
The code that emits this stuff in the first place for a member function pointer load is here:
|
The code that emits this stuff in the first place for a member function pointer load is here: |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
waywardmonkeys
Dec 10, 2014
Contributor
In the IMVU codebase, this change to use aligned loads appears to result in an 80k size reduction in the minified build and nearly 200k in the -g3 debug build.
|
In the IMVU codebase, this change to use aligned loads appears to result in an 80k size reduction in the minified build and nearly 200k in the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kripken
Dec 11, 2014
Owner
I don't think we can just assume full alignment, but taking into account the struct's alignment plus the field's alignment inside it should give the right result. I don't remember how to do that in llvm, but there is a method that does this, I am fairly sure. Likely @sunfishcode would know off the top of his head.
|
I don't think we can just assume full alignment, but taking into account the struct's alignment plus the field's alignment inside it should give the right result. I don't remember how to do that in llvm, but there is a method that does this, I am fairly sure. Likely @sunfishcode would know off the top of his head. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
chadaustin
Dec 11, 2014
Collaborator
Isn't there an option to assume full alignment? The old compiler had an option like that, iirc. It's undefined behavior in C to rely on unaligned reads so, at least for our use case, I would prefer ALL memory operations have full alignment.
|
Isn't there an option to assume full alignment? The old compiler had an option like that, iirc. It's undefined behavior in C to rely on unaligned reads so, at least for our use case, I would prefer ALL memory operations have full alignment. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kripken
Dec 11, 2014
Owner
There isn't currently an option for force full alignment in fastcomp, sounds ok to me to add one.
|
There isn't currently an option for force full alignment in fastcomp, sounds ok to me to add one. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
sunfishcode
Dec 11, 2014
Collaborator
Offfhand, I don't think Emscripten on asmjs-unknown-emscripten ever needs to discard alignment information, in contrast to the needs of PNaCl which does. I think we can do "full alignment" by default.
|
Offfhand, I don't think Emscripten on asmjs-unknown-emscripten ever needs to discard alignment information, in contrast to the needs of PNaCl which does. I think we can do "full alignment" by default. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kripken
Dec 11, 2014
Owner
Yes, we don't need to discard alignment information, as we can depend on JS portability, unlike PNaCl. We should preserve alignment information here, as mentioned above.
However, I don't think we can do full alignment by default. While undefined behavior, it would break many real-world apps, e.g. Cube 2/BananaBread. But again, adding an option for full alignment sounds fine.
|
Yes, we don't need to discard alignment information, as we can depend on JS portability, unlike PNaCl. We should preserve alignment information here, as mentioned above. However, I don't think we can do full alignment by default. While undefined behavior, it would break many real-world apps, e.g. Cube 2/BananaBread. But again, adding an option for full alignment sounds fine. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
Can we talk about this again? |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kripken
Feb 27, 2015
Owner
Sure. Is there new data or other info? Re-reading back here, I would be interested to see whether Cube 2 does in fact break with full alignment. I'm 99% sure it would, but maybe I am missing something. That could easily change my position here.
|
Sure. Is there new data or other info? Re-reading back here, I would be interested to see whether Cube 2 does in fact break with full alignment. I'm 99% sure it would, but maybe I am missing something. That could easily change my position here. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
kripken
Mar 20, 2015
Owner
The pnacl pass code looks fixed on the merge-pnacl-mar-13-2015 branches, so we might see this fixed next week or so if those merge to incoming.
|
The pnacl pass code looks fixed on the |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
waywardmonkeys
Apr 13, 2015
Contributor
Oddly, while the LLVM IR still has some extractvalues without alignment, it seems like the generated JS is much better:
function __ZN10emscripten8internal13MethodInvokerIMN9northstar6CanvasEFvRKN4gmtl3VecIiLj2EEEEvPS3_JS8_EE6invokeERKSA_SB_PS6_(i3, i1, i2) {
i3 = i3 | 0;
i1 = i1 | 0;
i2 = i2 | 0;
var i4 = 0;
i4 = HEAP32[i3 >> 2] | 0;
i3 = HEAP32[i3 + 4 >> 2] | 0;
if (i3 & 1) i4 = HEAP32[(HEAP32[i1 + (i3 >> 1) >> 2] | 0) + i4 >> 2] | 0;
FUNCTION_TABLE_vii[i4 & 1023](i1 + (i3 >> 1) | 0, i2);
return;
}Closing.
|
Oddly, while the LLVM IR still has some extractvalues without alignment, it seems like the generated JS is much better: function __ZN10emscripten8internal13MethodInvokerIMN9northstar6CanvasEFvRKN4gmtl3VecIiLj2EEEEvPS3_JS8_EE6invokeERKSA_SB_PS6_(i3, i1, i2) {
i3 = i3 | 0;
i1 = i1 | 0;
i2 = i2 | 0;
var i4 = 0;
i4 = HEAP32[i3 >> 2] | 0;
i3 = HEAP32[i3 + 4 >> 2] | 0;
if (i3 & 1) i4 = HEAP32[(HEAP32[i1 + (i3 >> 1) >> 2] | 0) + i4 >> 2] | 0;
FUNCTION_TABLE_vii[i4 & 1023](i1 + (i3 >> 1) | 0, i2);
return;
}Closing. |
waywardmonkeys commentedDec 10, 2014
Here's a pattern that pops up in code from embind:
We can see that
$$fieldand$$field2are unaligned reads.That's from this C++:
And this is the LLVM IR involved:
Now, the old (non-fastcomp) compiler pretended everything in an
extractvaluewas aligned:However, in fastcomp, there is the
ExpandStructRegspass inlib/Transforms/NaCl/ExpandStructRegs.cppand it throws away the alignment inProcessLoadOrStoreAttrs:All of that said ... if I try to preserve the alignment by modifying the above to be:
Then it ends up generating this JS for the
$$fieldand$$field2loads: