Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang assumes zero-extension of 8-bit arguments in x86, causing interop issues with gcc #43573

Open
emilio opened this issue Dec 5, 2019 · 10 comments
Labels
ABI Application Binary Interface bugzilla Issues migrated from bugzilla clang:codegen

Comments

@emilio
Copy link
Contributor

emilio commented Dec 5, 2019

Bugzilla Link 44228
Version trunk
OS Linux
Attachments test-case
CC @topperc,@davezarzycki,@froydnj,@jrmuizel,@josephcsible,@jdm,@jyknight,@RKSimon,@zygoloid,@rjmccall,@tstellar

Extended Description

When receiving 8-bit-wide arguments in extern function, clang seems to assume the argument has been zero-extended by the caller.

According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92821#c2:

I believe it is a LLVM bug.
At least, reading https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf, I can't find in the Parameter Passing section anything that would say that arguments smaller than 64-bit are passed sign or zero extended to 64-bit like some other psABIs require. The only related thing is
"When a value of type _Bool is returned or passed in a register or on the stack, bit 0 contains the truth value and bits 1 to 7 shall be zero."
with a footnote:
"Other bits are left unspecified, hence the consumer side of those values can rely on it being 0 or 1 when truncated to 8 bit."
which says that _Bool has only significant low 8 bits and the rest is unspecified.

https://godbolt.org/z/BNHxEY has a comparison of clang and gcc output for the attached test-case. GCC correctly does an 8-bit load, disregarding the rest of the bits in the register.

This causes real problems when gcc-built functions call into llvm-built functions. See https://bugzilla.mozilla.org/show_bug.cgi?id=1600735 for an example that happens on Firefox. GCC may not always sign-extend in the caller.

In the Firefox case the LLVM-built function is Rust code, but per the above godbolt link it also seems to reproduce with C / C++.

@emilio
Copy link
Contributor Author

emilio commented Dec 5, 2019

I believe this may be a dupe / related to bug 12207, though it seems like a somewhat serious issue.

@emilio
Copy link
Contributor Author

emilio commented Dec 9, 2019

I sent a tentative patch for this in https://reviews.llvm.org/D71178.

I think it'd do the right thing, but I still have 10 or so tests to update.

@jyknight
Copy link
Member

jyknight commented Mar 17, 2020

Ugh....

LLVM has been using the caller-extends-to-32bit ABI for...just about ever. And, except perhaps for this particular bug, GCC also has been generating calls compliant with that. And per https://gcc.gnu.org/PR46942 wanted to depend on it as well -- only didn't due to (at the time) difficulties in codegen.

There also appeared to be overall a desire to change the ABI document to state that....but for some reason never happened (with no further followup AFAIK), which is really unfortunate.

So, I don't agree this is an LLVM bug, at least not without reopening the discussion on the x86-64 ABI list and coming to some final conclusion on how to update the document to clarify matters.

I think:

  1. The ABI document should be actually modified to record that caller-extends is the ABI. (Because that is de-facto the ABI already.)

  2. Any odd corner-case where the value is not extended by callers should be fixed. Sounds like the one known case in GCC has indeed already been fixed, but it's not clear if that was intentional.

  3. If the outcome of a discussion on the ABI list is that, despite historical precedent, no zero/sign extension to 32-its is required on the caller side, LLVM should still continue to extend there, for compatibility with older versions of itself.

@rjmccall
Copy link
Contributor

Ugh....

LLVM has been using the caller-extends-to-32bit ABI for...just about ever.
And, except perhaps for this particular bug, GCC also has been generating
calls compliant with that. And per https://gcc.gnu.org/#46942 wanted to
depend on it as well -- only didn't due to (at the time) difficulties in
codegen.

There also appeared to be overall a desire to change the ABI document to
state that....but for some reason never happened (with no further followup
AFAIK), which is really unfortunate.

So, I don't agree this is an LLVM bug, at least not without reopening the
discussion on the x86-64 ABI list and coming to some final conclusion on how
to update the document to clarify matters.

I think:

  1. The ABI document should be actually modified to record that
    caller-extends is the ABI. (Because that is de-facto the ABI already.)

  2. Any odd corner-case where the value is not extended by callers should
    be fixed. Sounds like the one known case in GCC has indeed already been
    fixed, but it's not clear if that was intentional.

  3. If the outcome of a discussion on the ABI list is that, despite
    historical precedent, no zero/sign extension to 32-its is required on the
    caller side, LLVM should still continue to extend there, for compatibility
    with older versions of itself.

That all seems reasonable, thank you.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
@LuoYuanke
Copy link
Contributor

Since ABI document doesn't define the behavior, we can't assume compiler other than clang would sign/zero extend parameter in caller. We should be compatible to other compiler as much as possible. So I think we should remove the sign/zero extend in callee. Removing the assumption (caller has sign/zero extend) in callee would make Clang be compatible with previous clang compiler, gcc and ICC. I didn't see any side effect on doing this.

@jyknight
Copy link
Member

And yet, de-facto, other compilers DO sign/zero extend in caller, and therefore DO interop with Clang. It seems silly to require extension on both sides of the call (though, of course, GCC has been doing that).

@LuoYuanke
Copy link
Contributor

However ICC doesn't sign/zero extend in caller. See https://gcc.godbolt.org/z/8aaPP33cG. So if the caller is built by ICC and callee is built by Clang, the issue happen. We can't assume all compiler do sign/zero extend in caller.

@jyknight
Copy link
Member

Ah, yes; that's an unfortunate new piece of information here. :(

@emilio
Copy link
Contributor Author

emilio commented Mar 31, 2022

Also, GCC doesn't guarantee the sign extension.

It usually does, due to integer promotion rules, but it might not if you're passing e.g. an enum class or so. The Mozilla bug referenced at the top of the issue was caused by that, and it was GCC code calling into Rust (LLVM) code.

@jyknight
Copy link
Member

Since it's not clear on the bug trail (discussions happened elsewhere) --

The consensus from discussions on the ABI list and elsewhere is that Clang is indeed incorrect per the x86 psABI specification. I think that's unfortunate (and it also would've been nice if the spec had been more clearly written from the get-go) but being that as it may: the consensus is that x86 psABI specifies that no zero/sign-extension is required for callers passing 8/16-bit values, so this is a Clang bug that needs to be fixed.

The only issue for Clang is that it should fix this without breaking compatibility with old versions of itself. Thus, it must stop assuming values have been extended in the callee, but continue to (possibly-redundantly) extend in the caller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ABI Application Binary Interface bugzilla Issues migrated from bugzilla clang:codegen
Projects
None yet
Development

No branches or pull requests

4 participants