Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong cast of u16 to usize on aarch64 #97463

Closed
fparat opened this issue May 27, 2022 · 12 comments · Fixed by #97800
Closed

Wrong cast of u16 to usize on aarch64 #97463

fparat opened this issue May 27, 2022 · 12 comments · Fixed by #97800
Assignees
Labels
A-abi Area: Concerning the application binary interface (ABI). A-codegen Area: Code generation C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-AArch64 Armv8-A or later processors in AArch64 mode P-critical Critical priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@fparat
Copy link

fparat commented May 27, 2022

Hello,

A cast of a u16 value returned from a C function to a usize is invalid when enabling optimization.

I tried this code (with cargo flag --release):

#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>

struct bloc {
    uint16_t a;
    uint16_t b;
    uint16_t c;
};

uint16_t c_read_value(void) {
    struct bloc *data = malloc(sizeof(struct bloc));
    data->a = rand() & 0xFFFF;
    data->b = rand() & 0xFFFF;
    data->c = rand() & 0xFFFF;

    printf("C struct: a = %u, b = %u, c = %u\n",
        (unsigned) data->a, (unsigned) data->b, (unsigned) data->c);
    printf("C function returns %u\n", (unsigned) data->b);

    return data->b; /* leak data */
}
//#![feature(bench_black_box)]

#[link(name = "bad")]
extern "C" {
    pub fn c_read_value() -> u16;
}

fn main() {
    let value = unsafe { c_read_value() };
    dbg!(value); // ok
    dbg!(value as usize); // nok
    dbg!(usize::from(value)); // nok
    dbg!((value as usize) & 0xFFFF); // nok

    //dbg!(std::hint::black_box(value) as usize); // ok
}

I expected to print the same value when casting to usize.

Instead, the value is wrong, the cast seems to "absorb" the neighboring bytes. Worse, I cannot even mask with & 0xFFFF!

C struct: a = 17767, b = 9158, c = 39017
C function returns 9158
[src/main.rs:10] value = 9158
[src/main.rs:11] value as usize = 846930886
[src/main.rs:12] usize::from(value) = 846930886
[src/main.rs:13] (value as usize) & 0xFFFF = 846930886

Using black_box with nightly on the value fixes the issue. I'd like to find a good workaround for stable though.

No issue in debug mode.

To reproduce, this repo can be used: https://github.com/fparat/badcast

Meta

Bug present on stable and nightly.
Found on Linux aarch64 (Jetson Nano). No issue found on amd64 or armv7.

rustc +nightly --version --verbose:

rustc 1.63.0-nightly (490324f7b 2022-05-26)
binary: rustc
commit-hash: 490324f7b29d5f1a1e063a563045d790091ed639
commit-date: 2022-05-26
host: aarch64-unknown-linux-gnu
release: 1.63.0-nightly
LLVM version: 14.0.4

Same issue with both gcc and clang for the C code

clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0

Backtrace: N.A.

@fparat fparat added the C-bug Category: This is a bug. label May 27, 2022
@fparat
Copy link
Author

fparat commented May 27, 2022

I'd like to find a good workaround for stable though.

Criterion has stable-compatible implementation of black_box: https://docs.rs/criterion/0.3.5/src/criterion/lib.rs.html#172-178

@HeroicKatora
Copy link
Contributor

HeroicKatora commented May 28, 2022

This appears to be an ABI issue for u16 return values. (More investigation below). The LLVM IR in Rust expects a i16 zeroextended to the architecture width, but is returned a 16-bit value with unspecified additional bits. Of course it then removes any instruction masking bits it assumes to be 0 already.

  %value = tail call zeroext i16 @c_read_value(), !dbg !10

This is probably incorrect. EABI defines returning in terms of calling fn (T) and says

When an argument is assigned to a register, any unused bits in the register have unspecified value.

I can't find anything specific to masking/zero-extending for return values though.


The assembly of the original C-code does not zero-extend but instead expects the caller to mask bits. As produced by

  • ARM64 gcc 7.5
  • -O3

I've added stars for annotating the instruction sequence delivering the result. Note that there's no bitmasking in that sequence, this is only performed on w20. Since all printing on the C side use w20 this goes unnoticed on the printf doesn't indicate any incorrect value.

        bl      rand
        mov     w21, w0
*       bl      rand
        and     w20, w0, 65535
*       mov     w19, w0
        bl      rand
        and     w3, w0, 65535
        mov     w2, w20
        and     w1, w21, 65535
        adrp    x0, .LC0
        add     x0, x0, :lo12:.LC0
        bl      printf
        mov     w1, w20
        adrp    x0, .LC1
        add     x0, x0, :lo12:.LC1
        bl      printf
*       mov     w0, w19

On the Rust side the w0 result is never modified by any instruction as per llvm qualifier.

  • rustc: 1.61
  • -Copt-level=3 --edition 2021 --target=aarch64-unknown-linux-gnu
        bl      c_read_value
        adrp    x22, .L__unnamed_1
        adrp    x20, <&T as core::fmt::Display>::fmt
        mov     x8, sp
        add     x22, x22, :lo12:.L__unnamed_1
        add     x20, x20, :lo12:<&T as core::fmt::Display>::fmt
        adrp    x23, :got:_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$u32$GT$3fmt17h78bd006cfbffcd0fE
        strh    w0, [sp]

@HeroicKatora
Copy link
Contributor

HeroicKatora commented May 30, 2022

cc: @jonas-schievink the bot seems to not be assigning anyone, could you verify the tags: A-codegen, I-unsound?

@jonas-schievink jonas-schievink added A-codegen Area: Code generation I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-AArch64 Armv8-A or later processors in AArch64 mode labels May 30, 2022
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label May 30, 2022
@apiraino
Copy link
Contributor

apiraino commented May 31, 2022

Assigning priority as discussed in the Zulip thread of the Prioritization Working Group.

@rustbot label -I-prioritize +P-high +T-compiler

@rustbot rustbot added P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels May 31, 2022
@Mark-Simulacrum Mark-Simulacrum added P-critical Critical priority and removed P-high High priority labels May 31, 2022
@Mark-Simulacrum
Copy link
Member

Raising priority as this affects a tier 1 target and seems relatively easy to trigger.

@nbdd0121
Copy link
Contributor

This behaviour is introduced in #32732, so regression happens in Rust 1.9.0. It's a bit horrifying to know that this miscompilation has been existing since 2016.

Removing ret.extend_integer_width_to(32) and arg.extend_integer_width_to(32) in rustc_target/src/abi/call/aarch64.rs should be sufficient, however I don't have an AArch64 dev machine for testing.

@pnkfelix pnkfelix self-assigned this Jun 2, 2022
@pnkfelix
Copy link
Member

pnkfelix commented Jun 2, 2022

self assigning: if I can reproduce this, then I'll try out the fix suggested by @nbdd0121

@pnkfelix
Copy link
Member

pnkfelix commented Jun 2, 2022

weird, this does not reproduce for me on a Mac M1...

@nbdd0121
Copy link
Contributor

nbdd0121 commented Jun 2, 2022

Apple uses a non-standard calling convention, see here.

@pnkfelix
Copy link
Member

pnkfelix commented Jun 2, 2022

Okay now I have reproduced atop aarch64-unknown-linux-gnu.

I'll work on making a patch now.

@pnkfelix
Copy link
Member

pnkfelix commented Jun 2, 2022

It's a bit horrifying to know that this miscompilation has been existing since 2016.

This might be explained by how "little" our test suite tries to exercise Rust/C interop:

line/word/byte counts for all the *.c files under `src/test`

% find src/test/ -name '*.c' | xargs wc

lines words bytes file
417 964 7536 src/test/auxiliary/rust_test_helpers.c
11 20 268 src/test/run-make/raw-dylib-stdcall-ordinal/exporter.c
16 30 369 src/test/run-make/raw-dylib-c/extern_1.c
6 9 114 src/test/run-make/raw-dylib-c/extern_2.c
123 422 3533 src/test/run-make/raw-dylib-alt-calling-convention/extern.c
18 57 494 src/test/run-make/x86_64-fortanix-unknown-sgx-lvi/enclave/foo.c
26 76 704 src/test/run-make/x86_64-fortanix-unknown-sgx-lvi/enclave/libcmake_foo/src/foo.c
5 7 84 src/test/run-make/raw-dylib-link-ordinal/exporter.c
1 6 24 src/test/run-make-fulldeps/c-static-rlib/cfoo.c
66 239 1207 src/test/run-make-fulldeps/arguments-non-c-like-enum/test.c
12 36 287 src/test/run-make-fulldeps/cross-lang-lto-pgo-smoketest/cmain.c
9 15 146 src/test/run-make-fulldeps/cross-lang-lto-pgo-smoketest/clib.c
6 9 63 src/test/run-make-fulldeps/sanitizer-staticlib-link/program.c
26 59 439 src/test/run-make-fulldeps/extern-fn-with-packed-struct/test.c
5 8 44 src/test/run-make-fulldeps/no-duplicate-libs/bar.c
1 3 14 src/test/run-make-fulldeps/no-duplicate-libs/foo.c
7 13 98 src/test/run-make-fulldeps/issue-14500/foo.c
6 9 53 src/test/run-make-fulldeps/lto-smoke-c/bar.c
6 9 53 src/test/run-make-fulldeps/issue-24445/foo.c
16 32 237 src/test/run-make-fulldeps/extern-fn-with-extern-types/ctest.c
12 31 263 src/test/run-make-fulldeps/c-unwind-abi-catch-lib-panic/add.c
85 243 1260 src/test/run-make-fulldeps/pass-non-c-like-enum-to-c/test.c
12 34 281 src/test/run-make-fulldeps/cross-lang-lto-clang/cmain.c
9 15 146 src/test/run-make-fulldeps/cross-lang-lto-clang/clib.c
10 22 148 src/test/run-make-fulldeps/cdylib/foo.c
5 8 48 src/test/run-make-fulldeps/issue-28595/b.c
1 3 16 src/test/run-make-fulldeps/issue-28595/a.c
1 6 24 src/test/run-make-fulldeps/pointer-auth-link-with-c/test.c
8 14 103 src/test/run-make-fulldeps/extern-fn-mangle/test.c
63 174 1049 src/test/run-make-fulldeps/return-non-c-like-enum/test.c
1 6 38 src/test/run-make-fulldeps/link-path-order/wrong.c
1 6 38 src/test/run-make-fulldeps/link-path-order/correct.c
6 9 76 src/test/run-make-fulldeps/issue-15460/foo.c
11 24 137 src/test/run-make-fulldeps/static-extern-type/define-foo.c
3 6 26 src/test/run-make-fulldeps/lto-no-link-whole-rlib/bar.c
3 6 26 src/test/run-make-fulldeps/lto-no-link-whole-rlib/foo.c
4 10 67 src/test/run-make-fulldeps/c-dynamic-rlib/cfoo.c
3 7 35 src/test/run-make-fulldeps/interdependent-c-libraries/bar.c
1 3 14 src/test/run-make-fulldeps/interdependent-c-libraries/foo.c
50 191 1595 src/test/run-make-fulldeps/c-link-to-rust-va-list-fn/test.c
16 31 250 src/test/run-make-fulldeps/extern-fn-generic/test.c
4 10 67 src/test/run-make-fulldeps/c-dynamic-dylib/cfoo.c
12 31 263 src/test/run-make-fulldeps/c-unwind-abi-catch-panic/add.c
1 6 24 src/test/run-make-fulldeps/c-static-dylib/cfoo.c
6 9 67 src/test/run-make-fulldeps/glibc-staticlib-args/program.c
1 3 14 src/test/run-make-fulldeps/archive-duplicate-names/bar.c
1 3 14 src/test/run-make-fulldeps/archive-duplicate-names/foo.c
1 3 22 src/test/run-make-fulldeps/static-nobundle/aaa.c
314 1284 7587 src/test/run-make-fulldeps/extern-fn-struct-passing-abi/test.c
1 3 14 src/test/run-make-fulldeps/redundant-libs/bar.c
2 6 30 src/test/run-make-fulldeps/redundant-libs/foo.c
7 12 76 src/test/run-make-fulldeps/redundant-libs/baz.c
6 9 53 src/test/run-make-fulldeps/c-link-to-rust-staticlib/bar.c
6 9 53 src/test/run-make-fulldeps/issue-68794-textrel-on-minimal-lib/bar.c
10 19 156 src/test/run-make-fulldeps/extern-fn-with-union/ctest.c
61 147 781 src/test/run-make-fulldeps/return-non-c-like-enum-from-c/test.c
18 28 231 src/test/run-make-fulldeps/longjmp-across-rust/foo.c
6 10 77 src/test/run-make-fulldeps/link-cfg/return2.c
6 10 77 src/test/run-make-fulldeps/link-cfg/return1.c
6 10 77 src/test/run-make-fulldeps/link-cfg/return3.c
6 10 60 src/test/run-make-fulldeps/static-dylib-by-default/main.c
10 22 148 src/test/run-make-fulldeps/cdylib-dylib-linkage/foo.c
6 9 53 src/test/run-make-fulldeps/c-link-to-rust-dylib/bar.c
1 3 14 src/test/run-make-fulldeps/manual-link/bar.c
1 3 14 src/test/run-make-fulldeps/manual-link/foo.c
7 11 77 src/test/run-make-fulldeps/linkage-attr-on-static/foo.c
15 30 258 src/test/run-make-fulldeps/issue-25581/test.c
1 3 23 src/test/run-make-fulldeps/compiler-lookup-paths/native.c
1604 4575 31737 total

@pnkfelix
Copy link
Member

pnkfelix commented Jun 3, 2022

Here's something else that's scary in our test infrastructure exposed by this issue: This problem fails to replicate if you remove the optimization flag (i.e. -O3) from the compilation of the bad.c file to bad.o.

Now It reproduces for all -O levels >= 1 that I tried, ...

... but here's why its scary: Our run-make-fulldeps suite doesn't exercise -O by default. There are individual Makefiles that opt-into using -O, but problems like this probably aren't getting exercised in practice by our run-make suites...

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 20, 2022
…ll-abi-does-not-zeroext, r=wesleywiser

Aarch64 call abi does not zeroext (and one cannot assume it does so)

Fix rust-lang#97463
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 20, 2022
…ll-abi-does-not-zeroext, r=wesleywiser

Aarch64 call abi does not zeroext (and one cannot assume it does so)

Fix rust-lang#97463
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 20, 2022
…ll-abi-does-not-zeroext, r=wesleywiser

Aarch64 call abi does not zeroext (and one cannot assume it does so)

Fix rust-lang#97463
@bors bors closed this as completed in 95a992a Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-abi Area: Concerning the application binary interface (ABI). A-codegen Area: Code generation C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-AArch64 Armv8-A or later processors in AArch64 mode P-critical Critical priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants