Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange Code Generation for Custom DST methods #106683

Closed
ZhennanWu opened this issue Jan 10, 2023 · 2 comments
Closed

Strange Code Generation for Custom DST methods #106683

ZhennanWu opened this issue Jan 10, 2023 · 2 comments
Labels
C-bug Category: This is a bug.

Comments

@ZhennanWu
Copy link

Godbolt link: https://godbolt.org/z/YM65qf41h

I tried this simple custom DST method:

use std::any::TypeId;
use std::any::Any;
use std::hint::black_box;
struct A<T:?Sized+'static> {
    a: i32,
    b: T
}

impl<T:?Sized+'static> A<T> {
    fn bb(&self) -> TypeId {
        self.b.type_id()
    }
}

pub fn main() {
    let mut a0 = black_box(A{a: 8, b: 9 as i32});
    let mut a: &mut A<dyn Any> = &mut a0;
    a=black_box(a);
    black_box(a.bb());
}

It turns out the generated assembly (rustc 1.68.0-nightly (659e169d3 2023-01-04) -Copt-level=3) contains strange operations on the %rdi register, while never actually using the %rdi register. It doesn't seem to be performing any caller-reserving-space sort of stuff, and there doesn't seem to be the need for such behavior for this method.

<T as core::any::Any>::type_id:
        movabs  rax, 3735189839305137790
        ret

core::ptr::drop_in_place<i32>:
        ret

example::main:
        push    rbx
        sub     rsp, 32
        movabs  rax, 38654705672
        mov     qword ptr [rsp + 24], rax
        lea     rax, [rsp + 24]
        mov     qword ptr [rsp + 8], rax
        lea     rax, [rip + .L__unnamed_1]
        mov     qword ptr [rsp + 16], rax
        lea     rbx, [rsp + 8]
        mov     rax, qword ptr [rsp + 16]
# OPERATION ON RDI
        mov     rdi, qword ptr [rax + 16]
        add     rdi, 3
        and     rdi, -4
        add     rdi, qword ptr [rsp + 8]
# THE ACTUAL FUNCTION CALL
        call    qword ptr [rax + 24]
        mov     qword ptr [rsp + 8], rax
        add     rsp, 32
        pop     rbx
        ret

.L__unnamed_1:
        .quad   core::ptr::drop_in_place<i32>
        .asciz  "\004\000\000\000\000\000\000\000\004\000\000\000\000\000\000"
        .quad   <T as core::any::Any>::type_id

If we re-implement it as a trait object method call, there won't be operations on %rdi register but rather a direct call.

I haven't tried with more complex code yet, but with simple methods, there seems to be a consistent pattern where custom DST produces longer code than a typical trait object.

Meta

rustc 1.68.0-nightly (659e169d3 2023-01-04):

Backtrace

<backtrace>

@ZhennanWu ZhennanWu added the C-bug Category: This is a bug. label Jan 10, 2023
@erikdesjardins
Copy link
Contributor

erikdesjardins commented Jan 13, 2023

rdi is the first argument to Any::type_id, that is, &self, which must point to the dynamically sized b field.

The instructions you've identified are used to compute that pointer. This takes a few instructions because b is not at a constant offset--it depends on the alignment of T and the size of the non-DST fields in A.

Isolating just the function call: https://godbolt.org/z/374oroPjq

example::example:
; example() arguments:
; rdi = data ptr to `A<dyn Any>`
; rsi = vtable ptr to an instantiation of `Any`

        mov     rax, qword ptr [rsi + 16] ; load alignment of DST field (`alignof(T)`) from vtable

        add     rax, 3  ; these two instructions effectively compute
        and     rax, -4 ; `max(sizeof(i32), alignof(T))`, relying on both being powers of 2

        add     rdi, rax ; offset data pointer from start of `A` to `A.b`

        jmp     qword ptr [rsi + 24] ; call `Any::type_id`

Looking at different instantiations of A:

A<i64> will be laid out like {a: i32, <4 bytes padding>, b: i64}, since i64 has alignment 8.
In order to call Any::type_id, the pointer to the start of A has to be offset by 8 bytes to point to b.

A<i8> will be laid out like {a: i32, b: i8}, since i8 has alignment 1.
In order to call Any::type_id, the pointer to the start of A has to be offset by 4 bytes to point to b.
Note that unlike the previous case, this offset (4) isn't equal to the alignment (1). This is what the two instructions in the middle deal with.

The instruction sequence gets even more complex when the size of the non-DST fields is no longer a power of 2, because it's even harder to find the offset of the b field: https://godbolt.org/z/EEGhTfnoj.

@ZhennanWu
Copy link
Author

The padding issue indeed makes sense. I really didn't realize those were alignment calculations for the custom DST. Thanks for the detailed explanation! @erikdesjardins

It seems to me though this is going to complicate the case for custom DST a lot, since any method call into the DST field have an extra cost of alignment calculations compared to a normal trait object implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug.
Projects
None yet
Development

No branches or pull requests

2 participants