Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__builtin_dynamic_object_size mode 1 falls back to mode 0 when array index is not a constant expression #66975

Open
kees opened this issue Sep 21, 2023 · 6 comments
Assignees
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema"

Comments

@kees
Copy link
Contributor

kees commented Sep 21, 2023

https://godbolt.org/z/MaEG8eEMd

For a struct like this:

struct suspend_stats {
        int     failed_resume_noirq;
#define REC_FAILED_NUM  2
        int     last_failed_dev;
        char    failed_devs[REC_FAILED_NUM][40];
        int     last_failed_errno;
        int     bar;
};

__bdos(x, 1) should match sizeof(x) for the multidimensional array. When the array index is a constant expression, this works correctly. If it's not, it breaks:

   argc == 1
    __builtin_dynamic_object_size(foo.failed_devs[1], 1) == sizeof(foo.failed_devs[1])
    __builtin_dynamic_object_size(foo.failed_devs[argc], 1)  != sizeof(foo.failed_devs[1]) !!!

The first bdos is correct, the second is not.

@nickdesaulniers @isanbard

@Endilll Endilll added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels Sep 28, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 28, 2023

@llvm/issue-subscribers-clang-frontend

https://godbolt.org/z/MaEG8eEMd

For a struct like this:

struct suspend_stats {
        int     failed_resume_noirq;
#define REC_FAILED_NUM  2
        int     last_failed_dev;
        char    failed_devs[REC_FAILED_NUM][40];
        int     last_failed_errno;
        int     bar;
};

__bdos(x, 1) should match sizeof(x) for the multidimensional array. When the array index is a constant expression, this works correctly. If it's not, it breaks:

   argc == 1
    __builtin_dynamic_object_size(foo.failed_devs[1], 1) == sizeof(foo.failed_devs[1])
    __builtin_dynamic_object_size(foo.failed_devs[argc], 1)  != sizeof(foo.failed_devs[1]) !!!

The first bdos is correct, the second is not.

@nickdesaulniers @isanbard

@bwendling bwendling self-assigned this Sep 29, 2023
@bwendling
Copy link
Collaborator

The __builtin_*object_size "type" isn't passing down a key piece of information. If the least significant bit is clear, then it should return the size of the whole object, otherwise it should return the subobject's size.

@bwendling
Copy link
Collaborator

#78526 is the PR. It's getting pushback.

@bwendling
Copy link
Collaborator

bwendling commented Mar 25, 2024

[Moving the conversation from https://github.com//issues/72032 to the correct PR]

@bwendling

This rabbit hole runs very deep. Consider this:

https://godbolt.org/z/v1s6brz9f

I would expect the value returned by foo() to be something else than 9. Maybe something closer to 5193? (unless I'm doing the math incorrectly.)

@kees

9 looks correct to me. A fixed-size array instances of size 9. I think GCC is wrong here. But that's not what this bug is about. This bug is about flexible array members at the end of a composite structure and Clang fails to notice. Here's a shorter PoC: https://godbolt.org/z/KMxY9xexW

@zygoloid

Re 9 vs 54, I think GCC is right -- the operand is a pointer to the second char[9] in an array of 7 char[9]s, so it's valid to access 6 * sizeof(char[9]) == 54 bytes forward from that pointer.

Hm, actually, no -- the operand should be a char[9] lvalue, which should decay to a pointer to the first char in that array, so 9 seems right. Maybe GCC isn't decaying the operand?

@bwendling

So what I'm thinking is that an offset into an N-dimensional array is calculated by the formula in Eli Bendersky's blog post: https://eli.thegreenplace.net/2015/memory-layout-of-multi-dimensional-arrays. In the case of char d[23][5][7][9], it's more like a 23 * 5 * 7 * 9 = 7245 bytes array. The remainder after an offset of d[14][3][1] is...well some number from the formula. So is a programmer asking for the amount left from the right-most array (9 in this case), or the amount from the offset to the end of the object? I think it's the latter, but I'm not sure how people use this in practice. The documentation says that:

The second bit determines if maximum or minimum of remaining bytes is computed.

Perhaps that should determine which value to return?

@bwendling
Copy link
Collaborator

BTW: here's a WIP that implements the idea I had. I can convert it to what you are suggesting easily. But I'd like to confirm what the expected result is.

https://github.com/bwendling/llvm-project/tree/__bdos-subobject

@bwendling
Copy link
Collaborator

After further investigation, it looks like I'm idea isn't how either __bdos or __bos are implemented. It's more according to what @kees and @zygoloid said. I made the changes. I'll create a PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema"
Projects
None yet
Development

No branches or pull requests

4 participants