Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__builtin_object_size(P->M, 1) where M ends with a flex-array behaves like sizeof() #72032

Open
kees opened this issue Nov 11, 2023 · 7 comments
Assignees
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema"

Comments

@kees
Copy link
Contributor

kees commented Nov 11, 2023

Using __builtin_object_size (and __builtin_dynamic_object_size) on a composite structure's member that has a flexible array loses the sense of how large it is.

For example, on a struct that has a flexible array, __bdos correctly says it doesn't know the size (bounded here by alloc_size):

    expect(__builtin_object_size(wrap->msg.nlmsg_content, 1), 4076);
    expect(__builtin_object_size(wrap, 0), 4096);
    expect(__builtin_object_size(wrap, 1), 4096);

But if it is part of a wrapper, it start behaving like sizeof():

    expect(__builtin_object_size(wrap->msg.nlmsg_content, 1), 4076);
    expect(__builtin_object_size(&wrap->msg, 0), 4092);
    /* But suddenly gets it wrong? */
    expect(__builtin_object_size(&wrap->msg, 1), 4092);

https://godbolt.org/z/YrGsh8Ybs

This was recently fixed in GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832

PoC:
composite.c.txt

@EugeneZelenko EugeneZelenko added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels Nov 12, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 12, 2023

@llvm/issue-subscribers-clang-frontend

Author: Kees Cook (kees)

Using `__builtin_object_size` (and `__builtin_dynamic_object_size`) on a composite structure's member that has a flexible array loses the sense of how large it is.

For example, on a struct that has a flexible array, __bdos correctly says it doesn't know the size (bounded here by alloc_size):

    expect(__builtin_object_size(wrap->msg.nlmsg_content, 1), 4076);
    expect(__builtin_object_size(wrap, 0), 4096);
    expect(__builtin_object_size(wrap, 1), 4096);

But if it is part of a wrapper, it start behaving like sizeof():

    expect(__builtin_object_size(wrap->msg.nlmsg_content, 1), 4076);
    expect(__builtin_object_size(&wrap->msg, 0), 4092);
    /* But suddenly gets it wrong? */
    expect(__builtin_object_size(&wrap->msg, 1), 4092);

https://godbolt.org/z/YrGsh8Ybs

This was recently fixed in GCC:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101832

PoC:
composite.c.txt

@kees
Copy link
Contributor Author

kees commented Feb 27, 2024

@bwendling

@bwendling
Copy link
Collaborator

This rabbit hole runs very deep. Consider this:

https://godbolt.org/z/v1s6brz9f

I would expect the value returned by foo() to be something else than 9. Maybe something closer to 5193? (unless I'm doing the math incorrectly.)

@kees
Copy link
Contributor Author

kees commented Mar 22, 2024

9 looks correct to me. Pointer to a fixed-size array instance of size 9. I think GCC is wrong here. But that's not what this bug is about. This bug is about flexible array members at the end of a composite structure and Clang fails to notice. Here's a shorter PoC:
https://godbolt.org/z/KMxY9xexW

GCC is correct here: we don't know the size of msg, so it must be SIZE_MAX.

@zygoloid
Copy link
Collaborator

Re 9 vs 54, I think GCC is right -- the operand is a pointer to the second char[9] in an array of 7 char[9]s, so it's valid to access 6 * sizeof(char[9]) == 54 bytes forward from that pointer.

@zygoloid
Copy link
Collaborator

Hm, actually, no -- the operand should be a char[9] lvalue, which should decay to a pointer to the first char in that array, so 9 seems right. Maybe GCC isn't decaying the operand?

@isanbard
Copy link
Contributor

So what I'm thinking is that an offset into an N-dimensional array is calculated by the formula in Eli Bendersky's blog post: https://eli.thegreenplace.net/2015/memory-layout-of-multi-dimensional-arrays. In the case of char d[23][5][7][9], it's more like a 23 * 5 * 7 * 9 = 7245 bytes array. The remainder after an offset of d[14][3][1] is...well some number from the formula. So is a programmer asking for the amount left from the right-most array (9 in this case), or the amount from the offset to the end of the object? I think it's the latter, but I'm not sure how people use this in practice. The documentation says that:

The second bit determines if maximum or minimum of remaining bytes is computed.

Perhaps that should determine which value to return?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema"
Projects
None yet
Development

No branches or pull requests

6 participants