Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent claims about padding in structs #49318

Open
Seelengrab opened this issue Apr 10, 2023 · 13 comments · May be fixed by #49362
Open

Inconsistent claims about padding in structs #49318

Seelengrab opened this issue Apr 10, 2023 · 13 comments · May be fixed by #49362

Comments

@Seelengrab
Copy link
Contributor

Seelengrab commented Apr 10, 2023

This came up while experimenting with @flaggify and @bitfieldify from FieldFlags.jl (name pending).

julia> primitive type Foo_prim 40 end

julia> struct Foo_p
           a::Foo_prim
       end

julia> struct Foo
           a::NTuple{5, UInt8}
       end

julia> sizeof(Foo_prim)
5

julia> sizeof(Foo_p)
8

julia> sizeof(NTuple{5, UInt8})
5

julia> sizeof(Foo)
5

julia> Base.datatype_haspadding(NTuple{5, UInt8})
false

julia> Base.datatype_haspadding(Foo)
false

julia> Base.datatype_haspadding(Foo_prim)
false

julia> Base.datatype_haspadding(Foo_p)
false # ?!

How can Foo_p grow in size from 5 to 8 bytes, when there isn't supposed to be any padding there?

Additionally, Base.padding doesn't seem to take this "magic end-struct padding" into account, not even for structs that do claim to have padding:

julia> sizeof(Tuple{UInt32, UInt8})
8

julia> Base.datatype_haspadding(Tuple{UInt32, UInt8})
true

julia> Base.padding(Tuple{UInt32, UInt8})
Base.Padding[] # ?!

julia> Base.padding(Foo_p)
Base.Padding[]

So either Foo_p doesn't have padding, which begs the question why it's 8 bytes in size, or it does, which begs the question why it claims that it doesn't.

@JeffBezanson
Copy link
Sponsor Member

I guess we should change sizeof for primitive types, since the space they take in memory is always the aligned size.

@Seelengrab
Copy link
Contributor Author

Seelengrab commented Apr 13, 2023

You mean to say that sizeof of a primitive type Foo 40 end should be something other than 5? To what are they aligned (presumably sizeof(Ptr{Any})?), and would that mean that Base.datatype_haspadding should return true as well?

This seems undesirable to me - the purpose of FieldFlags.jl is to pack as much data in as little space as possible, to save space in an embedded environment. Primitive types make that MUCH easier to implement than fiddling with an NTuple{N, UInt8}, since everything can just be bitshifts & logical masks instead of having to figure out which entries of the tuple are needed & splitting the input up correctly. Clearly, both our compiler and LLVM can handle non-aligned things just fine, since NTuple{N, UInt8} has the correct size & no padding.

I do want to be clear that having alignment to a multiple of 8 bit i.e. a byte is perfectly fine and acceptable. That is what I take the current behavior to be and is consistent with sizeof(Foo_p) == 5 (i.e., 5 bytes, meaning 40 bits, exactly as specified in the primitive type declaration). Aligning the primitive type to something other than a byte seems incorrect - all investigations so far have led to LLVM IR that's aligned to a multiple of 8 bits.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 13, 2023

C23 now has a standard ABI for this, so we should follow whatever that is, as per our usual policy

@Seelengrab
Copy link
Contributor Author

Are you referring to _BitInt? If so, while that could be used for implementing FieldFlags.jl, it's a bit orthogonal to the issue of "Schrödinger's padding" at hand.

@Seelengrab
Copy link
Contributor Author

Seelengrab commented Apr 13, 2023

Perhaps I should add that (as I understand it) a primitive type Foo N end maps precisely to an iN in LLVM (according to this, _BitInt corresponds to exactly that too). This is exactly what I want - what I don't want is to have mysterious padding that may or may not exist appear when using that object in a struct. The i40 should stay as an i40 and not introduce additional padding by itself, since there isn't anything placed after that.

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 13, 2023

They map to whatever the relevant C standard says they map to, which at this point in time appears to be the next power-of-two up to a maximum of 64 bytes: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf

@Seelengrab
Copy link
Contributor Author

That's fine, I guess, but that still means our padding reflection is inconsistent. If the padding is introduced on the LLVM level after julia, that's ok by me, as long as it's still transparent & consistent on the julia level. The abstraction we're targeting is LLVM after all, and as such, our types should be consistent in whether there is padding, and how much that is (i.e. either return true on datatype_haspadding on Foo_p in the first post, or return 5 on sizeof(Foo_p)).

Also, the standard actually says this:

_BitInt(N) types align with existing calling conventions. They have the same size and alignment as the
smallest basic type that can contain them. Types that are larger than __int64_t are conceptually treated
as struct of register size chunks. The number of chunks is the smallest number that can contain the type.

which is the additional complication that "register size" on AVR means "a byte", while it's "8 bytes" on the (64 bit) host system (4 byte on 32 bit). So enforcing/hardcoding the host system behavior here is something I'd rather avoid.

@JeffBezanson
Copy link
Sponsor Member

Ok, so we should switch to computing the size of a primitive type using this algorithm:

if nbits <= 64
    nbits = next_power_of_2(nbits)
else
    nbits = align(nbits, word_size)
end
size = nbits/8

So an Int40 is 8 bytes, and the extra 3 bytes are not padding but part of the data itself even though they're not used.

@Seelengrab
Copy link
Contributor Author

Seelengrab commented Apr 13, 2023

Well... I think that's actually a bad thing the C standard defines, because on AVR you really don't want to promote an Int40 to 64 bits - those extra 3 bytes are just wasted space there. I don't think LLVM currently does that either, but I (admittedly) haven't checked. I think following C here to the letter would be a mistake, at least in terms of guaranteeing that a primitive type Foo 24 end ends up as a i32 in the LLVM IR that we emit. Currently, it ends up as a i24:

julia> using FieldFlags

# 17 flags need 3 bytes, so this generates a `primitive type Foo 24 end`
julia> @flaggify struct Foo
           a
           b
           c
           d
           e
           f
           g
           h
           i
           j
           k
           l
           m
           n
           o
           p
           q
       end

julia> sizeof(Foo) 
3

julia> f() = Core.Intrinsics.zext_int(Foo, 0x0)
f (generic function with 1 method)

julia> @code_llvm f()
;  @ REPL[6]:1 within `f`
define i24 @julia_f_243() #0 {
top:
   # ...
  ret i24 0
}

This only ends up as a 32 bit integer when it's actually compiled to x86_64, on AVR I'd expect this to end up as three bytes, not four.

Alternatively, we also change what we emit based on the target architecture.. which seems more complicated than just letting LLVM handle it.

@JeffBezanson
Copy link
Sponsor Member

I don't think LLVM currently does that either

It does; I believe it follows the C ABI here. In fact we had to specifically fix this in #37974

@vtjnash
Copy link
Sponsor Member

vtjnash commented Apr 13, 2023

the extra 3 bytes are not padding but part of the data itself even though they're not used

I think LLVM generally expects these to act as padding. In particular, it specifies that their alignment is rounded up to the next power of 2 (https://llvm.org/docs/LangRef.html#data-layout) if using default alignment (somewhat rare these days). and that stores of the value are permitted to clobber extra bytes up their alignment value (https://llvm.org/docs/LangRef.html#store-instruction):

An alignment value higher than the size of the loaded type implies memory up to the alignment value bytes can be safely loaded(sic) without trapping in the default address space.

But note also there are explicit semantics on store that clarify further (if alignment is not specified per the default computation):

For example, storing an i24 writes at most three bytes. When writing a value of a type like i20 with a size that is not an integral number of bytes, it is unspecified what happens to the extra bits that do not belong to the type, but they will typically be overwritten

@JeffBezanson
Copy link
Sponsor Member

Ah, looks like the logic we're missing is to add padding for values whose alignment is bigger than their size?

@Seelengrab
Copy link
Contributor Author

Seelengrab commented Apr 13, 2023

It does; I believe it follows the C ABI here.

I just don't want to end up in a situation where we specify "no matter where julia runs, we must align a primitive type to 64 bits", because on AVR that would imply wasting a whole lot of space :) That's why I'd rather align to the target-register-size than raw 64 bits, as _BitInt(N) seems to require.. #37974 seems to have been about ignoring the padding that LLVM required due to running on x86_64. On AVR, this would not have happened, because the alignment is smaller than the data - it's just a byte after all. It's a good reference to check though, to see what happens now 🤔 I'll have to try this.

Ah, looks like the logic we're missing is to add padding for values whose alignment is bigger than their size?

Yes, that sounds like it will be consistent. We're missing the (reflection on) padding on x86(_64), which is not an issue on AVR due to the alignment there being to 1 byte (instead of 4/8). As long as the padding we add is target-specific, we should be good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants