Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packed structs: failure in assigning to struct fields #16615

Closed
ethindp opened this issue Jul 30, 2023 · 7 comments · Fixed by #17375
Closed

Packed structs: failure in assigning to struct fields #16615

ethindp opened this issue Jul 30, 2023 · 7 comments · Fixed by #17375
Labels
bug Observed behavior contradicts documented or intended behavior miscompilation The compiler reports success but produces semantically incorrect code.
Milestone

Comments

@ethindp
Copy link

ethindp commented Jul 30, 2023

Zig Version

0.11.0-dev.4308+417b92f08

Steps to Reproduce and Observed Behavior

I'm trying to set up the GDT. I'm using a packed struct with the alignment of u64. However, when I assign to the struct, the statically-assigned entries have all the bits on the first field set (which is correct) but no other field is set (even though I fully initialized the struct). This happens both when using packed unions to represent bitfields and when just using normal integral types.

To reproduce, all that's needed is this code:

const std = @import("std");
const debug = std.debug;

const GdtEntry = packed struct {
    limit_lo: u16 = 0,
    base_lo: u24 = 0,
    access: u8 = 0x00,
    limit_hi: u4 = 0,
    flags: u4 = 0x0,
    base_hi: u8 = 0,

    pub fn asU64(self: GdtEntry) u64 {
        return @bitCast(self);
    }
};
comptime {
    if (@sizeOf(GdtEntry) != 8 or @bitSizeOf(GdtEntry) != 64) @compileError("GdtEntry must be 8 bytes!");
    if (@bitOffsetOf(GdtEntry, "limit_lo") != 0) @compileError("Limit lo must be at bit offset 0!");
    if (@bitOffsetOf(GdtEntry, "base_lo") == 15) @compileError("base_lo must be at bit offset 16!");
    if (@bitOffsetOf(GdtEntry, "access") != 40) @compileError("access byte must be at bit offset 40!");
    if (@bitOffsetOf(GdtEntry, "limit_hi") != 48) @compileError("limit_hi must be at bit offset 48!");
    if (@bitOffsetOf(GdtEntry, "flags") != 52) @compileError("flags must be a bit offset 52!");
    if (@bitOffsetOf(GdtEntry, "base_hi") != 56) @compileError("base_hi must be at bit offset 56!");
}

var gdt = [_]GdtEntry{.{}} ** 8;
var ist1 = [_]u8{0} ** 2048;
var ist2 = [_]u8{0} ** 2048;
var ist3 = [_]u8{0} ** 2048;
var ist4 = [_]u8{0} ** 2048;
var ist5 = [_]u8{0} ** 2048;
var ist6 = [_]u8{0} ** 2048;
var ist7 = [_]u8{0} ** 2048;
var stack0 = [_]u8{0} ** 4096;
var stack1 = [_]u8{0} ** 4096;
var stack2 = [_]u8{0} ** 4096;
const TssDescriptor = packed struct {
    reserved1: u32 = 0,
    rsp0: u64,
    rsp1: u64,
    rsp2: u64,
    reserved2: u64 = 0,
    ist1: u64,
    ist2: u64,
    ist3: u64,
    ist4: u64,
    ist5: u64,
    ist6: u64,
    ist7: u64,
    reserved3: u32 = 0,
    reserved4: u32 = 0,
    reserved5: u8 = 0,
    iopb: u16,
};
comptime {
    if (@sizeOf(TssDescriptor) != 104) @compileError("TSS descriptor must be 104 bytes in size");
}

var tss: TssDescriptor = undefined;

pub fn main() void {
    // Initial GDT
    // 64-bit kernel code
    gdt[1] = .{
        .limit_lo = 0xFFFF,
        .limit_hi = 0xF,
        .access = 0x9B,
        .flags = 0xA,
    };
    // 64-bit kernel data
    gdt[2] = .{
        .limit_lo = 0xFFFF,
        .limit_hi = 0xF,
        .access = 0x93,
        .flags = 0xC,
    };
    // 64-bit user code
    gdt[3] = .{
        .limit_lo = 0xFFFF,
        .limit_hi = 0xF,
        .access = 0xFB,
        .flags = 0xA,
    };
    // 64-bit user data
    gdt[4] = .{
        .limit_lo = 0xFFFF,
        .limit_hi = 0xF,
        .access = 0xF3,
        .flags = 0xC,
    };
    tss = TssDescriptor{
        .rsp0 = @intFromPtr(&stack0),
        .rsp1 = @intFromPtr(&stack1),
        .rsp2 = @intFromPtr(&stack2),
        .ist1 = @intFromPtr(&ist1),
        .ist2 = @intFromPtr(&ist2),
        .ist3 = @intFromPtr(&ist3),
        .ist4 = @intFromPtr(&ist4),
        .ist5 = @intFromPtr(&ist5),
        .ist6 = @intFromPtr(&ist6),
        .ist7 = @intFromPtr(&ist7),
        .iopb = @sizeOf(TssDescriptor),
    };
    const tss_base = @intFromPtr(&tss);
    const tss_limit = @sizeOf(TssDescriptor) - 1;
    gdt[5] = .{
        .base_lo = @truncate(tss_base),
        .base_hi = @truncate(tss_base >> 24),
        .limit_lo = @truncate(tss_limit),
        .limit_hi = @truncate(tss_limit >> 16),
        .access = 0x89,
    };
    gdt[6] = .{
        .base_lo = @truncate(tss_base >> 32),
        .base_hi = @truncate(tss_base >> 56),
    };
    for (gdt, 0..) |entry, i| {
        std.debug.print("GDT entry {}: {X:0<16}\n", .{ i, entry.asU64() });
    }
}

When run, this program prints the following output:

GDT entry 0: 0000000000000000
GDT entry 1: FFFF000000000000
GDT entry 2: FFFF000000000000
GDT entry 3: FFFF000000000000
GDT entry 4: FFFF000000000000
GDT entry 5: 7B0E894800000067
GDT entry 6: F700000000000000
GDT entry 7: 0000000000000000

The zeroth and seventh entries are correct (and I strongly believe the fifth and sixth are also correct).

Expected Behavior

The assignments should work fine, without any problems, and should reflect the properly assigned values.

@ethindp ethindp added the bug Observed behavior contradicts documented or intended behavior label Jul 30, 2023
@andrewrk
Copy link
Member

Is this a duplicate of #12852? It looks like what you want is an extern struct with all fields having align(1), like this:

const TssDescriptor = extern struct {
    reserved1: u32 align(1) = 0,
    rsp0: u64 align(1),
    rsp1: u64 align(1),
    rsp2: u64 align(1),
    reserved2: u64 align(1) = 0,
    ist1: u64 align(1),
    ist2: u64 align(1),
    ist3: u64 align(1),
    ist4: u64 align(1),
    ist5: u64 align(1),
    ist6: u64 align(1),
    ist7: u64 align(1),
    reserved3: u32 align(1) = 0,
    reserved4: u32 align(1) = 0,
    reserved5: u8 align(1) = 0,
    iopb: u16 align(1),
};

Instead what you have is a 824-bit integer that gets loaded and stored via bitwise operations.

Proposal to make this require less syntax: #13009

@ethindp
Copy link
Author

ethindp commented Jul 30, 2023

I mean, maybe, but this happens for the GdtEntry, not the TSS

@andrewrk
Copy link
Member

Understood. I apologize for not reading the issue carefully. In that case this certainly looks like a miscompilation (my advice stands for the TSS struct however).

@andrewrk andrewrk added the miscompilation The compiler reports success but produces semantically incorrect code. label Jul 30, 2023
@andrewrk andrewrk added this to the 0.11.0 milestone Jul 30, 2023
@ethindp
Copy link
Author

ethindp commented Jul 30, 2023

I'll apply your advice -- I wasn't aware that I was misusing packed structs in that instance.

@bibo38
Copy link

bibo38 commented Jul 30, 2023

Mimimalistic example to reproduce (0.11.0-dev.4315+f5239677e):

const std = @import("std");

const Small = packed struct {
    val: u8 = 0,
    lo: u4 = 0,
    hi: u4 = 0,
};

var small = Small{};
export fn val() u16 {
    small = .{
        .val = 0x12,
        .lo = 3,
        .hi = 4,
    };
    return @bitCast(small);
}

pub fn main() void {
    std.log.info("Value: {x}", .{val()});
}

Outputs: info: Value: 4012

Defining the variable locally works as expected:

const std = @import("std");

const Small = packed struct {
    val: u8 = 0,
    lo: u4 = 0,
    hi: u4 = 0,
};

export fn val() u16 {
    var small = Small{};
    small = .{
        .val = 0x12,
        .lo = 3,
        .hi = 4,
    };
    return @bitCast(small);
}

pub fn main() void {
    std.log.info("Value: {x}", .{val()});
}

Outputs: info: Value: 4312

The asm of the buggy code indicates a wrong offset (Godbolt):

mov     byte ptr [example.small], 18 ; assigning val = 0x12
mov     ax, word ptr [example.small+1]  ; here should be no +1 offset, as we only operate on the word
and     ax, -3841
or      ax, 768
mov     word ptr [example.small+1], ax  ; here the .low value is saved one byte after the packed struct
mov     ax, word ptr [example.small]  ; for .hi zig correctly omits the offset, therefore it is applyed correctly
and     ax, 4095
or      ax, 16384
mov     word ptr [example.small], ax

But I'm impressed that Zig assigns each field individually, given that the whole struct is overwritten. When defining the variable locally (second working example), Zig just overrides everything in one go (Godbolt):

mov     word ptr [rbp - 2], 17170

@ethindp
Copy link
Author

ethindp commented Jul 30, 2023

@bibo38 Frankly, I'm surprised that that assembly doesn't cause some kind of exception, or corrupt program code. Especially if it writes one byte past the end of the packed struct -- there's no telling what that could be, so it could potentially be corrupting other data or code.

@ethindp
Copy link
Author

ethindp commented Jul 31, 2023

Update: so the (proposed) fix doesn't quite work. If I localize the definitions, the results are:

GDT entry 0: 0000000000000000
GDT entry 1: 0AF9B000000FFFF0
GDT entry 2: 0CF93000000FFFF0
GDT entry 3: 0AFFB000000FFFF0
GDT entry 4: 0CFF3000000FFFF0
GDT entry 5: 420089FFF9000067
GDT entry 6: 000003D000000000
GDT entry 7: 0000000000000000

According to the OSDev forums, the values are shifted by a byte (I don't know in which direction though).

@andrewrk andrewrk modified the milestones: 0.11.0, 0.11.1 Aug 1, 2023
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Aug 2, 2023
when a field was 'byte_aligned', the offset was applied once to the pointer
and again when dereferencing it (because it's a ptr with a 'packed_offset')

resolves ziglang#16615
and maybe ziglang#16621 ?
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Aug 7, 2023
A field smaller that the abisize is still be accessed with a packed_offset decorated ptr, even when aligned to a byte boundary.
The test was incorrect and the offset was applied once to the pointer (as if it would be used as an undecorated ptr),
and again when dereferencing it (by the packed_offset decoration)

resolves ziglang#16615
and maybe ziglang#16621 ?
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Aug 8, 2023
A field smaller that the abisize is still be accessed with a packed_offset decorated ptr, even when aligned to a byte boundary.
The test was incorrect and the offset was applied once to the pointer (as if it would be used as an undecorated ptr),
and again when dereferencing it (by the packed_offset decoration)

resolves ziglang#16615
and maybe ziglang#16621 ?
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Aug 9, 2023
Fields smaller that the abisize, even when aligned to a byte boundary, are accessed via a packed_offset decorated ptr.
This was not always considered by the code, and the offset was applied once to the pointer (as if it would be used as an undecorated ptr),
and again when dereferencing it (by the packed_offset decoration)

resolves ziglang#16615
and probaly ziglang#16621
xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Aug 9, 2023
Fields smaller that the abisize, even when aligned to a byte boundary, are accessed via a packed_offset decorated ptr.
This was not always considered by the code, and the offset was applied once to the pointer (as if it would be used as an undecorated ptr),
and again when dereferencing it (by the packed_offset decoration)

resolves ziglang#16615
and probaly ziglang#16621
andrewrk pushed a commit to xxxbxxx/zig that referenced this issue Sep 17, 2023
Fields smaller that the abisize, even when aligned to a byte boundary, are accessed via a packed_offset decorated ptr.
This was not always considered by the code, and the offset was applied once to the pointer (as if it would be used as an undecorated ptr),
and again when dereferencing it (by the packed_offset decoration)

resolves ziglang#16615
and probaly ziglang#16621
@andrewrk andrewrk modified the milestones: 0.11.1, 0.12.0 Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior miscompilation The compiler reports success but produces semantically incorrect code.
Projects
None yet
3 participants