Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big executable size on Windows when using both openFile and createFile, but not when using one of the two. #18849

Open
shimamura-sakura opened this issue Feb 7, 2024 · 3 comments
Labels
bug Observed behavior contradicts documented or intended behavior

Comments

@shimamura-sakura
Copy link

Zig Version

0.12.0-dev.2619+5cf138e51

Steps to Reproduce and Observed Behavior

Build command: zig build-exe -OReleaseSmall -flto -fstrip -fsingle-threaded -target x86_64-windows main.zig

Code:

const std = @import("std");

pub fn main() void {
    _ = std.fs.cwd().openFile("test", .{}) catch {};
    _ = std.fs.cwd().createFile("test", .{}) catch {};
}

If I comment out any one line of the two, I get a 12KB exe file.
However, If I keep both two lines, I get a 140KB exe file. Also, I see very long zero bytes in the EXE file.

big-small.zip

big.exe: both line.
small.exe: only the "openFile" line.

Expected Behavior

The exe should be smaller, maybe 10-20KB.

@shimamura-sakura shimamura-sakura added the bug Observed behavior contradicts documented or intended behavior label Feb 7, 2024
@Vexu
Copy link
Member

Vexu commented Feb 7, 2024

I would expect this to be because of PE format alignment requirements but someone else will have to confirm.

@drew-gpf
Copy link
Contributor

drew-gpf commented Feb 8, 2024

The problem is that the compiler is trying to initialize two structs on the stack that are ~64k wide using weak_memcpy_default__alloca, which necessitates placing a similarly-lengthed buffer in the .rdata section.

/// > The maximum path of 32,767 characters is approximate, because the "\\?\"
/// > prefix may be expanded to a longer string by the system at run time, and
/// > this expansion applies to the total length.
/// from https://docs.microsoft.com/en-us/windows/desktop/FileIO/naming-a-file#maximum-path-length-limitation
pub const PATH_MAX_WIDE = 32767;

pub const PathSpace = struct {
    data: [PATH_MAX_WIDE:0]u16,
    len: usize,

    pub fn span(self: *const PathSpace) [:0]const u16 {
        return self.data[0..self.len :0];
    }
};

We see initialization points in sliceToPrefixedFileW and wToPrefixedFileW:

pub fn sliceToPrefixedFileW(dir: ?HANDLE, path: []const u8) !PathSpace {
    var temp_path: PathSpace = undefined;
    temp_path.len = try std.unicode.utf8ToUtf16Le(&temp_path.data, path);
    temp_path.data[temp_path.len] = 0;
    return wToPrefixedFileW(dir, temp_path.span());
}

In wToPrefixedFileW, the undefined PathSpace is initialized depending on a branch. Comparing source code with decompilation, for example:

var path_space: PathSpace = undefined;
// ...
const path_byte_len = ntdll.RtlGetFullPathName_U(
    path_to_get.ptr,
    buf_len * 2,
    path_space.data[path_buf_offset..].ptr,
    null,
);
if (path_byte_len == 0) {
    // TODO: This may not be the right error
    return error.BadPathName;
} else if (path_byte_len / 2 > buf_len) {
    return error.NameTooLong;
}

This becomes:

    v32 = v133;
    v33 = RtlGetFullPathName_U_0(FileName, 2 * v30, (PWSTR)&dest[2 * v29], 0i64);
    if ( !v33 )
    {
      weak_memcpy_default__alloca(v163, (unsigned __int8 *)&byte_42F0D0, 0x10008ui64);
      v131 = 0;
      v132 = 0;
      v35 = 8;
      goto exit;
    }
    v34 = v33 >> 1;
    if ( v34 > v30 )
    {
      weak_memcpy_default__alloca(v163, (unsigned __int8 *)&::src, 0x10008ui64);
      v131 = 0;
      v132 = 0;
      v35 = 6;
      goto exit;
    }

Both ::src and byte_42F0D0 point to separate buffers with a length of 0x10008, which is the size of PathSpace on x64, and are entirely composed of zeroes.

I'm not sure why the compiler wants to do this, but it's probably related to the fact that PathSpace is so large that the compiler needs to call __chkstk in the function prologue:

image

As an aside, I would really like if PathSpace wasn't 64 kilobytes so that I don't have to worry about extreme stack usage when trying to just open a file. This is especially bad if the function is inlined but rarely called.

@notcancername
Copy link
Contributor

notcancername commented Feb 14, 2024

As an aside, I would really like if PathSpace wasn't 64 kilobytes so that I don't have to worry about extreme stack usage when trying to just open a file. This is especially bad if the function is inlined but rarely called.

#225 can eliminate this problem in release mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior
Projects
None yet
Development

No branches or pull requests

4 participants