Skip to content

Size saving opportunity in glibc include directories #21258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexrp opened this issue Aug 30, 2024 · 6 comments · Fixed by #24025
Closed

Size saving opportunity in glibc include directories #21258

alexrp opened this issue Aug 30, 2024 · 6 comments · Fixed by #24025
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. libc Issues related to libzigc and Zig's vendored libc code.
Milestone

Comments

@alexrp
Copy link
Member

alexrp commented Aug 30, 2024

Right now, we have a directory for each glibc-based target triple under lib/libc/include. But if you do a recursive diff between many (all?) of the related ones (riscv32-linux-gnu vs riscv64-linux-gnu, arm-linux-gnueabi vs arm-linux-gnueabihf, and so on), it becomes apparent that the only difference is the presence of the appropriate lib-names-<abi>.h and stubs-<abi>.h headers. The appropriate version of this header is picked by the lib-names.h and stubs.h files based on preprocessor defines.

Also, we already have a bunch of logic for picking the right paths based on target info:

zig/src/glibc.zig

Lines 416 to 663 in e084c46

fn start_asm_path(comp: *Compilation, arena: Allocator, basename: []const u8) ![]const u8 {
const arch = comp.getTarget().cpu.arch;
const is_ppc = arch.isPowerPC();
const is_aarch64 = arch.isAARCH64();
const is_sparc = arch.isSPARC();
const is_64 = comp.getTarget().ptrBitWidth() == 64;
const s = path.sep_str;
var result = std.ArrayList(u8).init(arena);
try result.appendSlice(comp.zig_lib_directory.path.?);
try result.appendSlice(s ++ "libc" ++ s ++ "glibc" ++ s ++ "sysdeps" ++ s);
if (is_sparc) {
if (mem.eql(u8, basename, "crti.S") or mem.eql(u8, basename, "crtn.S")) {
try result.appendSlice("sparc");
} else {
if (is_64) {
try result.appendSlice("sparc" ++ s ++ "sparc64");
} else {
try result.appendSlice("sparc" ++ s ++ "sparc32");
}
}
} else if (arch.isARM()) {
try result.appendSlice("arm");
} else if (arch.isMIPS()) {
if (!mem.eql(u8, basename, "crti.S") and !mem.eql(u8, basename, "crtn.S")) {
try result.appendSlice("mips");
} else {
if (is_64) {
const abi_dir = if (comp.getTarget().abi == .gnuabin32)
"n32"
else
"n64";
try result.appendSlice("mips" ++ s ++ "mips64" ++ s);
try result.appendSlice(abi_dir);
} else {
try result.appendSlice("mips" ++ s ++ "mips32");
}
}
} else if (arch == .x86_64) {
try result.appendSlice("x86_64");
} else if (arch == .x86) {
try result.appendSlice("i386");
} else if (is_aarch64) {
try result.appendSlice("aarch64");
} else if (arch.isRISCV()) {
try result.appendSlice("riscv");
} else if (is_ppc) {
if (is_64) {
try result.appendSlice("powerpc" ++ s ++ "powerpc64");
} else {
try result.appendSlice("powerpc" ++ s ++ "powerpc32");
}
} else if (arch.isLoongArch()) {
try result.appendSlice("loongarch");
}
try result.appendSlice(s);
try result.appendSlice(basename);
return result.items;
}
fn add_include_dirs(comp: *Compilation, arena: Allocator, args: *std.ArrayList([]const u8)) error{OutOfMemory}!void {
const target = comp.getTarget();
const opt_nptl: ?[]const u8 = if (target.os.tag == .linux) "nptl" else "htl";
const s = path.sep_str;
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "include"));
if (target.os.tag == .linux) {
try add_include_dirs_arch(arena, args, target, null, try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++ "unix" ++ s ++ "sysv" ++ s ++ "linux"));
}
if (opt_nptl) |nptl| {
try add_include_dirs_arch(arena, args, target, nptl, try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps"));
}
if (target.os.tag == .linux) {
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++
"unix" ++ s ++ "sysv" ++ s ++ "linux" ++ s ++ "generic"));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++
"unix" ++ s ++ "sysv" ++ s ++ "linux" ++ s ++ "include"));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++
"unix" ++ s ++ "sysv" ++ s ++ "linux"));
}
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ comp.zig_lib_directory.path.?, lib_libc_glibc ++ "sysdeps", nptl }));
}
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++ "pthread"));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++ "unix" ++ s ++ "sysv"));
try add_include_dirs_arch(arena, args, target, null, try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++ "unix"));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++ "unix"));
try add_include_dirs_arch(arena, args, target, null, try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps"));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc_glibc ++ "sysdeps" ++ s ++ "generic"));
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ comp.zig_lib_directory.path.?, lib_libc ++ "glibc" }));
try args.append("-I");
try args.append(try std.fmt.allocPrint(arena, "{s}" ++ s ++ "libc" ++ s ++ "include" ++ s ++ "{s}-{s}-{s}", .{
comp.zig_lib_directory.path.?, @tagName(target.cpu.arch), @tagName(target.os.tag), @tagName(target.abi),
}));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc ++ "include" ++ s ++ "generic-glibc"));
const arch_name = target.osArchName();
try args.append("-I");
try args.append(try std.fmt.allocPrint(arena, "{s}" ++ s ++ "libc" ++ s ++ "include" ++ s ++ "{s}-linux-any", .{
comp.zig_lib_directory.path.?, arch_name,
}));
try args.append("-I");
try args.append(try lib_path(comp, arena, lib_libc ++ "include" ++ s ++ "any-linux-any"));
}
fn add_include_dirs_arch(
arena: Allocator,
args: *std.ArrayList([]const u8),
target: std.Target,
opt_nptl: ?[]const u8,
dir: []const u8,
) error{OutOfMemory}!void {
const arch = target.cpu.arch;
const is_x86 = arch.isX86();
const is_aarch64 = arch.isAARCH64();
const is_ppc = arch.isPowerPC();
const is_sparc = arch.isSPARC();
const is_64 = target.ptrBitWidth() == 64;
const s = path.sep_str;
if (is_x86) {
if (arch == .x86_64) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "x86_64", nptl }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "x86_64" }));
}
} else if (arch == .x86) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "i386", nptl }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "i386" }));
}
}
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "x86", nptl }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "x86" }));
}
} else if (arch.isARM()) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "arm", nptl }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "arm" }));
}
} else if (arch.isMIPS()) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "mips", nptl }));
} else {
if (is_64) {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "mips" ++ s ++ "mips64" }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "mips" ++ s ++ "mips32" }));
}
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "mips" }));
}
} else if (is_sparc) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "sparc", nptl }));
} else {
if (is_64) {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "sparc" ++ s ++ "sparc64" }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "sparc" ++ s ++ "sparc32" }));
}
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "sparc" }));
}
} else if (is_aarch64) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "aarch64", nptl }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "aarch64" }));
}
} else if (is_ppc) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "powerpc", nptl }));
} else {
if (is_64) {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "powerpc" ++ s ++ "powerpc64" }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "powerpc" ++ s ++ "powerpc32" }));
}
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "powerpc" }));
}
} else if (arch.isRISCV()) {
if (opt_nptl) |nptl| {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "riscv", nptl }));
} else {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "riscv" }));
}
} else if (arch.isLoongArch()) {
try args.append("-I");
try args.append(try path.join(arena, &[_][]const u8{ dir, "loongarch" }));
}
}

Given these facts, I think we could enhance process_headers.zig to exploit this knowledge and merge these include directories together (while asserting for safety that there are no actual diffs between them). Then we'd just update src/glibc.zig to have very slightly smarter include directory selection.

@wooster0
Copy link
Contributor

wooster0 commented Sep 3, 2024

Sounds a bit like https://github.com/ziglang/universal-headers?

@alexrp
Copy link
Member Author

alexrp commented Sep 3, 2024

Not exactly. The idea there is to have a single set of headers to cover everything. Here I'm just talking about merging headers for targets when they're literally identical already.

@alexrp alexrp added zig cc Zig as a drop-in C compiler feature enhancement Solving this issue will likely involve adding new logic or components to the codebase. labels Oct 3, 2024
@alexrp alexrp added this to the unplanned milestone Oct 3, 2024
@alexrp alexrp removed the zig cc Zig as a drop-in C compiler feature label Oct 4, 2024
@alexrp
Copy link
Member Author

alexrp commented Feb 18, 2025

Looks like we can delete somewhere in the ballpark of ~550 headers by doing this, which takes lib/libc/include/*gnu* from ~2.9M to ~1.3M.

@alexrp alexrp self-assigned this Apr 11, 2025
@alexrp alexrp added the libc Issues related to libzigc and Zig's vendored libc code. label Apr 11, 2025
@alexrp alexrp modified the milestones: unplanned, 0.15.0 Apr 11, 2025
@alexrp
Copy link
Member Author

alexrp commented May 19, 2025

Note to self, these groups of directories can be merged:

  • lib/libc/include/{aarch64,aarch64_be}-linux-gnu
  • lib/libc/include/{arm,armeb}-linux-{gnueabi,gnueabihf}
  • lib/libc/include/csky-linux-{gnueabi,gnueabihf}
    • Caveat: gnu/lib-names.h and gnu/stubs.h will need manual patching for float ABI.
  • lib/libc/include/loongarch64-linux-{gnu,gnusf}
  • lib/libc/include/{{mips,mipsel}-linux-{gnueabi,gnueabihf},{mips64,mips64el}-linux-{gnuabi64,gnuabin32}}
  • lib/libc/include/{powerpc-linux-{gnueabi,gnueabihf},{powerpc64,powerpc64le}-linux-gnu}
    • Caveat: bits/long-double.h may need manual patching for long double ABI on powerpc64le.
  • lib/libc/include/{riscv32,riscv64}-linux-gnu
  • lib/libc/include/{sparc,sparc64}-linux-gnu
  • lib/libc/include/{x86-linux-gnu,x86_64-linux-{gnu,gnux32}}

@alexrp
Copy link
Member Author

alexrp commented May 19, 2025

Looks like we can delete somewhere in the ballpark of ~550 headers by doing this, which takes lib/libc/include/*gnu* from ~2.9M to ~1.3M.

Correction: 616 headers can be removed, for a size reduction from 2.9M to 1.2M.

@alexrp
Copy link
Member Author

alexrp commented May 29, 2025

Semi-related: lib/libc/include/arm-netbsd-{eabi,eabihf}, lib/libc/include/mips-netbsd-{eabi,eabihf}, and lib/libc/include/powerpc-netbsd-{eabi,eabihf} can be merged for some modest wins.

alexrp added a commit to alexrp/zig that referenced this issue May 29, 2025

Verified

This commit was signed with the committer’s verified signature.
alexrp Alex Rønne Petersen
…able.

Manual patches:

* lib/libc/include/csky-linux-gnu/gnu/{lib-names,stubs}.h
* lib/libc/include/powerpc-linux-gnu/bits/long-double.h

Takes lib/libc/include from 115.5 MB to 113.4 MB.

Closes ziglang#21258.
@alexrp alexrp closed this as completed in 63a9048 Jun 4, 2025
@alexrp alexrp removed their assignment Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. libc Issues related to libzigc and Zig's vendored libc code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants