Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for target details (CPUs and their supported features) #4264

Merged
merged 64 commits into from Jan 26, 2020

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Jan 22, 2020

This is @layneson's pull request #3927, merged into a branch, and then a bunch of my commits piled on top of it, with several major changes:

  • Using a bit set for CPU features, to avoid having std.Target ever require heap allocation
  • Adding the concept of "baseline" CPU features into Zig and exposing it, rather than implicitly relying on LLVM for this.
  • Exposing the set of CPU features to builtin.zig so that comptime code has access to the information.
  • Moving zig targets entirely into userland, and implementing make zig targets emit well-formed json #4213 (btw @LemonBoy I experimented with piping the output into jq and it works quite well)

This is complete and ready to be merged into master. The only thing left is getting the CI green.

layneson and others added 30 commits January 19, 2020 20:53
Previously, buffers were used with toOwnedSlice() to create c strings
for LLVM cpu/feature strings. However, toOwnedSlice() shrinks the
string memory to the buffer's length, which cuts off the null terminator.
Now toSliceConst() is used instead, and the buffer is not deinited
so that the string memory is not freed.
see BRANCH_TODO file
to avoid an illegal instruction error with the older qemu
version that is available on the CI server.
See #508. These can be re-enabled when we upgrade to LLVM 10.
comment from this commit reproduced here:

I have observed the CPU name reported by LLVM being incorrect. On
the SourceHut build services, LLVM 9.0 reports the CPU as "athlon-xp",
which is a 32-bit CPU, even though the system is 64-bit and the reported
CPU features include, among other things, +64bit.
So the strategy taken here is that we observe both reported CPU, and the
reported CPU features. The features are trusted more; but if the features
match exactly the features of the reported CPU, then we trust the reported CPU.
This reverts commit 4640ef5.

This attempted workaround did not have the desired effect.
@@ -496,7 +864,7 @@ pub const Target = union(enum) {
pub fn parseArchSub(text: []const u8) ParseArchSubError!Arch {
const info = @typeInfo(Arch);
inline for (info.Union.fields) |field| {
if (mem.eql(u8, text, field.name)) {
if (mem.startsWith(u8, text, field.name)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why startsWith instead of eql?

Copy link
Member Author

@andrewrk andrewrk Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example: aarch64v8_5a-linux-musl

related: #4261

src/main.cpp Outdated Show resolved Hide resolved
@andrewrk
Copy link
Member Author

andrewrk commented Jan 22, 2020

It looks like there is a problem with determining what CPU features to report to LLVM for the "baseline" CPUs of aarch64. @LemonBoy if you are interested in this, I could really use your expertise here.

std/target.zig

        /// The "default" set of CPU features for cross-compiling. A conservative set
        /// of features that is expected to be supported on most available hardware.
        pub fn baselineFeatures(arch: Arch) Cpu.Feature.Set {
            return switch (arch) {
                .arm, .armeb, .thumb, .thumbeb => arm.cpu.generic.features,
                .aarch64, .aarch64_be, .aarch64_32 => aarch64.cpu.generic.features,
                .avr => avr.baseline_features,
                .bpfel, .bpfeb => bpf.cpu.generic.features,
                .hexagon => hexagon.cpu.generic.features,
                .mips, .mipsel => mips.cpu.mips32.features,
                .mips64, .mips64el => mips.cpu.mips64.features,
                .msp430 => msp430.cpu.generic.features,
                .powerpc, .powerpc64, .powerpc64le => powerpc.cpu.generic.features,
                .amdgcn => amdgpu.cpu.generic.features,
                .riscv32 => riscv.baseline_32_features,
                .riscv64 => riscv.baseline_64_features,
                .sparc, .sparcv9, .sparcel => sparc.cpu.generic.features,
                .s390x => systemz.cpu.generic.features,
                .i386 => x86.cpu.pentium4.features,
                .x86_64 => x86.cpu.x86_64.features,
                .nvptx, .nvptx64 => nvptx.cpu.sm_20.features,
                .wasm32, .wasm64 => wasm.cpu.generic.features,

                else => Cpu.Feature.Set.empty,
            };
        }

I tried to make this match what we are doing in master branch, but I must have done something wrong for aarch64. There's a new flag --verbose-llvm-cpu-features to help troubleshoot this. For master branch it would be reporting an empty string (leaving it to default to llvm); in this branch we take matters into our own hands and always specify a set of CPU features to LLVM.

@andrewrk
Copy link
Member Author

andrewrk commented Jan 22, 2020

I'm pretty sure every CI failure comes down to using a different string for llvm_target_cpu_features to createTargetMachine than is passed in master (empty string for cross compiling, directly passing the result of ZigLLVMGetTargetFeatures() for native).

But rather than reverting to that behavior, the goal is to improve zig's target cpu features capabilities such that it can expose the options at comptime, as well as give the correct string to LLVM.

};
result[@enumToInt(Feature.e)] = .{
.llvm_name = "e",
.description = "Implements RV32E (provides 16 rather than 32 GPRs)",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should RV32I/RV32E/RV64I/RV128I be sub-architectures rather than features?
CC @xobs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related: #4261

lib/std/build.zig Outdated Show resolved Hide resolved
@LemonBoy
Copy link
Contributor

LemonBoy commented Jan 22, 2020

Ok so, here's some good news. The problem with the sr.ht servers are due to the fact that the cpu parameter is set to qemu64 (see here) because FreeBSD apparently has some issues with the host cpu.

You can reproduce the problem with qemu user-mode emulation for x86_64:

qemu-x86_64 -cpu qemu64 ~/code/zig/build/zig0 --verbose-llvm-cpu-features build-exe --cache off ~/code/zig/test/standalone/hello_world/hello.zig

This reports

target_specific_cpu_args=athlon-xp

You can also double-check how qemu defines the qemu64 CPU.

The commit that worked around this problem can be reverted and the sr.ht CI script amended to pick a suitable target.

Edit: Here's the upstream bug report.

@andrewrk
Copy link
Member Author

The commit that worked around this problem can be reverted and the sr.ht CI script amended to pick a suitable target.

This would result in Zig failing to compile anything on that platform, unless the user worked around this issue. That's not right; zig should be able to compile code in this environment without a workaround hack done by the user. LLVM is incorrect - the CPU is not a (32-bit!) athlon-xp, despite whatever config the OS has saying otherwise.

I'm also concerned about LLVM's reports of cpu features on one of my laptops. It reports "skylake" for the CPU, which is correct, but then it lists the features as:

"64bit",
"adx",
"aes",
"avx",
"avx2",
"bmi",
"bmi2",
"clflushopt",
"cmov",
"cx16",
"cx8",
"f16c",
"fma",
"fsgsbase",
"fxsr",
"invpcid",
"lzcnt",
"macrofusion",
"mmx",
"movbe",
"mpx",
"nopl",
"pclmul",
"popcnt",
"prfchw",
"rdrnd",
"rdseed",
"rtm",
"sahf",
"sgx",
"slow_3ops_lea",
"slow_incdec",
"sse",
"sse2",
"sse3",
"sse4_1",
"sse4_2",
"ssse3",
"x87",
"xsave",
"xsavec",
"xsaveopt",
"xsaves",

But the list of skylake features is:

"64bit",
"adx",
"aes",
"avx",
"avx2",
"bmi",
"bmi2",
"clflushopt",
"cmov",
"cx16",
"cx8",
"ermsb",
"f16c",
"false_deps_popcnt",
"fast_gather",
"fast_scalar_fsqrt",
"fast_shld_rotate",
"fast_variable_shuffle",
"fast_vector_fsqrt",
"fma",
"fsgsbase",
"fxsr",
"idivq_to_divl",
"invpcid",
"lzcnt",
"macrofusion",
"merge_to_threeway_branch",
"mmx",
"movbe",
"mpx",
"nopl",
"pclmul",
"popcnt",
"prfchw",
"rdrnd",
"rdseed",
"sahf",
"sgx",
"slow_3ops_lea",
"sse4_2",
"x87",
"xsave",
"xsavec",
"xsaveopt",
"xsaves",

Which has more things in it than what was reported as available CPU features. So presumably it would be incorrect to depend on, for example, fast_gather, based on the fact that the CPU name is skylake, and skylake has that feature, because for some reason LLVM has detected that the native host does not support that feature.

Previously it was a tagged union which was one of:
 * baseline
 * a specific CPU
 * a set of features

Now, it's possible to have a CPU but also modify the CPU's feature set
on top of that. This is closer to what LLVM does.

This is more correct because Zig's notion of CPUs (and LLVM's) is not
exact CPU models. For example "skylake" is not one very specific model;
there are several different pieces of hardware that match "skylake" that
have different feature sets enabled.
hopefully this avoids the older qemu version crashing
@andrewrk
Copy link
Member Author

These were never working with native CPU features. In this branch,
we fix native CPU features not being enabled on Windows, and regress
f128 language features. In the llvm10 branch, all this is fixed,
and the tests are re-enabled.
tests use older sub-arch that works in the older qemu
@andrewrk andrewrk added the breaking Implementing this issue could cause existing code to no longer compile or have different behavior. label Jan 23, 2020
in stack tracing code, the idea was to detect the tty settings at the
top of the stack and pass the information down. somewhere along the way
this got changed so that setTtyColor was assuming the global stderr_file
was related to the output stream the stack trace was being printed to.

now, tty_color is changed to tty_config, and it is an enum rather than a
bool, telling how tty colors are expected to be handled. windows is
still incorrectly looking at stderr_file.
@andrewrk andrewrk merged commit 96e5f47 into master Jan 26, 2020
@andrewrk andrewrk deleted the layneson-cpus_and_features branch January 26, 2020 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants