Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thumb bit no longer set in reset entry in vector table #4631

Closed
markfirmware opened this issue Mar 5, 2020 · 8 comments
Closed

thumb bit no longer set in reset entry in vector table #4631

markfirmware opened this issue Mar 5, 2020 · 8 comments
Milestone

Comments

@markfirmware
Copy link
Contributor

Before the last few commits to master, this build.zig and this vector table would actually set the lsb of the .long mission1_main in the vector table. This is needed for starting in thumb mode. I think something has changed with regards to cpu and features, but I don't know how to address this.

Thanks,
Mark

Updated for latest zig:
https://github.com/markfirmware/zig-bare-metal-microbit/tree/thumb-bit

exe.setTheTarget(try std.Target.parse(.{
    .arch_os_abi = "thumb-freestanding-none",
}));

comptime {
    asm (
        \\.section .text.start.mission1
        \\.globl mission1_vector_table
        \\.balign 0x80
        \\mission1_vector_table:
        \\ .long 0x20004000 // sp top of 16KB ram
        \\ .long mission1_main
    );
}
@andrewrk andrewrk added this to the 0.6.0 milestone Mar 5, 2020
@andrewrk andrewrk added bug Observed behavior contradicts documented or intended behavior frontend Tokenization, parsing, AstGen, Sema, and Liveness. labels Mar 5, 2020
@LemonBoy
Copy link
Contributor

LemonBoy commented Mar 5, 2020

The whole build target thing has become a Rube Goldberg...

Your problem is that thumb-freestanding-none is not enough to specify all your target features as it defaults to something w/o noarm and with thumb2 only... definitely not what your armv6m wants.

Use something like:

    exe.setTarget(.{
        .cpu_arch = .thumb,
        .os_tag = .freestanding,
        .abi = .none,
        .cpu_model = .{ .explicit = &std.Target.arm.cpu.cortex_m0 },
    });

@andrewrk
Copy link
Member

andrewrk commented Mar 5, 2020

The whole build target thing has become a Rube Goldberg...

Why do you say that? I think if you take some time to see how it works, you will find it to be quite sane.

I don't think @markfirmware needs to change the target. This looks like a regression. Previously, thumb-freestanding-none was enabling some (correct) set of CPU features, and now it is not. That can be fixed.

@LemonBoy
Copy link
Contributor

LemonBoy commented Mar 5, 2020

This looks like a regression

Not really, the previous (and bad) baseline was ARMv6, as usual it only worked due to the luck factor :) This is the correct way to build the project as the Microbit is equipped with a Cortex-M0, there's no need to play the baseline-cpu lottery anymore.

@justinbalexander
Copy link
Contributor

justinbalexander commented Mar 5, 2020 via email

@justinbalexander
Copy link
Contributor

How is a user supposed to know what features are enabled for a given
cpu? I believe the features are from LLVM so would the following page
be a good reference for ARM?

http://www.llvm.org/svn/llvm-project/llvm/trunk/lib/Target/ARM/ARM.td

From that page, here is the definition of cortex-m7.

def : ProcNoItin<"cortex-m7",                           [ARMv7em,
                                                         FeatureFPARMv8_D16]>;

You can see(?) that it includes the floating point unit for that architecture.

If you look at the datasheet for the cortex-m7 however, the floating
point unit is listed as optional. Here it is included by default.

So it would appear that you don't have to specify the FPU if you are
using the cortex-m7? How could a user know that?

Am I misunderstanding something? Why do you need to specify the fpu
when using clang for example?

Also, how would you select b/t the three ABIs for floating point? Is
thumbv7em-freestanding-eabi for a cortex-m7 soft-float using FPU or
software floating point emulation?

Thanks,

@andrewrk
Copy link
Member

andrewrk commented Mar 5, 2020

This is the correct way to build the project as the Microbit is equipped with a Cortex-M0, there's no need to play the baseline-cpu lottery anymore.

Ah! OK this is a success story for the new cpu features then. I got confused by the rube goldberg thing and thought you were suggesting a workaround.

How is a user supposed to know what features are enabled for a given cpu?

andy@ark ~> zig targets | jq keys
[
  "abi",
  "arch",
  "cpuFeatures",
  "cpus",
  "glibc",
  "libc",
  "native",
  "os"
]
andy@ark ~> zig targets | jq .cpus | jq keys
[
  "aarch64",
  "aarch64_32",
  "aarch64_be",
  "amdgcn",
  "amdil",
  "amdil64",
  "arc",
  "arm",
  "armeb",
  "avr",
  "bpfeb",
  "bpfel",
  "hexagon",
  "hsail",
  "hsail64",
  "i386",
  "kalimba",
  "lanai",
  "le32",
  "le64",
  "mips",
  "mips64",
  "mips64el",
  "mipsel",
  "msp430",
  "nvptx",
  "nvptx64",
  "powerpc",
  "powerpc64",
  "powerpc64le",
  "r600",
  "renderscript32",
  "renderscript64",
  "riscv32",
  "riscv64",
  "s390x",
  "shave",
  "sparc",
  "sparcel",
  "sparcv9",
  "spir",
  "spir64",
  "tce",
  "tcele",
  "thumb",
  "thumbeb",
  "wasm32",
  "wasm64",
  "x86_64",
  "xcore"
]
andy@ark ~> zig targets | jq .cpus.arm
{
  "arm1020e": [
    "v5te"
  ],
  "arm1020t": [
    "v5t"
  ],
  "arm1022e": [
    "v5te"
  ],
  "arm10e": [
    "v5te"
  ],
  "arm10tdmi": [
    "v5t"
  ],
  "arm1136j_s": [
    "v6"
  ],
  "arm1136jf_s": [
    "slowfpvmlx",
    "v6",
    "vfp2"
  ],
  "arm1156t2_s": [
    "v6t2"
  ],
  "arm1156t2f_s": [
    "slowfpvmlx",
    "v6t2",
    "vfp2"
  ],
  "arm1176j_s": [
    "v6kz"
  ],
  "arm1176jz_s": [
    "v6kz"
  ],
  "arm1176jzf_s": [
    "slowfpvmlx",
    "v6kz",
    "vfp2"
  ],
  "arm710t": [
    "v4t"
  ],
  "arm720t": [
    "v4t"
  ],
  "arm7tdmi": [
    "v4t"
  ],
  "arm7tdmi_s": [
    "v4t"
  ],
  "arm8": [
    "v4"
  ],
  "arm810": [
    "v4"
  ],
  "arm9": [
    "v4t"
  ],
  "arm920": [
    "v4t"
  ],
  "arm920t": [
    "v4t"
  ],
  "arm922t": [
    "v4t"
  ],
  "arm926ej_s": [
    "v5te"
  ],
  "arm940t": [
    "v4t"
  ],
  "arm946e_s": [
    "v5te"
  ],
  "arm966e_s": [
    "v5te"
  ],
  "arm968e_s": [
    "v5te"
  ],
  "arm9e": [
    "v5te"
  ],
  "arm9tdmi": [
    "v4t"
  ],
  "cortex_a12": [
    "avoid_partial_cpsr",
    "mp",
    "ret_addr_stack",
    "trustzone",
    "v7a",
    "vfp4",
    "virtualization",
    "vmlx_forwarding"
  ],
  "cortex_a15": [
    "avoid_partial_cpsr",
    "dont_widen_vmovs",
    "mp",
    "muxed_units",
    "ret_addr_stack",
    "splat_vfp_neon",
    "trustzone",
    "v7a",
    "vfp4",
    "virtualization",
    "vldn_align"
  ],
  "cortex_a17": [
    "avoid_partial_cpsr",
    "mp",
    "ret_addr_stack",
    "trustzone",
    "v7a",
    "vfp4",
    "virtualization",
    "vmlx_forwarding"
  ],
  "cortex_a32": [
    "crc",
    "crypto",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "cortex_a35": [
    "crc",
    "crypto",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "cortex_a5": [
    "mp",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "trustzone",
    "v7a",
    "vfp4",
    "vmlx_forwarding"
  ],
  "cortex_a53": [
    "crc",
    "crypto",
    "fpao",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "cortex_a55": [
    "dotprod",
    "hwdiv",
    "hwdiv_arm",
    "v8_2a"
  ],
  "cortex_a57": [
    "avoid_partial_cpsr",
    "cheap_predicable_cpsr",
    "crc",
    "crypto",
    "fpao",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "cortex_a7": [
    "mp",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "trustzone",
    "v7a",
    "vfp4",
    "virtualization",
    "vmlx_forwarding",
    "vmlx_hazards"
  ],
  "cortex_a72": [
    "crc",
    "crypto",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "cortex_a73": [
    "crc",
    "crypto",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "cortex_a75": [
    "dotprod",
    "hwdiv",
    "hwdiv_arm",
    "v8_2a"
  ],
  "cortex_a76": [
    "a76",
    "crc",
    "crypto",
    "dotprod",
    "fullfp16",
    "hwdiv",
    "hwdiv_arm",
    "v8_2a"
  ],
  "cortex_a76ae": [
    "a76",
    "crc",
    "crypto",
    "dotprod",
    "fullfp16",
    "hwdiv",
    "hwdiv_arm",
    "v8_2a"
  ],
  "cortex_a8": [
    "nonpipelined_vfp",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "trustzone",
    "v7a",
    "vmlx_forwarding",
    "vmlx_hazards"
  ],
  "cortex_a9": [
    "avoid_partial_cpsr",
    "expand_fp_mlx",
    "fp16",
    "mp",
    "muxed_units",
    "neon_fpmovs",
    "prefer_vmovsr",
    "ret_addr_stack",
    "trustzone",
    "v7a",
    "vldn_align",
    "vmlx_forwarding",
    "vmlx_hazards"
  ],
  "cortex_m0": [
    "v6m"
  ],
  "cortex_m0plus": [
    "v6m"
  ],
  "cortex_m1": [
    "v6m"
  ],
  "cortex_m23": [
    "no_movt",
    "v8m"
  ],
  "cortex_m3": [
    "loop_align",
    "m3",
    "no_branch_predictor",
    "use_aa",
    "use_misched",
    "v7m"
  ],
  "cortex_m33": [
    "dsp",
    "fp_armv8d16sp",
    "loop_align",
    "no_branch_predictor",
    "slowfpvmlx",
    "use_aa",
    "use_misched",
    "v8m_main"
  ],
  "cortex_m35p": [
    "dsp",
    "fp_armv8d16sp",
    "loop_align",
    "no_branch_predictor",
    "slowfpvmlx",
    "use_aa",
    "use_misched",
    "v8m_main"
  ],
  "cortex_m4": [
    "loop_align",
    "no_branch_predictor",
    "slowfpvmlx",
    "use_aa",
    "use_misched",
    "v7em",
    "vfp4d16sp"
  ],
  "cortex_m7": [
    "fp_armv8d16",
    "v7em"
  ],
  "cortex_r4": [
    "avoid_partial_cpsr",
    "r4",
    "ret_addr_stack",
    "v7r"
  ],
  "cortex_r4f": [
    "avoid_partial_cpsr",
    "r4",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "v7r",
    "vfp3d16"
  ],
  "cortex_r5": [
    "avoid_partial_cpsr",
    "hwdiv_arm",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "v7r",
    "vfp3d16"
  ],
  "cortex_r52": [
    "fpao",
    "use_aa",
    "use_misched",
    "v8r"
  ],
  "cortex_r7": [
    "avoid_partial_cpsr",
    "fp16",
    "hwdiv_arm",
    "mp",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "v7r",
    "vfp3d16"
  ],
  "cortex_r8": [
    "avoid_partial_cpsr",
    "fp16",
    "hwdiv_arm",
    "mp",
    "ret_addr_stack",
    "slow_fp_brcc",
    "slowfpvmlx",
    "v7r",
    "vfp3d16"
  ],
  "cyclone": [
    "avoid_movs_shop",
    "avoid_partial_cpsr",
    "crypto",
    "disable_postra_scheduler",
    "hwdiv",
    "hwdiv_arm",
    "mp",
    "neonfp",
    "ret_addr_stack",
    "slowfpvmlx",
    "swift",
    "use_misched",
    "v8a",
    "vfp4",
    "zcz"
  ],
  "ep9312": [
    "v4t"
  ],
  "exynos_m1": [
    "exynos",
    "v8a"
  ],
  "exynos_m2": [
    "exynos",
    "v8a"
  ],
  "exynos_m3": [
    "exynos",
    "v8_2a"
  ],
  "exynos_m4": [
    "dotprod",
    "exynos",
    "fullfp16",
    "v8_2a"
  ],
  "exynos_m5": [
    "dotprod",
    "exynos",
    "fullfp16",
    "v8_2a"
  ],
  "generic": [],
  "iwmmxt": [
    "v5te"
  ],
  "krait": [
    "avoid_partial_cpsr",
    "fp16",
    "hwdiv",
    "hwdiv_arm",
    "muxed_units",
    "ret_addr_stack",
    "v7a",
    "vfp4",
    "vldn_align",
    "vmlx_forwarding"
  ],
  "kryo": [
    "crc",
    "crypto",
    "hwdiv",
    "hwdiv_arm",
    "v8a"
  ],
  "mpcore": [
    "slowfpvmlx",
    "v6k",
    "vfp2"
  ],
  "mpcorenovfp": [
    "v6k"
  ],
  "sc000": [
    "v6m"
  ],
  "sc300": [
    "m3",
    "no_branch_predictor",
    "use_aa",
    "use_misched",
    "v7m"
  ],
  "strongarm": [
    "v4"
  ],
  "strongarm110": [
    "v4"
  ],
  "strongarm1100": [
    "v4"
  ],
  "strongarm1110": [
    "v4"
  ],
  "swift": [
    "avoid_movs_shop",
    "avoid_partial_cpsr",
    "disable_postra_scheduler",
    "hwdiv",
    "hwdiv_arm",
    "mp",
    "neonfp",
    "prefer_ishst",
    "prof_unpr",
    "ret_addr_stack",
    "slow_load_D_subreg",
    "slow_odd_reg",
    "slow_vdup32",
    "slow_vgetlni32",
    "slowfpvmlx",
    "swift",
    "use_misched",
    "v7a",
    "vfp4",
    "vmlx_hazards",
    "wide_stride_vfp"
  ],
  "xscale": [
    "v5te"
  ]
}

It needs to be augmented with feature descriptions which are not exposed yet. But you can find the source here: https://github.com/ziglang/zig/blob/master/lib/std/target/arm.zig

I believe the features are from LLVM so would the following page be a good reference for ARM?

Zig's CPU models & features are now independent from LLVM. They are a superset of LLVM's CPU models & features. However they are integrated with LLVM so that LLVM is informed correctly when doing code generation. Point being that we are taking steps towards a non-llvm backend being a possibility.

If you look at the datasheet for the cortex-m7 however, the floating point unit is listed as optional. Here it is included by default.

Here is where zig specifies this:

zig/lib/std/target/arm.zig

Lines 1823 to 1830 in ad27041

pub const cortex_m7 = CpuModel{
.name = "cortex_m7",
.llvm_name = "cortex-m7",
.features = featureSet(&[_]Feature{
.v7em,
.fp_armv8d16,
}),
};

It seems you have found a bug in the CPU feature set, and a pull request would be most welcome! Especially with a cited source of the data sheet.

So it would appear that you don't have to specify the FPU if you are using the cortex-m7? How could a user know that?

If the FPU is not determinable only by specifying the cpu model as cortex-m7, then the user should have to specify the FPU. The features attached to a given CPU model should be only what is guaranteed to be available in the hardware, and optional features are supposed to be supplied separately.

Am I misunderstanding something? Why do you need to specify the fpu when using clang for example?

It's likely that clang has extra logic associated with this target on top of LLVM.

Also, how would you select b/t the three ABIs for floating point? Is armv7em-freestanding-eabi for a cortex-m7 soft-float using FPU or software floating point emulation?

Currently how the ABI works is that the "abi" portion of the target triple selects "hard" or "soft":

  • ("eabihf", "musleabihf", "gnueabihf") => hard
  • ("eabi", "musleabi", "gnueabi") => soft

Next, the CPU feature set determines whether it's soft-float using FPU or software floating point emulation. If there is an FPU, it's the former, otherwise, the latter.

2 caveats:

  • This is how it works in theory, there may be bugs, please open issues if it does not work as expected
  • This may not be a good way for it to work. It's just the first thing I thought of. You are more knowledgeable than me in this domain, and proposals are welcome and appreciated.

@andrewrk andrewrk removed bug Observed behavior contradicts documented or intended behavior frontend Tokenization, parsing, AstGen, Sema, and Liveness. labels Mar 5, 2020
@andrewrk
Copy link
Member

andrewrk commented Mar 5, 2020

The suggestion #4631 (comment) is correct. When using freestanding with ARM or Thumb, you'll want to specify either a specific CPU model, or at least a subarchitecture.

With using the parse function:

exe.setTarget(std.zig.CrossTarget.parse(.{
    .arch_os_abi = "thumb-freestanding-none",
    .mcpu = "cortex_m0",
}) catch unreachable);

If you wanted to use a generic CPU rather than cortex_m0:

exe.setTarget(std.zig.CrossTarget.parse(.{
    .arch_os_abi = "thumb-freestanding-none",
    .mcpu = "generic+v6m",
}) catch unreachable);

Please let me know if this does not solve the problem.

@andrewrk andrewrk closed this as completed Mar 5, 2020
@markfirmware
Copy link
Contributor Author

Thanks - working well now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants