morello: Add initial support for a CPU and a QEMU platform #1157

heshamelmatary · 2024-01-08T15:17:12Z

Support for building and running seL4 on the new Arm's Morello which has CHERI extension. However, this PR only supports AArch64 without CHERI yet. This servers as a baseline architecture for future CHERI support and could be used as another normal seL4 AArch64 platform.

See this issue for related progress/PRs adding baseline Morello support.

libsel4/sel4_plat_include/morello-qemu/sel4/plat/api/constants.h

axel-h · 2024-01-08T15:50:51Z

See also seL4/seL4_tools#182

Indanz · 2024-01-08T15:53:37Z

This relates to https://sel4.atlassian.net/browse/RFC-15. I converted this to draft as it doesn't seem ready yet.

src/plat/morello-qemu/config.cmake

tools/dts/morello-qemu-gicv2.dts

Indanz · 2024-01-08T16:05:27Z

I think treating Morello as a CPU type is wrong and will lead to problems in the future when there are multiple hardware Morello implementations. So I think there should be one config option for Morello on a higher level, which is independent from ARM, and additional config type per CPU Morello implementations. But without users of the config options it's hard to see what the best approach is and what makes sense to add and what not.

heshamelmatary · 2024-01-08T16:26:25Z

This relates to https://sel4.atlassian.net/browse/RFC-15. I converted this to draft as it doesn't seem ready yet.

RFC-15 is mainly about CHERI support, this is just another AArch64 platform without any CHERI stuff which is the baseline. AFAIR, @lsf37 said it can go ahead without an RFC

Indanz · 2024-01-08T16:40:53Z

This relates to https://sel4.atlassian.net/browse/RFC-15. I converted this to draft as it doesn't seem ready yet.

RFC-15 is mainly about CHERI support, this is just another AArch64 platform without any CHERI stuff which is the baseline. AFAIR, @lsf37 said it can go ahead without an RFC

Then don't add 'KernelArmMorello' and only add morello-qemu and tell us how to run it with qemu. And perhaps call it morello-arm-qemu or something, in case other qemu adds morello support for risc-v in the future. Does it need CHERI-QEMU or does it uses standard qemu?

Please tell us what the goal and purpose is in the description, that would make reviewing easier and less confusing. Please remove everything that isn't needed yet.

Indanz · 2024-01-08T16:42:05Z

(And feel free to remove it from draft status.)

heshamelmatary · 2024-01-08T16:49:05Z

I think treating Morello as a CPU type is wrong and will lead to problems in the future when there are multiple hardware Morello implementations. So I think there should be one config option for Morello on a higher level, which is independent from ARM, and additional config type per CPU Morello implementations. But without users of the config options it's hard to see what the best approach is and what makes sense to add and what not.

In the Arm world, "Morello" is being interchangeably (and confusingly) used to represent an architecture[extension], an SoC, a CPU, and a board. It's arguably just a CHERI implementation on armv8a, and I don't expect there's going to be multiple different "Morello" implementations with the same name. It's also used as a -cpu flag in QEMU, and in the QEMU's CPU nodes there's compatible = "arm,morello"; This same CPU name works for QEMU, FVP, and the Morello board. I intentionally named it as a new seL4 CPU so that it's easy to spot this is a Morello-based CPU

heshamelmatary · 2024-01-08T17:04:05Z

This relates to https://sel4.atlassian.net/browse/RFC-15. I converted this to draft as it doesn't seem ready yet.

RFC-15 is mainly about CHERI support, this is just another AArch64 platform without any CHERI stuff which is the baseline. AFAIR, @lsf37 said it can go ahead without an RFC

Then don't add 'KernelArmMorello' and only add morello-qemu and tell us how to run it with qemu. And perhaps call it morello-arm-qemu or something, in case other qemu adds morello support for risc-v in the future. Does it need CHERI-QEMU or does it uses standard qemu?

It's an KernelArmMorello but without CHERI support. I added instructions and support to run it in QEMU in seL4/seL4_tools#182. In this commit message, 1a0cfea I state that it requires CHERI-QEMU. There's no (and won't be) other QEMU adding Morello support; the naming convention is just following other Linux/EDK2/Morello support such as morello-fvp and morello-soc which I'll submit soon.

Please tell us what the goal and purpose is in the description, that would make reviewing easier and less confusing. Please remove everything that isn't needed yet.

Sure, happy to add to the PR description. I just relied on the commit messages for the description. 3e576c0 states the goal and purpose

axel-h · 2024-01-08T19:14:40Z

src/arch/arm/config.cmake

@@ -32,6 +32,9 @@ elseif(KernelArmCortexA72)
    # (https://developer.arm.com/documentation/100095/0001/memory-management-unit/about-the-mmu)
    set(KernelArmPASizeBits44 ON)
    math(EXPR KernelPaddrUserTop "(1 << 44)")
+elseif(KernelArmMorello)
+    set(KernelArmPASizeBits48 ON)


Seems KernelArmPASizeBits48 is a new CMake variable, but it's not used anywhere at the moment. Can't we leave this at 44 bit for now to keep things simple and add the rest later if this is needed?

If we punt on the address space size, we can as well pretend it's a KernelArmCortexA72 and make it even simpler.

I'd rather make naming/numbers correct from the start if they're not causing issues with verification, functional correctness, or affecting other platforms. Simplicity is good, but in this case it would be misleading. Further, it's just a few lines. This will also save us renaming later when/if we add more Morello features (e.g., CHERI, hypervisor support, etc).

Regarding the size, it's used to calculate the USER_TOP. Even though KernelArmPASizeBits48 is currently unused, the fact is Morello has 48 PA and it's not going to be misleading for others who are reading this to think that Morello only supports 44 PA. So, if we don't want to define KernelArmPASizeBits48 now, I'd just delete it but not make it 44.
Related to that, I don't remember why the full 48 PA size isn't allowed in non hypervisor config? I could only have 47 due to this assert? AFAIK, canonical PPTRs are virtual, but this is a 48-bit physical address

Related to that, I don't remember why the full 48 PA size isn't allowed in non hypervisor config? I could only have 47 due to this assert? AFAIK, canonical PPTRs are virtual, but this is a 48-bit physical address

Device untyped caps are created for the full physical address space. Device untyped caps are created using generated bitfield accessor functions. These asserts were needed to avoid asserts in the bitfield generator accessor functions during boot. You could try removing that assert and see if there's bitfield asserts during boot (in debug build).

Yeah, when I delete this assert, there's another one from the bitfield generator that fails after the root task starts. Is that something expected or is it an issue to do with the bitfield generator? This cuts the device untype range to half to the potential physical address space on Morello

capPtr on device untypeds is absolutely used, so #1163 definitely is not possible. They are not dereferenced for device memory, but they are used esp for comparison with other kernel virtual pointers and need to satisfy the same basic conditions as other kernel virtual pointers. You will actually get wrong results if they don't.

The only way I can see to relax that would be to find all places where that happens and to make special cases, comparing physical instead of virtual addresses in those case. I think that might work, but there are a lot of assumptions in how the CDT is laid out and how pointers relate to each other, so I might be wrong. In any case, that would need verification funding to do.

One of the assumptions that the 64-bit kernel design currently makes is that the kernel window is large enough to hold all of addressable physical memory. With 48 bits of addressable virtual memory, 47-bit PA limit is probably the larges as there needs to be parts of the kernel virtual address space that isn't used for the direct mapping window. Getting larger than 47-bit would probably require a different approach for the kernel window mapping.

In hyp mode, the kernel uses a separate page table for the kernel mappings. @kent-mcleod I think you mentioned a while ago that armv8 would support that in non-hyp mode as well. That might solve at least some of the problem, and since we're only verifying hyp mode at the moment, it would not have any verification impact.

There is still the problem that addresses might overflow 48 bits in the translation between kernel virtual and physical, but that could probably at least affect a smaller region of memory, not the full top half of the physical address space.

@lsf37 Thanks for the clarification. The Morello port could go ahead with just 47-bit PA as per this PR then, and we could create an issue for future 48-bit PA support in seL4 in general

Indanz · 2024-01-09T12:23:26Z

In the Arm world, "Morello" is being interchangeably (and confusingly) used to represent an architecture[extension], an SoC, a CPU, and a board.

That is highly confusing and I want to avoid spreading that confusion further.

It's arguably just a CHERI implementation on armv8a, and I don't expect there's going to be multiple different "Morello" implementations with the same name.
It's also used as a -cpu flag in QEMU, and in the QEMU's CPU nodes there's compatible = "arm,morello"; This same CPU name works for QEMU, FVP, and the Morello board. I intentionally named it as a new seL4 CPU so that it's easy to spot this is a Morello-based CPU

Looking into it:

CHERI stands for Capability Hardware Enhanced RISC Instructions and is defined for multiple architectures.
Morello is the name of the research program and also the name of ARM's experimental CHERI-extended, multicore, superscalar ARMv8-A processor, SoC and prototype board.

But now ARM is also calling the architectural extension "Morello" instead of CHERI, and that's where it becomes confusing.

So adding morello-qemu as a new platform makes perfect sense.

What still doesn't make sense is adding KernelArmMorello though, as that should be instead KernelArchArmV8_2 or KernelArmCortex?? + CHERI. However, it seems the hole ARM dug is deep enough that it's impossible to get out of, so to be consistent with others and the DTS files we'll need to add KernelArmMorello and who knows what the next CPU will be called.

It's an KernelArmMorello but without CHERI support. I added instructions and support to run it in QEMU in seL4/seL4_tools#182. In this commit message, 1a0cfea I state that it requires CHERI-QEMU.

But without CHERI support it's just a normal aarch64 CPU with nothing special. I think you can just run it in a normal qemu too with slight modifications of the dts file. That's the whole point of CHERI, to be an extension to an existing architecture instead of being new and incompatible.

Sure, happy to add to the PR description. I just relied on the commit messages for the description. 3e576c0 states the goal and purpose

That commit message doesn't make the goal of adding support for a specific Morello Qemu implementation clear to me. I think this PR would be clearer if you would have also added the hardware Morello board as a platform, because that's apparently what Qemu tries to emulate.

heshamelmatary · 2024-01-09T15:07:27Z

In the Arm world, "Morello" is being interchangeably (and confusingly) used to represent an architecture[extension], an SoC, a CPU, and a board.

That is highly confusing and I want to avoid spreading that confusion further.

Yes, I agree, but we're stuck with it unfortunately. Just to be more clear, "Morello" is only Arm(v8a) specific, and there's only going to be one "Morello".

It's arguably just a CHERI implementation on armv8a, and I don't expect there's going to be multiple different "Morello" implementations with the same name.
It's also used as a -cpu flag in QEMU, and in the QEMU's CPU nodes there's compatible = "arm,morello"; This same CPU name works for QEMU, FVP, and the Morello board. I intentionally named it as a new seL4 CPU so that it's easy to spot this is a Morello-based CPU

Looking into it:

CHERI stands for Capability Hardware Enhanced RISC Instructions and is defined for multiple architectures.

Morello is the name of the research program and also the name of ARM's experimental CHERI-extended, multicore, superscalar ARMv8-A processor, SoC and prototype board.

But now ARM is also calling the architectural extension "Morello" instead of CHERI, and that's where it becomes confusing.

True. But still, Morello is a CHERI implementation on armv8. So it's not not CHERI. They just happen to name the specification, implementation, CPU, board, etc the same name. In RISC-V, the specification is commonly known as CHERI-RISC-V, and there are multiple implementations like in QEMU, Sail, Piccolo, Flute, Toooba, GFE, (Codasip 700?) etc

So adding morello-qemu as a new platform makes perfect sense.

What still doesn't make sense is adding KernelArmMorello though, as that should be instead KernelArchArmV8_2 or KernelArmCortex?? + CHERI. However, it seems the hole ARM dug is deep enough that it's impossible to get out of, so to be consistent with others and the DTS files we'll need to add KernelArmMorello and who knows what the next CPU will be called.

Technically speaking, the CPU name is called Rainier to make it even more confusing; it's not KernelArmCortex. But the rainier name is only mentioned in the board's manual and no where else. Rainier is based on Neoverse N1 Core which is based on Cortex-A76 core, but they're not the same. I just renamed Rainier to Morello in this case to be consistent and clear, and since Rainier isn't going to be used beyond Morello. KernelArmMorello is thus, in this PR, the CPU name. There's not going to multiple different Morello CPUs, so it should be fine and clear to keep that name/macro.

It's an KernelArmMorello but without CHERI support. I added instructions and support to run it in QEMU in seL4/seL4_tools#182. In this commit message, 1a0cfea I state that it requires CHERI-QEMU.

But without CHERI support it's just a normal aarch64 CPU with nothing special. I think you can just run it in a normal qemu too with slight modifications of the dts file. That's the whole point of CHERI, to be an extension to an existing architecture instead of being new and incompatible.

Yes, true. You can consider it another high-performance, bigger, feature-rich CPU in this case; according to my numbers (see this issue), it still beats the performance for A57/Tx1/Jetson. But this PR is also making it easier if/when we support CHERI on Morello.

Sure, happy to add to the PR description. I just relied on the commit messages for the description. 3e576c0 states the goal and purpose

That commit message doesn't make the goal of adding support for a specific Morello Qemu implementation clear to me. I think this PR would be clearer if you would have also added the hardware Morello board as a platform, because that's apparently what Qemu tries to emulate.

Sure, I already have local PRs for hardware and FVP. #1158 now has an overall tracker of what I am going to submit, which I'll also refine. I thought I'd start with QEMU as a platform because anyone can build, test, and run it, and not everyone has a Morello board, then submit PRs for FVP (which is tricky/slow to run seL4 on), then the Morello board.

This is an initial commit to support building and running seL4 on the new Arm's Morello CPU [1] which has CHERI extension [2]. However, this commit only supports AArch64 without CHERI yet. This servers as a baseline architecture for future CHERI support. [1] https://www.morello-project.org [2] https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/ Signed-off-by: Hesham Almatary <hesham.almatary@cl.cam.ac.uk>

Currently only supports running vanilla AArch64 code built with LLVM/lld on QEMU's morello from the CHERI group [1] [1] https://github.com/CTSRD-CHERI/qemu Signed-off-by: Hesham Almatary <hesham.almatary@cl.cam.ac.uk>

Indanz · 2024-01-25T12:42:24Z

Any particular reason why using GICv2 instead of GICv3?

heshamelmatary · 2024-01-26T10:04:56Z

Any particular reason why using GICv2 instead of GICv3?

Two reasons, the first is that I don't think GICv3 is completely properly emulated in QEMU; seL4 hangs on boot with that error GICv3: GICR_WAKER returned non-zero 6. The GICv3 driver works fine on FVP and the board though, so I didn't want to have custom workarounds just for QEMU. The second reason is to be able to CHERIfy GICv2 in the future if needed in addition to GICv3.

Indanz · 2024-01-26T10:34:12Z

Two reasons, the first is that I don't think GICv3 is completely properly emulated in QEMU;

I don't see any open bugs related to GICv3 in QEMU. Main thing seems to be that you may need to run it without secure=on.

seL4 hangs on boot with that error GICv3: GICR_WAKER returned non-zero 6.

The code comment above says:

GICR_WAKER should be Read-all-zeros in Non-secure world
and we expect redistributors to be alread awoken by an earlier loader.
However if we get a value back then something is probably wrong.

So it seems worth looking into, it may be overly strict. If it works without the halt() we can leave it in as a warning, but it doesn't look like an actual hard error, more like defensive programming.

The GICv3 driver works fine on FVP and the board though, so I didn't want to have custom workarounds just for QEMU. The second reason is to be able to CHERIfy GICv2 in the future if needed in addition to GICv3.

GICv3 is architecturally defined in Aarch64, so it should always be available, unlike GICv2. I just don't see the need for GICv2 support for new ARM hardware in general.

heshamelmatary · 2024-01-26T11:16:11Z

Two reasons, the first is that I don't think GICv3 is completely properly emulated in QEMU;

I don't see any open bugs related to GICv3 in QEMU. Main thing seems to be that you may need to run it without secure=on.

The command I use doesn't have secure=on. I tried to be more explicit and passed secure=off and this didn't help unfortunately.

seL4 hangs on boot with that error GICv3: GICR_WAKER returned non-zero 6.

The code comment above says:

GICR_WAKER should be Read-all-zeros in Non-secure world
and we expect redistributors to be alread awoken by an earlier loader.
However if we get a value back then something is probably wrong.

So it seems worth looking into, it may be overly strict. If it works without the halt() we can leave it in as a warning, but it doesn't look like an actual hard error, more like defensive programming.

It proceeds without halt() but sel4test hangs on FPU0001. I didn't really want to spend more time debugging and getting GICv3 working with QEMU if it's already working fine on FVP and hardware.

The GICv3 driver works fine on FVP and the board though, so I didn't want to have custom workarounds just for QEMU. The second reason is to be able to CHERIfy GICv2 in the future if needed in addition to GICv3.

GICv3 is architecturally defined in Aarch64, so it should always be available, unlike GICv2. I just don't see the need for GICv2 support for new ARM hardware in general.

Indanz · 2024-01-26T11:31:39Z

So it seems worth looking into, it may be overly strict. If it works without the halt() we can leave it in as a warning, but it doesn't look like an actual hard error, more like defensive programming.

It proceeds without halt() but sel4test hangs on FPU0001. I didn't really want to spend more time debugging and getting GICv3 working with QEMU if it's already working fine on FVP and hardware.

That's fine, but it would be nice if you would at least report problems like these so they're documented somewhere.

It seems unlikely that the FPU problem is related to the GICR_WAKER warning.

The only report I know about is: https://sel4.discourse.group/t/sel4test-get-stuck-in-fpu0001-test-on-qemu-arm-virt-with-givc3-enable/772

But like I said there, Odroid had the same hang with GICv2, so I thought it wasn't GICv3 related, but a timing issue (handleFPUFault() runs without the kernel lock held). That said, Odroid was build with clang, which may have other issues. Do you get the hang with gcc?

heshamelmatary · 2024-01-26T11:48:04Z

So it seems worth looking into, it may be overly strict. If it works without the halt() we can leave it in as a warning, but it doesn't look like an actual hard error, more like defensive programming.

It proceeds without halt() but sel4test hangs on FPU0001. I didn't really want to spend more time debugging and getting GICv3 working with QEMU if it's already working fine on FVP and hardware.

That's fine, but it would be nice if you would at least report problems like these so they're documented somewhere.

Yes, sure. I also noticed that the existing qemu-virt-arm is using GICv2, so it seems like GICv3 wasn't tested with QEMU and seL4 before?

It seems unlikely that the FPU problem is related to the GICR_WAKER warning.

The only report I know about is: https://sel4.discourse.group/t/sel4test-get-stuck-in-fpu0001-test-on-qemu-arm-virt-with-givc3-enable/772

But like I said there, Odroid had the same hang with GICv2, so I thought it wasn't GICv3 related, but a timing issue (handleFPUFault() runs without the kernel lock held). That said, Odroid was build with clang, which may have other issues. Do you get the hang with gcc?

I don't think it's FPU related either, this test requires scheduling, so I suspect we're not getting timer interrupts at all. It hangs with GCC and Clang. A quick GDB-debugging confirms that; handleInterrupt() isn't triggered at all.

Indanz · 2024-01-26T12:01:33Z

I don't think it's FPU related either, this test requires scheduling, so I suspect we're not getting timer interrupts at all. It hangs with GCC and Clang. A quick GDB-debugging confirms that; handleInterrupt() isn't triggered at all.

Thanks. Great, so it's just the first test that relies on the scheduling timer to work. Tests are sorted by name, and FPU is before SCHED.