Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large kernel images and FW_JUMP_FDT_ADDR #169

Closed
arichardson opened this issue Jun 4, 2020 · 16 comments
Closed

Large kernel images and FW_JUMP_FDT_ADDR #169

arichardson opened this issue Jun 4, 2020 · 16 comments

Comments

@arichardson
Copy link
Contributor

I am trying to determine how to handle loading of an S-mode OS kernel that is > 32MB (it has a root filesystem embedded in the kernel binary) in QEMU (with -kernel) using the generic fw_jump.

The problem I see is that the default FDT address is FW_TEXT_START + 34 MB and the kernel starts at FW_TEXT_START+2MB.
If the kernel is larger than 32 MB the FDT will overwrite the mapped kernel image.
Since the FDT is quite small this is unlikely to write over an important file, but could have fatal consequences if we're unlucky.

As a workaround we can build OpenSBI with FW_JUMP_FDT_ADDR to some larger value (e.g. start + 256MB). This will work for us since we don't support < 256MB RAM, but does not seem ideal.
I'm not sure we can move the FDT before the kernel start since some kernels may assume 2MB alignment?

@avpatel
Copy link
Collaborator

avpatel commented Jun 4, 2020

We have few other options as well:

  1. Place FDT between kernel and OpenSBI at 1MB offset from FW_TEXT_START
  2. Place FDT at 64MB or 128MB offset from FW_TEXT_START

The problem with any X offset for FDT that we are restricting minimum RAM size by X.

Regards,
Anup

arichardson added a commit to arichardson/cheribsd that referenced this issue Jun 4, 2020
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also riscv-software-src/opensbi#169
arichardson added a commit to CTSRD-CHERI/opensbi that referenced this issue Jun 4, 2020
This removes the minimum RAM size and maximum kernel size restriction.
See also riscv-software-src#169
@arichardson
Copy link
Contributor Author

Hi Anup,

It turns out our kernel (CheriBSD, a fork of FreeBSD) assumed 2MB alignment of the FDT, but after fixing that, we can now boot with the FDT at FW_TEXT_START+1MB.
I much prefer this solution, but it's possible other kernels will also not handle it.

arichardson added a commit to arichardson/cheribsd that referenced this issue Jun 4, 2020
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also riscv-software-src/opensbi#169

Co-authored-by: Jessica Clarke <jrtc27@jrtc27.com>
@avpatel
Copy link
Collaborator

avpatel commented Jun 5, 2020

Yes, I agree. FW_TEXT_START+1MB is good but it will not be acceptable to everyone. In fact, we cannot have FDT bigger than 1M at FW_TEXT_START+1MB because it might corrupt kernel at FW_TEXT_START+2MB

I think FW_TEXT_START+64MB is reasonable location for FDT placement. This will be fine for Linux kernel as well. Will this work for you?

I would like to highlight that this FDT placement issue is only for FW_JUMP and FW_PAYLOAD. The FW_DYNAMIC does not have this issue because prior booting puts FDT at desired location and FW_DYNAMIC will not change FDT location. Most mainstream bootloader (U-Boot, Coreboot, etc) use FW_DYNAMIC for better control and flexibility.

Regards,
Anup

@arichardson
Copy link
Contributor Author

I just checked and my current kernel image is about 90MB so using +64 doesn't really change anything.

We need a modified OpenSBI or BBL anyway to support our newly added instructions, so building with a non-default 1MB offset should fix the problem and the DTB is quite small.

However, OpenSBI is bundled with QEMU (I believe using the +32MB offset) as default BIOS.
If I want to run a FreeBSD kernel with an embedded filesystem on QEMU, I will have to build my own BIOS too.
The QEMU dtb is < 0x1000, so using a 1MB offset should also work there. Or possibly QEMU should be using FW_DYNAMIC by default?

@avpatel
Copy link
Collaborator

avpatel commented Jun 5, 2020

I think it is possible to fix this in a clean way.

Does the BSD flat image have header (like Linux flat image) from which we can determine size of image??

If yes then we can fix fw_jump.bin to get image size from flat image header and try to place FDT after flat image. If fw_jump.bin cannot understand the image header then it will place FDT at FW_JUMP_FDT_ADDR.

For fw_payload.bin, we can completely remove FW_PAYLOAD_FDT_ADDR.

Suggestions??

Regards,
Anup

@arichardson
Copy link
Contributor Author

Hi Anup,

By header do you mean this: https://www.kernel.org/doc/html/latest/riscv/boot-image-header.html ?
The FreeBSD kernel does not have such a header, but that could be added easily.

@jrtc27
Copy link
Contributor

jrtc27 commented Jun 5, 2020

FWIW, the largest DTS file (ie device tree source, not the encoded output which should be smaller) in the FreeBSD tree is tegra30-beaver.dts that sits at 60K. A 1 MiB DTB would be a monster, so putting the DTB 1 MiB before the payload seems like a safer default than an address at some arbitrary offset from the pre-loaded payload. Since OpenSBI is doing the copy itself, it could even be safe and stop if it hits the end of the allocated region, printing an error message and halting, unlike the current situation where it has no way of knowing whether it did anything problematic.

@atishp04
Copy link
Collaborator

atishp04 commented Jun 6, 2020

Here are few issues with the approaches discussed above.

  1. Keeping the FDT at 1MB offset below kernel
    - This will work for now but may create issues in future because we may want to relax 2MB restriction for kernel in future. It also restricts how much memory OpenSBI consumes.

  2. Parse flat image header to detect next payload size and put FDT after at a certain offset from that
    - This seems to be a better option but every supervisor OS has to adapt the image header format. Adding the header may not be complicated but just doing it for fw_jump may be an overkill.

Currently, the issue is with only fw_jump which is mostly used in Qemu. I believe most of the ASIC platform will end up using fw_dynamic while the FPGA platforms may prefer fw_payload to keep things simple. That leaves Qemu is the sole user of fw_jump.

We can modify the Qemu loader to use fw_dynamic instead of fw_jump. Qemu loader can calculate the kernel image size and place the fdt/kernel accordingly. I think that will solve the dt placement issue once and for all.

Any suggestions ?

@arichardson
Copy link
Contributor Author

Here are few issues with the approaches discussed above.

  1. Keeping the FDT at 1MB offset below kernel
    • This will work for now but may create issues in future because we may want to relax 2MB restriction for kernel in future. It also restricts how much memory OpenSBI consumes.

I'm not sure how easy it will be to change the 2MB offset for kernels in the future since I'm sure many assume that the start address is 0x8... + 2MB.

  1. Parse flat image header to detect next payload size and put FDT after at a certain offset from that
    • This seems to be a better option but every supervisor OS has to adapt the image header format. Adding the header may not be complicated but just doing it for fw_jump may be an overkill.

Currently, the issue is with only fw_jump which is mostly used in Qemu. I believe most of the ASIC platform will end up using fw_dynamic while the FPGA platforms may prefer fw_payload to keep things simple. That leaves Qemu is the sole user of fw_jump.

We can modify the Qemu loader to use fw_dynamic instead of fw_jump. Qemu loader can calculate the kernel image size and place the fdt/kernel accordingly. I think that will solve the dt placement issue once and for all.

Any suggestions ?

Modifying QEMU to use fw_dynamic would definitely solve the issue for me.
However, we do use fw_jump for our FPGAs (well we currently use a fw_jump BBL) as this means we don't have to rebuild the SBI every time the kernel changes.

For now I think +1MB default offset for fw_jump DTB seems like a good solution.
However, I don't know the details of our FPGA booting procedure.

@avpatel
Copy link
Collaborator

avpatel commented Jun 6, 2020

The biggest problem with FW_TEXT_START+1MB for FDT is that it cause space crunch for OpenSBI because OpenSBI creates this per-HART scratch space which is currently 8KB per-HART. The total OpenSBI scratch space increases with growing number of HARTs or if we increase per-HART scratch space size ( which is likely in future).

I think FW_TEXT_START+1MB for FDT is a temporary solution.

Atish will certainly update QEMU to allow using fw_dynamic but we will have to figure out better fix for fw_jump.

I think having image header (similar to Linux) in BSD flat image can help fw_jump determine better location of FDT
for both Linux and BSD.

In fact, if BSD flat image uses image header same as Linux then:

  1. BSD folks can use booti command of U-Boot without any changes
  2. We can put common flat image header format as recommended header format in Unix platform spec for all S-mode flat kernel images. The flat image header used by Linux is not at all Linux spec it just that we have documented it as part of Linux documentation

Regards,
Anup

@avpatel
Copy link
Collaborator

avpatel commented Jun 6, 2020

I have one more proposal.

How about:

  1. Restricting max FDT size to 512KB
  2. Place FDT at FW_TEXT_START+1.5MB

This is only for fw_jump and fw_payload. We don't have to worry about large number of HARTs because fw_jump and fw_payload are usually used on FPGA or simulation.

The space crunch will still be there for OpenSBI so I am not fan of this proposal so eventually we have to consider image header approach.

Regards,
Anup

uqs pushed a commit to freebsd/freebsd-src that referenced this issue Jun 8, 2020
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also riscv-software-src/opensbi#169

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25151


git-svn-id: svn+ssh://svn.freebsd.org/base/head@361903 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
uqs pushed a commit to freebsd/freebsd-src that referenced this issue Jun 8, 2020
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also riscv-software-src/opensbi#169

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25151
@atishp04
Copy link
Collaborator

atishp04 commented Jun 9, 2020

I'm not sure how easy it will be to change the 2MB offset for kernels in the future since I'm sure many assume that the start address is 0x8... + 2MB.

2MB alignment can be relaxed by using a 4KB alignment. Thus, the existing platform will continue to work as 2MB is already 4KB alignment.

I got the fw_dynamic working in Qemu. I will send the patch.

@atishp04
Copy link
Collaborator

atishp04 commented Jun 9, 2020

Here are few issues with the approaches discussed above.

  1. Keeping the FDT at 1MB offset below kernel

    • This will work for now but may create issues in future because we may want to relax 2MB restriction for kernel in future. It also restricts how much memory OpenSBI consumes.

I'm not sure how easy it will be to change the 2MB offset for kernels in the future since I'm sure many assume that the start address is 0x8... + 2MB.

  1. Parse flat image header to detect next payload size and put FDT after at a certain offset from that

    • This seems to be a better option but every supervisor OS has to adapt the image header format. Adding the header may not be complicated but just doing it for fw_jump may be an overkill.

Currently, the issue is with only fw_jump which is mostly used in Qemu. I believe most of the ASIC platform will end up using fw_dynamic while the FPGA platforms may prefer fw_payload to keep things simple. That leaves Qemu is the sole user of fw_jump.
We can modify the Qemu loader to use fw_dynamic instead of fw_jump. Qemu loader can calculate the kernel image size and place the fdt/kernel accordingly. I think that will solve the dt placement issue once and for all.
Any suggestions ?

Modifying QEMU to use fw_dynamic would definitely solve the issue for me.
However, we do use fw_jump for our FPGAs (well we currently use a fw_jump BBL) as this means we don't have to rebuild the SBI every time the kernel changes.

It is not too hard to adopt to fw_dynamic. As fw_dynamic is more flexible than fw_jump, I would suggest you to migrate to fw_dynamic when your platfrom starts using OpenSBI. This would allow you to use the same OpenSBI image to be used in both Qemu & FPGA as well.
To the best of my knowledge, your fpga platform will be the first one to do so :). Every other FPGA platform (with OpenSBI support) has been using fw_payload only.

For now I think +1MB default offset for fw_jump DTB seems like a good solution.
However, I don't know the details of our FPGA booting procedure.

mat813 pushed a commit to mat813/freebsd that referenced this issue Jun 9, 2020
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also riscv-software-src/opensbi#169

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25151


git-svn-id: https://svn.freebsd.org/base/head@361903 ccf9f872-aa2e-dd11-9fc8-001c23d0bc1f
brooksdavis pushed a commit to CTSRD-CHERI/cheribsd that referenced this issue Jun 12, 2020
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also riscv-software-src/opensbi#169

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25151
arichardson added a commit to CTSRD-CHERI/opensbi that referenced this issue Jun 17, 2020
This removes the minimum RAM size and maximum kernel size restriction.
See also riscv-software-src#169
@avpatel
Copy link
Collaborator

avpatel commented Oct 23, 2020

QEMU has been updated to use fw_dynamic.bin as default firmware. The fw_jump.bin also works fine in latest QEMU.

Regards,
Anup

thloh85-starfivetech pushed a commit to starfive-tech/opensbi that referenced this issue Sep 17, 2021
OpenSBI defaults to place the FDT 32MB after the start of the kernel but
this is not sufficient for a kernel with a large builtin initrd rootfs.

This patch change FW_JUMP_FDT_ADDR to 128MB after the FW_TEXT_START to
support > 32MB kernel image.

This is known limitation, see detail in
riscv-software-src/opensbi#169

Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
thloh85-starfivetech pushed a commit to starfive-tech/opensbi that referenced this issue Apr 21, 2022
OpenSBI defaults to place the FDT 32MB after the start of the kernel but
this is not sufficient for a kernel with a large builtin initrd rootfs.

This patch change FW_JUMP_FDT_ADDR to 128MB after the FW_TEXT_START to
support > 32MB kernel image.

This is known limitation, see detail in
riscv-software-src/opensbi#169

Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
@arichardson
Copy link
Contributor Author

Could you please explain why this is marked as closed now? It seems to me that the jump offset is still 2MB?

rebeccasf pushed a commit to starfive-tech/opensbi that referenced this issue Feb 22, 2023
OpenSBI defaults to place the FDT 32MB after the start of the kernel but
this is not sufficient for a kernel with a large builtin initrd rootfs.

This patch change FW_JUMP_FDT_ADDR to 128MB after the FW_TEXT_START to
support > 32MB kernel image.

This is known limitation, see detail in
riscv-software-src/opensbi#169

Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
@arichardson
Copy link
Contributor Author

ping. Could you please explain why this issue has been closed (especially considering forks of this repository are still applying patches to work around it)?

rebeccasf pushed a commit to starfive-tech/opensbi that referenced this issue Oct 24, 2023
OpenSBI defaults to place the FDT 32MB after the start of the kernel but
this is not sufficient for a kernel with a large builtin initrd rootfs.

This patch change FW_JUMP_FDT_ADDR to 128MB after the FW_TEXT_START to
support > 32MB kernel image.

This is known limitation, see detail in
riscv-software-src/opensbi#169

Signed-off-by: Ley Foon Tan <leyfoon.tan@starfivetech.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants