Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better handling of very large memory maps #259

Open
3 of 5 tasks
hawkw opened this issue Sep 20, 2022 · 4 comments
Open
3 of 5 tasks

better handling of very large memory maps #259

hawkw opened this issue Sep 20, 2022 · 4 comments

Comments

@hawkw
Copy link
Contributor

hawkw commented Sep 20, 2022

Currently, the bootloader has some issues handling BIOS memory maps that contain very high addresses, such as around the 1TB mark, and/or very large regions (in excess of 500 GB). Depending on the bootloader version and configuration, memory maps with high addresses or large regions may result in crashes or degrade boot performance significantly.

In particular, the following issues have been observed:

  • v0.9.x of bootloader contains an assertion that may be triggered when the memory map contains an address too big for a single PML4 entry if the map_physical_memory feature flag is enabled:

    bootloader/src/main.rs

    Lines 289 to 291 in 42da77b

    // If offset not manually provided, find a free p4 entry and map memory here.
    // One level 4 entry spans 2^48/512 bytes (over 500gib) so this should suffice.
    assert!(max_phys_addr < (1 << 48) / 512);
    (see also panicked at 'assertion failed: max_phys_addr < (1 << 48) / 512' in post-09 phil-opp/blog_os#1138)
  • v0.10.x of bootloader doesn't contain that assertion, but does have offset selection code that assumes no region will require multiple PML4 entries:
    let base =
    Page::from_page_table_indices_1gib(self.get_free_entry(), PageTableIndex::new(0))
    .start_address();
  • v0.10.x of bootloader exhibits very long boot times when a memory map contains a reserved region at a ~1TB offset. This is because the bootloader will identity map all pages up to the highest reserved address in the memory map using 4K pages, and does this twice (once for the bootloader itself, and a second time when setting up physical memory mappings for the kernel). This results in a very slow boot process.

These issues are relevant because AMD systems with an IOMMU have a reserved hole close to the 1TB mark. In recent QEMU versions (>= v7.1.0), QEMU will report this region as reserved in its e380 BIOS memory map, resulting in assertion failures or boot performance degradation. I would assume this would cause issues when booting on real AMD hardware, as well.

Some changes that would improve how bootloader handles high addresses in the memory map (many of which were suggested by @phil-opp on Gitter):

  • Change the automatic offset selection to support regions that need multiple entries in the level 4 page table (v0.9.x, v0.10.x, and probably v0.11.x)
  • Don't identity map reserved regions at all when performing the initial identity mapping for the bootloader itself (v0.10.x and probably v0.11.x)
    • The framebuffer would still need to be identity mapped so that the bootloader can write to it.
  • Use 2MB rather than 4KB pages when identity mapping the kernel address space, which would improve performance (v0.10.x and v0.11.x)
  • Consider not identity mapping holes in the memory map at all, only the reserved regions that would need to be mapped for the kernel. This way, we wouldn't map every page between the second-highest reserved region and the 1TB hole on AMD systems.
  • Consider not identity mapping that specific reserved region, at all. It's an unusable hole, not a MMIO region or BIOS structure that the kernel would actually want to access...
@hawkw
Copy link
Contributor Author

hawkw commented Sep 20, 2022

(as a side note, I'd be happy to work on some or all of these changes)

@phil-opp
Copy link
Member

Thanks a lot for creating this issue!

I'm a bit short on time right now, so I would appreciate any help! I would suggest that we fix things in the following order:

  1. Change the automatic offset selection to support regions that need multiple entries in the level 4 page table (in v0.9 and v0.10) and remove the related assertion (in v0.9). This should be enough to fix the boot errors.
  2. In the BIOS identity mapping of v0.10, ignore all reserved regions with start address > 4GIB when calculating max_phys_addr. This should be possible by adding an additional filter call here (before the map):
    let max_phys_addr = e820_memory_map
    .iter()
    .map(|r| r.start_addr + r.len)
    .max()
    .expect("no physical memory regions found");

    The reason for the 4GiB bound is that this is the maximum addressable memory in protected mode, so everything important should be contained there (e.g. the framebuffer).
  3. For the kernel-requested physical memory mapping we don't want to skip regions. However, instead of mapping the full 0..max_phys_addr range, we could instead iterate over the reported regions and map only them. The relevant part of the code is:
    for frame in PhysFrame::range_inclusive(start_frame, end_frame) {

    This could instead become something like the following (pseudo-code):
    for region in frame_allocator.regions() {
        for frame in PhysFrame::range_inclusive(region.start_frame(), region.end_frame()) { ...
    We could still use 2MiB pages for these mappings and skip any pages that are mapped already.

The code for version v0.10 is in the main branch and the code for v0.9 is in v0.9-base. We can ignore the upcoming v0.11 release for now, I'll cherry-pick the relevant parts later.


Regarding:

Use 2MB rather than 4KB pages when identity mapping the kernel address space, which would improve performance (v0.10.x and v0.11.x)

I think this is already happening. I was just confused because we don't specify the page size for the start frame here:

let start_frame = PhysFrame::containing_address(PhysAddr::new(0));
let max_phys = frame_allocator.max_phys_addr();
let end_frame: PhysFrame<Size2MiB> = PhysFrame::containing_address(max_phys - 1u64);

However, we do specify the size for the end frame. Since start and end frame must be of the same type, type inference should automatically choose Size2MiB for the start frame as well. It would still be a good idea to make this more explicit :D.

hawkw added a commit to hawkw/bootloader that referenced this issue Sep 22, 2022
When determining the maximum physical address for the BIOS bootloader's
identity mapping, we currently use the highest address in the E380
memory map, no matter how high it is. However, the bootloader runs in
protected mode, and therefore, it cannot address more than 4 GiB of
memory. We can save some time by not identity mapping addresses over 4
GiB, as the bootloader cannot address them anyway.

This commit changes the BIOS bootloader to skip addresses over 4 GiB
when determining the maximum physical address.

This is one of the changes described in issue rust-osdev#259.
@hawkw
Copy link
Contributor Author

hawkw commented Sep 22, 2022

Opened #260 to ignore physical addresses in excess of 4 GiB when identity mapping for the bootloader itself.

@Freax13
Copy link
Member

Freax13 commented Sep 22, 2022

* [ ]  Change the automatic offset selection to support regions that need multiple entries in the level 4 page table (v0.9.x, v0.10.x, and probably v0.11.x)

I'm working on implementing this. This will also fix #262.

phil-opp added a commit that referenced this issue Sep 25, 2022
hawkw added a commit to hawkw/mycelium that referenced this issue Sep 28, 2022
This updates the `bootloader` crate to version 0.10.13, which fixes
issues when booting on systems that report memory regions at high
physical addresses in their memory map (see rust-osdev/bootloader#259).

This should make it possible to run Mycelium in QEMU v7.1.0 and later,
without very poor boot performance, as well as on real hardware which
reports high addresses (many AMD systems).

In addition, there are some new bootloader features we might want to
play around with.

Fixes #321.
hawkw added a commit to hawkw/mycelium that referenced this issue Sep 28, 2022
This updates the `bootloader` crate to version 0.10.13, which fixes
issues when booting on systems that report memory regions at high
physical addresses in their memory map (see rust-osdev/bootloader#259).

This should make it possible to run Mycelium in QEMU v7.1.0 and later,
without very poor boot performance, as well as on real hardware which
reports high addresses (many AMD systems).

In addition, there are some new bootloader features we might want to
play around with.

Fixes #321
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants