Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for position independent executables #206

Merged
merged 9 commits into from
Jan 9, 2022

Conversation

Freax13
Copy link
Member

@Freax13 Freax13 commented Dec 15, 2021

This pr adds support for position independent executables. This is not particularly useful by itself, however I plan to follow this up with a pr implementing ASLR. For now the bootloader always loads the kernel at 0x400000.

This pr only implements one type of relocation: R_AMD64_RELATIVE. I've used code similar to this to relocate code in userspace and R_AMD64_RELATIVE was the only relocation type I ever came across. I didn't implement any others types because I don't how to make rust emit them and thus am unable to test the implementations for any other types. If the code encounters any other type it will panic, stating that the type is not implemented. If this ever happens it should be trivial to implement the relocation and add a corresponding test.

I also didn't implement relocations using the symbol table for the same reason: I never saw them emitted by rust and couldn't test them.

I've added a test kernel pie testing relocations.

@Freax13
Copy link
Member Author

Freax13 commented Dec 25, 2021

This is not particularly useful by itself, however I plan to follow this up with a pr implementing ASLR.

I've started working on an implementation in https://github.com/Freax13/bootloader/tree/aslr in case someone is interested.

Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks a lot!

src/binary/load_kernel.rs Outdated Show resolved Hide resolved
src/binary/load_kernel.rs Outdated Show resolved Hide resolved
src/binary/load_kernel.rs Outdated Show resolved Hide resolved
tests/test_kernels/pie/Cargo.toml Outdated Show resolved Hide resolved
@Freax13
Copy link
Member Author

Freax13 commented Dec 28, 2021

I just noticed that simply writing to the elf file in https://github.com/Freax13/bootloader/blob/5d61efd72e201aff401031119ae0662d7a175f72/src/binary/load_kernel.rs#L400 is wildly unsound: We have an immutable reference to that memory, so we can't just write to it.

We can probably copy the memory to a new frame similar to what we do in handle_bss_section.

@phil-opp
Copy link
Member

phil-opp commented Dec 28, 2021

Oh yeah, you're right. Instead of copying, we could also wrap the kernel slice in an UnsafeCell.

Just FYI: I'm currently working on a redesign of the bootloader crate in the next branch. Instead of including the KERNEL static at compile time, the new version places the kernel ELF file in the FAT image and loads it dynamically when booting. The configuration is passed in a separate ELF section by the entry_point macro (instead of reading the config from the Cargo.toml). Since the kernel is loaded to freshly allocated memory in the new version, writing to its memory should not be a problem.

The UEFI implementation already seems to work. The challenge is to implement the same approach for the BIOS bootloader too.

@Freax13
Copy link
Member Author

Freax13 commented Dec 28, 2021

Oh yeah, you're right. Instead of copying, we could also wrap the kernel slice in an UnsafeCell.

That wouldn't work, we'd need to wrap each u8 inside the slice in a UnsafeCell and not the reference to the slice. However we can't do that because the slice is part of ElfFile, so we'd need to convert it to a &[u8] anyways.

Just FYI: I'm currently working on a redesign of the bootloader crate in the next branch. Instead of including the KERNEL static at compile time, the new version places the kernel ELF file in the FAT image and loads it dynamically when booting. The configuration is passed in a separate ELF section by the entry_point macro (instead of reading the config from the Cargo.toml).

Very cool! I have a feeling that will make a lot of things a lot easier!

Since the kernel is loaded to freshly allocated memory in the new version, writing to its memory should not be a problem.

Yes and no, we still need to make sure that the relocations don't overlap with any other structures of the elf file. Considering that elf files contain a bunch of tables and addresses that could all possibly overlap and reference each other, I don't think modifying the file in place is a good approach.
I think the best way going forward is to only modify copies of the frames. We could use unused page tables flags to signal whether or not a page has already been copied (of course we'd remove the flags after applying the relocations).

AFAICT, the redesign hasn't (yet) changed kernel loading by much, so I'll continue working on this pr support for now, however if you have plans to make significant changes load_kernel.rs I can also wait until after you've made those changes.

@phil-opp
Copy link
Member

Yeah, I think you're right that changing the file in-place is not a good idea.

AFAICT, the redesign hasn't (yet) changed kernel loading by much, so I'll continue working on this pr support for now, however if you have plans to make significant changes load_kernel.rs I can also wait until after you've made those changes.

I don't think that I'll change the kernel loading, only the loading of the kernel slice and the configuration.

@Freax13
Copy link
Member Author

Freax13 commented Dec 28, 2021

I fixed the aliasing issues by lazily coping to new frames before writing.

@Freax13
Copy link
Member Author

Freax13 commented Dec 28, 2021

Not sure why CI failed, I don't think it has anything to do with this pr.

@phil-opp
Copy link
Member

Not sure why CI failed, I don't think it has anything to do with this pr.

I also got these macOS test errors on CI occasionally over the past few days. Not sure what's going on, but I agree that it seems unrelated to this PR.

@phil-opp
Copy link
Member

phil-opp commented Jan 9, 2022

Not sure why CI failed, I don't think it has anything to do with this pr.

I also got these macOS test errors on CI occasionally over the past few days. Not sure what's going on, but I agree that it seems unrelated to this PR.

They might be related to rust-lang/cargo#8941. I think the problem is that multiple tests build and use the runner executable concurrently, which leads to a race. It's the same executable for each test, so it doesn't matter which runner executable is used by which test. However, macOS seems to do the overwriting non-atomically, so that the runner executable does not exist for a short time.

I pushed a workaround in e8dc356 that runs the tests sequentially on macOS. As a longer-term solution, I'm thinking about using the upcoming "artifact dependencies" feature of cargo, but I'm not sure if this use case will be supported by cargo (see my question on zulip).

Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment, otherwise this looks good to me. I haven't tested the code though.

src/binary/load_kernel.rs Outdated Show resolved Hide resolved
@phil-opp
Copy link
Member

phil-opp commented Jan 9, 2022

There seems to be a small type error:

error[E0308]: mismatched types
Error:    --> src/binary/load_kernel.rs:235:39
    |
235 |                     new_bytes_ptr.add(data_bytes_before_zero),
    |                                       ^^^^^^^^^^^^^^^^^^^^^^ expected `usize`, found `u64`
    |
help: you can convert a `u64` to a `usize` and panic if the converted value doesn't fit
    |
235 |                     new_bytes_ptr.add(data_bytes_before_zero.try_into().unwrap()),
    |                                                             ++++++++++++++++++++

@Freax13
Copy link
Member Author

Freax13 commented Jan 9, 2022

Oops, that's embarassing. Should be fixed now.

@phil-opp
Copy link
Member

phil-opp commented Jan 9, 2022

Thanks!

@phil-opp phil-opp merged commit 6423165 into rust-osdev:main Jan 9, 2022
@Freax13 Freax13 deleted the pie branch January 9, 2022 15:40
@ethindp
Copy link
Contributor

ethindp commented Jan 9, 2022

So this doesn't actually initialize the kernel correctly and causes a #GP immediately when using x86_64-unknown-none which uses PIC code. The GOT apparently has to be initialized in a certain way -- I don't quite know the specifics -- and as such my kernel hasn't been bootable for a while, even when setting the codegen model to PIE. I don't know if this issue is linked to that problem but I thought I'd report it here.

@Freax13
Copy link
Member Author

Freax13 commented Jan 16, 2022

So this doesn't actually initialize the kernel correctly and causes a #GP immediately when using x86_64-unknown-none which uses PIC code.

Could you provide an example of a kernel that doesn't get initialized properly? I just ran the default_settings, map_phys_mem and pie tests with the x86_64-unknown-none target and didn't get any errors.

@Freax13
Copy link
Member Author

Freax13 commented Jan 16, 2022

[...] and as such my kernel hasn't been bootable for a while, even when setting the codegen model to PIE.

After overwriting the bootloader_location variable in Makefile.toml with a custom path pointing to the bootloader repo on my disk, I was able to run your kernel.

@ethindp
Copy link
Contributor

ethindp commented Jan 16, 2022

@Freax13 mine for some reason doesn't do that. And I'm waiting on the 0.11 release so the build process doesn't depend on the TOML dependencies list.

@ethindp
Copy link
Contributor

ethindp commented Jan 23, 2022

So I just tried adding -Crelocation-model=static and it still triple faults. Weird... I wonder if the bootloader is being built with PIC/PIE (or does that even matter)?

@Freax13
Copy link
Member Author

Freax13 commented Jan 23, 2022

So I just tried adding -Crelocation-model=static and it still triple faults. Weird... I wonder if the bootloader is being built with PIC/PIE (or does that even matter)?

The bootloader itself must be built without PIE. Did you make sure that you used the bootloader from the master branch and not from crates.io?

@ethindp
Copy link
Contributor

ethindp commented Jan 23, 2022

Yes. I added this to my dependencies section in Cargo.toml:

bootloader = { git = "https://github.com/rust-osdev/bootloader" }

I'm building it using this cargo-makefile ruleset:

[tasks.default]
run_task = { name = ["check_pciids", "format", "build", "run"] }

#...
[tasks.format]
command = "cargo"
args = ["fmt", "--all"]

[tasks.build]
run_task = { name = ["check_kernel", "clippy_kernel", "build_kernel", "build_bootloader"] }
dependencies = ["format"]

[tasks.build_kernel]
command = "cargo"
args = ["rustc", "-Zbuild-std=core,compiler_builtins,alloc", "-Zbuild-std-features=compiler-builtins-mem", "--target", "x86_64-unknown-none", "--", "-C", "relocation-model=static"]

[tasks.build_bootloader]
script_runner = "@rust"
script = '''
//!```cargo
//![dependencies]
//!bootloader-locator = "*"
//!```
use std::process::Command;
use std::env::var;
use bootloader_locator::locate_bootloader;

fn main() {
    let mut bootloader_location = locate_bootloader("bootloader").expect("Could not find bootloader");
    bootloader_location.pop();
    let bootloader_location = bootloader_location.into_os_string();
    let bootloader_location = bootloader_location.to_str().unwrap();
    let cwd = var("CARGO_MAKE_WORKING_DIRECTORY").expect("CARGO_MAKE_WORKING_DIRECTORY not set");
    let mut builder_command = Command::new("cargo");
    builder_command.current_dir(bootloader_location);
    builder_command.args(&["builder", "--kernel-manifest", &(cwd.clone() + "/Cargo.toml"), "--kernel-binary", &(cwd.clone() + "/target/x86_64-unknown-none/debug/kernel"), "--target-dir", &(cwd.clone() + "/target"), "--out-dir", &(cwd.clone() + "/target/x86_64-unknown-none/debug")]);
    builder_command.status().expect("Failed to run cargo builder");
}
'''

[tasks.run]
command = "qemu-system-x86_64"
args = ["-enable-kvm", "-machine", "q35,smm=off,vmport=off", "-cpu", "host,kvm=on", "-m", "8G", "-device", "virtio-balloon", "-nographic", "-device", "qemu-xhci,id=input", "-device", "usb-kbd,bus=input.0", "-device", "usb-tablet,bus=input.0", "-audiodev", "pa,id=audio0,out.mixing-engine=off,out.stream-name=kernel,in.stream-name=kernel", "-device", "intel-hda", "-device", "hda-duplex,audiodev=audio0", "-rtc", "base=localtime,clock=host,driftfix=slew", "-drive", "format=raw,file=${CARGO_MAKE_WORKING_DIRECTORY}/target/x86_64-unknown-none/debug/boot-uefi-kernel.img", "-drive", "if=pflash,format=raw,file=/usr/share/OVMF/x64/OVMF_CODE.fd,readonly=on", "-drive", "file=disk-nvme.qcow2,if=none,id=NVME01", "-device", "nvme,drive=NVME01,serial=0001", "-drive", "id=disk,file=disk-sata.qcow2,if=none", "-device", "ahci,id=ahci", "-device", "ide-hd,drive=disk,bus=ahci.0",  "-debugcon", "file:qemu.log", "-global", "isa-debugcon.iobase=0x402", "-d", "int", "-D", "qemu2.log", "-device", "qemu-xhci,id=audio", "-device", "usb-audio,audiodev=usbaudio,bus=audio.0", "-audiodev", "pa,id=usbaudio,out.mixing-engine=off,out.stream-name=kernel-alsa,in.stream-name=kernel-alsa", "-device", "virtio-net,netdev=nic", "-netdev", "user,hostname=kernel,id=nic",  "-device", "virtio-rng-pci,rng=rng0", "-object", "rng-random,id=rng0,filename=/dev/urandom", "-device", "virtio-gpu", "-global", "driver=cfi.pflash01,property=secure,value=on", "-no-reboot"]
dependencies=["build"]

[tasks.check_kernel]
command = "cargo"
args = ["check", "-Zbuild-std=core,compiler_builtins,alloc", "-Zbuild-std-features=compiler-builtins-mem", "--target", "x86_64-unknown-none"]

[tasks.clippy_kernel]
command = "cargo"
args = ["clippy", "-Zbuild-std=core,compiler_builtins,alloc", "-Zbuild-std-features=compiler-builtins-mem", "--target", "x86_64-unknown-none"]

I don't know if the above tasks (check/clippy) break the build in some way. It doesn't (appear) to do so, but my kernel (still) isn't booting, so I'm really not sure what else to try at this point.

@Freax13
Copy link
Member Author

Freax13 commented Jan 23, 2022

I don't know if the above tasks (check/clippy) break the build in some way. It doesn't (appear) to do so, but my kernel (still) isn't booting, so I'm really not sure what else to try at this point.

This works for me. I forked your repo and commited all the changes I made to it: https://github.com/Freax13/kernel. I recommend running cargo clean before trying to run it. If it still doesn't work for you, maybe send me an email and we can figure this out in a phone call with screen sharing or in a teamviewer session.

@ethindp
Copy link
Contributor

ethindp commented Jan 23, 2022

@Freax13 So I tried your build and it works for me. I then tried applying your changes and my copy still doesn't work. I am soooooooo confused at this point...

phil-opp added a commit that referenced this pull request Apr 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants