-
Notifications
You must be signed in to change notification settings - Fork 751
arch/x86: Changed system call ABI #4452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: x86-mmu
Are you sure you want to change the base?
Conversation
We are preparing a set of several pull requests to unify the usage of MPU and MMU, eventually with paging capability, for Tock. This pull request is required to simplify the work. |
I also decreased the value of |
Will/does this also require changes to libtock-rs/libtock-c for x86? |
There is no upstream x86 support for |
@reynoldsbd @HMiyaziwala any reason changing the ABI in this way would be a problem? |
On mobile so I haven't had a chance to review this PR yet. But the motivation behind the current ABI was to mirror cdecl, which had the specific benefit of making upcalls easier to implement and allowing the usermode callback to be a plain cdecl function. It also made the crt0 easier to write. I've got no special attachment to the ABI, however I would prefer to preserve the "upcall handler being regular cdecl" feature. So maybe it's possible to pass args in registers for syscalls but preserve the cdecl compatibility for upcalls? I also recognize our team has not yet published our libtock-c changes for x86. I'll try to make that happen soon, because I think it would be good to see the proposed ABI changes in both repos. |
I will take care of porting the
There are two upcalls in my Why can't the "kernel upcall" have its arguments pushed on the stack directly by the kernel?
[1] ^ [2] ^ [3] ^ [4] ==> the kernel accesses a virtual pointer that is not mapped. ==> a kernel fault is trigerred ==> the entire OS shuts down. Why don't we change the kernel's code such that when
|
Just noting that this part in particular, is not true. The ABI is part of a stable yet, and upstream libtock-c / libtock-rs so it's technically fine to just change it, but there are thousands? hundreds of thousands? millions? of devices in the field using the current ABI. So it's not true that "nobody care about the ABI change" |
Did you mean "The ABI is not part of a stable release yet"?
Is this a general statement about ABI changes? If so, you are right. Is this a statement about x86 Tock's ABI change? I doubt the veracity of your statement. Thousands of devices? Maybe. Hundreds of thousands or millions? That looks like an overestimate for me. Also, Tock doesn't support remote updates, so it is impossible to break any running device. Since remote updates of the kernel are impossible, the apps are either flashed together with the kernel, or separately thanks to dynamic application loader. This means that the ABI of the kernel is known prior to application loading. A developer can compile the apps for a specific kernel ABI. A minor problem might be closed-source Tock apps. In this case, a developer cannot recompile them, so an ABI change is indeed problematic. Should we care about closed-source apps? Isn't open-source the better way to develop? Lastly, do we really care about every downstream project using Tock? Due to Tock's licence, it is actually impossible to know which projects/organizations use Tock. And apart from Microsoft, I really doubt there is anyone using the x86 port. I also assume they have full control over the apps they flash with Tock. |
The high end of @alevy's statement is indeed what we are looking at. Unfortunately I'm not able to share any more detail than that. It is also true that we have full control to update our kernel and apps in lockstep, so this PR does not present any risk of unexpected breakage. What I will say as a general statement is that my team at Microsoft has a certain amount of capacity to work with and contribute to upstream Tock, and the more breakage we have to absorb the less time we have to participate upstream. Even some of the seemingly simple refactoring within tock and libtock-c repos over the past year have taken away from our capacity to help with the x86 upstream efforts. For example, I'm currently spending a ton of time wrestling with the Makefile-related changes in libtock-c over the past year, which is slowing down my ability to upstream the x86 port for libtock-c. That's the angle I'm coming from, and that's why I asked about libtock-c/-rs updates. I personally don't care as much about the ABI itself changing if it truly needs to, but I'd much prefer if the change was transparent to our downstream use case (i.e. that we don't have to rewrite all our usermode driver wrappers for the new ABI). Couple of side notes that maybe aren't directly relevant to this PR:
Like I said, these are more like side notes. Maybe there's a better venue (Slack channel or GH discussion?) to discuss x86 and paging? |
FWIW here's my working branch to add i386 support to libtock-c. I don't think it's quite ready for PR yet, but all ABI-related pieces are there and should provide a good baseline to evaluate these proposed ABI changes. |
Yes, sorry and thanks for the correction. The remainder of the comments, I think this became too adversarial. In general, yes, we care about downstream projects---both those that have not yet upstreamed all relevant support as well as those do not necessarily contribute code upstream but otherwise engage with the community. There is of course a limit, but we don't want to make life harder for deploying projects unnecessarily. ABI stability is important not specifically because of the potential for online updates (which do happen, actually) or separate updates of the kernel and applications (which are not deployed anywhere AFAIK, but they are certainly on the roadmap for several projects), but because applications and the kernel are developed and compiled separately, and the only way of ensuring an application works with a kernel is the stability of the ABI. Subtle and less subtle changes can (and do) break things in ways that can significantly increase the burden on testing and thus increase the burden to, for example, track upstream Tock. OK, none of this is to say that this PR is bad, or shouldn't be accepted (a note/question about that to come in a follow up comment, I'd like to put this high-level discussion of what's worth discussing to bed). It's a good PR and the effort to support actual virtual memory is welcome and the challenge appreciated! But, in considering whether changes are the right ones, it's important to keep in mind how they impact the ecosystem, not only what is strictly in one repo upstream right now. In this case, the problem is that because libtock-c does not yet have upstream support for x86, it is not immediately obvious what changes to libtock-c will be exactly needed to support the change in system ABI and what impact those changes will have (on performance, code size, application APIs, etc). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, now for a more substantive question and comment about these changes:
First, it's good to reconsider the ABI when significantly new features are going to be added to an architecture, like is happening in this PR. It's also good to try and unify the ABI and implementation for MMU and MPU-like architectures (though not strictly necessary if it makes sense for them to use different ABIs).
But i'm not convinced this ABI is the right one. Avoiding pushing to the stack avoids the having to align the kernel and process MMUs, which maybe saves some cycles (eventually the MMU will need to be adjusted before switching to the process and/or back to the kernel, so why not just do it earlier? the actual switch is just a single update to crt0, no?). However, if the system call ABI does not match the C calling convention (which both C and Rust use for external function calls) it would cost additional copying and cycles in userspace and/or the kernel to translate between the calling convention and the Tock ABI.
Moreover, dealing with the kernel not necessarily having a process's pages in it's VM map is going to need to be dealt with for allow buffers, so dealing with it in this way might just be kicking the can down the road.
I don't believe MMU vs. MPU has any impact on the calling convention which, for 32-bit x86 is the same (see the System-V ABI specification for example https://www.sco.com/developers/devspecs/abi386-4.pdf). On 64-bit x86, of course, the calling conventions are different, and indeed most arguments are stuck in registers (certainly enough for Tock system calls).
libtock-c build system is a complete mess. It looks more complicated than it should be. However, I expect source-related changes to be far easier to manage once the i386 build is fixed.
You won't for libtock-rs. Unfortunately, I couldn't compile libtock-c. I've tried both crosstool-ng and manual toolchain setup, both failed to build
As far as I know, there is none else involved with MMU support for Tock.
The kernel is aware of virtual addresses. Actually, the entire kernel will be aware of virtual addresses. This is different from
When a process makes an upcall, the kernel checks if it is in bounds, then translates its starting user virtual address to kernel virtual address. This should be in theory the sole point where offline translation is performed.
See my previous answer. |
Didn't intend to be adversarial. In French culture, people speak more frankly than in the US, so we might be perceived as rude by you, Americans. We also love defending our viewpoints. :)
I never said we shouldn't care about downstream projects. I said we should not care about EVERY downstream project. MIT and Apache licence allows organisations to use Tock without notice. As a consequence, it is impossible to know every downstream user of Tock, their needs and their capability to adapt to changes.
Here I have a different vision, but I would just stop here. Don't want to be "adversarial" again. :)
Can you update Tock online without an access to a debugger and using a communication protocol such as Ethernet or SPI?
For As for
Depending on your discussions, this PR may be bad or not. may be accepted or not. Check my next comment for more details. :)
x86 support is recent. I expect very few projects to use it. Also, the people who use the x86 port didn't bother to come up with a well formatted PR, neither for Tock nor for libtock-c. |
I have already said why this is bad. A process may not be scheduled immediately, so the kernel has to setup the MMU just to schedule an upcall or set system call return values.
What do you mean by crt0? Did you mean CR0 (control register 0)? The page map base address is hold in CR3. And no, changing MMU configuration is not done by changing the CR3 to point to another table. The current upstream x86 Tock MMU support to emulate an MPU uses a single table. Every time a process is going to be run, the table is changed. A similar approach is used for ARMv8-A. Why this approach? For a few reasons:
How expensive is a table change? Quite expensive for several reasons:
Look at subscribe disassembly from
You see that despite having the same ABI, the userspace still pushes the registers on the stack just before issuing the system call. This is done because some of the registers will be cloberred by the system call so they have to be saved on the stack. Also, if you care about performance and memory size, why don't you define system calls in header files so they can be inlined by the compiler? In theory, that could avoid arguments being pushed on the stack twice in some cases. For comparaison,
You read arguments from the stack, do a bit of computation and then push them again on the stack.
You read the arguments from the stack and do some computation with them, without needing to push them back on the stack. Binary size with old ABI:
Binary size with new ABI:
You saved 60 bytes of text. Why? Cuz xoring and moving registers around need 2 bytes, while reading from the stack needs 3 bytes + the additional overhead of constantly pushing registers on the stack (1 byte per push). Also, the performance implication of translating from function call ABI to system call ABI is insignificant. It takes at most a few cycles, since the processor can perform register renaming with
Check my answer to reynoldsbd.
Function call ABI != system call ABI. Both Linux and FreeBSD use the ABI pointed out by I hope I'll be able to share the PR with MMU support so we can discuss on concrete code rather than on etherical topics and suppositions. |
Here is my view on this:
|
So at least from my side, I have a broader underlying question about the overall MMU implementation being proposed: From your comments (@Ioan-Cristian and @alexandruradovici) it sounds like you envision a very flexible and general-purpose MMU subsystem which dynamically allocates pages and configures page tables at runtime. (Please correct me if that's untrue! This is why I mentioned kicking off a larger discussion about MMU, because right now I'm just reading between the lines of your comments.) In contrast, the limited MMU implementation currently in master was designed with up-front allocation and linear mapping in mind. This was very intentional on our part for a number of reasons, the biggest of which is that we view dynamic allocation as a reliability and timing hazard. So we took advantage of the Tock application model - specifically the fact that the full list of apps is generally known already when the kernel is booting - to produce what we feel is a more reliable and deterministic design with page tables that are linearly mapped and which do not change at runtime. To be clear, I don't think either of these viewpoints is wrong. Much of the community may favor a dynamic MMU, and at the same time some deployments may prefer stronger guarantees about reliability and determinism (certainly the case for Pluton, and quite possibly valuable for other applications like industrial or automotive). Hence my comment about a broader MMU conversation. In true Rust fashion, I'm confident we'll be able to find some solution that puts the choice into the user's (i.e. board author's) control. Back to the topic of the present ABI PR: I agree with @alevy's comment, this feels like kicking the can down the road. Regardless of whether the page tables are crafted statically or dynamically, I would still assert the same two observations from my previous comment:
|
The MMU support I have implemented is designed for static page tables. @alexandruradovici desire for dynamic page tables is more like a long term goal.
It is unavoidable for allow buffers. It is avoidable for x86 userspace kernel boundary. Can you please provide instructions on how to compile a toolchain for the
The kernel does not need to swap, rewrite page tables or perform TLB flushes when it wants to access process memory. Instead, it has to perform an offline translation to find the kernel's equivalent of a user virtual pointer. This is done each time a buffer is allowed. @alevy suggested to setup the MMU earlier in |
arch/x86-next/Cargo.toml
Outdated
kernel = { path = "../../kernel" } | ||
tock-cells = { path = "../../libraries/tock-cells" } | ||
tock-registers = { path = "../../libraries/tock-register-interface" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking for thoughts here, @Ioan-Cristian.
Does all of the x86 crate need to be duplicated? I had hoped that we could have one shared x86 crate, and then have two additional crates (x86 flat memory space and x86 virtual memory space) that build on top of it and each only have a few files that differ. For this PR, the only difference would be the boundary stuff. In the future, additional files could be pulled out of shared and instead differentiated for the two as necessary for implementing virtual memory.
However, you're the one actually doing the work. So maybe I'm totally missing something about why that's way too much work, totally unworkable, or bad for some other reason. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Common code may be shared. I don't like the approach of using two different crates, so I was lazy and simply duplicated the code.
The reasons I don't like the approach of using two different crates:
- They should be eventually merged, either by adopting the new ABI or by dropping it.
- It doesn't seem natural to have two x86 platforms, because, in reality, there is only one x86 platform.
- Duplicating the arch crate results in duplicating the chip and board crates. This means that the idea of using a shared crate for common code and two separate crates for specific code will propagate to chip and board crate.
As a consequence, I don't think it is worth the effort of separating common and distinct code. Eventually, the two x86
platforms will be merged by choosing one ABI over another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem natural to have two x86 platforms, because, in reality, there is only one x86 platform.
Do you not think that there will eventually be two separate platforms, one with virtual memory and one without?
Eventually, the two x86 platforms will be merged by choosing one ABI over another.
I do agree that even if there are two separate platforms, we would likely want to choose one ABI for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you not think that there will eventually be two separate platforms, one with virtual memory and one without?
Isn't a platform with a MMU a superset of a platform with MPU? As for the new memory management system, it works with both MPUs and MMUs with little changes. As a proof, I was able to run the same applications, unmodified, on both ARMv6-M, RV32I and ARMv8-A, without changing any capsule at all. There are minor changes to chip
, arch
and board
crates, but nothing significant. Even x86 doesn't require any big change in order to work with the new memory management system. It just happens that one of the changes, the ABI one, is a breaking one for userspace.
As a result, I don't think Tock will need to have two separate x86 platforms, as the x86 with virtual memory provides all the functionality of the x86 without virtual memory. The only downside is a slight increase in binary size for x86, but given how small Tock is, this is negligible.
Removed the waiting-on-author tag for now. A request: could you split this into two (or more if necessary) commits. One that duplicates the x86 stuff and one that makes changes to it? As-is, I can't tell which files actually have changes. |
Done. |
In hindsight, having single crates for chips was, in my opinion, a mistake. We should have used a two-crate structure for all chips, one for the chip family and one for a specific MCU. I don't support merging this with the duplicated code and pushing the refactoring work on to the next contributor. It's not fair to someone who wants to fix something in the shared x86 code who either has to implement the prudent chip/chip-family split or duplicate their work. It's also not fair to reviewers who have to keep track of which patches need to be duplicated for the fork to not diverge. |
I also think it is a bad idea to duplicate code in this way, especially that the goal is to merge them one day. That's why I suggested a new branch, Branden suggested to create a common crate,
In a previous comment, I explained why there is no need for two Another approach is to use conditional compilation. This one has two disadvantages:
|
I wouldn't mind a feature branch |
The downsides of branches are:
So to me, it's not clear that a feature branch is any better than just keeping the code in your own downstream repo. However, we did find them really useful for the final push on Tock ethernet. Merging a huge change request was going to be hard, but merging little pieces into a feature branch was reviewable. Then the merge from the feature branch into master was quick because we'd already reviewed all the pieces within it. I do think a feature branch is better than duplication. |
@alevy If everyone agrees with a feature branch, can you create it, please? I don't have the required permissions to do it by myself. |
Also, if I understand correctly, the main branch of Tock is designed for:
Aren't the two goals mutually exclusive? Why doesn't Tock have multiple branches, a stable one, for downstream users, and an unstable one, for development? Or why doesn't Tock use tags with the guarantee that a tagged commit may be safely used by downstream users, while untagged commits are subjects to fast and various changes? |
Created |
Instead of pushing and popping arguments on the stack, arguments are passed in registers as on Linux x86 and Tock RISC-V. This new ABI simplifies the implementation of virtual memory. Signed-off-by: Ioan-Cristian CÎRSTEA <ioan.cirstea@oxidos.io>
@alevy Reverted the PR.
No problems on my side! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is now going into a separate branch. The changes themselves look fine. So, I think we should merge this now.
Then later after other work on MMU support progresses we can revisit the question about whether these are the right ABI changes before merging in master.
To answer this question, Tock does the latter option. It has tagged releases that may be safely used by downstream users, while the master branch includes many recent PRs with less testing. But given that the master branch eventually becomes the next release, we do still care about stability for downstream users in it. It's not that things can't ever change, but that we want to be cautious about those changes. |
Pull Request Overview
This pull request changes the system call ABI on x86. Instead of passing arguments and return values on the stack, they are passed in registers. The following registers are used for argument/return values:
ebx
,ecx
,edx
,edi
.This change is needed to simplify the implementation of the MMU on the x86 architecture, where the virtual address space of a process doesn't match the virtual address space of the kernel. In that case, writing/reading from stack led to page faults due to missing pages.
The new system call ABI mimics the Linux kernel's ABI and Tock's RISC-V ABI.
Testing Strategy
This pull request was tested by running two
libtock-rs
applications on QEMU.TODO or Help Wanted
No help needed.
Documentation Updated
Formatting
make prepush
.