Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: rust: chrdev: fix use-after-free on module unload #207

Merged
merged 2 commits into from
May 12, 2021

Conversation

TheSven73
Copy link
Collaborator

@TheSven73 TheSven73 commented Apr 19, 2021

Note that this issue is potentially present on any driver module which
stores its cdev in kmalloc-ed memory. This is not seen as a problem,
as module unloading is currently "best effort" only.

The kernel's struct cdev is a reference-counted kobject. This
means that the object isn't guaranteed to be cleaned up after a
call to cdev_del - the cleanup may occur later.

Rust's chrdev places the struct cdev in kmalloc-ed memory.
On module unload, it calls cdev_del and kfrees all module memory,
including the struct cdev. But that structure might only be cleaned
up later - resulting in a potential use-after-free.

This issue is reliably triggered using CONFIG_DEBUG_KOBJECT_RELEASE,
which has been developed specifically to catch this subtle class of
bugs.

Fix by allocating the cdev using cdev_alloc, which stores the
object on the kernel's kalloc heap. Now it can outlive the
module, and be cleaned up+released when the kernel decides it's time.

@TheSven73
Copy link
Collaborator Author

TheSven73 commented Apr 19, 2021

This is probably an incomplete solution (hence the RFC) because:

 * cdev_del() removes @p from the system, possibly freeing the structure
 * itself.
 *
 * NOTE: This guarantees that cdev device will no longer be able to be
 * opened, however any cdevs already open will remain and their fops will
 * still be callable even after cdev_del returns.

https://github.com/Rust-for-Linux/linux/blob/rust/fs/char_dev.c#L583

What happens when the module is unloaded while someone in userspace has the chrdev still open? Can the freed fops still get called?

@jeffmahoney
Copy link

The owner field in the file operations is set to the module before the cdev is added.

(*cdev).owner = this.this_module.0;

That will pin the module via the get_fops call in do_dentry_open.

linux/fs/open.c

Line 807 in 30d13c6

f->f_op = fops_get(inode->i_fop);

@TheSven73
Copy link
Collaborator Author

The owner field in the file operations is set to the module before the cdev is added.

Thanks! That also seems to work in practice:

# insmod rust_chrdev.ko 
[ ] rust_chrdev: Rust character device sample (init)
# mknod /dev/rust c 240 0
<hold /dev/rust open>
# rmmod rust_chrdev
rmmod: can't unload module 'rust_chrdev': Resource temporarily unavailable
<close /dev/rust>
# rmmod rust_chrdev
[ ] rust_chrdev: Rust character device sample (exit)

So all good on that front.

Copy link
Member

@wedsonaf wedsonaf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tracking down the issue!

// SAFETY: Calling unsafe functions and manipulating `MaybeUninit`
// pointer.
unsafe {
let cdev = bindings::cdev_alloc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lately we've been trying to keep the scope of unsafe as tight as possible, with proper justifications for each unsafe operation. Would you mind moving the above out to its own unsafe block?

Something like:

// SAFETY: This only allocates memory.
let cdev = unsafe { bindings::cdev_alloc() };
if cdev.is_null() {
    return Err(Error::ENOMEM);
}

@@ -121,8 +125,10 @@ impl<const N: usize> Registration<{ N }> {
(*cdev).owner = this.this_module.0;
let rc = bindings::cdev_add(cdev, inner.dev + inner.used as bindings::dev_t, 1);
if rc != 0 {
bindings::cdev_del(cdev);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm starting to think that we need a wrapper for a cdev, such that clean up is automatic. Without it, it is very much possible for us to add an extra error path and forget to clean up; this is the sort of thing we're trying to be better at than C.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the exact same thing while I was typing up the cdev_del in the error path. That didn't feel very oxidated!

A wrapper/abstraction would be cool. The more kobject instances we need to juggle around in Rust, the more useful such an abstraction would be. I guess its complexity has to balance its benefit?

// SAFETY: Calling unsafe functions and manipulating `MaybeUninit`
// pointer.
unsafe {
let cdev = bindings::cdev_alloc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with this approach is that it doesn't allow us to pass any context to open -- when we control cdev, we can embed it in a outer struct and use container_of to get per-device shared state. How can we do this sort of thing after this change? (Example here.)

More generally, how are drivers supposed to use cdev_init safely?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can retrieve the cdev from the struct inode passed into the fops's open call. E.g.:

static int port_fops_open(struct inode *inode, struct file *filp)

static int port_fops_open(struct inode *inode, struct file *filp)
{
	struct cdev *cdev = inode->i_cdev;
        ....
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but if it is allocated via cdev_alloc, we can't use the traditional approach of defining a struct that contains a cdev, then using container_of to go from the cdev in inode->i_cdev to the outer struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know how do drivers that do use cdev_init ensure that the memory is freed only when the refcount on the kobject goes to zero?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's something I've wanted to bring up as well. I've been testing the simplest module I could find which uses a cdev -(drivers/char/raw.ko). It appears to (mis)treat the cdev in the same way as the chrdev module. Interestingly, it doesn't appear to trigger DEBUG_KOBJECT_RELEASE warnings, like the chrdev module does! Not the few times I ran it, anyways. So I wonder what's going on there.

Copy link
Collaborator Author

@TheSven73 TheSven73 Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of the raw module was that it couldn't possibly be working but I didn't want to poison your analysis :)

Thanks for letting me discover that on my own, it's the best way :)

I agree that raw is broken, or at least the module unload section of it. But a quick grep shows that many more drivers are in the same boat, e.g.

char/pc8736x_gpio.c
usb/mon/mon_bin.c
hid/hid-roccat.c
hid/hidraw.c

From what I can tell, any driver which uses cdev_init/cdev_del is at risk of triggering the debug warning, if the cdev is still alive when the module is unloaded. I guess for most drivers that won't happen so easily, as in practice there tends to be a delay between closing the last handle (which calls cdev_del) and running rmmod.

That said, I found out in the past that plenty of drivers have use-after-free issues (and more!) in their module unload path. A common mistake is forgetting to call cancel_work_sync, you can find it all over. A few years ago, I submitted a bunch of fixes for various drivers to try and address some of these issues.

From what I understand, the kernel community does not view module onload bugs as a problem. That's probably why so many remain, and few people care. See Greg KH's opinion here.

So module unload might be broken wrt DEBUG_KOBJECT_RELEASE simply because no-one cares enough right now, and the "correct" mechanism does not yet exist. Should we work around that in the Rust code?

Copy link
Collaborator Author

@TheSven73 TheSven73 Apr 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a thought: using cdev_set_parent the kernel will make sure that the parent (which holds the cdev's memory) is not freed before the cdev. So maybe we can make the module itself the cdevs parent? But will that increment the module's refcnt, and prevent it from getting unloaded?

Edit: looking at kernel/module.c, it's obviously way more complicated than that :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'm speaking to Greg about this and indeed this isn't something that they care about. He advises us to not expose these APIs in Rust just yet.

So for now, let's do the workaround that you implemented so that we don't get DEBUG_KOBJECT_RELEASE warnings and we don't have to disable it either.

We'll find a way to share per-device state later (we already have a TODO for this in convert below).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call. I will proceed with your review comments then, does that sound ok?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds great, thank you!

@TheSven73 TheSven73 force-pushed the rust-for-linux-cdev-uaf branch 2 times, most recently from 70a2bab to dad1a84 Compare April 21, 2021 04:48
}
}

fn new_none_array<T, const N: usize>() -> [Option<T>; N] {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function looks pretty bad. I couldn't work out how to initialize an [Option<SafeCdev>; N] array to None:

cdevs: [None; N],

Rust insists that SafeCdev needs the Copy trait, which is obviously a really bad idea for a refcounted object:

error[E0277]: the trait bound `Option<SafeCdev>: core::marker::Copy` is not satisfied
   --> rust/kernel/chrdev.rs:155:24
    |
155 |                 cdevs: [None; N],
    |                        ^^^^^^^^^ the trait `core::marker::Copy` is not implemented for `Option<SafeCdev>`
    |
    = help: the following implementations were found:
              <Option<T> as core::marker::Copy>
    = note: the `Copy` trait is required because the repeated element will be copied

Does anyone know of an elegant, safe way to accomplish this?

Copy link

@adamrk adamrk Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can fill an array with const values even if they aren't Copy: https://blog.rust-lang.org/2021/02/11/Rust-1.50.0.html#const-value-repetition-for-arrays. So

const NONE: Option<SafeCdev> = None;
[NONE; N]

should work.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works. Concise, elegant, and safe - thanks ! 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the codegen is better, even, from a quick test in Compiler Explorer.

@@ -20,10 +20,66 @@ use crate::error::{Error, KernelResult};
use crate::file_operations;
use crate::types::CStr;

struct SafeCdev(*mut bindings::cdev);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I don't think we need this Safe prefix. How about just Cdev or CharDev?

fn init(&mut self, fops: *const bindings::file_operations) {
// SAFETY: call an unsafe function
unsafe {
bindings::cdev_init(self.0, fops);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're supposed to call cdev_init on what's returned by cdev_alloc . Perhaps we should have the constructor above (alloc) also take a file_operations as argument and we just initialise cdev.ops?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are absolutely right, that pattern is incorrect! Yes, let's fold the cdev.ops assignment into alloc.


impl SafeCdev {
fn alloc() -> KernelResult<Self> {
// SAFETY: call an unsafe function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 'm not sure if this is a final version meant for revision (in case it isn't, please ignore this), but we'd like to see a justification for the call being safe. Also, for consistency, please end the sentences in comments with a punctuation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that I understand the ins-and-outs of how we're planning to manage unsafe code in the kernel, so bear with me. How could I justify that this code is safe?

}
}

fn add(&mut self, dev: bindings::dev_t, pos: c_types::c_uint) -> KernelResult<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pos should be called count, no?

rust/kernel/chrdev.rs Show resolved Hide resolved
@@ -223,6 +223,10 @@ jobs:
| sed s:$'\r'$:: \
| tee qemu-stdout.log

# the kernel should not be generating any warnings
- run: |
grep -L '] WARNING:' qemu-stdout.log
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why -L?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-L inverts grep's return value. So the workflow will fail if a warning is present in the kernel log.

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow -- this has changed 3 years ago. In my version of grep it didn't, and the manual didn't mention it either.

It turns out the behavior changed in https://git.savannah.gnu.org/cgit/grep.git/commit/?id=92526f7246464825c5547ceb08e01433e035c867

Since this is just for the CI, it is OK.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting ! Is there a better, more universal, etc way to invert the grep result? No problem switching to that.

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, it was changed back again: https://git.savannah.gnu.org/cgit/grep.git/commit/?id=0435ebca64fbafcd62008c991dd9377d8a792dfc

So only grep 3.2, 3.3 and 3.4 work that way. I think it is best to just do it ourselves even if we never change the image we run the CI on (e.g. we may want to move all this into a script so that the CI can be run somewhere else).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll use (! grep '] WARNING:' qemu-stdout.log) then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that should do the trick (not sure if the parenthesis are needed -- it seems to work without them)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indicates that parentheses are needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are multiline strings in a "literal style block", so I think ! is not reserved there, but I may be wrong.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one way to find out, lol

}

fn set_owner(&mut self, module: &crate::ThisModule) {
// SAFETY: dereference of a raw pointer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, as Wedson said -- i.e. the SAFETY comment is not about describing what operation is unsafe (the compiler can tell us that), but about describing why that operation is sound.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're looking for a description of a proof-of-safety, which runs inside a human brain? :) Could you point me to an instance of this in the existing codebase? I often find it easier to work from an existing example.

No concern that new commits will break existing proofs-of-safety?

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inherited the SAFETY comment practice from Rust's standard library; and most of the code we have already follows the convention.

New commits that fail to update the comment are a bug. That is, any commit updating any unsafe block must re-check the SAFETY comment that goes with it. Same for unsafe functions and their Safety section. That is why having unsafe blocks as small as possible and as close as possible to the operation helps avoiding "comment bugs" :)

I know the C side of the kernel does not follow such strict comment guidelines, but I/we believe this leads to better code overall (specially when maintaining it). And yes, many of those comments will be trivial, but the point is not just to document what is non-trivial, but to also know there is nothing non-trivial going on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree 100% with the need for rigour here. One of the big justifications for this effort is memory safety issues, so we better get this right, and keep it right. I would like to learn what's required here - I'll see if I can make a rigorous SAFETY argument in the next push.

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As to how to write the soundness "proof", a good way is to check the documentation of what you are calling to see why it is unsafe (or the language reference, for other operations). Then, think why those preconditions hold, and write it down.

So, for instance, here, the comment should explain why self.0 is guaranteed in all cases to be a valid (dereferenceable) pointer. Thus you can ask yourself the question "Is there any way someone can build a SafeCdev object without .0 initialized properly and call it?"

From a quick look, it seems the only way to build a SafeCdev object is through alloc(), which ensures it is a valid pointer, and it never gets modified. So you could write something like this.

However, this is one of those cases where what we have is a type invariant. For these, we started the convention of writing an # Invariants section in the type's doc-comment, and then referencing that in the SAFETY comments. Doing it this way makes things much easier to reason about, and we avoid having to repeat the same explanation in several places (e.g. every time we dereference the pointer member).

Examples of type invariants being used are at pages.rs, file_operations.rs and sync/arc.rs:

/// Wraps the kernel's `struct file`.
///
/// # Invariants
///
/// The pointer [`File::ptr`] is non-null and valid.
pub struct File {
    ptr: *const bindings::file,
}

And then:

    /// Returns the current seek/cursor/pointer position (`struct file::f_pos`).
    pub fn pos(&self) -> u64 {
        // SAFETY: `File::ptr` is guaranteed to be valid by the type invariants.
        unsafe { (*self.ptr).f_pos as u64 }
    }

I think I will put this explanation somewhere in coding.rst, since it will probably come up several times :)

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry -- I was writing this second comment after submitting the other; I didn't mean to leave you hanging without a proper explanation/example.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, don't worry about stuff like that! And thank you so much for the great explanation, this should give a chance of getting it right.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're welcome!

Also, please don't forget to add a // INVARIANTS: ... comment in the place you actually build (or modify) the object, explaining why the invariant starts to hold (or still holds, respectively). See the files I mentioned for examples or that.

rust/kernel/chrdev.rs Outdated Show resolved Hide resolved
rust/kernel/chrdev.rs Outdated Show resolved Hide resolved
}
}

fn new_none_array<T, const N: usize>() -> [Option<T>; N] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the codegen is better, even, from a quick test in Compiler Explorer.

fops: &'static bindings::file_operations,
module: &'static crate::ThisModule,
) -> KernelResult<Self> {
// SAFETY: `bindings::cdev_alloc` returns a valid pointer or null.
Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value is not what makes this particular block sound (although it is what makes future things sound -- but those should be documented at that point; see the other comments). If there are no additional preconditions to satisfy (but just that the bindings are correct), you can say something like "FFI call".

Sometimes some kernel functions may not be called from certain contexts, so in those cases that "context precondition" needs to be lifted to the outer function's doc-comment # Context. Since this allocates memory, this may be one of those cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell me more about Contexts? Are you saying that every function in Rust needs a # Context comment which indicates whether or not it can run in atomic context? Does kernel Rust have atomic contexts yet? A grep for GFP_ATOMIC brings up nothing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only those that cannot be called from all contexts. This is one of the policies we introduced very recently and we do not really follow it in many places, so bear with us...

Furthermore, we do not require documenting private items like this struct (at least so far, still thinking about that one...), but callers that are public still need to be documented. So e.g. Registration::register() should have it.

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long-term idea is to have some analyzer or lint tell us if we are following the rules -- or even the compiler itself by annotating the functions somehow (e.g. something like the "colored unsafe" I proposed).

Copy link
Collaborator Author

@TheSven73 TheSven73 Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying, that makes sense! From what I understand, only a relatively small subset of kernel functions can be called from atomic contexts. So it might make sense to add a documentation marker (or code attribute?) only to those functions that are atomic compatible.

I can't think of any reason why one would want to create/register a cdev from an atomic context. But it's an interesting area to think about, for sure!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS looks like our messages crossed :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, when making things like this so strict, it brings up situations that do not make much sense (like registering, as you say). Similar to the SAFETY comments, it is more about being consistent, even if it should never be the case that someone attempts to register something in an atomic context etc.

If we manage to get the # Context comments right, it is my hope that reasoning about contexts is easier compared to the C side, plus we would be in a much better position later on to switch to actual annotations and automated checking.

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for the # Context doc-comment, you can ignore it for the moment -- we still need to see how we will write them and do a pass over the current code (we can discuss it in one of the meetings).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I'll push the updated SAFETY comments in a few minutes.

if cdev.is_null() {
return Err(Error::ENOMEM);
}
// SAFETY: [`Cdev::cdev`] is valid and non-null.
Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we don't have a Cdev object yet, so the type invariants do not guarantee the pointer is valid.

Instead, consider something like:

Suggested change
// SAFETY: [`Cdev::cdev`] is valid and non-null.
// SAFETY: `cdev` should be valid since `cdev_alloc()` returned a non-null pointer.

}

fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult<()> {
// SAFETY: [`Cdev::cdev`] is valid and non-null.
Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SAFETY: [`Cdev::cdev`] is valid and non-null.
// SAFETY: `self.0` is valid and non-null by the type invariants.

return Err(Error::ENOMEM);
}
// SAFETY: [`Cdev::cdev`] is valid and non-null.
// `fops` and `module` have `'static` lifetime.
Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second line does not matter for the soundness of this unsafe block. It may matter for future usage of this struct, such as other FFI calls etc. So this fact perhaps needs to be part of the type invariants instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that true, though? If fops or module get deallocated before the struct cdev they're part of... very interesting and unsound things could happen.

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is what I meant by "for future usage of this struct, such as other FFI calls". But for this unsafe block in particular, it is not making it sound or otherwise, i.e. even if both were null the block would still be safe.

Copy link
Collaborator Author

@TheSven73 TheSven73 Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True! I'm not sure where the crashes could happen in such a case. If owner is freed prematurely, the call to cdev_del in Drop could crash. But worse, if fops is freed prematurely, the crash could happen anywhere in the kernel's C code!

I'm not sure how this could/should be documented or analysed. (I'm sure this can be addressed, I just can't see how myself yet)

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suggestion (sorry if it was not clear) is to record this fact in the type # Invariants section, since it is something that the type guarantees. Then perhaps you can mention the fields are properly filled and will never become invalid thanks to the invariant when calling cdev_add().

///
/// # Invariants
///
/// The pointer [`Cdev::cdev`] is valid and non-null.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no Cdev::cdev since you have a tuple struct. Perhaps change to a named member if you prefer self.cdev etc.


impl Drop for Cdev {
fn drop(&mut self) {
// SAFETY: [`Cdev::cdev`] is guaranteed valid and non-null.
Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SAFETY: [`Cdev::cdev`] is guaranteed valid and non-null.
// SAFETY: `self.0` is guaranteed valid and non-null by the type invariants.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for the review @ojeda ! I will fix up those SAFETY comments. My internal safety comment generator isn't properly calibrated yet, bear with me :)

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all -- thanks to you for being one of the early ones joining. We all needed to get accustomed, and if this effort is successful, we will have many other kernel developers asking the same things sooner or later. So hopefully these reviews help other readers as well (we can even reuse a few good review chains as documentation for newcomers etc., since they will likely have the same questions etc.).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's awesome being a guinea pig :)

Copy link
Member

@ojeda ojeda Apr 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, we should avoid linking to the PRs themselves unless everyone involved agrees and instead write better docs based on the review chains found in the PRs :-)

(The reason is that some people may prefer not to be shown in that light, i.e. asking "newbie" questions -- which they are not -- we all asked the same questions at some point)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this experience extremely interesting - it shows how much SAFETY rigour is/will be required to guarantee a safe interface between kernel C and Rust drivers. One spends a few orders of magnitude more effort in the kernel Rust core, in order to get Rust driver safety.

A few potential problems spring to mind, though:

  • human fallibility regarding initial SAFETY reasoning
  • even more importantly, kernel C behaviour changing due to refactoring or bug fixes, and existing SAFETY reasoning going stale. this could be a serious concern, as the kernel is in constant flux (internally).

On the whole I believe Rust-for-Linux has unbelievable potential (that's why I'm here!). The optics and politics of the above could be an issue. The more technical, automated mitigations we can find, the better.

rust/kernel/chrdev.rs Outdated Show resolved Hide resolved
@@ -223,6 +223,10 @@ jobs:
| sed s:$'\r'$:: \
| tee qemu-stdout.log

# the kernel should not be generating any warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# the kernel should not be generating any warnings
# The kernel should not be generating any warnings

Comment on lines 48 to 49
// [`(*self.0).ops`] and [`(*self.0).owner`] are valid and have
// static lifetime.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would mention why, at least for owner since it may confuse a future reader, i.e.:

  • In the case of ops it holds because it was coerced from a reference with 'static lifetime.
  • However, the owner case is different: it is because the module.0 pointer comes from a ThisModule (the fact that the reference to the ThisModule has static lifetime does not imply the pointer inside ThisModule does -- it is the invariants of ThisModule that ensure that).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created #211 and #212 to improve ThisModule on its invariants.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like owner has module lifetime, not static lifetime. That means we could potentially cause bad stuff to happen by giving Cdev::alloc the this_module of a different module! I wonder if we need to consider that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry I haven't followed this closely, I'll take a closer look tomorrow. For now just answer the question above.)

We need to consider it. The ultimate one should ask to help answer this is: can someone write code without unsafe blocks that violate memory safety? If the answer is yes, then we need to do something about it; in general we have two possible routes: mark the function unsafe and document the safety requirements that callers are required to manually satisfy, or (preferred) find a way to prevent abuse of the function (e.g., by not allowing users to specify arbitrary modules) though it's not always possible...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ThisModule is only to be used only within the current module (the constructor is unsafe to ensure proper usage, but I opened those two issues because it should be improved). In fact, I think nobody should be creating any ThisModule object in normal circumstances, only using the one provided via the module! macro (which is static). We can create a lint to ensure this later on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I put a TODO in the SAFETY comments for now re. owner?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should ensure nobody falls into that trap, so I wouldn't put in chrdev.rs but in ThisModule. Probably we should create a get() unsafe method in ThisModule and explain that you cannot pass it to another module. I will add this point to the issue I created above.

Copy link
Collaborator Author

@TheSven73 TheSven73 Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that we might be in a serious spot of lifetime trouble here. I'll push the updated SAFETYs, and place a review comment outlining the problem.

rust/kernel/chrdev.rs Outdated Show resolved Hide resolved
/// # Invariants
///
/// - [`self.0`] is valid and non-null.
/// - [`(*self.0).ops`] is valid and has static lifetime.
Copy link
Member

@ojeda ojeda Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// - [`(*self.0).ops`] is valid and has static lifetime.
/// - [`(*self.0).ops`] is valid, non-null and has static lifetime.

// - [`(*self.0).ops`] is valid, non-null, and has a static lifetime.
// - [`(*self.0).owner`] is valid and, if non-null, has module lifetime.
// This guarantees that [`(*self.0).ops`] and [`(*self.0).owner`] (if non-null)
// will live at least as long as [`self.0`].
Copy link
Collaborator Author

@TheSven73 TheSven73 Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong! The whole reason we're doing this, is because our struct cdev instance might outlive the module - it's a kobject. So the cdev ends up still alive, but with an owner pointer to freed memory.

But the kernel C code probably has the same issue - I'm guessing the cdev won't touch owner when all file handles to it are closed. In that case, we are good.

What's your opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, in the kernel, module removal is best effort. I don't know what is nowadays supposed to be guaranteed to work or not, we can ask.

cdev_del states that the ops can still be called even after it returns (that one should be fine, since they are built into the kernel crate rather than the modules), but does not mention the other fields. We would need to check if they dereference the module anywhere.

In any case, for the comment, I would mention the detail about ops, since it is the important one to guarantee to be available forever; and create an issue for owners in general (of all kinds of devices etc.).

Copy link
Member

@ojeda ojeda Apr 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, I would avoid repeating all the invariants. It is more important to explain why those invariants are good enough for the unsafe block (e.g. that cdev_del() requires ops to live forever, and so the 'static lifetime is OK to satisfy that).

// - [`(*self.0).owner`] is valid and, if non-null, has module lifetime.
// This guarantees that [`(*self.0).ops`] and [`(*self.0).owner`] (if non-null)
// will live at least as long as [`self.0`].
// dev and count are scalars.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed, i.e. it does not make it sound or not (nevertheless, if e.g. dev or count have a range precondition, then it would need to be here. From a quick look, cdev_add does not say; although kobj_map limits n to 255, so looks fine).

Suggested change
// dev and count are scalars.

Ok(Self(cdev))
}

fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult<()> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult<()> {
fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult {

// - [`(*self.0).ops`] is valid, non-null, and has a static lifetime.
// - [`(*self.0).owner`] is valid and, if non-null, has module lifetime.
// This guarantees that [`(*self.0).ops`] and [`(*self.0).owner`] (if non-null)
// will live at least as long as [`self.0`].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, in the kernel, module removal is best effort. I don't know what is nowadays supposed to be guaranteed to work or not, we can ask.

cdev_del states that the ops can still be called even after it returns (that one should be fine, since they are built into the kernel crate rather than the modules), but does not mention the other fields. We would need to check if they dereference the module anywhere.

In any case, for the comment, I would mention the detail about ops, since it is the important one to guarantee to be available forever; and create an issue for owners in general (of all kinds of devices etc.).

@@ -149,12 +197,9 @@ unsafe impl<const N: usize> Sync for Registration<{ N }> {}
impl<const N: usize> Drop for Registration<{ N }> {
fn drop(&mut self) {
if let Some(inner) = self.inner.as_mut() {
// SAFETY: Calling unsafe functions, `0..inner.used` of
// `inner.cdevs` are initialized in `Registration::register`.
// SAFETY: TODO needs explanation why unregister_chrdev_region
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something that says that self.inner being Some(_) means that a previous call to alloc_chrdev_region completed successfully would be fine here.

rust/kernel/chrdev.rs Show resolved Hide resolved
@TheSven73
Copy link
Collaborator Author

No need to apologise at all! Your suggestion made perfect sense to everyone at the time, including me.

unsafe {
for i in 0..inner.used {
bindings::cdev_del(inner.cdevs[i].as_mut_ptr());
}
bindings::unregister_chrdev_region(inner.dev, N.try_into().unwrap());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a comment here about why the unwrap() can never panic? E.g.

            // N.try_into() returned Some in [`register`], and N is constant, so it will return
            // Some here too, which ensures [`unwrap`] will not panic.
            bindings::unregister_chrdev_region(inner.dev, N.try_into().unwrap());

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please!

In fact, I think we could start // PANIC: and // NOPANIC: annotations nearby unwrap(), expect(), etc.

  • The former would annotate those that we believe may actually panic -- so it needs to be a justification of why the outer function was not made fallible to begin with.
  • The latter would annotate those that we believe cannot panic -- so it needs to justify why that is so. i.e. this is the category that could have potentially been unwrap_unchecked().

At the moment, we have likely too many places due to allocations, so it can wait since otherwise we would be flooded with this comments; but later on we really should not have too many of these comments, which makes it worth the noise to have them, I think.

Thoughts?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the whole fallible allocations thing is still in flux, so adding these annotations all over the place probably won't be productive. So let's defer until later.

That said, I'm not sure, but I think we can eliminate the unwrap here:

let mut inner = this.inner.as_mut().unwrap();

simply by folding the alloc_chrdev_region into new. In that case we wouldn't need self.inner to be an Option<RegistrationInner>, it could simply be a RegistrationInner. Things become simpler, and there's one fewer potential unwrap panic.

Not sure if it's worth doing a follow-up PR for this (if it does indeed work)? Maybe there are more important fish to fry at this point?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS I'm not sure what the CI failure is about? The CI worked fine in my cloned repo. Same git id. https://github.com/TheSven73/linux/actions/runs/778067926

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I just re-triggered them. kobject_add appears there, but looks it is not coming from us:

[    0.628134] RIP: 0010:sysfs_create_file_ns+0x73/0x90
...
[    0.628165]  kobject_add_internal+0x21a/0x3c0
[    0.628167]  kobject_add+0x88/0xe0
[    0.628170]  add_sysfs_fw_map_entry+0x61/0x70
[    0.628173]  firmware_memmap_init+0x1f/0x2a
[    0.628175]  ? firmware_map_add_early+0x57/0x57
[    0.628178]  do_one_initcall+0xb3/0x240

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, the whole fallible allocations thing is still in flux, so adding these annotations all over the place probably won't be productive. So let's defer until later.

We could still start annotating the ones unrelated to allocations, though (nevertheless, I am not asking to be added here, since we still need to discuss whether to do it or not).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's worth doing a follow-up PR for this (if it does indeed work)? Maybe there are more important fish to fry at this point?

Never mind, I'm running into borrow checker issue due to the Pin, a concept which I don't understand well enough yet. I'm going to leave this well alone :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pin is indeed a bit tricky due to limitations of the language; the concept itself is easy. Hopefully the language will improve the ergonomics. Also the docs around it could be improved a bit.

@TheSven73 TheSven73 changed the title RFC: rust: chrdev: fix potential use-after-free RFC: rust: chrdev: fix use-after-free on module unload Apr 26, 2021
@TheSven73
Copy link
Collaborator Author

Updated title and commit message to make it clear that this issue occurs widely but is not seen as a problem - because module onloading is best effort only.

Copy link
Member

@alex alex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got one question, and two places that'll need a rebase, but otherwise this looks good to me. Thanks for wokring on i!

fn alloc(
fops: &'static bindings::file_operations,
module: &'static crate::ThisModule,
) -> KernelResult<Self> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to rebase this on rust and change this to Result, see #259

this.inner = Some(RegistrationInner {
dev,
used: 0,
cdevs: [MaybeUninit::<bindings::cdev>::uninit(); N],
cdevs: [NONE; N],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does just putting None here not work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not... see here

Ok(Self(cdev))
}

fn add(&mut self, dev: bindings::dev_t, count: c_types::c_uint) -> KernelResult {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Note that this issue is potentially present on any driver module which
stores its `cdev` in `kmalloc`-ed memory. This is not seen as a problem,
as module unloading is currently "best effort" only.

The kernel's `struct cdev` is a reference-counted `kobject`. This
means that the object isn't guaranteed to be cleaned up after a
call to `cdev_del` - the cleanup may occur later.

Rust's `chrdev` places the `struct cdev` in `kmalloc`-ed memory.
On module unload, it calls `cdev_del` and `kfree`s all module memory,
including the `struct cdev`. But that structure might only be cleaned
up later - resulting in a potential use-after-free.

This issue is reliably triggered using CONFIG_DEBUG_KOBJECT_RELEASE,
which has been developed specifically to catch this subtle class of
bugs.

Fix by allocating the `cdev` using `cdev_alloc`, which stores the
object on the kernel's `kalloc` heap. Now it can outlive the
module, and be cleaned up+released when the kernel decides it's time.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Pull Requests should not introduce any kernel warnings. Fail the
workflow if warnings are spotted in the log.

Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
@TheSven73
Copy link
Collaborator Author

Rebased to latest master branch, as requested. Thank you @alex for the review, really appreciate it !

@alex alex merged commit fc2b177 into Rust-for-Linux:rust May 12, 2021
@alex
Copy link
Member

alex commented May 12, 2021

Thanks for your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants