Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look into drop powered panic guards #103101

Open
SUPERCILEX opened this issue Oct 16, 2022 · 20 comments
Open

Look into drop powered panic guards #103101

SUPERCILEX opened this issue Oct 16, 2022 · 20 comments
Labels
T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@SUPERCILEX
Copy link
Contributor

Origin: #102023 (comment)

It's a bit weird to have forget and ManuallyDrop, especially since forget is just a let _ = ManuallyDrop::new(t); under the hood. The primary usage of forget that still makes sense is for panic guards, but using forget has the disadvantage of being unclear in its intent. Removing forget would probably make panic guards clearer as they would encourage writing a comment or seperate method explaining what's happening.

@scottmcm scottmcm added T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Oct 16, 2022
@thomcc
Copy link
Member

thomcc commented Oct 16, 2022

Strong -1 from me. That's a lot of churn and there are still totally reasonable cases to use mem::forget, although I do agree that in many cases where you can either use ManuallyDrop or mem::forget, ManuallyDrop is better.

More concretely, I think that let _ = ManuallyDrop::new(t); is a much less clear way to accomplish the same thing as mem::forget.

@RalfJung
Copy link
Member

The point that came up in #102023 was that forget is often used for guards, and maybe it'd be more clear if guards had a method to 'defuse' them rather than people having to se a method in std::mem? In fact when talking about memory, you almost always want ManuallyDrop and not forget, as the docs indicate. mem::forget is only good for non-memory uses, which is unfortunate given the module it lives in.

@thomcc
Copy link
Member

thomcc commented Oct 16, 2022

It's also reasonable to use if you want to run a function that returns an RAII type and just prevent its destructor from running. That is, cases like forget(something.lock().unwrap()) are reasonable.

I agree that perhaps it's in the wrong module, but I don't think it's worth the degree of ecosystem churn that would be required to change it (I also don't know where else would be better).

@RalfJung
Copy link
Member

RalfJung commented Oct 16, 2022

That is exactly the guard case I mentioned, which maybe should be something.lock().unwrap().defuse() or something.lock().unwrap().forget() or so.

I am not suggesting to deprecate mem::forget, so this issue should be closed IMO, but we should have an issue tracking whether we want such defuse/forget methods.

@thomcc
Copy link
Member

thomcc commented Oct 16, 2022

I think it would be pretty confusing for a MutexGuard, and I don't really think a dedicated method makes sense in all cases (it's quite rare that you want to leave a lock locked).

@ChrisDenton
Copy link
Contributor

See also #62553. It's true that mem::forget can be a subtle footgun in unsafe code because of the uniqueness assertion which is easy to overlook. But I'm not sure if that warrants deprecation.

I'd also note that a large part of the docs for mem::forget are devoted to describing alternatives.

@thomcc
Copy link
Member

thomcc commented Oct 17, 2022

I think a better solution there is to not make Vec/String/CString/Box assert uniqueness on typed move/copy. It doesn't help much for optimizations (reads and writes to these types still go through &/&mut respectively), and is a huge footgun for a number of reasons.

@RalfJung
Copy link
Member

RalfJung commented Oct 17, 2022

That's rust-lang/unsafe-code-guidelines#326.

Are you suggesting we entirely remove the Unique here and the noalias annotations in codegen, or some middle-ground between 'just a boring NonNull pointer' and 'full uniqueness'?

@SUPERCILEX
Copy link
Contributor Author

SUPERCILEX commented Oct 17, 2022

For the MutexGuard case, I can see forget kind of making sense? Then again, keeping a mutex locked forever is a super weird thing to do, so maybe the code that does so should look weird enough to wake people up during review.


Regarding the PanicGuards of the stdlib, I thought about this some more and I'm now extremely opposed to the current approach. It is fundamentally flawed in that it misattributes the unsafety of the program. The current approach is as follows:

PanicGuard = unsafe { deallocate memory on drop }

f() {
  let guard = PanicGuard(memory);
  do_stuff();
  // Yay, no panic
  forget(guard)
}

The unsafe is in the PanicGuard, but this is wrong!!! The unsafety is actually in the lack of code. If a developer forgets to forget (lol), they will UAF. This means that it should be unsafe to not call forget which can't be expressed in the language AFAIK. I haven't dug around, but I'd be kind of surprised if there hasn't been some bug caused by a missing forget.

Proposal: rewrite all panic guards to always drop and then check for thread::panicking to decide whether or not to deallocate. Any reason this wasn't done in the first place?

@RalfJung
Copy link
Member

RalfJung commented Oct 17, 2022

Checking thread::panicking seems rather fragile and IMO is much less clear than the current approach.

It seems better to abstract this pattern into a helper so not each and every user needs to get the forget right.

@SUPERCILEX
Copy link
Contributor Author

Checking thread::panicking seems rather fragile and IMO is much less clear than the current approach.

I don't know how thread::panicking works: why would it be fragile? Also saying thread::panicking is more fragile seems wrong. The current approach is: let me think really hard about all the places I return stuff to the user and where I could panic and then make sure I forget in all the right places but not before I could panic. The new approach would be: just check for panicking inside Drop and deallocate if so. To me, that's way easier and clearer.

It seems better to abstract this pattern into a helper so not each and every user needs to get the forget right.

I would prefer that to checking panicking, but I don't think it's possible to express "drop when panicking but not under normal operation"? Or are you thinking of a defuse wrapper? I don't like that for the reasons above: way too easy to UAF.

@RalfJung
Copy link
Member

I am thinking of a function that takes 2 closures, the 2nd one to be executed only if a panic occurs while the first is running.

@thomcc
Copy link
Member

thomcc commented Oct 17, 2022

Are you suggesting we entirely remove the Unique here and the noalias annotations in codegen, or some middle-ground between 'just a boring NonNull pointer' and 'full uniqueness'?

I feel much more strongly about Vec/String/etc than Box, but yeah, I'd prefer those be NonNull than Unique (and everything that implies). For Box, I'm not sure. If we don't have evidence that it helps performance I think we should go with NonNull, though.


Proposal: rewrite all panic guards to always drop and then check for thread::panicking to decide whether or not to deallocate. Any reason this wasn't done in the first place?

Performance. Checking thread::panicking is much more costly. I also disagree this is better -- I think it's much worse and needs much stronger motivation than you've given.

@SUPERCILEX
Copy link
Contributor Author

Performance. Checking thread::panicking is much more costly.

Ah, dang. Found the implementation:

pub fn count_is_zero() -> bool {
if GLOBAL_PANIC_COUNT.load(Ordering::Relaxed) & !ALWAYS_ABORT_FLAG == 0 {
// Fast path: if `GLOBAL_PANIC_COUNT` is zero, all threads
// (including the current one) will have `LOCAL_PANIC_COUNT`
// equal to zero, so TLS access can be avoided.
//
// In terms of performance, a relaxed atomic load is similar to a normal
// aligned memory read (e.g., a mov instruction in x86), but with some
// compiler optimization restrictions. On the other hand, a TLS access
// might require calling a non-inlinable function (such as `__tls_get_addr`
// when using the GD TLS model).
true
} else {
is_zero_slow_path()
}
}
I guess that's a no-go.

I also disagree this is better -- I think it's much worse and needs much stronger motivation than you've given.

Why? I think I'm missing something here because this is the analogy I'm seeing:

char* ptr = malloc(...); // Equivalent: PanicGuard::new()
// ...
free(ptr); // Equivalent: forget(guard). If you don't put this in, you're sad in C but much sadder with the guard because it's a UAF.

I am thinking of a function that takes 2 closures, the 2nd one to be executed only if a panic occurs while the first is running.

Interesting, so this-ish:

fn panic_guard<T>(block: impl FnOnce() -> T, panic_handler: impl FnOnce()) -> T {
    struct PanicGuard<PanicHandler: FnOnce()>(PanicHandler);

    impl<P: FnOnce()> Drop for PanicGuard<P> {
        fn drop(&mut self) {
            self.0();
        }
    }

    let guard = PanicGuard(panic_handler);
    let result = block();
    std::mem::forget(guard);
    result
}

I'd have to see what that looks like at the call site, but I think that makes sense.

@RalfJung
Copy link
Member

I feel much more strongly about Vec/String/etc than Box, but yeah, I'd prefer those be NonNull than Unique (and everything that implies). For Box, I'm not sure. If we don't have evidence that it helps performance I think we should go with NonNull, though.

The aliasing requirements we could make (as defined by Stacked Borrows) go well beyond what can be expressed in LLVM, so currently we don't have a way of evaluating their performance impact.

@RalfJung
Copy link
Member

Interesting, so this-ish:

Yeah, except both closures probably want to share some state so we might have to introduce another parameter that is mutably borrowed by both.

@faucct
Copy link

faucct commented Apr 23, 2023

I would like to mention the problems the existence of forget and ManualDrop causes in parallelism: by allowing freeing memory without calling destructors, it becomes impossible to notify the other threads that this memory is no longer safe to use, blocking until they get it, and maybe migrate the data somewhere else. This forbids implementing the async variant of scope, sharing the parent future data with child futures. It seems like it is more right to implement defusing resources, such as guards and file descriptors, on an individual basis, not with a ubiquitous forget.

@SUPERCILEX SUPERCILEX changed the title Consider deprecating std::mem::forget Look into drop powered panic guards Apr 23, 2023
@SUPERCILEX
Copy link
Contributor Author

@faucct it was basically decided that deprecating forget isn't going to happen. I'm only keeping this issue open to see if there's a better way to handle panic guards.

@RalfJung
Copy link
Member

@faucct ManuallyDrop/mem::forget are a red herring, you can implement leaking memory in safe code with just Rc. Forbidding them wouldn't help with async scopes, we would have to also remove Rc, Arc, channels, ...

@faucct
Copy link

faucct commented Apr 24, 2023

Yeah, I have already found the weird examples, where though the memory is not being reclaimed without the destructors being run, but the references stored in it are becoming unborrowed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-libs Relevant to the library team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants