Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make mem::uninitialized inaccessible in future editions #98862

Open
RalfJung opened this issue Jul 3, 2022 · 73 comments
Open

Make mem::uninitialized inaccessible in future editions #98862

RalfJung opened this issue Jul 3, 2022 · 73 comments
Labels
A-maybe-future-edition Something we may consider for a future edition. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@RalfJung
Copy link
Member

RalfJung commented Jul 3, 2022

We have reached the end of what we can do in #66151 (making mem::uninitialized panic when it is used incorrectly) without enormous amounts of crater failures. However, there is still one more thing we can do: we can make mem::uninitialized() inaccessible in future editions. We seem to have the support of at least one lang-team member as well. :) Cc @joshtriplett

@bstrie mentioned they have a patch doing this, though the RFC got closed. Reading the Zulip discussion, it looks like some libs team members were concerned, though the concern seems to have been mostly around hiding things from the docs. I am not talking about the docs here, I just want edition 2024 code to not compile when it calls mem::uninitialized. @m-ou-se I wonder if you would be fine with that?

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

I'm not sure if it makes sense to tie this to an edition. If it's so broken that it needs to be removed or cause a loud warning, we should probably do that in all editions.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

We will never be able to make this hard-error in old editions. There is too much old code out there that uses it.

At the same time we should do whatever we can to prevent new code that uses it from being written. I think this means it should hard error for new code.

@est31
Copy link
Member

est31 commented Jul 3, 2022

New editions are not just for new code, but also old. Do you know of a way to automatically migrate code away from mem::uninitialized? cargo fix-ability is a requirement for edition changes unless that requirement has been removed.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

There is no automatic migration possible.

AFAIK cargo-fixability of code that is already warning-free is a requirement for edition changes. This function is deprecated, so warning-free code will not call it.

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

We will never be able to make this hard-error in old editions.

No, but we can make it a deny-by-default lint. Perhaps even a future-incompatability lint, such that it becomes visible even when it occurs in dependencies.

for new code

I don't think it's right to mix up "new edition" and "new code". It's true that most new code gets written for the newest edition, and that cargo new automatically uses the latest edition, but that doesn't mean "new code" means "new edition" or the other way around. (As @est31 also just mentioned while I'm writing this comment. ^^)

I personally try not to think of editions as much as 'old' and 'new', but instead as just slightly different dialects that we're all maintaining in parallel. Naming them after years works nicely, but if you think of them as just "Rust Dialect A", "Rust Dialect B", etc., then it feels a bit odd to remove a function from Rust Dialect D but not from Rust Dialect C.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

if you think of them as just "Rust Dialect A", "Rust Dialect B", etc., then it feels a bit odd to remove a function from Rust Dialect D but not from Rust Dialect C.

I don't think this is odd at all. It's a case of "We screwed up and (lacking a time machine) the most we can do is to disallow this terrible thing in the next dialect. Bad enough that we have to deal with it in our previous dialects, let's at least contain this problem as much as we possibly can."

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

What value does a hard error provide over a deny-by-default lint? People could turn off the lint, sure, but people could also switch to another edition.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

I don't think it's right to mix up "new edition" and "new code".

Fair. But it somewhat corresponds to "maintained code", aka code that anybody still works on. (Not all maintained code is in a new edition due to MSRV concerns, but all new edition code is necessarily maintained.) Getting rid of mem::uninitialized() for (a large fraction of) maintained code seems like a good thing to me. It's the unmaintained code that depends on old hyper/crossbeam that is blocking our other avenues to make progress here.

What value does a hard error provide over a deny-by-default lint? People could turn off the lint, sure, but people could also switch to another edition.

It's a very clear message that we wish this thing just was not part of the language. It should not be part of the language. We should send the strongest possible signal to underpin that message. And that is making it a hard error.

Editions are the best mechanism we have to un-do past mistakes. But it's not un-doing them if here are ways around it. (I very much doubt people will stick to an old edition to keep this function. That's like saying, they might just pin an old rustc. So why ever deprecate/remove anything.)

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

That's like saying, they might just pin an old rustc. So why ever deprecate/remove anything.)

There are many reasons to upgrade to a newer Rust version, including being able to use dependencies that require the newer Rust version. There are very few reasons to use a later Rust edition. Using Rust 2018 for your crate today has almost no downsides.

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

And that is making it a hard error.

But it will never be a hard error, as you explained. All we're discussing now is whether the syntax for disabling the lint should be #![allow(mem_uninitialized)] or edition = "2021".

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

All we're discussing now is whether the syntax for disabling the lint should be #![allow(mem_uninitialized)] or edition = "2021".

That's not even in the same category of a thing. There is always the rustup install 1.40.0 syntax available.

So no I don't buy that switching editions is remotely comparable to allowing a lint.

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

No, because changing your Rust version can break things in your entire dependency tree. Many of my dependencies will not work with Rust 1.40. All of my dependencies will work fine if I switch my crate to Rust 2015 though, as Rust Edition is a decision per crate, just like enabling/disabling lints.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

Sure, pinning an old Rust has much bigger side-effects than pinning an old edition. Still to me they are much more comparable than allowing a lint. As you said, editions are basically Rust dialects. Different Rust releases are also Rust dialects. But we could and should at least make it so that the latest and greatest dialect doesn't have this thing that should never have existed.

But it will never be a hard error, as you explained.

It can be a hard error in some dialects of Rust. That's strictly better than being able (with enough allowed lints) to call this function in all dialects.

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

We don't maintain Rust 1.40 or release new versions of it. We do maintain Rust 2015 and release new vesions of it.

That's strictly better than being able (with enough allowed lints) to call this function in all dialects.

Why? There is virtually no cost to using an older edition, which is a per crate decision that doesn't affect anything around it.

If someone doesn't care how unsafe the function is and is willing to disable the lint with #![allow], why wouldn't they be willing to disable the lint through edition = "2021"? There is almost no reason for them to use the latest Rust edition. It's just a default they can easily change. 🤷

I'm all for adding a deny-by-default or even a forbid lint for mem::uninitialized to force maintained crates to update, using the unstoppable slowly forward moving force that is Rust version. We don't maintain old Rust versions, so there's a ton of reasons to keep moving to newer versions, otherwise you're missing out on bugfixes and new functionality. That's not the case for editions. Someone using Rust 2015 for their crate still gets all the cool new additions and bug fixes every six weeks.

@saethlin
Copy link
Member

saethlin commented Jul 3, 2022

If it's so broken that it needs to be removed or cause a loud warning

In the issue that this comes from, it looks like it was decided that even issuing a panic in the insta-UB cases is not acceptable right now. But even if users of such libraries with insta-UB mem::uninitialized never bump their edition but they bring in a new Rust version it's entirely possible their code will just... miscompile (or have surprising codegen, depending on how you look at it) some day with no warning. I think we should do something to help those people.

Or more generally, whatever we do to ratchet up optimizations that exploit UB should come in lock-step with help to fix problems when they arise.

I think it would be nice to see mem::uninitialized gone in future editions because it was a mistake to have in the first place. But also it would be nice to not just leave those who don't bump their edition to fend for themselves as they run more and more aggressive versions of LLVM across code that we already know is asking for trouble.

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

ratchet

Rust versions have this "ratchet" effect: they only go up and don't go down. Dependencies all slowly raise their MSRV over time, important bugs are discovered in now unmaintained versions, cool features are added to new versions and not to older ones, etc. If you don't move with it, you get left behind.

Rust editions do not have this effect. Any crate can move freely between editions, in both directions. If the older editions were only meant for unmaintained crates, we shouldn't be adding new functionality to them. But we do. Just three days ago we added #[derive(Default)] for enums to Rust 2015 (and the other editions), for example. In less than six weeks from now, we'll again add a whole bunch of new standard library APIs to Rust 2015 (and the other editions).

What's the issue with making this a forbid lint regardless of edition? It wouldn't break any unmaintained dependencies thanks to cap-lints, but it'd force any maintained crate to move away from mem::uninitialized, assuming the maintainer would use a somewhat recent version of Rust. Is there a reason to silence the lint in Rust 2015/2018/2021 crates?

@est31
Copy link
Member

est31 commented Jul 3, 2022

There is the traditional forward compat lint deprecation mechanism which is used to deprecate a lot of features. It's kinda new to have it for a standard library function like mem::uninitialized though, which has been the recommended way of creating uninitialized memory for a while.

But if there is the will to get it out of the language, why not use that mechanism? Add a lint for mem::uninitialized, make it warn by default, then deny by default, then use the forward compat lint integration of cargo to warn crate users about the upcoming breakage. Then error, and on all editions (as there might be miscompilations on all of them).

As a second question, regarding the motivation for this. Is there any upcoming work on optimizers that's being blocked by mem::uninitialized because it would lead to miscompilations? Are there miscompilations already? Or is it still only about some theoretical model?

@m-ou-se
Copy link
Member

m-ou-se commented Jul 3, 2022

If #[deprecated] isn't strong enough, maybe we need a #[deprecated_and_also_completely_broken] or something, to trigger something stronger than a warning. It doesn't have to be specific to mem::uninitialized.

@saethlin
Copy link
Member

saethlin commented Jul 3, 2022

which has been the recommended way of creating uninitialized memory for a while.

It has been deprecated since 1.39.0, almost exactly 3 years ago, and MaybeUninit was stable since 1.36.0. By the time the next edition rolls around, mem::uninitialized will have been deprecated for longer than it wasn't. So yes I agree that a lot of code could have been written in the time it was the way, but also we've given people a lot of opportunity to port.

Are there miscompilations already? Or is it still only about some theoretical model?

I am not personally aware of any miscompilations reported in the wild, but there are certain sectors of the community which tend to keep to themselves even when they encounter a critical issue, much to my dismay.

The IR that rustc generates depends on the type you construct. For example, if you create a &'static u8, we just codegen that to a panic. But if you create an array of that type, we emit LLVM IR which "returns" uninitialized data and is also marked as noundef. There are old versions of crates which do this pattern, which you can read about in the issue this descends from. Again, I am not personally aware of any reports of surprise codegen due to this situation, but we are offering a very blatant contradiction to LLVM.

@5225225
Copy link
Contributor

5225225 commented Jul 3, 2022

There's #94075 (comment) which seems to have at least a fair few segfaults due to "they're making an uninit enum, so changes to the way we use enums can cause LLVM to notice the UB / something else to go wrong".

At least that's what I assume is going on for at least some of them. The work there doesn't seem blocked because of this, though.

That, and #52898 was a case of UB for an unused variable causing problems (but curiously, I can't reproduce that on current rustc)

@est31
Copy link
Member

est31 commented Jul 3, 2022

If #[deprecated] isn't strong enough, maybe we need a #[deprecated_and_also_completely_broken] or something, to trigger something stronger than a warning. It doesn't have to be specific to mem::uninitialized.

The issue of deprecated is twofold. First, one can't escalate that lint to deny by default in the future, etc. Second, there might be code that added #[allow(deprecated)] a while ago to silence deprecation warnings of some unrelated piece of code and didn't notice the mem::uninitialized deprecation warning (see #62398). The latter is a bit rare though and given that deprecated is already warn-by-default one could make the deprecation lint deny-by-default from the start.

seems to have at least a fair few segfaults due to "they're making an uninit enum, so changes to the way we use enums can cause LLVM to notice the UB / something else to go wrong".

was a case of UB for an unused variable causing problems (but curiously, I can't reproduce that on current rustc)

Thanks for the concrete examples, @5225225 ! This should help quell the "this is just FUD" people.

@saethlin
Copy link
Member

saethlin commented Jul 3, 2022

I do not want to drag this off topic, but I think if Rust is going to live up to its safety claims/reputation, we should work to prevent situations like this and not wait for people to report problems first.

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

We don't maintain Rust 1.40 or release new versions of it. We do maintain Rust 2015 and release new vesions of it.

I view this as a two-dimensional matrix -- there is Rust 2015 version 1.40 and Rust 2018 Version 1.60 and so on.

We can't change the past, and we are bound by stability guarnatees within editions, but we can make some of the entries in this matrix be the language Rust would be if we hadn't made mistakes in the past -- in particular, a language without mem::uninitialized().

I view this as similar to how we now require dyn when using a trait as a type. We entirely removed the ability to use Send as a type since we now consider this a mistake. You have to write dyn Send in newer editions -- not only in newer editions. We should treat mem::uninitialized similarly. Granted, migration is harder, but the risk associated with using this broken feature is also much bigger.

If someone doesn't care how unsafe the function is and is willing to disable the lint with #![allow], why wouldn't they be willing to disable the lint through edition = "2021"?

Maybe someone slapped the attribute some place in the crate and other people elsewhere are not aware of it. Or maybe having to downgrade an entire edition gives people second-thoughts. I see a qualitative difference between allowing a lint and switching to a different language dialect.

Rust versions have this "ratchet" effect: they only go up and don't go down.

They have this effect on each edition individually -- the "ratchet" works along the rows of the matrix, if the columns are editions. It's one ratchet per edition, so to speak.

So we should use the ratchet where we can, i.e. the edition 2024 row, even if we cannot use it in the edition 2015 row.

As a second question, regarding the motivation for this. Is there any upcoming work on optimizers that's being blocked by mem::uninitialized because it would lead to miscompilations? Are there miscompilations already? Or is it still only about some theoretical model?

I am not aware of anything that is actively blocked. However, LLVM now has the noundef attribute, and we'd like to use it for more and more types -- including integer types. So the risk of miscompilations to happen in code that uses mem::uninitialized() is only going up.

@workingjubilee
Copy link
Contributor

I would hope we don't have to "quell" anyone, as that is a sign of fundamental disagreement deeper than anything we can reasonably expect to deal with merely through discourse. Never mind "in the context of a GitHub issue".

Not that I did not grok your meaning otherwise. Hard examples are quite useful.
I feel it is worth reiterating from RFC 1122 what was discussed as an option and eventually rejected as unviable alternatives when choosing Rust's specific stability promises:

Rather than simply fixing soundness bugs, we could issue new major releases, or use some sort of opt-in mechanism to fix them conditionally. This was initially considered as an option, but eventually rejected for the following reasons:

Opting in to type system changes would cause deep splits between minor versions; it would also create a high maintenance burden in the compiler, since both older and newer versions would have to be supported.
It seems likely that all users of Rust will want to know that their code is sound and would not want to be working with unsafe constructs or bugs.
We already have several mitigation measures, such as opt-out or temporary deprecation, that can be used to ease the transition around a soundness fix. Moreover, separating out new type rules so that they can be "opted into" can be very difficult and would complicate the compiler internally; it would also make it harder to reason about the type system as a whole.

I would, in fact, generically encourage everyone to revisit that document and read it entirely rather than rehashing all the points of discussion that lead to it. It is a complex issue that requires thought and nuance, and Rust has already made very specifically-shaped commitments here. Some of those commitments diverge from what we commonly remember them as (including my own memory, to be honest).

@RalfJung
Copy link
Member Author

RalfJung commented Jul 3, 2022

There is not, technically, a soundness bug here -- mem::uninitialized() was always unsafe. (Though that text pretty much says we should, as a breaking change, remove before_exec and env::set_var from the safe part of the language. Interesting.) It's "just" that a lot of old code uses mem::uninitialized() in totally unsound ways -- mostly because at the time there was no alternative and no guidance. And as is often the case with such subtle cases of UB, it is very hard to point at any data for bugs this caused -- these issues are latent, and if/when this does cause a concrete problem, it will be extremely hard to track down to its source.

Due to that I doubt that we will get consensus for removing mem::uninitialized() entirely from all editions. Hence removing it from as many editions as we can is the second best thing.

I think there is a gradient between "this is so critical that we have to reject it in all editions" and "this is 'fine', regular linting is enough". I think mem::uninitialized() is on that gradient (as are before_exec and the environment mutation functions -- assuming the safe forms of the latter will get deprecated). I don't think it is useful to insist that this must be a binary choice.

(The maintenance burden argument from the RFC does not apply, since we are not talking about compiler features here, but about library APIs. So there is no issue of ending up with a a combinatorial explosion of different type checkers, for instance.)

@workingjubilee
Copy link
Contributor

Correct. Not all of RFC 1122 nor RFC 1105 are actually applicable to this conversation. I just think it helps to get a nice cup of Context from the artifacts of previous discussions.

@JakobDegen
Copy link
Contributor

If someone doesn't care how unsafe the function is and is willing to disable the lint with #![allow], why wouldn't they be willing to disable the lint through edition = "2021"?

Why don't we make them write both? Deny by default (or future incompat, or whatever) on old editions and hard error in new editions?

@CAD97
Copy link
Contributor

CAD97 commented Jul 4, 2022

I'm all for adding a deny-by-default or even a forbid lint for mem::uninitialized [--m-ou-se]

Note: forbid lints cannot be latter overridden, except by --cap-lints. As such, the difference between making the use of mem::uninitialized a hard error in --edition-202X or a forbid-by-default lint in --edition-202X is effectively zero. The only difference is whether we call this

error[E0425]: cannot find function uninitialized in module std::mem

or

error: use of deprecated function std::mem::uninitialized: use mem::MaybeUninit instead
= note: #[forbid(hard_deprecated)] is on by default and cannot be overruled

and it really doesn't matter which. The latter error is even significantly more useful, and more what I think Ralf and Josh had in mind.

So here's my concrete proposal:

Add a new attribute, #[hard_deprecated(since = "1.XX.0", edition = "20XX")]. In editions < 20XX, using this item triggers a forward compatibility deny-by-default lint. In editions >= 20XX, using this item triggers a forbid-by-default lint.

Ideally, this would be a unique lint per item which has the #[hard_deprecated] attribute, so that if someone decides to allow(std::mem::uninitialized) for some reason, they still have deny(std::os::unix::process::CommandExt::before_exec), but this is not a hard requirement. The important part is that we loudly say that these functions must never be used.

@5225225
Copy link
Contributor

5225225 commented Jul 4, 2022

In editions < 20XX, using this item triggers a forward compatibility deny-by-default lint. In editions >= 20XX, using this item triggers a forbid-by-default lint.

Well, in editions >= 20XX, it should be a hard error. Since you can publish non-building crates to crates.io, and a forbid by default doesn't stop a crate being used as a dependency because of cap-lints. Not that I suspect that's likely to happen, but there's no reason not to.

a future compat warning for < 20XX code (probably only for types that we know are UB, don't emit it for mem::uninit::<ZST>()?) is a good idea. In theory it has false positives, where dead code calls mem::uninit<!>() or something, we'll emit a FCW for code that's not UB. Not sure if FCWs are allowed to have false positives like that.

But we're currently in the situation where we want code that currently uses, say old crossbeam to panic, but we're not actually telling people that it will, so they need to either use rustsec's audit, or just upgrade for unrelated reasons.

@RalfJung
Copy link
Member Author

Can mem::uninitialized be re-purposed to return arbitrary, but frozen bytes

It was recently changed to return memory filled with 0x01.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 26, 2022

Currently ability to build, but not to run some old e.g. tokio-core-based code seems like a major hit on "Rust stable forever" policy

Could you give more details on that?

Now that we do the 0x01-filling, we might want to adjust the panics to only fire if 0x01 would potentially lead to trouble.

@5225225
Copy link
Contributor

5225225 commented Aug 26, 2022

We do have an explicit hole in our stability promise for making our underspecified semantics more specified. "What you can do with uninit memory" was previously rather unspecified, this is making it more specified. Yes, this does break code that was written assuming that uninit data was fine.

See: https://github.com/rust-lang/rfcs/blob/master/text/1122-language-semver.md

There are a number of areas where the precise language semantics are currently somewhat underspecified. Over time, we expect to be fully defining the semantics of all of these areas. This may cause some existing code -- and in particular existing unsafe code -- to break or become invalid. Changes of this nature should be treated as soundness changes, meaning that we should attempt to mitigate the impact and ease the transition wherever possible.

Now that doesn't mean we should make changes without trying to minimise the impact on the ecosystem as much as we can. That's why crater runs were done every time the panics were made stricter and repos notified of the breakage manually.

@CAD97
Copy link
Contributor

CAD97 commented Aug 26, 2022

Can mem::uninitialized be re-purposed to return arbitrary, but frozen bytes, to allow old pre-MaybeUninit, pre-modern-definition-of-safety code to compile and work?

To be clear, this is proposed as an alternative to removing the function, correct?

This is merely an improvement in performance over the current state of things, where IIUC uninitialized will only panic if the code is UB even under the arbitrary but consistent semantics. Thus, code using uninitialized is at least as functional as it ever was. (It may be more due to the support of zeroing or oneing the value if those are potentially valid.)

Additionally, achieving a "frozen" arbitrary-but-consistent value wasn't something that LLVM even supported until somewhat recently. When it's available, it may be a reasonable choice, but it shouldn't change any of the cases which panic, because those would still be immediately UB.

@RalfJung
Copy link
Member Author

I don't think we should use freeze since we don't expose any other way of freezeing memory, so there is a non-0 risk that users might reach for mem::uninitiailized if they want frozen memory.

Also, the 0x01 filling was picked because uses of mem::uninitiailized with nonnull types was fairly common. Those would be UB even with freeze, so there'd still be planty of panics needed (only plain integers would would benefit from freeze, and we are not panicking for them anyway).

@vi
Copy link
Contributor

vi commented Aug 26, 2022

Currently ability to build, but not to run some old e.g. tokio-core-based code seems like a major hit on "Rust stable forever" policy

Could you give more details on that?

I remember two of my small projects: one using Tokio 0.1 and the other using old version of pest to be buildable (with big compat warnings about mem::uninitialized), but not runnable (e.g. futures::task_impl::Task uninitialized, which is invalid panic). This is not resolved even by cargo update (latest compatible version still relies on undefined, and major upgrade it non-trivial). I don't remember in detail, but maybe this one that depended on futures-await also got panic due to uninitialized before stopping being buildable completely (this one is nightly-only).

So I assume that somebody who developed some private high-performance networking tool 5 years ago, then left it behind (or just used a pre-built executable year after year), and now trying to revive it may be unpleasantly surprised.

@RalfJung
Copy link
Member Author

Assuming that is the 0.1 branch of futures, it's not looking good... Task contains a TaskUnpark which is an enum. So it's far from obvious whether the 0x01 pattern actually makes this fine from an LLVM point-of-view. (Both enum variants are fine. But enum layout is complicated so depending on how the discriminant is stored, there might still be UB.)

Here's a replication of that layout. TaskUnpark uses a 64bit tag field to store the discriminant. 0x0101010101010101 is not a valid tag, so the 0x01 pattern is still UB.

So, doesn't look like we have a good way to not make old tokio panic -- sorry. :/

@vi
Copy link
Contributor

vi commented Aug 26, 2022

Can rustc or cargo contain a list of specific formerly popular, but now broken-beyond-repair library versions and show some customized warnings explaining what users need to do in order to make their project work?

Imagine some Rust newbie tasked to revive old project with a mem::uninitialized dependency within some corporation. What error or warning message would be appropriate? Can something like This project depends on futures 0.1's tasks and will not work with modern Rust, as it relies on undefined behaviour. To build this project, you either need to use Rust version x.xx, migrate the project to modern async framework or use this patched version of futures 0.1 be shown, with solutions for some finite set of specific dependencies?

@5225225
Copy link
Contributor

5225225 commented Aug 26, 2022

Something basically exactly along those lines (but more general, just saying "you depend on a crate that uses mem::uninitialized") was implemented in #100342 but we're still deciding on the exact situations that it should warn about.

I don't think hard-coding specific crates are a good idea, since then it might feel like the compiler is targeting specific crates and not others. That, and it won't lint about the less common crates, or crates we don't know about (internal dependencies in companies?)

@CAD97
Copy link
Contributor

CAD97 commented Aug 26, 2022

tbf, the futures crate is a semiofficial rust-lang organization crate, so it may be a reasonable target for special behavior (in cargo, not rustc, though imho wait no, is tokio the one at fault for using uninitialized).

If they don't already have one, someone should add a RustSec advisory with that information, which cargo audit will surface.

random bad idea

In the case where we know it's UB, and we know there's a Default implementation, we could emit a call to Default. This would make more code defined at the cost of implementation complexity.

This is I'm sure very difficult due to the layering of the compiler, but at least theoretically possible. Unfortunately, it doesn't really solve the problem for all types, and most problem cases probably don't have a Default Implementation anyway.

@saethlin
Copy link
Member

saethlin commented Aug 26, 2022

In the past people (including myself) have suggested the we could emit a valid but unspecified bit pattern for any type. That would keep old code "working" for sure, but

  1. It's a significant maintenance burden to support code that was explicitly not part of the stability guarantee to begin with, so it may be a concerning precedent.
  2. We definitely don't want users to notice that this is the case and use this function willy-nilly because it never causes problems so long as there is any valid bit pattern for the type.

@RalfJung
Copy link
Member Author

tokio the one at fault for using uninitialized

Ah, I was about to suggest if futures is a semi-official crate maybe we can get an 0.1.x point release, but seems like that would not help.

@vi
Copy link
Contributor

vi commented Aug 27, 2022

futures ... 0.1.x point release, but seems like that would not help

Maybe tokio 0.1 and tokio-core point releases could help?

@LunarLambda
Copy link

I'm in favour of allowing editions to remove things that are known to be broken. Despite Rust's stability guarantees (which are a wonderful thing!), I don't think it's realistic for there to never be a "we screwed up big time" escape hatch, and I think it's good for Rust to "encourage"/enforce the use of safe replacements, even if it risks some churn/frustration.

It also means that if someone sees (e.g. when auditing dependencies) a crate is written against edition 2024 (or any future relevant edition), it means it cannot (or at the very least is very unlikely to) use certain broken features, which is nice. A good example of this are the efforts to either deprecate or restrict static mut in the 2024 edition.

@est31
Copy link
Member

est31 commented Dec 19, 2023

There has also been some discussion in #100342 which proposed adding a forward compat warning. I want to quote @Nilstrieb 's comment:

I don't think it's a good idea to add an FCW for mem::uninitialized now. The danger of the function has been diffused, it just returns 0x01 now. Programs using it are completely fine (though they are a little bit slower than they originally anticipated :D).

That is about FCWs and not edition lints/errors.

How much does the 0x01 value help really? If I do unsafe { let mut ref: &u8 = mem::uninitialized(); some_c_func(&mut ref); *ref}, is that safe now?

@KisaragiEffective
Copy link
Contributor

No, it would require reading uninitialized memory, which is UB.
The 0x01 constant is meant to be mitigating wrong code (i.e. where type argument is non-null pointers or bool which only accepts 0x00 or 0x01; creating invalid bools is also UB)

@saethlin
Copy link
Member

saethlin commented Dec 19, 2023

How much does the 0x01 value help really?

Quite a bit. Some callers still trip into UB because they use this function inside a generic and in an instantiation the 0x01-filling collides with a niche. But a very common use of this function is to "cheaply" create an array of u8.

If I do unsafe { let mut ref: &u8 = mem::uninitialized(); some_c_func(&mut ref); *ref}, is that safe now?

Probably. You're asking a bunch of technically-open opsem questions, including whether we have validity behind references. The only catch is that you can't test that in Miri, because Miri disables the 0x01-filling mitigation.

(note that I'm trying to answer the question you're trying to ask instead of the literal text, that program in question simply panics because there is an assertion in mem::uninitialized that get monomorphized into a non-unwinding panic in this program)

@RalfJung
Copy link
Member Author

RalfJung commented Dec 19, 2023

How much does the 0x01 value help really? If I do unsafe { let mut ref: &u8 = mem::uninitialized(); some_c_func(&mut ref); *ref}, is that safe now?

It's still language UB since the ref is a dangling reference. (And it's obviously library UB since we do not guarantee that mem::uninitialized behaves like this.) But the LLVM IR we generate does not have UB. That's not a guarantee though.

No, it would require reading uninitialized memory, which is UB.

Where does this read uninitialized memory?

Probably. You're asking a bunch of technically-open opsem questions, including whether we have validity behind references. The only catch is that you can't test that in Miri, because Miri disables the 0x01-filling mitigation.

This creates a dangling reference; I don't think there are open questions around whether this is language UB even with the mitigated mem::uninitialized.

@saethlin
Copy link
Member

saethlin commented Dec 19, 2023

Of course, I forgot about that 🤦 .

In practical terms, I think the 0x01-filling in combination with the way we will monomorphize this to just a non-unwinding panic is very effective at preventing surprise optimizations.

I would advocate for removing this function for two reasons. The mitigation makes it no longer match the documentation, which clearly says

while doing nothing at all

We've already effectively removed this function; the implementation doesn't match the documentation anymore.

There are also a few cases where people have used this function and mem::zeroed to acquire storage for generic containers. These are now very wonky to use because some monomorphizations will panic and some won't.

@bstrie bstrie added the A-edition-2024 Area: The 2024 edition label Jan 8, 2024
@traviscross traviscross added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Jan 28, 2024
@traviscross
Copy link
Contributor

@rustbot labels +I-lang-nominated

We discussed this in the edition call today. Let's nominate this on the lang side to discuss whether or not this is something that we wanted to see happen.

@rustbot rustbot added the I-lang-nominated The issue / PR has been nominated for discussion during a lang team meeting. label Jun 4, 2024
@traviscross
Copy link
Contributor

@rustbot labels -I-lang-nominated -A-edition-2024 +A-maybe-future-edition

We discussed this in the lang meeting this week. The consensus was that we'd be open to a lint-based approach.

We didn't feel that this needed to happen before Rust 2024, and we were open to the idea that perhaps the severity of such a lint could be tied to Rust 2024 even after it ships.

@rustbot rustbot added A-maybe-future-edition Something we may consider for a future edition. and removed I-lang-nominated The issue / PR has been nominated for discussion during a lang team meeting. A-edition-2024 Area: The 2024 edition labels Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-maybe-future-edition Something we may consider for a future edition. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
Status: Needs help: Design
Development

No branches or pull requests