Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audit the sentinel node for Unsafe Code Guidelines Violations #97

Closed
Gankra opened this issue Aug 16, 2019 · 10 comments
Closed

Audit the sentinel node for Unsafe Code Guidelines Violations #97

Gankra opened this issue Aug 16, 2019 · 10 comments

Comments

@Gankra
Copy link
Contributor

Gankra commented Aug 16, 2019

As discussed in #96, LinkedHashMap is some very old code that uses some old patterns that the UCG really doesn't like.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 16, 2019

Whoops accidentally hit submit. Copying my comment from the PR:


Yikes. Ok so this is definitely an improvement but I think more needs to be done to make this rigorously correct. In particular it is very easy to accidentally assert that the K and V exist transiently, leading to technical but unlikely to matter UB.

I believe the correct fix here is a more subtle and frankly very annoying, in the same vein as these fixes we did for BTreeMap and LinkedList.


Extra info: we allocate a sentinel ("guard") node for the map with an uninitialized K/V, which is a standard doubly-linked-list trick to make things a bit simpler by basically making it a big ring, with this sentinel joining the two ends. The problem is that we don't have a type-level distinction between "a node" and "a node that has a K/V pair", so if you ever access the sentinel with a reference, we are, potentially (grey area in the UCG), implicitly asserting that the K/V is initialized to valid values (impossible for us to do).

Note that our situation isn't exactly the same as BTreeMap: it could never even allocate space for K/V, as every BTreeMap shares this one statically allocated node. This meant a reference to a Node was partially dangling, instead of "just" pointing at uninitialized memory. The former is more obviously busted in the eyes of the UCG, as I understand it.

As BTreeMap established, the "correct" fix for this is to have two types, Node and ValueNode, where everything points to Node and you only cast to ValueNode when you know you're not looking at the sentinel.

Alternatively, we can just very carefully avoid ever constructing &Node or &mut Node if it's ever possible that we're pointing at the sentinel (basically only manipulating it through a raw pointer). This is potentially possible!

Alternatively Alternatively, we can wrap the K/V in a MaybeUninit, which to me feels like a "bad solution" but honestly might just be the best one? It'll mostly just add lots of random noise anytime we want to look at the key and value, which admittedly so would the BTreeMap solution, just "higher up" in the logic.

@Gankra
Copy link
Contributor Author

Gankra commented Aug 16, 2019

fun unsafe evil for @RalfJung's catalogue

@Gankra
Copy link
Contributor Author

Gankra commented Aug 16, 2019

As fun trivia, although this sounds like a case where maybe the absence of &raw would be relevant, I think this perfectly dodges around the problem, as:

  • we have no interest in actually initializing the sentinel
  • we get a raw pointer to the sentinel directly from alloc
  • normal nodes just use Box::new

@RalfJung
Copy link

RalfJung commented Aug 16, 2019

My personal stanza on this is we should allow references to uninitialized memory. But we'll likely have to make an exception for references to uninhabited types (including things like (u32, !)), because optimizations.

If K or V is !, is it possible to ever hit a code path that creates a node reference?

@Gankra
Copy link
Contributor Author

Gankra commented Aug 16, 2019

No, the empty map doesn't allocate.

@kyren
Copy link

kyren commented Jan 21, 2020

Hey, I just wanted to drop in and say that I have a forked version of linked-hash-map that passes tests under miri and I'd like to maybe coordinate work here.

The solution to fixing this was the laziest, mechanical fixes I could manage, basically your last suggestion of the "bad" solution of wrapping the K/V in MaybeUninit. My version of this could still use a lot of improvement.

Unfortunately I actually did this before I even found this issue and looked at what you had written, but since I've done at least some of the work it's silly for me not to try to coordinate here.

It's also 100% possible that I've done a lot of this incorrectly or sloppily as well. I should have done more work to coordinate here, but the maintenance warning sort of scared me off of that and led me down a path of doing it all independently. I do not consider myself an expert at writing theoretically sound rust code, so it's probable there are unsolved problems that miri didn't catch, but I'd be willing to do more work here with some guidance.

@niluxv
Copy link

niluxv commented Jul 11, 2020

Is there an intent to fix the UB currently detected by miri? Or should users switch to e.g. hashlink?

@valpackett
Copy link

Looks like the rust stdlib itself now detects some UB (probably the same one discussed above) with assertions in mem::uninitialized rust-lang/rust#77585 and these assertions literally make Firefox crash for me..

Firefox uses rusqlite which uses lru_cache (also a maintenance mode project on this github org) which uses this. Maybe switching lru_cache to hashlink is a good idea, but I also raised an issue in rusqlite about switching to a different caching library rusqlite/rusqlite#810

@RalfJung
Copy link

RalfJung commented Oct 5, 2020

@myfreeweb I don't think the issue you are running into is this one... it likely is #100. This was fixed in the latest released version of the crate, and that update is semver-compatible, so you should be able to update your crates to get the fix.

@Gankra
Copy link
Contributor Author

Gankra commented Jun 9, 2022

This was fixed in #106 and published as 0.5.4

@Gankra Gankra closed this as completed Jun 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants