Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix `BTreeMap` UB caused by using pointer identity on a static #63338

Open
wants to merge 1 commit into
base: master
from

Conversation

@francesca64
Copy link

commented Aug 6, 2019

In #50352, EMPTY_ROOT_NODE was introduced to avoid
allocating when creating an empty BTreeMap. EMPTY_ROOT_NODE was
distinguished from regular nodes using pointer identity. However,
EMPTY_ROOT_NODE can have multiple addresses if multiple copies
of liballoc are used together, i.e. when interoperating with a
dynamic library. In that situation, all sorts of scary things will
happen, since it's possible for EMPTY_ROOT_NODE to be treated as
a regular node.

To fix that, this PR adds a flag to every node, and simply uses
that flag to distinguish EMPTY_ROOT_NODE from regular nodes.
This only increases the node size if keys previously used every
byte of padding after len, or if every byte of remaining padding
was previously used to also fit vals. In those cases, every node
will grow by 1 word. This playground link can be used to easily
test the space impact of this PR:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87850055d6049be048f9e73f6a5055d6

Fix `BTreeMap` UB caused by using pointer identity on a static
In #50352, `EMPTY_ROOT_NODE` was introduced to avoid
allocating when creating an empty `BTreeMap`. `EMPTY_ROOT_NODE` was
distinguished from regular nodes using pointer identity. However,
`EMPTY_ROOT_NODE` can have multiple addresses if multiple copies
of `liballoc` are used together, i.e. when interoperating with a
dynamic library. In that situation, all sorts of scary things will
happen, since it's possible for `EMPTY_ROOT_NODE` to be treated as
a regular node.

To fix that, this PR adds a flag to every node, and simply uses
that flag to distinguish `EMPTY_ROOT_NODE` from regular nodes.
This only increases the node size if `keys` previously used every
byte of padding after `len`, or if every byte of remaining padding
was previously used to also fit `vals`. In those cases, every node
will grow by 1 word. This playground link can be used to easily
test the space impact of this PR:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=87850055d6049be048f9e73f6a5055d6
@rust-highfive

This comment has been minimized.

Copy link
Collaborator

commented Aug 6, 2019

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @cramertj (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@cramertj

This comment has been minimized.

Copy link
Member

commented Aug 6, 2019

r? @gankro

@rust-highfive rust-highfive assigned Gankra and unassigned cramertj Aug 6, 2019

@Centril

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

@Gankra

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

Do we... actually support such a cursed linkage? I guess we have to..? Maybe..?

@Mark-Simulacrum

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

I think we only need to support it if LLVM is permitted to generate multiple copies with e.g. codegen units set to non-1 value; there's no way to pass a BTreeMap across different copies of liballoc since we don't have a stable ABI, if I understand our guarantees correctly.

Cc @alexcrichton who knows more about linkage I think

@Gankra

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

consts can freely diverge with codegen units but iirc we guarantee that statics have a single symbol (should be using something like linkonce/linkonce_odr).

@Gankra

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

Regardless of the answer we get here, here's a suggested Cute Hack to avoid bloating up the map: Just have two null parent pointer values, 0 and 1. e.g.

const IS_EMPTY_SINGLETON = 0
const HAS_NO_PARENT = 1

Then just change ascend (the only function that reads parent) to do <= HAS_NO_PARENT (<= 1) instead of == 0. Only overhead would be an extra load to check if we're the empty singleton.

But of course I'd rather not do this if we explicitly don't support this linkage scenario.

@RalfJung

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

AFAIK the entire point of statics is to guarantee having a unique address. If you have multiple liballoc in one binary you must make sure objects from one liballoc don't "leak" to the other; these two instances are basically ABI-incompatible.

@Gankra

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

also to be clear this question needs answering because the stdlib is not unique in using this pattern (see: thin_vec)

@RalfJung

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

Statics guarantee a unique address, yes, and passing something like BTreeMap across dynamic library boundaries is not supported. Almost no types in the standard library are supported to cross dynamic library boundaries.

@Gankra

This comment has been minimized.

Copy link
Contributor

commented Aug 7, 2019

It sounds like we should warn/error if a non-FFI-safe type appears in the public API of something compiled as a dynamic library..? I know the linker is an infinite source of terrors that we can't hope to tame, but it feels like this is a case where something better could be done.

@rkruppe

This comment has been minimized.

Copy link
Member

commented Aug 7, 2019

It sounds like we should warn/error if a non-FFI-safe type appears in the public API of something compiled as a dynamic library..?

See #19834, though that won't help if someone exposes Rust-ABI function symbols.

@ProgrammaticNajel

This comment has been minimized.

Copy link

commented Aug 16, 2019

Ping from triage. @Gankra any updates on this? Thanks.

@Gankra

This comment has been minimized.

Copy link
Contributor

commented Aug 16, 2019

I'm pretty sure the conclusion of this discussion is that the author is trying to "fix" a fundamentally unsupported way of using Rust, which doesn't make sense to do. In addition, I also reviewed the patch and provided an alternative implementation that would be acceptable if for some reason we concluded this configuration was worth supporting.

I think we're all implicitly waiting for the author to respond and try to provide a compelling argument for their use case, but if they don't do that we should just close this as WONTFIX.

@Dylan-DPC

This comment has been minimized.

Copy link
Member

commented Aug 16, 2019

@francesca64 any updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.