Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(When) Should we do intptrcast (and other opt-in features) per default? #785

Closed
8 tasks done
RalfJung opened this issue Jun 22, 2019 · 12 comments
Closed
8 tasks done
Labels
A-intptrcast Area: affects int2ptr and ptr2int casts C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps

Comments

@RalfJung
Copy link
Member

RalfJung commented Jun 22, 2019

So it looks like thanks to @christianpoveda we will soon have a first version of intrptrcast. That's great! For now it will be off-by-default and only enabled when a seed is given, as allocation comes with some non-determinism that the seed will be used to resolve (this part is not yet implemented).

However, I think it might make sense to have intptrcast on by default. The libstd code that runs before main() does a bunch of things that we currently carry hacks for in Miri (such as testing whether a pointer is > 0), and similar for the slice comparison code (we assume allocations all have a base address of at least 32 or so). Also the integer-binop and integer-cast code is extra messy because it has to support pointer values. We have to duplicate parts of our test suite to make sure the right thing happens both with and without intrptrcast. And every now and then people run into the problem that alignment checking code does not work in Miri without intptrcast, and they don't know that they have to do cargo miri run -- -Zmiri-seed=42. So, in terms of user experience, it strikes me as a bad default.

The current "no base address" mode is nice to help detect code that assumes that no allocation will ever be "low" (which e.g. Rc did for some time) -- but OTOH code that assumes an OS, like Tokio, legitimately makes that assumption. And the cost for carrying that mode is non-trivial.

Considering in particular the UX aspect, it seems to me that rather than having Miri support a minimal set of features per default and letting the user opt-in to more, it might be better to support all (implemented) features per default and allow disabling them. So per default we would actually pick a "truly random" seed on startup, but -Zmiri-seed=xx could make that choice deterministic (but Miri would always have an RNG it could use for stuff). Per default we would allow communicating with the host system (file system and network access, system time, environment variables, ...), but -Zmiri-isolate would disable that. And so on.

CTFE might at some point become powerful enough that we'll want some of these hacks in rustc proper, but certainly not all of them, and I think we can implement them in a better way there if no machine hook is in the way.

@oli-obk (and anyone else reading) what do you think?


The concrete plan for now:

  • When no seed is given, use some default. So we always have an RNG, we can remove the Option.
  • Get rid of intptrcast special handling in run-pass test-suite, and delete redundant compile-fail tests (some of tests/compile-fail/intptrcast_*).
  • Fix panic when an odd number of digits is given for the seed, better error.
  • Get rid of the error catching introduced in Enable intptrcast for explicit casts rust#62229 (just always force_bits).
  • Move cast.rs and operator.rs to type-based dispatching for all types.
  • Remove -Zmiri-seed from playground invocation. (Of course the attribute remains documented. We should probably move that section up above "Development" as it is more relevant.)
  • Remove -Zmiri-seed from miri-test-libstd
  • Re-enable "overlaps" check for copy_nonoverlapping (see update Miri rust#62823).
@RalfJung RalfJung added C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out A-interpreter Area: affects the core interpreter labels Jun 22, 2019
@bjorn3
Copy link
Member

bjorn3 commented Jun 22, 2019

So per default we would actually pick a "truly random" seed on startup

If so, please print it, so that when code fails for a certain seed, it is easier to reproduce.

@RalfJung
Copy link
Member Author

How do you suggest we print it in a way that does not get mixed up with what the program prints? Running miri should basically behave the same as running the program, also to enable 1:1 output comparisons and so on.

@bjorn3
Copy link
Member

bjorn3 commented Jun 22, 2019

Maybe write it to say /tmp/miri_seed. and/or print it when an eval error happens.

@RalfJung
Copy link
Member Author

Printing on error seems fine, and we can also have a flag -Zmiri-print-seed.

I'd rather not print stuff to random files on the system though.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 24, 2019

Yes, I think having the "do everything" mode be the default mode is better.

Note that cargo run also prints extra stuff, so cargo miri can print extra stuff, as long as miri itself doesn't

@RalfJung
Copy link
Member Author

However, before we enable intptrcast per default, I'd like to make sure that we "mis-align" buffers as much as we can. I want to be sure that we find the error in programs like this:

fn main() { unsafe {
    let x = &[0u8; 64];
    let y = &*(x as *const _ as usize as *const [u64; 8]);
} }

@RalfJung
Copy link
Member Author

So e.g. when picking the base address for a 2-aligned allocation, we should make sure that the base address is not divisible by 4.

This still hides some alignment errors though, because that means that the offset 2 of a 2-aligned allocation is always 4-aligned... hm.

@oli-obk
Copy link
Contributor

oli-obk commented Jun 25, 2019

oh lol, the current impl can actually create truly misaligned integer addresses. you can have a u16 end up at address 2^16 + 1:

let x = (&5u8 as *const u8 as usize) * 2;
let y = (&6u16 as *const u16 as usize) * 2 / 2;
assert!(y % 2 == 1); // will succeed

cc @christianpoveda

@RalfJung
Copy link
Member Author

RalfJung commented Jun 25, 2019

Regarding the seed problem, maybe we just have a default seed of 0?

(The rest of this post is actually off-topic as it is about external communication; I'll move that to a new issue when we close this one. Also see #653.)

And we make getrandom syscalls forward to "real" getrandom, and env vars forward to "real" env vars. So per default the "interpreter RNG" (the thing that -Zmiri-seed seeds) only gets used for "interpreter non-determinism", aka allocation base addresses (and maybe thread scheduling, once we support some form of concurrency).

And then we offer -Zmiri-isolate to shut down all host system communication, which also implies determinism. Or maybe we call it -Zmiri-deterministic, but that seems misleading since it's only deterministic for a fixed seed. And then env vars are emulated like they are now, and getrandom syscalls query the "interpreter RNG". And if we ever implement reading from stdin, or accessing the file system, then that would also be turned off by that flag.

Does that seem like a reasonable plan?

@RalfJung RalfJung added A-intptrcast Area: affects int2ptr and ptr2int casts and removed A-interpreter Area: affects the core interpreter labels Jun 28, 2019
@RalfJung RalfJung changed the title (When) Should we do intptrcast and other opt-in features per default? (When) Should we do intptrcast (and other opt-in features) per default? Jun 28, 2019
@RalfJung RalfJung added C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps and removed C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out C-enhancement Category: a PR with an enhancement or an issue tracking an accepted enhancement labels Jul 1, 2019
bors added a commit that referenced this issue Jul 24, 2019
enable Intrptrcast by default

As laid out in #785: we change Miri to always have an RNG, seeded per default with 0. Then we adjust everything to remove dead code and dead tests.

r? @oli-obk
Cc @christianpoveda
bors added a commit that referenced this issue Jul 24, 2019
enable Intrptrcast by default

As laid out in #785: we change Miri to always have an RNG, seeded per default with 0. Then we adjust everything to remove dead code and dead tests.

r? @oli-obk
Cc @christianpoveda
@RalfJung
Copy link
Member Author

All tasks are done, this issue can be closed. :)

@oli-obk
Copy link
Contributor

oli-obk commented Jul 26, 2019

🎉 🎈 awesome! This is an absolutely epic feature

@RalfJung
Copy link
Member Author

(Some of those checkboxes are only really done when rust-lang/rust#62946 lands; I ticked them when I created the PR to know what other things I still need to do.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-intptrcast Area: affects int2ptr and ptr2int casts C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps
Projects
None yet
Development

No branches or pull requests

3 participants