Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

semantics of black_box and clobber in the memory model #311

Closed
gnzlbg opened this issue Mar 13, 2018 · 32 comments
Closed

semantics of black_box and clobber in the memory model #311

gnzlbg opened this issue Mar 13, 2018 · 32 comments

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Mar 13, 2018

I've just closed rust-lang/rfcs#2360 due to @rkruppe's input that by trying to specify what black_box and clobber do there the RFC is basically specifying a subset of the memory model.

So I'd like to ask feedback for these here first.

The specification in the RFC is very loose. mem::black_box(x) is specified as follows:

  • writes the address of x to memory
  • reads all memory
  • writes all memory

while mem::clobber():

  • reads all memory
  • writes all memory

where I have no idea how to specify memory but @rkruppe came up with a minimal example that helps:

{
    let tmp = x;
    clobber();
}

Here, the compiler is allowed to optimize the store of x to tmp away, because the only code that has the address of tmp is in that scope, and that code does not read or write from tmp. In the specification of clobber, when it states "read/write from/to all memory", "memory" does not refer to tmp. However, if I change the example to:

{
    let tmp = x;
    black_box(&tmp);
    clobber();
}

in this case clobber() requires the store of x to tmp to be live, because black_box has written the address of tmp to "memory", and thus the "read/write to/from memory" in clobber can do something with tmp.

So a big question I have is what is "memory" here, and how does tmp in the first example differs from the rest?

I don't know how these could make sense in Rust memory model, what the wording for their specification should be, and I am barely qualified to follow the discussion at all. The RFC contains more information, and lots of godbolt links, but black_box and clobber proposed implementaiton is this:

#[inline(always)]
fn clobber() {
    unsafe { asm!("" : : : "memory" : "volatile") };
}

#[inline(always)]
fn black_box<T>(x: T) -> T {
    unsafe {
        asm!("" // template
          : // output operands
          : "r"(&x) // input operands
          // r: any general purpose register
          : "memory"    // clobbers all memory
          : "volatile"  // has side-effects
        );
        x
    }
}

Also, @rkruppe asked:

I'd like to know the difference between this and compiler_fence(SeqCst)

To which @nagisa answered:

The difference between asm! with a memory clobber and compiler_fence exists in the fact, that memory clobber requires compiler to actually reload the memory if they want to use it again (as memory is… clobbered – considered changed), whereas compiler_fence only enforces that memory accesses are not reordered and the compiler still may use the usual rules to figure that it needn’t to reload stuff.


cc @rkruppe @nagisa

@strega-nil
Copy link

strega-nil commented Mar 13, 2018

These functions, by definition, are no-ops to the virtual machine. All they are, are hints to the optimizer. The definition of these functions is implementation-defined. It's similar to the volatile semantics of C++.

@hanna-kruppe
Copy link

I tend to agree to that positon, though then their docs need to be much more carefully worded, and they may not make sense to put in std. But the underlying questions -- what the effects of asm!(... "memory") and asm!(... "memory" : "volatile") are, for example -- are relevant for other unsafe code as well. The answers to that will also influence how robust the proposed implementation is under optimizer improvements.

@strega-nil
Copy link

strega-nil commented Mar 13, 2018

@rkruppe It is all implementation defined. Defining asm in a cross-platform way seems... nearly impossible. This would be a question for rustc, not for the memory model. volatile semantics get into codegen on specific platforms, which would be impossible to specify in a useful way, imo. (that's not to say we shouldn't specify it, just that the memory model is not the correct place to specify it)

@hanna-kruppe
Copy link

hanna-kruppe commented Mar 13, 2018

Defining asm in a cross-platform way is nonsense, sure. But its interactions (edit: or more specifically, the interactions of any loads and stores that may be made from the inline asm) with surrounding loads and stores? That seems perfectly possible (disclaimer: after having thought like two minutes about it).

@strega-nil
Copy link

@rkruppe what could we define? Beyond it being UB to store to a memory location you don't have mutable access to, or load from a memory location you don't have access to, and it being fine to not do that. Assembler is completely outside the bounds of a reasonable memory model - the best you can do is treat it as any other linked thing, and force the visible effects of the "call" to be valid under the Rust object/memory model.

@hanna-kruppe
Copy link

hanna-kruppe commented Mar 13, 2018

@ubsan That's pretty much what I was thinking of and it's already something more useful than "it's all implementation defined"! Saying precisely which memory locations the inline asm can and can't affect already tells us in which cases it can't possibly block optimizations, and (looking beyond the scope of clobber) makes quite a few "misuses" of inline asm unambigiously UB rather than being all implementation-defined.

@strega-nil
Copy link

@rkruppe Yeah, fair. I'd define it similarly to how it'd be defined to call out to C with. for black_box and clobber in specific, since they're hints to the optimizer and not side effecting in any way, it's not necessary to really define anything wrt them. Perhaps clobber should only be callable on valid objects which are reachable and mutable(?) in the current context? And I don't think it should deal with atomics in any way, shape, or form.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 16, 2018

Perhaps clobber should only be callable on valid objects which are reachable and mutable(?) in the current context?

@ubsan mem::clobber() takes no arguments :/

@strega-nil
Copy link

@gnzlbg oh. Then there's like, literally nothing that clobber does in the memory model, besides being a side effect (maybe?)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 16, 2018

It's a side-effect that reads all memory and does something observable with it - but it does not alter any memory that the program can observe.

@hanna-kruppe
Copy link

[clobber] does not alter any memory that the program can observe

This is off-topic but I keep forgetting to point it out so I'll do it now: that's not what clobbering usually means, and not what asm!(... : "memory") means. (And FWIW it's not what the Google Benchmark function ClobberMemory referenced in the RFC does, either.) I don't know if that's really what one would want for benchmarking, but either change your definition or don't call it clobber.

@strega-nil
Copy link

@rkruppe clobber means "the optimizer cannot assume that any memory before this point wasn't changed by the call to clobber", no?

@hanna-kruppe
Copy link

@ubsan When applied to "memory as a whole", yes -- the term is also commony applied to specific memory locations and to registers. Although in the context of inline asm I wouldn't call it a an optimizer hint. But I don't know what @gnzlbg has envisioned for the proposed function in std::mem. (Sorry, I should have laid all this in the previous post.)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 16, 2018

@rkruppe

Yes, sorry for the confusion, the RFC currently gives it the

"the optimizer cannot assume that any memory before this point wasn't changed by the call to clobber"

semantics, so it makes sense to me to call that clobber.

For the purpose of benchmarking, clobber is heavier weight than what's actually needed, and what is needed is actually what I mentioned above:

It's a side-effect that reads all memory and does something observable with it - but it does not alter any memory that the program can observe.

Sorry for calling that clobber, that was a mistake. I don't know what we can call that, maybe mem::read_volatile / mem::read_all / mem::flush .

Sorry again for the confusion.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 18, 2018

So what would the std docs be allowed to say about black_box and clobber?

  • black_box: prevents the result value of an expression from being optimized away - no mention of reads value x, all memory, and writes to all memory with side-effects.
  • clobber: inserts an unspecified side-effect - no mention of flushing, reading/writing all memory with side-effects

@strega-nil
Copy link

honestly, it seems like you could just explain how it hints the optimizer in the std docs.

@RalfJung
Copy link
Member

RalfJung commented Mar 19, 2018

From an optimization standpoint, if I had to describe these functions in the model, I'd describe them as calls to inherently unknown functions -- functions that the compiler cannot make any assumptions about, and thus has to be maximally pessimistic in terms of optimizations (or, equivalently, functions that just block any sort of interprocedural analysis).

So, I would think that

pub fn foo(...) {
  ...
  clobber();
  ...
}

can be optimized exactly the same way that the following can be optimized

pun fn foo(..., clobber: fn()) {
  ...
  clobber();
  ...
}

Similar for black_box: It's an unknown function that can do anything it wants with the pointer it gets, restricted only by the invariants that the unsafe code guidelines impose on all code (and that we do nut fully know yet).

In particular, clobber() could observe and mutate any global state, as well as any state that was passed in by the environment (say foo takes a &Cell<i32>; clobber() could observe and change its value). This gives rise to the behavior described by @gnzlbg in the initial post: clobber() cannot observe or mutate foo's local variables, because an unknown function cannot access those. black_box, when passed a &mut T, can observe and mutate the data behind that pointer like any normal function could.

However, I think the second example @gnzlbg gave cannot reasonably work:

{
    let tmp = x;
    black_box(&tmp);
    clobber();
}

Here, black_box is a function of type for<'a> fn(&'a T). Such a function is not allowed to mutate through the pointer, and the pointer must not be used again once the function returns. This is because we want to enable the following optimization without interprocedural analysis:

{
  let tmp = 3;
  bar(&tmp);
  let tmp2 = tmp;
  baz();
  return tmp == tmp2; // optimized to true (tmp will stay 3 all the time)
}

Imagine bar is black_box and baz is clobber -- the optimization is universally valid, so it must still be valid for these concrete choices.

So, I agree with @ubsan when they say

Yeah, fair. I'd define it similarly to how it'd be defined to call out to C with.

Calling C via FFI is another example of calling an unknown function.

The funny thing is that in C, if I follow the same rules I laid out above, I do get the behavior that @gnzlbg wants for black_box:

int tmp = 3;
bar(&tmp);
int tmp2 = tmp;
baz(); // bar could have stored the pointer to `tmp` somewhere accessible to baz, so this can change `tmp`
return (tmp == tmp2); // not optimizable

The reason for this difference is that Rust's stronger type system enables the optimizer to reason about aliasing without interprocedural analysis, making optimizations of pointer operations much simpler than they are in C. That's the entire point of the unsafe code guidelines team's struggle with the aliasing rules. Mostly, this is beneficial, but if you want to inhibit optimizations of course this becomes a problem.

Depending on how the rules around raw pointers work out, it may be that calling black_box(&mut x as *mut _) has the effect you are looking for, maybe together with making black_box unsafe (because otherwise it couldn't actually do anything with its raw pointer). I am pretty sure we want to have a strong stanza on "no mutation through shared references", so I would be very strongly opposed to black_box(&x) possibly "writing" to x.

Also, @gnzlbg what do you think about other cases that concern Rust's aliasing rules? For example

fn foo2(x: &i32) -> bool {
  let y = *x;
  clobber(); // cannot possibly write to x as we have a shared reference
  y == *x // optimized to true
}

Again a C compiler would have to assume that clobbercan have access to an alias of x and re-load *x, but Rust can do better because of its stronger types.

@strega-nil
Copy link

@RalfJung I mostly agree, but slightly disagree about the way the compiler gets to that point. Specifically, I don't think it's to do with the function returning - I don't think it's invalid for bar to store the pointer to tmp anywhere. However, once you read from tmp, the borrow that that pointer has is "broken", and so you can't read from or write to the tmp object without invoking undefined behavior.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

@RalfJung your examples are great, thank you!

If

  • black_box is an unknown function with signature unsafe fn black_box(x: *mut T);
  • clobber is an unknown function with signature unsafe fn clobber();

then:

unsafe {
  let mut tmp = 3;
  black_box(&mut tmp as *mut _);
  let tmp2 = tmp;  // A
  clobber();
  return tmp == tmp2; // B
}

are the following statements:

  • at A tmp2 isn't necessarily equal to 3 because black_box might have modified tmp,
  • at B tmp == tmp2 isn't necessarily true because black_box might have stored a pointer to tmp somewhere that clobber might have written to

true or is this behavior something reasonable to expect from the memory models that have been explored in this repo?

Also, @gnzlbg what do you think about other cases that concern Rust's aliasing rules?

I think that clobber and black_box must behave within the memory model.

That is, for the foo2 example clobber wouldn't be able to write to the memory of x, and I don't think the example can be "fixed" to allow that.

For example, one could pass &x to black_box, so that black_box can write &x to memory. That's ok because black_box does not modify *x. However, this enables clobber to modify *x, violating the aliasing rules, and resulting in undefined behavior:

fn foo3(x: &i32) -> bool {
  unsafe {
      black_box(x as *const _ as *mut _);  // OK: writes &mut x to memory 
      let y = *x;
      clobber();  // modifies *x via *&mut -> UB
      y == *x 
  }
}

However, for writing benchmarks, we don't need clobber semantics of reading + writing to all memory. We could define a function similar but not equal to clobber with following semantics:

  • unsafe fn mem::global_side_effect();: unknown function that reads all memory and performs a side-effecting operation - does not write to any memory.

which we can then use:

fn foo4(x: &mut i32) -> bool {
  unsafe {
      black_box(x as *mut _);  // OK: writes &mut x to memory 
      *x = 2;
      global_side_effect();  // OK: reads *x via *&mut but does not modify it. 
      // ^^ forces the compiler to emit a store for *x = 2
      2 == *x   // compiler can optimize this to true
  }
}

@RalfJung
Copy link
Member

RalfJung commented Mar 20, 2018

@ubsan

I mostly agree, but slightly disagree about the way the compiler gets to that point. Specifically, I don't think it's to do with the function returning - I don't think it's invalid for bar to store the pointer to tmp anywhere. However, once you read from tmp, the borrow that that pointer has is "broken", and so you can't read from or write to the tmp object without invoking undefined behavior.

Ah, this is "check only on usage" vs. "check all the time" again. The difference between our two models is whether the compiler is apparent in the following code:

{
  let mut tmp = 3;
  bar(&tmp);
  baz();
  tmp = 4;
}

Assuming baz does not panic, I think we are allowed to move this up across baz() because baz cannot possibly observe the value of tmp. Your model prohibits that optimization because baz would be allowed to read a pointer leaked by bar.


at A tmp2 isn't necessarily equal to 3 because black_box might have modified tmp,

Yes that sounds reasonable. This is behavior that (I think) we would want allow unsafe code to have.

at B tmp == tmp2 isn't necessarily true because black_box might have stored a pointer to tmp somewhere that clobber might have written to

This is a subtle one. Already small variants of this should probably not be allowed. Consider

unsafe {
  let mut tmp = 3;
  black_box(&mut tmp as *mut _);
  let tmp_ptr = &mut tmp;
  *tmp_ptr = 3;
  clobber();
  return *tmp_ptr; // Can be optimized to return 3
}

The reasoning here would be that by writing &mut tmp, you are asserting that you actually have full control of this location. Of course, this could be an auto-borrow, which makes this rather subtle. But on the other hand, unless we want to special case unsafe code in our semantics (which comes with its own host of problems), this code is indistinguishable from code that we certainly want to optimize.
(Btw, if I just showed you that code, would you think clobber is allowed to write to tmp_ptr or not?)

For your concrete code, I am not sure. I am worried people will expect it to work, which means we better make it work. But see below for why that's rather funny.

I think that clobber and black_box must behave within the memory model.

That makes me happy =D

unsafe fn mem::global_side_effect();: unknown function that reads all memory and performs a side-effecting operation - does not write to any memory.

So this would be a function that can do strictly less than clobber? It has the same signature, and clobber is the "most unknown function possible" with that signature (I think).

fn foo4(x: &mut i32) -> bool {
  unsafe {
      black_box(x as *mut _);  // OK: writes &mut x to memory 
      *x = 2; // (A)
      global_side_effect();  // OK: reads *x via *&mut but does not modify it.  (B)
      // ^^ forces the compiler to emit a store for *x = 2
      2 == *x   // compiler can optimize this to true (C)
  }
}

Ah, this is an interesting one, actually not to dissimilar from my example above. It is not at all clear whether global_side_effect should be allowed to read x. After all, foo4 has a mutable (and hence unique) borrow of x that it uses right before and after calling global_side_effect. These uses signal that the borrow is valid at these points, and the reference is not "escaped" by turning it into a raw pointer between these points. This indicates to the compiler that it can rely on the aliasing information.

This is my understanding of (some variant of) the ACA (asserting-conflicting-access) model: (A) asserts that the reference is valid here, global_side_effect creates a conflict at (B), and then the final access at (C) completes the "UB witness". I am not convinced this is the model we want, but it seems plausible as a "lower bound" --- this is probably the least we need to perform useful optimizations.

The take-away is that the Rust memory model is pretty strict about what memory you can even read, and if you want global_side_effect effect above to be able to read x, you better explicitly give it access. Storing a pointer (like black_box would here) and then using it later is only legal under pretty severe restrictions -- essentially the reference this pointer came from must not be used at all between these two accesses. (A safe API would reborrow the original pointer, enforcing this restriction statically.)

The first example in your post skids the edge of an ACA violation: Instead of using a reference, it uses the original variable again. It seems strange to let the rules be any different then -- a very surprising subtle difference. So maybe, if your example should be okay (as in, clobber is allowed to mess with tmp), then my example in this post should also permit clobber to mutate tmp_ptr as the two are too similar. But then if we change things to

{
  let mut tmp = 3;
  safe_black_box(&mut tmp as *mut _);
  let tmp2 = tmp;  // A
  safe_clobber();
  return tmp == tmp2; // B
}

We most certainly do want to optimize this to return true. So, if we want your suggested behavior of clobber to be legal, we kind of have to make unsafe significant in some way.

Well, this thread has certainly lead to some very interesting examples, and some concrete questions that, once answered, will provide much clearer boundaries to the model. :)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

(Btw, if I just showed you that code, would you think clobber is allowed to write to tmp_ptr or not?)

Calling clobber would write to tmp_ptr (this is just what clobber does), but whether this behavior is defined or not, I don't know. The whole point of being able to get a *mut T from a &mut T is to be able to write to the value via the pointer. That is, the moment ones obtains a *mut T there are multiple mutable aliases to a value. Here black_box is storing the pointer somewhere, and clobber is mutating the value through it. I guess it would boil down to whether the following is valid or not:

unsafe {
   let mut tmp = 3;
   let x = &mut tmp;
   let y = y as *mut _;
   let z = y; 
   *x = 2;
    {
        *y = 3;
        *z = 4;
    }
   *x == 2;  // is the compiler allowed to optimize this check away here?
}

It might also be that things are valid in this case because it happens "within a function", but not in the clobber/black_box cases because it would happen "across functions".

The take-away is that the Rust memory model is pretty strict about what memory you can even read, and if you want global_side_effect effect above to be able to read x, you better explicitly give it access. Storing a pointer (like black_box would here) and then using it later is only legal under pretty severe restrictions -- essentially the reference this pointer came from must not be used at all between these two accesses. (A safe API would reborrow the original pointer, enforcing this restriction statically.)

This is new to me (and makes sense). I think that answering whether clobber is able to write to the pointer or not is key for the RFC. If we require having to give access to clobber, then that makes is less useful and probably we can just call black_box twice (instead of black_box + clobber) to achieve a similar effect.

@hanna-kruppe
Copy link

hanna-kruppe commented Mar 20, 2018

Hold on why should clobber ever have UB? That seems like a huge footgun with precisely zero advantages. If an access to some memory location would trigger UB, clobber should not access it.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

@rkruppe IIUC to avoid undefined behavior we would need to state that clobber does not write to memory with live shared aliases & to it (but it could read from this memory) and that it does not read nor write from memory that has live &mut aliases to it.

@hanna-kruppe
Copy link

Yeah, to avoid UB from accessing all memory it needs to not access all memory. What's your point?

(What precisely a sufficient criterion would be I don't know for sure, I'll defer to @ubsan and @RalfJung.)

@hanna-kruppe
Copy link

Actually, I don't think the docs of clobber need to say anything special about which memory it accesses. The optimizer already assumes programs don't have UB, including by accessing memory in ways they're not allowed to. That's pretty much what UB means to optimizers, after all. We just need to not specify or imply that clobber triggers UB.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

What's your point?

Doesn't that make it practically useless? For the original case of writing a benchmark for Vec::push it would not work anymore AFAICT:

fn foo() -> Duration {
    let mut vec = Vec::with_capacity(N);
    black_box(vec.as_ptr()); // writes *const Vec<i32> to memory
    let start = now();
    for i in 0..N {
        vec.push(0);
        clobber(); // A
    }
    (now() - start) / N
}

In A clobber won't read the memory owned by the vector because the vector holds a &mut reference to it so the whole loop could be optimized away.


The following might work, but at this point, at least for benchmarking purposes, we wouldn't need clobber anymore:

fn foo() -> Duration {
    let mut vec = Vec::with_capacity(N);
    black_box(vec.as_ptr()); // writes *const Vec<i32> to memory
    let start = now();
    for i in 0..N {
        vec.push(0);
        vec = back_box(vec);
        // back_box(&mut vec[vec.len() - 1]);
        // or similar
    }
    (now() - start) / N
}

@hanna-kruppe
Copy link

I don't see how a clobber that can cause UB changes that, since the optimizer already assumes UB is not triggered (i.e., those memory locations are not accessed) You're probably right that this makes it useless, but that's because the strong optimization blocking effect is inherently incompatible with the strong optimization enabling effect of Rust's aliasing rules :P

@RalfJung
Copy link
Member

RalfJung commented Mar 20, 2018

I probably also picked confusing terminology... I've been in the mental mode of clobber as some kind of "adversarial" unknown function that performs as much access as it is allowed to; this is exactly the mode the compiler has to operate in when dealing with an unknown function. It hence seems feasible to make the compiler treat clobber the same way. This is not adding a new feature to the compiler, and in fact when constructing examples for miscompilations, it is not uncommon to follow the approach I described above and have some function pointers sitting around to inhibit optimizations. (Then we just have to make sure that the main function that actually instantiates the function pointers is translated successfully.) It is my understanding that clobber/black_box try to achieve the same but without going through the effort of adding function pointers and using multiple translation units.

So, when I asked "would you think clobber is allowed", I implicitly assumed that clobber just does the worst possible thing that does not violate the memory model. IOW, I consider clobber to already be fully defined -- it has the same effect on optimizations as an unknown function call, and performs all side-effects that such a call could perform. But this is just defining clobber in terms of another unknown, namely "what is an unknown function call allowed to do, and more importantly, what is it not allowed to do". The purpose of this discussion, and indeed the entire unsafe code guidelines effort, is to answer that question.

The whole point of being able to get a *mut T from a &mut T is to be able to write to the value via the pointer. That is, the moment ones obtains a *mut T there are multiple mutable aliases to a value.

Absolutely. And as long as you only use raw pointers, you are fine. The open question is what happens when you mix-and-match accesses through references (that are supposed to come with strong aliasing guarantees) and accesses through raw pointers.

Concerning @gnzlbg's Vec example, I think this is actually a prime example of code we would like to optimize; the bounds checks in vec.push should be removed after inlining.

The following might work, but at this point, at least for benchmarking purposes, we wouldn't need clobber anymore:

Yes, giving away a mutable reference to the vector on each round through the loop is the way to go.

Is there any fundamental problem with having only black_box, not clobber? Also, isn't clobber() equivalent to black_box(0)?

the strong optimization blocking effect is inherently incompatible with the strong optimization enabling effect of Rust's aliasing rules :P

Yes, that's really the key of the issue here.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Mar 20, 2018

Is there any fundamental problem with having only black_box, not clobber? Also, isn't clobber() equivalent to black_box(0)?

There aren't any fundamental problems AFAICT.

clobber as in "reads and writes to all memory" is equivalent to black_box(0) as in "reads and writes all memory and writes 0 to memory".

However, all the limits discussed about clobber apply to black_box(x) as well. So the question is then what is black_box allowed to read and write. It probably is just an unknown function that takes x as an argument. Because of this, it can be used to "clobber" a single memory location, but not all memory. This definition is already useful and also points out how to use black_box correctly. I'll update the RFC to propose only black_box defined like this.


Concerning @gnzlbg's Vec example, I think this is actually a prime example of code we would like to optimize; the bounds checks in vec.push should be removed after inlining.

FWIW, LLVM currently optimize that example to (now() - now()) / N. The function instantiates a vector, which allocates memory in the free store, writes to it, reads to it without side-effects, and frees it. So LLVM can remove the writes and thus the loop, and thus the vector, and all that's left is: (now() - now()) / N.

Here LLVM should always be able to completely remove the loop. Whether the memory allocation should be elidable or not is a completely different topic.

@RalfJung
Copy link
Member

However, all the limits discussed about clobber apply to black_box(x) as well.

Absolutely.

So the question is then what is black_box allowed to read and write. It probably is just an unknown function that takes x as an argument. Because of this, it can be used to "clobber" a single memory location, but not all memory.

Right, so that that much is uncontroversial -- if black_box takes a &mut, that is. If it takes just &, like in your example at the beginning of this issue, it could not even write to the location itself (unless interior mutability is involved).

So, a minimal safe API could be just having fn black_box<T>(x: &mut T). The memory model certainly has no reservations about this function observing and mutating x. In your vector example, I expect calling black_box(&mut vec) on each loop iteration helps?

If you want, you could include additional language saying that this function may also observe "leaked" shared data (&T), as well as observe and mutate "leaked" shared interior mutable data (&Cell<T>) -- even if that data is not passed to the function. "leaked" here means that the environment could have a shared reference to the data (whose lifetime is still valid!), for example because it was an argument to the function. Or is that not even something you want it to do?

FWIW, LLVM currently optimize that example to (now() - now()) / N.

😂 (I assume that's without using clobber/black_box)

Okay, I was talking in a context that actually uses the vector later. The loop should not have to do a size check on each round as the vector was allocated with enough capacity. That's one of the advanced optimizations benchmarks @gankro would like us to get right. ;)

@RalfJung RalfJung transferred this issue from rust-lang/rust-memory-model Nov 29, 2021
@JakobDegen JakobDegen changed the title semantics of black_blox and clobber in the memory model semantics of black_box and clobber in the memory model Apr 25, 2023
@JakobDegen
Copy link
Contributor

Closing - black_box is stable and the semantics are relatively well understood. There are possible alternatives in the future, tracked in separate issues.

@RalfJung
Copy link
Member

To be clear, the semantics of the stable function are a NOP in the memory model.

Specifying a 'guaranteed' black_box remains a fun topic of conversation but not something we have to track.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants