-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document any Memory model/visibility changes when executing on X64 Linux/Mac #4906
Comments
@tzwlai |
A strict (formal) definition of CLR and the CLR memory model, and implementation details would be even better. Usually, the more strict the definition the more possiblity you have to optimize the code for specific restricted cases (eg.: jmm barrier elimination). Semantics example (the memory model not yet integrated): Memory model and barrier elimination example: |
@clrjunkie "Future" is just our way of saying "after March". |
The IL memory model is described in the CLI spec http://www.ecma-international.org/publications/standards/Ecma-335.htm. I agree that it would be nice to have more formal writeup. Nothing have changed about the IL memory model for x64 Linux/Mac port. |
@jkotas Thanks. I think this needs to be better communicated to the community as well as to MS Devs, preferably on the CoreClr GitHub Repo front-page. I noticed a piece of code in .NET Core that included a code comment about possible change in behavior in this area and if I recall correctly the code comment was about justifying the use of Volatile.Read as a means to ensure consistent semantics across different platforms other than Windows (which is by the way what triggered me to open this issue). I'm leaving this issue open so others know about this, but I suggest that your above statement and reference to the CLI spec be published to a more visible page, even before a formal writeup. |
Thanks. |
Could you please share link to this piece of code? |
Yes, I am aware of the ECMA 335 spec five page "I.12.6 Memory model and optimizations" section. CarolEidt comment from https://github.com/dotnet/coreclr/issues/422 "As far as I know there is no formal definition for the CLR memory model, though this blog post from some years ago describes the de facto model pretty accurately, I believe: http://joeduffyblog.com/2007/11/10/clr-20-memory-model/. A couple of additional points: First, the JIT is somewhat more conservative than Joe describes in his post. That is, I don't think it will eliminate stores to "non-local" fields even if they are "adjacent" (i.e. not separated by any calls or aliased references that could be accessing them). By "non-local" I mean any location that may be visible to other threads. Second, the JIT is also somewhat limited in its optimizations, which means that even the load will not be eliminated because it would require a loop transformation to enregister the counter prior to the loop." Joe Duffy comment from the same page: "We have constructed our model over years of informal work and design-by-example, but something about the JMM approach is far more attractive. Lastly, what I've described applies to the implemented memory model, and not to what was specified in ECMA." |
😕 |
@jkotas I don't quite remember the filename.. |
I ran into this paper from POPL2016 that may help to build a (formal) concurrency semantics and memory model: Jean Pichon-Pharabod, Peter Sewell: A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions "Despite much research on concurrent programming languages, especially for Java and C/C++, we still do not have a satisfactory definition of their semantics, one that admits all common optimisations without also admitting undesired behaviour. Especially problematic are the ``thin-air'' examples involving high-performance concurrent accesses, such as C/C++11 relaxed atomics. The C/C++11 model is in a per-candidate-execution style, and previous work has identified a tension between that and the fact that compiler optimisations do not operate over single candidate executions in isolation; rather, they operate over syntactic representations that represent all executions. In this paper we propose a novel approach that circumvents this difficulty. We define a concurrency semantics for a core calculus, including relaxed-atomic and non-atomic accesses, and locks, that admits a wide range of optimisation while still forbidding the classic thin-air examples. It also addresses other problems relating to undefined behaviour. The basic idea is to use an event-structure representation of the current state of each thread, capturing all of its potential executions, and to permit interleaving of execution and transformation steps over that to reflect optimisation (possibly dynamic) of the code. These are combined with a non-multi-copy-atomic storage subsystem, to reflect common hardware behaviour. The semantics is defined in a mechanised and executable form, and designed to be implementable above current relaxed hardware and strong enough to support the programming idioms that C/C++11 does for this fragment. It offers a potential way forward for concurrent programming language semantics, beyond the current C/C++11 and Java models." |
@jkotas Is it true that in .NET FW/Core all assignments to shared variables within a given thread are guaranteed to be visible to all other threads who entered the Running State after the assignment took place? For example, would these two programs produce consistent results when run on a multi-core. Example 1:
Example 2:
If so, is that guarantee (with respect to these particular examples) made implicitly by the underline O/S or by the CLR? Is the lock necessary to prevent the reordering of the String assignment from happening before the “printing threads” (re)enter the Running state? |
All classic .NET synchronization functions (ManualResetEvent, ...) have implicit memory barrier, so you second example can be simplified to this, and still work fine:
You first example is about whether creating a new thread has an implicit memory barrier. I think that it is reasonable to expect that it has: It has it in current implementation and I do not think it is ever going to change. I do not think it is documented anywhere though. cc @stephentoub |
Does that mean that in the current CLR implementation I can remove the lock in the first example also? |
I think so. |
Thanks @jkotas! highly appreciate a ping in case you gain any new insight about this. |
@jkotas could you please clarify the rulings about the memory model used by .NET (Core) today? You said that
There's some information on the web about so-called ".NET / CLR 2.0 Memory Model" (e.g. here), most of the time people referring this article (link from the Web archive; I hope it won't expire). The article claims that there's a ".NET Framework 2.0" memory model that offers us more guarantees than just plain CLI specification. That confuses me a lot: could we actually rely on that model across the platforms (i.e. in .NET Core, Mono, Xamarin etc.)? Or the only guarantees we have are the ones given by CLI spec, and we shouldn't rely on any more strong semantics that informal .NET 2.0 memory model offer to us? |
I do not think so. E.g. I do not think that The simple rule is: use volatile for anything lock free. |
Closing as there hasn't been any recent discussion here. |
For example: Are all writes volatile? Does lock still generate a full barrier?
The text was updated successfully, but these errors were encountered: