Document any Memory model/visibility changes when executing on X64 Linux/Mac #4906

clrjunkie · 2016-01-07T15:33:42Z

For example: Are all writes volatile? Does lock still generate a full barrier?

clrjunkie · 2016-01-07T17:49:14Z

@tzwlai
Hope future will arrive soon, otherwise it would be hard to right correct code..

zpodlovics · 2016-01-07T19:31:04Z

A strict (formal) definition of CLR and the CLR memory model, and implementation details would be even better. Usually, the more strict the definition the more possiblity you have to optimize the code for specific restricted cases (eg.: jmm barrier elimination).

Semantics example (the memory model not yet integrated):
https://github.com/kframework/jvm-semantics
http://fsl.cs.illinois.edu/index.php/K-Java:_A_Complete_Semantics_of_Java

Memory model and barrier elimination example:
http://gee.cs.oswego.edu/dl/jmm/cookbook.html
http://shipilev.net/blog/2014/jmm-pragmatics/
http://www.cl.cam.ac.uk/~pes20/weakmemory/index.html

tzwlai · 2016-01-07T23:24:27Z

@clrjunkie "Future" is just our way of saying "after March".

CC @jkotas @sergiy-k

jkotas · 2016-01-08T00:31:51Z

The IL memory model is described in the CLI spec http://www.ecma-international.org/publications/standards/Ecma-335.htm. I agree that it would be nice to have more formal writeup.

Nothing have changed about the IL memory model for x64 Linux/Mac port.

clrjunkie · 2016-01-08T10:01:09Z

@jkotas Thanks. I think this needs to be better communicated to the community as well as to MS Devs, preferably on the CoreClr GitHub Repo front-page. I noticed a piece of code in .NET Core that included a code comment about possible change in behavior in this area and if I recall correctly the code comment was about justifying the use of Volatile.Read as a means to ensure consistent semantics across different platforms other than Windows (which is by the way what triggered me to open this issue).

I'm leaving this issue open so others know about this, but I suggest that your above statement and reference to the CLI spec be published to a more visible page, even before a formal writeup.

clrjunkie · 2016-01-08T10:52:25Z

@tzwlai

"Future" is just our way of saying "after March".

Thanks.

jkotas · 2016-01-08T12:55:42Z

noticed a piece of code in .NET Core that included a code comment about possible change in behavior

Could you please share link to this piece of code?

zpodlovics · 2016-01-08T13:19:23Z

Yes, I am aware of the ECMA 335 spec five page "I.12.6 Memory model and optimizations" section.

CarolEidt comment from https://github.com/dotnet/coreclr/issues/422

"As far as I know there is no formal definition for the CLR memory model, though this blog post from some years ago describes the de facto model pretty accurately, I believe: http://joeduffyblog.com/2007/11/10/clr-20-memory-model/. A couple of additional points:

First, the JIT is somewhat more conservative than Joe describes in his post. That is, I don't think it will eliminate stores to "non-local" fields even if they are "adjacent" (i.e. not separated by any calls or aliased references that could be accessing them). By "non-local" I mean any location that may be visible to other threads.

Second, the JIT is also somewhat limited in its optimizations, which means that even the load will not be eliminated because it would require a loop transformation to enregister the counter prior to the loop."

Joe Duffy comment from the same page:

"We have constructed our model over years of informal work and design-by-example, but something about the JMM approach is far more attractive. Lastly, what I've described applies to the implemented memory model, and not to what was specified in ECMA."

clrjunkie · 2016-01-08T14:01:27Z

"We have constructed our model over years of informal work and design-by-example, but something about the JMM approach is far more attractive. Lastly, what I've described applies to the implemented memory model, and not to what was specified in ECMA."

😕

clrjunkie · 2016-01-08T15:07:33Z

@jkotas I don't quite remember the filename..

zpodlovics · 2016-01-09T18:08:46Z

I ran into this paper from POPL2016 that may help to build a (formal) concurrency semantics and memory model:

Jean Pichon-Pharabod, Peter Sewell: A concurrency semantics for relaxed atomics that permits optimisation and avoids thin-air executions

"Despite much research on concurrent programming languages, especially for Java and C/C++, we still do not have a satisfactory definition of their semantics, one that admits all common optimisations without also admitting undesired behaviour. Especially problematic are the ``thin-air'' examples involving high-performance concurrent accesses, such as C/C++11 relaxed atomics. The C/C++11 model is in a per-candidate-execution style, and previous work has identified a tension between that and the fact that compiler optimisations do not operate over single candidate executions in isolation; rather, they operate over syntactic representations that represent all executions.

In this paper we propose a novel approach that circumvents this difficulty. We define a concurrency semantics for a core calculus, including relaxed-atomic and non-atomic accesses, and locks, that admits a wide range of optimisation while still forbidding the classic thin-air examples. It also addresses other problems relating to undefined behaviour.

The basic idea is to use an event-structure representation of the current state of each thread, capturing all of its potential executions, and to permit interleaving of execution and transformation steps over that to reflect optimisation (possibly dynamic) of the code. These are combined with a non-multi-copy-atomic storage subsystem, to reflect common hardware behaviour.

The semantics is defined in a mechanised and executable form, and designed to be implementable above current relaxed hardware and strong enough to support the programming idioms that C/C++11 does for this fragment. It offers a potential way forward for concurrent programming language semantics, beyond the current C/C++11 and Java models."

https://www.cl.cam.ac.uk/~jp622/popl16-thinair/

clrjunkie · 2017-08-02T10:27:51Z

@jkotas Is it true that in .NET FW/Core all assignments to shared variables within a given thread are guaranteed to be visible to all other threads who entered the Running State after the assignment took place?

For example, would these two programs produce consistent results when run on a multi-core.

Example 1:

    class Program
    {
        static String message;
        static object _lock = new object();
        static CountdownEvent cde = new CountdownEvent(1);

        static void Main(string[] args)
        {
            lock (_lock)
            {
                message = "Hello World!";
            }

            new Thread(ThreadEntryPoint).Start();

            cde.Wait();
        }

        static void ThreadEntryPoint(object o)
        {
            Console.WriteLine(message);
            cde.Signal();
        }
    }

Example 2:

    class Program
    {
        static String message;
        static object _lock = new object();
        static CountdownEvent cde = new CountdownEvent(2);
        static ManualResetEvent mre = new ManualResetEvent(false);

        static void Main(string[] args)
        {
            Thread t1 = new Thread(ThreadEntryPoint);
            t1.Start();

            Thread t2 = new Thread(ThreadEntryPoint);
            t2.Start();


            while (t1.ThreadState == ThreadState.Running || t2.ThreadState == ThreadState.Running)
            {
                Thread.Sleep(1000);
            }

            lock (_lock)
            {
                message = "Hello World!";
            }

            mre.Set();
            cde.Wait();
        }

        static void ThreadEntryPoint(object o)
        {
            mre.WaitOne();
            Console.WriteLine(message);
            cde.Signal();
        }
    }

If so, is that guarantee (with respect to these particular examples) made implicitly by the underline O/S or by the CLR?

Is the lock necessary to prevent the reordering of the String assignment from happening before the “printing threads” (re)enter the Running state?

jkotas · 2017-08-02T17:24:44Z

All classic .NET synchronization functions (ManualResetEvent, ...) have implicit memory barrier, so you second example can be simplified to this, and still work fine:

    class Program
    {
        static String message;
        static object _lock = new object();
        static CountdownEvent cde = new CountdownEvent(2);
        static ManualResetEvent mre = new ManualResetEvent(false);

        static void Main(string[] args)
        {
            Thread t1 = new Thread(ThreadEntryPoint);
            t1.Start();

            Thread t2 = new Thread(ThreadEntryPoint);
            t2.Start();

            message = "Hello World!";

            mre.Set();
            cde.Wait();
        }

        static void ThreadEntryPoint(object o)
        {
            mre.WaitOne();
            Console.WriteLine(message);
            cde.Signal();
        }
    }

You first example is about whether creating a new thread has an implicit memory barrier. I think that it is reasonable to expect that it has: It has it in current implementation and I do not think it is ever going to change. I do not think it is documented anywhere though.

cc @stephentoub

clrjunkie · 2017-08-02T17:41:52Z

You first example is about whether creating a new thread has an implicit memory barrier. I think that it is reasonable to expect that it has: It has it in current implementation and I do not think it is ever going to change

Does that mean that in the current CLR implementation I can remove the lock in the first example also?

jkotas · 2017-08-02T17:43:06Z

I think so.

clrjunkie · 2017-08-02T17:49:02Z

Thanks @jkotas! highly appreciate a ping in case you gain any new insight about this.

ForNeVeR · 2017-10-13T13:51:47Z

@jkotas could you please clarify the rulings about the memory model used by .NET (Core) today?

You said that

The IL memory model is described in the CLI spec http://www.ecma-international.org/publications/standards/Ecma-335.htm

There's some information on the web about so-called ".NET / CLR 2.0 Memory Model" (e.g. here), most of the time people referring this article (link from the Web archive; I hope it won't expire).

The article claims that there's a ".NET Framework 2.0" memory model that offers us more guarantees than just plain CLI specification. That confuses me a lot: could we actually rely on that model across the platforms (i.e. in .NET Core, Mono, Xamarin etc.)? Or the only guarantees we have are the ones given by CLI spec, and we shouldn't rely on any more strong semantics that informal .NET 2.0 memory model offer to us?

jkotas · 2017-10-13T14:13:58Z

could we actually rely on that model across the platforms

I do not think so. E.g. I do not think that All stores have release semantics rule holds on .NET Core on ARM. Also, I doubt that Mono guarantees this accross all their platforms.

The simple rule is: use volatile for anything lock free.

ericstj · 2020-02-28T17:45:42Z

Closing as there hasn't been any recent discussion here.

msftgits transferred this issue from dotnet/coreclr Jan 30, 2020

msftgits added this to the Future milestone Jan 30, 2020

maryamariyan added the untriaged New issue has not been triaged by the area owner label Feb 26, 2020

ericstj closed this as completed Feb 28, 2020

ericstj removed the untriaged New issue has not been triaged by the area owner label Feb 28, 2020

ghost locked as resolved and limited conversation to collaborators Jan 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document any Memory model/visibility changes when executing on X64 Linux/Mac #4906

Document any Memory model/visibility changes when executing on X64 Linux/Mac #4906

clrjunkie commented Jan 7, 2016

clrjunkie commented Jan 7, 2016

zpodlovics commented Jan 7, 2016

tzwlai commented Jan 7, 2016

jkotas commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

jkotas commented Jan 8, 2016

zpodlovics commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

zpodlovics commented Jan 9, 2016

clrjunkie commented Aug 2, 2017

jkotas commented Aug 2, 2017

clrjunkie commented Aug 2, 2017

jkotas commented Aug 2, 2017

clrjunkie commented Aug 2, 2017

ForNeVeR commented Oct 13, 2017

jkotas commented Oct 13, 2017 •

edited

Loading

ericstj commented Feb 28, 2020

Document any Memory model/visibility changes when executing on X64 Linux/Mac #4906

Document any Memory model/visibility changes when executing on X64 Linux/Mac #4906

Comments

clrjunkie commented Jan 7, 2016

clrjunkie commented Jan 7, 2016

zpodlovics commented Jan 7, 2016

tzwlai commented Jan 7, 2016

jkotas commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

jkotas commented Jan 8, 2016

zpodlovics commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

clrjunkie commented Jan 8, 2016

zpodlovics commented Jan 9, 2016

clrjunkie commented Aug 2, 2017

jkotas commented Aug 2, 2017

clrjunkie commented Aug 2, 2017

jkotas commented Aug 2, 2017

clrjunkie commented Aug 2, 2017

ForNeVeR commented Oct 13, 2017

jkotas commented Oct 13, 2017 • edited Loading

ericstj commented Feb 28, 2020

jkotas commented Oct 13, 2017 •

edited

Loading