-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Multicore Guide explaining the new memory model #145
Conversation
9f024de
to
c372bcd
Compare
doc/multicore.md
Outdated
|
||
We can solve this by relying on a useful feature of atomics: every atomic also has a frontier of its own (a location on every non-atomic location's timeline). | ||
Writing to an atomic updates its frontier with information from the writing CPU's frontier | ||
(so it's at least as up-to-date as the writer). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially read this as updating only the frontier of the atomic.
Looking at the PLDI'18-paper https://kcsrk.info/papers/pldi18-memory.pdf
I understand rule (While-AT) in Fig.1c as updating both frontiers.
This is also how I understand KC's example of an atomic write on sl.21-22
https://speakerdeck.com/kayceesrk/bounding-data-races-in-space-and-time?slide=78
which updates both the red frontier for the atomic A
and thread 2's frontier for variable b
.
I'm fairly new to the memory-model though, so take this with a grain of salt.
If my understanding is correct, you could consider rephrasing to something like:
"Writing to an atomic updates both the atomic's frontier and the writing CPU's frontier
to contain the most up-to-date entries for each."
For the example it shouldn't change anything though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right. I've fixed the text. I wonder why it's like that, though?
Thanks! This is a very nice explanation IMO - well done! 😀🙏 While reading it I found two small nits (pointed out inline). Thanks again! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good introduction to the memory model and I like the use of 'frontiers'.
There's one thing that's not quite right about the garbage collector. Multicore as upstreamed is using a 'parallel minor collector' in which there is no read barrier, private minor heaps or changes to the C-API. However the parallel minor collector does have synchronised minor heap promotion which means there are periods when no domain can execute OCaml code as all minor heaps are being promoted.
It isn't clear this document gains from discussing minor heap collection, so a fix could be to remove those lines.
For those interested https://arxiv.org/abs/2004.11663 has more on the parallel minor collector vs the concurrent minor collector.
doc/multicore.md
Outdated
The way this works on a real system is interesting. | ||
Neither `x` nor the list items are atomic, so the system is free to optimise things as it pleases. | ||
In particular, it might update `x` to point at the new list's address before writing the list. | ||
However, the new list will be allocated in the writing domain's minor heap, which is private to that domain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the parallel minor GC, all minor heaps can be read by all domains and links can occur between minor heaps.
To make this work, all mutators must stop to perform minor heap promotion across all domains; we call these periods where all domains are stopped "stop-the-world" sections.
doc/multicore.md
Outdated
In particular, it might update `x` to point at the new list's address before writing the list. | ||
However, the new list will be allocated in the writing domain's minor heap, which is private to that domain. | ||
If the second branch sees the new pointer value, it will notice that it points into another domain's minor heap. | ||
Instead of accessing it directly, it will send a message to that domain asking for the value to be promoted to the major heap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the 'concurrent minor collector' this is how things worked. However the 'parallel minor collector' does not impose read barriers and promotion. Instead it enforces stop-the-world sections where all minor heaps across all domains are promoted in parallel.
c372bcd
to
d1acac7
Compare
d1acac7
to
6d72268
Compare
@ctk21: thanks - I've removed that section. But now I'm curious how it works in the new system. I suppose mutating a field creates some kind of barrier to ensure the thing it now points to is written first? |
|
This is my understanding of OCaml's memory model. Someone more knowledgable should check it for accuracy.