New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move tls areas to a dedicated memory space #477
Move tls areas to a dedicated memory space #477
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I couldn't see in this patch is where caml_tls_areas_base
gets deallocated. Shouldn't we deallocate that on shutdown?
The other comment is more general, we're allocating the whole of caml_tls_areas_base
in one malloc
call from the startup pthread. This might have a bad NUMA outcome compared to the existing code where we are commiting (with caml_mem_commit
) in create_domain
from the pthread with which a domain is spawned. I admit the existing behaviour is weak in the sense that the pthread may subsequently be moved by the OS. However the behaviour we are introducing will very likely place all the TLS areas into one NUMA zone on large machines, which will certainly be bad for pthreads spawned in the other NUMA zones.
Is there a reason to strongly prefer using a single malloc
for the tls_area
over using the caml_mem_commit
for it in create_domain
?
runtime/domain.c
Outdated
size = (uintnat)Minor_heap_max * Max_domains; | ||
tls_size = sizeof(caml_domain_state) * Max_domains; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we round the sizeof(caml_domain_state)
up to a whole page?
Avoids any possibility of false sharing of cache lines between domains and keeps the pages between domains separate.
runtime/domain.c
Outdated
|
||
heaps_base = caml_mem_map(size*2, size*2, 1 /* reserve_only */); | ||
if (!heaps_base) caml_raise_out_of_memory(); | ||
tls_base = malloc(tls_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to use caml_stat_alloc_noexc
than malloc
for this.
There is no specific reason, we would need to round up each tls_area up to a page (in order to be able to commit them properly). This sounds reasonable to me. |
f1f7bf1
to
dd9bea3
Compare
dd9bea3
to
7024178
Compare
LGTM |
This PR changes the way we allocate individual domain's tls areas.
Currently, tls areas are allocated alongside the minor heaps, and the minor heaps segment is effectively divided as such:
Domain 0 tls area -> guard page -> d0 minor heap -> Domain 1 tls area ... ect
This is not an optimal layout for Domain Local Allocation buffers, we need to be able to bump the allocation pointer without any extra book keeping, besides knowing the end of the segment.
This PR move the tls areas to their own, malloc'ed space, and the minor_heaps segment is now only contiguously storing the minor heaps and nothing else.