Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move tls areas to a dedicated memory space #477

Merged

Conversation

abbysmal
Copy link
Collaborator

This PR changes the way we allocate individual domain's tls areas.

Currently, tls areas are allocated alongside the minor heaps, and the minor heaps segment is effectively divided as such:

Domain 0 tls area -> guard page -> d0 minor heap -> Domain 1 tls area ... ect

This is not an optimal layout for Domain Local Allocation buffers, we need to be able to bump the allocation pointer without any extra book keeping, besides knowing the end of the segment.

This PR move the tls areas to their own, malloc'ed space, and the minor_heaps segment is now only contiguously storing the minor heaps and nothing else.

Copy link
Collaborator

@ctk21 ctk21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I couldn't see in this patch is where caml_tls_areas_base gets deallocated. Shouldn't we deallocate that on shutdown?

The other comment is more general, we're allocating the whole of caml_tls_areas_base in one malloc call from the startup pthread. This might have a bad NUMA outcome compared to the existing code where we are commiting (with caml_mem_commit) in create_domain from the pthread with which a domain is spawned. I admit the existing behaviour is weak in the sense that the pthread may subsequently be moved by the OS. However the behaviour we are introducing will very likely place all the TLS areas into one NUMA zone on large machines, which will certainly be bad for pthreads spawned in the other NUMA zones.

Is there a reason to strongly prefer using a single malloc for the tls_area over using the caml_mem_commit for it in create_domain?

runtime/domain.c Outdated
size = (uintnat)Minor_heap_max * Max_domains;
tls_size = sizeof(caml_domain_state) * Max_domains;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we round the sizeof(caml_domain_state) up to a whole page?
Avoids any possibility of false sharing of cache lines between domains and keeps the pages between domains separate.

runtime/domain.c Outdated

heaps_base = caml_mem_map(size*2, size*2, 1 /* reserve_only */);
if (!heaps_base) caml_raise_out_of_memory();
tls_base = malloc(tls_size);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to use caml_stat_alloc_noexc than malloc for this.

@abbysmal
Copy link
Collaborator Author

abbysmal commented Feb 25, 2021

Is there a reason to strongly prefer using a single malloc for the tls_area over using the caml_mem_commit for it in create_domain?

There is no specific reason, we would need to round up each tls_area up to a page (in order to be able to commit them properly). This sounds reasonable to me.

@ctk21
Copy link
Collaborator

ctk21 commented Mar 2, 2021

LGTM

@ctk21 ctk21 merged commit 9b8b818 into ocaml-multicore:parallel_minor_gc Mar 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants