Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor gc to only need to allocate virtual addresses as needed #21135

Merged
merged 2 commits into from
Apr 17, 2017

Conversation

vtjnash
Copy link
Sponsor Member

@vtjnash vtjnash commented Mar 22, 2017

From local testing, this doesn't seem to have any performance impact. To get virtual address space to line up with physical allocation a bit more would require also adjusting openblas to avoid allocation of it's 512 MB buffer (32 MB * 16 threads) during initialization.

fix #17987
alterate fix for #10390

@vtjnash vtjnash added the GC Garbage collector label Mar 22, 2017
@vtjnash vtjnash requested a review from yuyichao March 22, 2017 17:27
@JeffBezanson
Copy link
Sponsor Member

Could you explain what region0, region1, and region2 mean?

Copy link
Contributor

@yuyichao yuyichao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. The page_metadata fast path can be tweaked a little bit. Other slow paths can probably be tweaked later.

@@ -590,7 +590,7 @@ STATIC_INLINE void gc_setmark_pool_(jl_ptls_t ptls, jl_taggedvalue_t *o,
STATIC_INLINE void gc_setmark_pool(jl_ptls_t ptls, jl_taggedvalue_t *o,
uint8_t mark_mode)
{
gc_setmark_pool_(ptls, o, mark_mode, page_metadata(o));
gc_setmark_pool_(ptls, o, mark_mode, page_metadata_ext(o).meta);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be page_metadata_ext(o).meta and not page_metadata(o)? (Also a few cases below.)

Edit: I guess this is for NULL check? Seems that these can be replaced with jl_assume(page_metadata(o)). Or at least this one and the one in the pool allocator.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, main difference is that _ext requires it to be a valid page, while page_metadata did not

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, _ext is the fast path

src/gc-pages.c Outdated
info.region0->freemap[info.region0_i32] |= msk; // is free
pmeta = &info.region0->meta[i];
info.meta = (*pmeta = &page_meta[pg]);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allocation here can probably be cleaned up later..... (we can have a version of perm_alloc that guarantees zero filling).

src/gc.h Outdated
uintptr_t data = ((uintptr_t)_data);
unsigned i;
i = REGION_INDEX(data);
region1_t *r1 = memory_map.meta1[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can probably use a compile time optimization too. (Though I doubt 32bits matters too much.......)

@yuyichao
Copy link
Contributor

I believe the "regions" forms a 2-3 level "page table".

@ararslan
Copy link
Member

@nanosoldier runbenchmarks(ALL, vs=":master")

src/gc-pages.c Outdated
#ifdef _OS_WINDOWS_
VirtualAlloc(ptr, GC_PAGE_SZ, MEM_COMMIT, PAGE_READWRITE);
VirtualAlloc(meta->data, GC_PAGE_SZ, MEM_COMMIT, PAGE_READWRITE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is using meta outside of its scope (defined on line 178 inside a block) which is causing the Windows failures.

@@ -96,113 +84,184 @@ static void jl_gc_alloc_region(region_t *region)
jl_throw(jl_memory_exception);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting hit on 32-bit Linux

@@ -9,32 +9,21 @@
extern "C" {
#endif

// A region is contiguous storage for up to DEFAULT_REGION_PG_COUNT naturally aligned GC_PAGE_SZ pages
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't just delete this, update it if no longer true

@nanosoldier
Copy link
Collaborator

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @jrevels

@tkelman
Copy link
Contributor

tkelman commented Mar 24, 2017

the difference between pagetable, pagetable0, and pagetable1 is still not described very clearly

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Mar 26, 2017

There is no difference, just level h=0, h=1, and h=2 of the table. C is just really encouraging of copy-pasting.

@tkelman
Copy link
Contributor

tkelman commented Mar 26, 2017

these look copy pasted but in several cases aren't quite, the 0 function refers to 1, or similar. is this intentional? if so, it should be explained.

@StefanKarpinski
Copy link
Sponsor Member

@vtjnash: I don't really understand why you can't just say what they are for. If they're the same, why are there three of them? Why not just one?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Apr 3, 2017

Why not just one?

Exactly. This is why people don't like C. Three data-structures that differ only by their distance from the root of the tree each have their own independently unique type.

these look copy pasted but in several cases aren't quite

Of course, it's a table of height 3, so level 2 points at 1 and level 2 points at level 1 which contains the leaf nodes. That means I can hand-optimize away some of the conditional checks at each level, so that the copies of the code ends up littered with special cases.

@tkelman
Copy link
Contributor

tkelman commented Apr 3, 2017

Of course, it's a table of height 3, so level 2 points at 1 and level 2 points at level 1 which contains the leaf nodes.

Then describe that in comments, it's not obvious without any explanation.

@tkelman
Copy link
Contributor

tkelman commented Apr 4, 2017

gc_clear_mark_page*, gc_verify_tags_pagetable*, gc_stats_pagetable*, gc_count_pool_pagetable*, sweep_pool_page* still have no descriptive comments

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented Apr 4, 2017

We don't usually have descriptive comments on helper functions (unless the work they do is non-intuitive). In this case, I've just split out the existing iterators into their own functions.

@tkelman
Copy link
Contributor

tkelman commented Apr 4, 2017

These aren't intuitive - the problem isn't that they're repetitive, it's that they look copy pasted but are each slightly different in ways that aren't at all obvious if you didn't write the code. For maintainability of the code base, please explain the scheme better. We need many many more comments throughout, and confusing aspects are the best place to fix that first.

src/gc.h Outdated
// - pagetable0_t: the top/root level
// - pagetable1_t: the middle level
// - pagetable2_t: the bottom/leaf level
// Corresponding to these simlar structures is a large amount of repetetive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repetetive -> repetitive (here and below)

src/gc.h Outdated
// various conditions at each level.

// The following constants define the branching factors at each level.
// They constants and GC_PAGE_LG2 must therefore sum to sizeof(void*).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they -> the

@tkelman
Copy link
Contributor

tkelman commented Apr 5, 2017

the gc.h comment helps, but I meant inline on the functions I referred to - the pieces that differ, when a function called 0 refers to a variable called 1, or a variable called 1 has a field named 0, etc, is what's confusing and could use point-of-use indications of intent

src/gc.h Outdated
// - pagetable0_t: the bottom/leaf level (covers the contiguous addresses)
// - pagetable1_t: the middle level
// - pagetable2_t: the top/leaf level (covers the entire virtual address space)
// Corresponding to these simlar structures is a large amount of repetitive
Copy link
Contributor

@tkelman tkelman Apr 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar

gc_clear_mark_*, gc_verify_tags_*, gc_stats_pagetable*, and gc_count_pool_pagetable* still look misleading and could use comments

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are debugging functions not actually part of he gc. They are usually commented out, so I wouldn't worry about them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course you wouldn't, you wrote them

rather than using multiple enormous flat regions,
this divides all memory into a 2-3 level page map
@vtjnash vtjnash merged commit 55c97fb into master Apr 17, 2017
@vtjnash vtjnash deleted the jn/gc-page-no-region branch April 17, 2017 23:06
@tkelman
Copy link
Contributor

tkelman commented Apr 18, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GC Garbage collector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

command line flag to limit heap memory usage?
7 participants