Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC Pinning? #1085

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

GC Pinning? #1085

wants to merge 4 commits into from

Conversation

Aidan63
Copy link
Contributor

@Aidan63 Aidan63 commented Jan 3, 2024

Makes the MemType enum and GetMemType function accessible from the hx namespace as mentioned in #1082.

@hughsando
Copy link
Member

I think there are some complications with using the GC memtype routines outside of a collection (ok in __Mark).
For example, the large list might change - it is locked inside a collect call, but not in general.
On the plus side, if you know it is a valid pointer inside a haxe object, then you can rely on the header bits, and it should not be memUnManaged except in case of Array.mAlloc=-1 .

With array data, it is possible that the the pointer is a foreign pointer (Set with NativeArray.setUnmanagedData). In this case, the arrays mAlloc will be -1, and no memory management on the pointer is needed - this should be checked before examining the header.

You can check for 'const-ness' with bool IsConstAlloc(const void *inData). In this case, the data does not need any management at all (eg, string data) and will not be deleted.

largeness can be determined from a valid header from the size ("size in allocated blocks"), which will be zero in this case

 int size = flags & 0xffff;
 // Size will be 0 for large allocs -> no need to mark block

You may need to add some code to prevent large allocations from getting destroyed in a realloc process, see InternalRealloc and InternalReleaseMem. The theory is that only 1 array will hold the byte data, so it can be cleaned when the array is resized.
This may need some thought, but one possibility could be set the least-significant for the large array size to 1 in the case where we do not want immediate reallocation.

If you are storing GC block pointers in a haxe object, some additional work must be done the in the generational case, HX_OBJ_WB should be called. This is for the case where the object holding the pointer is "old" and the pointer is generationally new. This call means your __Mark will get called when doing a minor generational collection.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Jan 4, 2024

Thanks for the tips, I see what you mean about the large list locking.

Perhaps it makes sense to come at this from a different angle then, if arrays had a pin method which returned some sort of handle it could keep track if a pin was requested and not attempt that realloc optimisation if it was. That pin method could do all the checking to see if mBase is unmanaged, lives on the LOH, or in a block and act accordingly. The handle could keep a rooted marking object which calls hx::MarkAlloc in its __Mark to keep that object alive until the handle is released.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Jan 5, 2024

OK, I've completely re-jigged this merge. MemType changes have been reverted and it adds pinning spport.
Pinning and unpinning checks the header to see if it lives on the LOH and BlockDataInfo now keeps track of the number of allocs in its lines and sets mPinned accordingly.
Array now has a pin function which will return a handle keeping the array storage at time of pinning alive and un-moving until the handle is deleted.
Hopefully this is less dodgy.

@Aidan63 Aidan63 changed the title Expose MemType GC Pinning? Jan 5, 2024
@hughsando
Copy link
Member

Probably better to use the atomic inc/dec ops for mPinned manipulation if pinning might happen from any haxe thread.
It would be better if possible to not increase the size of the Array class since this will be allocated a lot, and pinning will happen much less often. I was thinking a std::unordered map of pinned pointers might do the trick. These could be marked like the conservative marks - pinning the blocks at mark time. If you reuse the mLargeListLock for the pinned set you could solve the realloc problem by checking the pinned set inside FreeLarge for almost no cost. Marking the list like conservative pointers also solve the generational gc problem.
You still might need your pinned structure to allow unpinning of exactly the same pointer that was pinned (since the array might realloc in the meantime). You could possibly just use a cpp.Star pointer in a normal haxe structure to do this though.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Jan 6, 2024

I thought of the atomic operations just as I was going to bed, I've changed things around now though so that mPinnedAllocs isn't needed any more.
There is now an sgPinSet of void* pointers which is protected by the mLargeListLock of the global allocator. Just before the roots are traversed this set is iterated over and any non object not on the LOH has its block pinned. FreeLarge now checks the pin set after aquiring mLargeListLock and exits early if the pointer to be released is in the set.
One thing I have noticed is the shrinking case of arrays, it doesn't re-allocate but just updates the length value. However, it also memsets the now extra bytes to 0 which is a problem when wanting to pass that data around.

Another thing I'm unsure of, when marking the pins do I need to do the whole process of checking the pointer type and then calling either mark object, mark string, etc, as is done in the conservative marking function, or is it just to just call hx::MarkAlloc as I'm doing now.

@hughsando
Copy link
Member

This seems good. Changing the length is the same as changing, say, the first element - more of a documentation issue if it does not crash. The only alternative is to copy, since there is not real way to do copy-on-write without incurring overhead.
The onus is on the user to ensure that the unmanaged pointers remain valid, so no problems there.

@Aidan63
Copy link
Contributor Author

Aidan63 commented Jan 7, 2024

Yeah, thats fair. The only way to solve it without the extra copy would probably be to introduce an extra BytesData type instead of that being a typedef to an array. Since this is primarially going to be used with haxe bytes that would stop the user peeking at the underlying implementation and resizing it.
But thats a lot of effort (and a breaking change) for something which the user should know is asking for trouble (haxe bytes don't expose any way in the formal API for resizing).

@skial skial mentioned this pull request Jan 9, 2024
1 task
@Aidan63
Copy link
Contributor Author

Aidan63 commented Jan 13, 2024

Probably should have opened this as a draft to begin with. I'm not planning on adding anything else so if this looks good its probably good to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants