-
Notifications
You must be signed in to change notification settings - Fork 722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime compressed refs work #8324
Conversation
This is not complete, but there's enough here to start having a look. |
I walked through the changes and they seems correct, however probably I will need another look focus on possible missed changed in pointers math |
c42b2c5
to
69227f0
Compare
@@ -195,7 +195,7 @@ MM_IndexableObjectAllocationModel::layoutContiguousArraylet(MM_EnvironmentBase * | |||
GC_SlotObject slotObject(env->getOmrVM(), arrayoidPtr); | |||
slotObject.writeReferenceToSlot((omrobjectptr_t)leafOffset); | |||
leafOffset += arrayletLeafSize; | |||
arrayoidPtr += 1; | |||
arrayoidPtr = GC_SlotObject::addToSlotAddress(arrayoidPtr, 1, compressed); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a non-static variant of this method and call for each iteration slotObject.addToSlotAddress(1, compressed); while constructing slotObject only once before entering the loop. It would be slightly more readable to me, but I'm not pushing for it.
return convertPointerFromToken(*srcAddress); | ||
mm_j9object_t result = NULL; | ||
if (compressObjectReferences()) { | ||
result = convertPointerFromToken(*(uint32_t*)srcAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm... convertPointerFromToken itself has already a runtime 'if CR' check.
arrayletLeafBase = (UDATA)convertPointerFromToken(*(U_32*)arrayletLeafSlot); | ||
} else { | ||
arrayletLeafBase = *(UDATA*)arrayletLeafSlot; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have up to 3 'if CR' runtime checks, while it's probably doable with only one :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like this?
if (compressObjectReferences()) {
arrayletLeafBase = (UDATA)convertCompressedPointerFromToken(*((U_32*)arrayoidPointer) + arrayletIndex);
} else {
arrayletLeafBase = *(((UDATA*)arrayoidPointer) + arrayletIndex);
}
where convertCompressedPointerFromToken() does just shifting and no 'if CR' check. This helper could be used elsewhere (in readImpl() )
@@ -200,8 +200,9 @@ class MM_MetronomeDelegate : public MM_BaseNonVirtual | |||
MMINLINE uintptr_t | |||
scanPointerArraylet(MM_EnvironmentRealtime *env, fomrobject_t *arraylet) | |||
{ | |||
UDATA const referenceSize = env->compressObjectReferences() ? sizeof(uint32_t) : sizeof(uintptr_t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this pattern repeats often. we should just have something like env->getObjectReferenceSize()
GC_SlotObject slotObject(_javaVM->omrVM, (fj9object_t*)scanPtr); | ||
_markingScheme->markObject(env, slotObject.readReferenceFromSlot()); | ||
scanPtr++; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pass is somewhat perf critical and extra runtime checks in readReferenceFromSlot() could be tangible (sparsly populated arrays). Would it make sense to have specialized GC_SlotObject::readCRReferenceFromSlot/readNonCRReferenceFromSlot APIs?
#endif /* OMR_GC_COMPRESSED_POINTERS */ | ||
{ | ||
return (fj9object_t)(UDATA)pointer; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This itself is good, but similarly to slow barrier code, this is invoked from a site where there is already a runtime CR check (see readObjectImpl).
As mentioned before, we could specialize this method further into two methods for CR and nonCR, and use proper one from the call site that already knows if it's CR or not.
We could keep this one too, for sites where we don't know if it's CR.
/** | ||
* @return the next leaf reference slot in the arraylet | ||
* @return NULL if there are no more reference slots in the object | ||
*/ | ||
MMINLINE GC_SlotObject *nextLeafPointer() | ||
{ | ||
if (_numLeafsCounted < _numLeafs) { | ||
_slotObject.writeAddressToSlot(&_arrayoid[_numLeafsCounted]); | ||
_slotObject.writeAddressToSlot(GC_SlotObject::addToSlotAddress(_arrayoid, _numLeafsCounted, compressObjectReferences())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as proposed before we could use a non-static addToSlotAddress
@amicic My current goal is to avoid regressing the single-mode builds. Optimizing multiple compressed checks is in plan for later on (likely by specializing entire code paths for compressed/full rather than microoptimizing the individual functions). |
706232d
to
e5a7217
Compare
90b19a6
to
a039663
Compare
@dmitripivkine @amicic With this PR, mixed mode now works on xa64 (metronome and balanced work is not complete). I'd prefer to make the suggested optimizations/clarifications in future PRs (as they require some OMR work first). |
@andrewcraik @fjeremic Can you please find someone to review the JIT changes? |
I believe I've found all of the places doing fobj pointer math (I temporarily replaced Some math changes remain in metronome and balanced, but I know where they are now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JIT changes look straightforward. I'd advise another x86 expert to take a look at the NASM changes.
448e7b9
to
5f169d3
Compare
Metronome is now working in mixed mode as well. |
@@ -58,7 +58,7 @@ class GC_PointerArrayObjectScanner : public GC_IndexableObjectScanner | |||
* @param[in] flags Scanning context flags | |||
*/ | |||
MMINLINE GC_PointerArrayObjectScanner(MM_EnvironmentBase *env, omrobjectptr_t arrayPtr, fomrobject_t *basePtr, fomrobject_t *limitPtr, fomrobject_t *scanPtr, fomrobject_t *endPtr, uintptr_t flags) | |||
: GC_IndexableObjectScanner(env, arrayPtr, basePtr, limitPtr, scanPtr, endPtr, ((endPtr - scanPtr) < _bitsPerScanMap) ? (((uintptr_t)1 << (endPtr - scanPtr)) - 1) : UDATA_MAX, sizeof(fomrobject_t), flags) | |||
: GC_IndexableObjectScanner(env, arrayPtr, basePtr, limitPtr, scanPtr, endPtr, (GC_SlotObject::subtractSlotAddresses(endPtr, scanPtr, env->compressObjectReferences()) < _bitsPerScanMap) ? ((uintptr_t)1 << GC_SlotObject::subtractSlotAddresses(endPtr, scanPtr, env->compressObjectReferences())) - 1 : UDATA_MAX, env->compressObjectReferences() ? sizeof(uint32_t) : sizeof(uintptr_t), flags) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is too long now... Can it be split somehow to be more read friendly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See if it's better now...
Jenkins test sanity all jdk8 |
Jenkins test sanity all jdk8 |
Remove assumptions about the compile-time size of various types (f9object_t/fomrobject_t/object headers). [ci skip] Signed-off-by: Graham Chapman <graham_chapman@ca.ibm.com>
Jenkins test sanity all jdk8 |
The failing testing looks like a build issue (JIT can't load due to protobuf stuff) - I suggest ignoring it and merging once the rest completes. |
Ya, it's a problem on ub16-ppcle-2 #8257. I had disabled, but every time jenkins restarts the machine comes back online. I'll disable the machine again. |
Remove assumptions about the compile-time size of various types (f9object_t/fomrobject_t/object headers).
[ci skip]
Signed-off-by: Graham Chapman graham_chapman@ca.ibm.com