Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Object.clone() #674

Closed
dmlloyd opened this issue Oct 4, 2021 · 10 comments · Fixed by #943
Closed

Implement Object.clone() #674

dmlloyd opened this issue Oct 4, 2021 · 10 comments · Fixed by #943
Assignees
Labels
kind: task 📋 A task to fulfill an implementation requirement

Comments

@dmlloyd
Copy link
Collaborator

dmlloyd commented Oct 4, 2021

One of the Big Two sources of compilation warnings right now is because Object.clone() is not fully implemented.

In order to implement this method, I propose something like this:

  • Add a field to Class which will be populated with the run time size of the corresponding object instance
  • Create a Object$_native implementation for Object.clone() whose implementation allocates an object of the size from the corresponding class, and uses the LLVM memcpy intrinsic to populate that object (be sure to exclude the object header so it doesn't get overwritten), ending with a release fence
  • Create overrides of this method on each of the generated array core classes which adds the allocated array size to the size of the corresponding element type, but otherwise uses the same process as above to populate & publish the object
  • Remove the Clone node and corresponding intrinsics completely from the codebase

The nodes needed to implement this method - specifically, reading an injected field from class and calling the memcpy intrinsic are not particularly easy to do on the Java side yet. For reading the injected field, we could define an intrinsic which returns the size given the corresponding Class object. For the memcpy intrinsic, I think we need to consider adding an LLVM runtime library module which defines a few intrinsic methods, starting with memcpy. Alternatively the standard C memcpy could be used. The release fence is accomplished by calling VarHandle.releaseFence().

Update: We do not presently have a good way to define native method bindings which apply to the array classes. We'll need to add a (public) override of clone onto the base array class at a minimum, or else overrides on to each subclass, when the core classes are created. The method bodies could be created by hand in CoreClasses, or, if someone wants to go through the trouble, the could ensure that e.g. binding internal_array$_native works correctly and implement the overrides there.

@dmlloyd dmlloyd added the kind: task 📋 A task to fulfill an implementation requirement label Oct 4, 2021
@DanHeidinga
Copy link
Collaborator

The overall approach of storing the instance size per class makes sense to me with the special handling of the header fields and memcpy of the body. We'll need to clear the instance hashcode and the lockword in the cloned object as well as any has_been_hashed | has_been_moved header bits.

Couldn't we add an if (this.class.isArray()) { /* do array clone */ } else { /* standard obj clone */} check to the Object$_native implementation? It saves trying to redefine the method on the array classes

@dmlloyd
Copy link
Collaborator Author

dmlloyd commented Oct 4, 2021

Couldn't we add an if (this.class.isArray()) { /* do array clone */ } else { /* standard obj clone */} check to the Object$_native implementation? It saves trying to redefine the method on the array classes

We could, in which case we'd need an intrinsic (or something) to read the array length (Array.getLength(Object) seems a good candidate) and the element size (no ideas for this one outside of Unsafe) and do the multiplication (well, probably bit shift).

@DanHeidinga
Copy link
Collaborator

We can write the array case entirely in normal Java:

Class<?> type = obj.getClass().getComponentType();
int length = Array.getLength(obj);
Object cloned = Array.newInstance(type, length);
System.arraycopy(obj, 0, cloned, 0, length);
return cloned;

(Borrowing the approach used in OpenJ9's JITHelpers class)

This assumes we're willing to support reflective array creation (and I don't see why we wouldn't given we'll have all the data to support it by default)

@DanHeidinga
Copy link
Collaborator

For the memcpy cases, we'll need to ensure that we can't hit a safepoint while doing the copy or the GC may see uninitialized / partly initialized objects. One way to avoid this is to write the code to do slot by slot copies

@dmlloyd
Copy link
Collaborator Author

dmlloyd commented Oct 5, 2021

For the memcpy cases, we'll need to ensure that we can't hit a safepoint while doing the copy or the GC may see uninitialized / partly initialized objects. One way to avoid this is to write the code to do slot by slot copies

OTOH if the (reference) array is very large it could end up blocking safepoints for an unreasonably long time, could it not?

@DanHeidinga
Copy link
Collaborator

I don't think that's an issue. The array would need to be incredibly large and the system would need to have severely constrained memory bandwidth and even then... it's never come up to my knowledge.

The benefit of reusing arraycopy is that we only need to get the gc barriers and array store checks right in one place rather than needing to duplicate that code...

@dgrove-oss
Copy link
Collaborator

Do we need to keep the clone node during the ADD phase so we can interpret it? Then lower to the VMHelper call in a later phase?

In any case, I'm going to start knocking off the easy parts of this (starting with the field on the java.lang.Class instance)

@dgrove-oss
Copy link
Collaborator

dgrove-oss commented Oct 6, 2021

For the memcpy cases, we'll need to ensure that we can't hit a safepoint while doing the copy or the GC may see uninitialized / partly initialized objects. One way to avoid this is to write the code to do slot by slot copies

A simpler (but less optimal) solution is for reference arrays to zero the backing storage of the new array, then call arraycopy without any special blocking of safe points. We'd end up doing two writes per slot, but optimizing clone of reference arrays is probably not at the top of our list of things to worry about.

@dmlloyd
Copy link
Collaborator Author

dmlloyd commented Oct 6, 2021

Do we need to keep the clone node during the ADD phase so we can interpret it? Then lower to the VMHelper call in a later phase?

TBH I think we should eliminate the Clone node. Having a "real" clone() method on the array base class and/or leaf classes means that we don't really have to do anything special to make array cloning work.

A simpler (but less optimal) solution is for reference arrays to zero the backing storage of the new array, then call arraycopy without any special blocking of safe points. We'd end up doing two writes per slot, but optimizing clone of reference arrays is probably not at the top of our list of things to worry about.

FWIW the simple proposal from Dan above would already do this, since Array.newInstance leaves you with a zeroed array. The allocator in turn would have to do the zeroing in this case though.

@dgrove-oss
Copy link
Collaborator

dgrove-oss commented Nov 3, 2021

In addition to a field for instance size, I believe we also need to add a field for alignment to Class (or force maximal alignment on clone, which seems less optimal).

@dgrove-oss dgrove-oss self-assigned this Nov 29, 2021
dgrove-oss added a commit to dgrove-oss/qbicc that referenced this issue Dec 20, 2021
dgrove-oss added a commit to dgrove-oss/qbicc that referenced this issue Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: task 📋 A task to fulfill an implementation requirement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants