Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TR_J9VM::getClassFromSignature: class loader of the result does not always match the class loader of the method #10397

Closed
dmitry-ten opened this issue Aug 17, 2020 · 13 comments · Fixed by #10431
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project

Comments

@dmitry-ten
Copy link
Contributor

In jitserver, we cache the result of TR_J9VM::getClassFromSignature call by {classLoader, signature} key where the class loader is obtained from the method passed as an argument.
However, I determined that in some cases, these 2 class loaders do not match. Sometimes, the class loader of the result is a system class loader, unlike the loader of the class owning the method.
This code explains some cases when the class loader might be different:
https://github.com/eclipse/openj9/blob/6a4bb2436cc5c15ad99da4f46e2881d9aaedd7b1/runtime/compiler/env/VMJ9.cpp#L6871-L6881
However, I see that sometimes the class loaders do not match, even if the above code is never executed. The reasons for that are not clear to me, but the conclusion is that we should not be using class loader of the method as a key when they do not match.
If the class loader is wrong, this will lead to segfaults if the class is unloaded. This problem is causing #10215 .

@dmitry-ten
Copy link
Contributor Author

I propose to modify the VM_getClassFromSignature message, to also return the class loader and whether the class loaders match.
If the loaders match, then the resulting class can be safely cached, otherwise do not cache.

@dmitry-ten dmitry-ten changed the title TR_J9VM::getClassFromSiganture: class loader of the result does not always match the class loader of method TR_J9VM::getClassFromSiganture: class loader of the result does not always match the class loader of the method Aug 17, 2020
@mpirvu mpirvu added the comp:jitserver Artifacts related to JIT-as-a-Service project label Aug 17, 2020
@mpirvu mpirvu added this to To do in JIT as a Service via automation Aug 17, 2020
@dsouzai
Copy link
Contributor

dsouzai commented Aug 17, 2020

If you look at this method's signature

TR_J9VM::getClassFromSignature(const char * sig, int32_t sigLength, J9ConstantPool * constantPool)

There's a constant pool there. What that means is that you're asking the question "Get me the J9Class that has a particular signature, that is visible to the J9Class that is associated with the constantPool that is passed in". Therefore, the result could vary depending on the constantPool that is passed in.

class loader of the result does not always match the class loader of the method

What do you mean by "the method" here; could you give a more concrete example of how getClassFromSignature is invoked, and how you're associating a classloader to the result?

@dmitry-ten
Copy link
Contributor Author

dmitry-ten commented Aug 17, 2020

When I talk about the method, I'm talking about the other 2 versions of getClassFromSignature that are passed a method and then get a constant pool from it.
https://github.com/eclipse/openj9/blob/b90bbb058b625b0f2ed5deae2bfaa8b226527ef8/runtime/compiler/env/VMJ9.cpp#L6838-L6849

The point is that right now jitserver assumes that
J9_CLASS_FROM_CP(constantPool)->classLoader == j9class->classLoader, where
j9class is the return value of getClassFromSignature which is not always the case.

@dsouzai
Copy link
Contributor

dsouzai commented Aug 17, 2020

The point is that right now jitserver assumes that
J9_CLASS_FROM_CP(constantPool)->classLoader == j9class->classLoader, where
j9class is the return value of getClassFromSignature which is not always the case.

Yeah there's definitely no guarantee that J9_CLASS_FROM_CP(constantPool)->classLoader == j9class->classLoader is always true. getClassFromSignature is used to get a J9Class that is visible to J9_CLASS_FROM_CP(constantPool); however visibility doesn't imply the same class loaders.

@dsouzai
Copy link
Contributor

dsouzai commented Aug 17, 2020

FWIW, it is always safe to cache using {J9_CLASS_FROM_CP(constantPool), signature} (which is basically the information in SVM validation records). However, I guess it doesn't cover as a big an area as the classloaders would.

I propose to modify the VM_getClassFromSignature message, to also return the class loader and whether the class loaders match.
If the loaders match, then the resulting class can be safely cached, otherwise do not cache.

Given the following situtation:

// classA->classLoader == classB->classLoader
const char *sig = ...;
J9Class * classByName1 = getClassFromSignature(sig, classA->constantPool)
J9Class * classByName2 = getClassFromSignature(sig, classB->constantPool)

I don't think it's possible for classByName1 != classByName2 if both are non-NULL. However, and I'm not 100% sure about this, I think it is possible for one or both of them to be NULL. The reason is because the class whose name is in sig might not be visible to, say, classB. That said, maybe it isn't possible for the compiler to make the getClassFromSignature query in the first place. Might be worth running this by someone on the VM team.

@mpirvu mpirvu changed the title TR_J9VM::getClassFromSiganture: class loader of the result does not always match the class loader of the method TR_J9VM::getClassFromSignature: class loader of the result does not always match the class loader of the method Aug 18, 2020
@mpirvu
Copy link
Contributor

mpirvu commented Aug 18, 2020

The chain of calls is

TR_J9VM::getClassFromSignature(const char * sig, int32_t sigLength, TR_OpaqueMethodBlock * method, bool isVettedForAOT)
  TR_J9VM::getClassFromSignature(sig, sigLength, cp=J9_CP_FROM_METHOD((J9Method*)method))
     jitGetClassFromUTF8(vmThread(), cp, (void *)sig, sigLength);
       jitGetClassInClassloaderFromUTF8(vmThread, cp->ramClass->classLoader, sig, sigLength)
         internalFindClassUTF8(vmThread, (U_8*)sig, sigLength, classLoader, J9_FINDCLASS_FLAG_EXISTING_ONLY);

So in the end we are asking the VM to give us a class based on name and method->constantPool->ramClass->classLoader. There are some cases when the VM returns NULL because the class is not initialized and then we may do that trick to find some special named classes:

      if ((sigLength > 5 && strncmp(sig, "java/", 5) == 0) ||
         (sigLength == 31 && strncmp(sig, "com/ibm/jit/DecimalFormatHelper", 31) == 0) ||
         (sigLength > 21 && strncmp(sig, "com/ibm/jit/JITHelpers", 22) == 0))
         {
         returnValue = getSystemClassFromClassName(sig, sigLength);
         }

Excepting the special case above I would like to see and understand some examples where:
found_j9class->classLoader != j9method->cp->ramClass->classLoader

I agree that our current caching scheme does not take into account the visibility aspect that Irwin mentioned, i.e., having cached the result of a query {j9method1, name} as {classLoader, name}, for another query involving {j9method2, name} we may find something in our cache, but the true result should have been NULL.

@mpirvu
Copy link
Contributor

mpirvu commented Aug 18, 2020

it is always safe to cache using {J9_CLASS_FROM_CP(constantPool), signature}

We could try that and measure the increase in the number of messages.

@mpirvu
Copy link
Contributor

mpirvu commented Aug 18, 2020

I propose to modify the VM_getClassFromSignature message, to also return the class loader and whether the class loaders match. If the loaders match, then the resulting class can be safely cached, otherwise do not cache.

I am ok with this, but with the following amendment: if the loaders do not match, store an entry in the JITServer j9class cache indicating the classloader that needs to be used when accessing the {classloader,name} hashtable.
Also, to check for correctness I would create a mode where we always check the client and compare the responses.

@dmitry-ten
Copy link
Contributor Author

I've actually realized something:
if most or all cases where the class loaders do not match are due to the actual class loader being the system class loader (and it seems to be the case), then we don't even need to clear them in processUnloadedClasses since they can't be unloaded before JVM termination. The segfaults seem to be actually coming from the signatures not being handled correctly at class unloading.

So we could cache the classes where class loaders match and cache the classes where they don't match but the actual class loader is a system class loader. Then at retrieval we would just need to check 2 cases: one with class loader of the referencing class, and one with the system class loader.

@mpirvu
Copy link
Contributor

mpirvu commented Aug 18, 2020

IIRC we treat class redefinition as class unloading, meaning that we delete from caches the classes that have been redefined. Classes from system class loaders cannot be unloaded, but they can be redefined.

@dmitry-ten
Copy link
Contributor Author

dmitry-ten commented Aug 18, 2020

Right. But if I understand your above solution correctly, the referencing class would cache the class loader of the class retrieved by signature. Wouldn't it be possible that the same class references multiple classes by signature and thus we would need a map? Or am I misunderstanding the idea.

@mpirvu
Copy link
Contributor

mpirvu commented Aug 19, 2020

Yes, we would need a linked list which would be used only for class unload or redefinition situations.

@dmitry-ten
Copy link
Contributor Author

I modified #10408 to only deliver the fix to incorrect signatures, but not to mismatching class unloaders since the former is a more serious bug but has a simple fix.
Regarding this issue I determined that caching with the actual class loader and then doing 2 queries at retrieval: one with class loader of the referencing method and one with system class loader sometimes produces incorrect results. It may be possible that we cached a system class but it's not necessarily visible from the referencing class.
In light of that, I can't think of a better solution than doing what @mpirvu proposed previously and caching by referencing class loader and keeping a vector/linked list of classes that reference system class.

dmitry-ten added a commit to dmitry-ten/openj9 that referenced this issue Aug 26, 2020
When caching the result of `TR_J9ServerVM::getClassFromSignature`
we incorrectly assume that the class loader of the referencing constant pool
is the same as class loader of the resulting class.
The problem is that when we delete unloaded classes from the cache,
we use the actual class loader as key, instead of referencing one
which will result in entries for unloaded classes not being deleted properly.

This commit fixes it by keeping track of which class loaders reference
a class in `ClassInfo`. When processing unloaded classes, we find
correct entries for deletion by retrieving class loaders from
`ClassInfo`.

Closes: eclipse-openj9#10397

Signed-off-by: Dmitry Ten <Dmitry.Ten@ibm.com>
JIT as a Service automation moved this from To do to Done Aug 27, 2020
dusanboskovic pushed a commit to dusanboskovic/openj9 that referenced this issue Sep 9, 2020
When caching the result of `TR_J9ServerVM::getClassFromSignature`
we incorrectly assume that the class loader of the referencing constant pool
is the same as class loader of the resulting class.
The problem is that when we delete unloaded classes from the cache,
we use the actual class loader as key, instead of referencing one
which will result in entries for unloaded classes not being deleted properly.

This commit fixes it by keeping track of which class loaders reference
a class in `ClassInfo`. When processing unloaded classes, we find
correct entries for deletion by retrieving class loaders from
`ClassInfo`.

Closes: eclipse-openj9#10397

Signed-off-by: Dmitry Ten <Dmitry.Ten@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
Development

Successfully merging a pull request may close this issue.

3 participants