Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to remove "modifyThreadGroup" checking from SecureSM? #1649

Closed
ylwu-amzn opened this issue Dec 2, 2021 · 20 comments
Closed
Labels
enhancement Enhancement or improvement to existing feature or request untriaged

Comments

@ylwu-amzn
Copy link
Contributor

ylwu-amzn commented Dec 2, 2021

Is your feature request related to a problem? Please describe.
We are developing a new ML plugin with Tribuo library. But found the KMeans algorithm can't run due to this exception.

java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "modifyThreadGroup")
        at java.security.AccessControlContext.checkPermission(AccessControlContext.java:472) ~[?:?]
        at java.security.AccessController.checkPermission(AccessController.java:1036) ~[?:?]
        at java.lang.SecurityManager.checkPermission(SecurityManager.java:408) ~[?:?]
        at org.opensearch.secure_sm.SecureSM.checkThreadGroupAccess(SecureSM.java:205) ~[opensearch-secure-sm-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at org.opensearch.secure_sm.SecureSM.checkAccess(SecureSM.java:160) ~[opensearch-secure-sm-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
        at java.lang.ThreadGroup.checkAccess(ThreadGroup.java:311) ~[?:?]
        at java.lang.Thread.<init>(Thread.java:421) ~[?:?]
        at java.lang.Thread.<init>(Thread.java:707) ~[?:?]
        at java.lang.Thread.<init>(Thread.java:540) ~[?:?]
        at java.util.concurrent.ForkJoinWorkerThread.<init>(ForkJoinWorkerThread.java:100) ~[?:?]
        at java.util.concurrent.ForkJoinPool$DefaultForkJoinWorkerThreadFactory$1.run(ForkJoinPool.java:720) ~[?:?]
        at java.util.concurrent.ForkJoinPool$DefaultForkJoinWorkerThreadFactory$1.run(ForkJoinPool.java:717) ~[?:?]
        at java.security.AccessController.doPrivileged(AccessController.java:391) ~[?:?]
        at java.util.concurrent.ForkJoinPool$DefaultForkJoinWorkerThreadFactory.newThread(ForkJoinPool.java:716) ~[?:?]
        at java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1329) ~[?:?]
        at java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1353) ~[?:?]
        at java.util.concurrent.ForkJoinPool.signalWork(ForkJoinPool.java:1478) ~[?:?]
        at java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:1912) ~[?:?]
        at java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:1930) ~[?:?]
        at java.util.concurrent.ForkJoinPool.submit(ForkJoinPool.java:2506) ~[?:?]
        at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:189) ~[tribuo-clustering-kmeans-4.0.2.jar:?]
        at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:253) ~[tribuo-clustering-kmeans-4.0.2.jar:?]

Have tried adding "modifyThreadGroup" permission in plugin security policy file, doesn't work.

Copy description from oracle/tribuo#158,

From Java 11, the security policies doesn’t get propagated to ForkJoinPool worker threads if a SecurityManager is present ( https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/ForkJoinPool.html )
Kmeans depends on ForkJoinPool to run the training tasks: https://github.com/oracle/tribuo/blob/v4.1.0/Clustering/KMeans/src/main/java/org/tribuo/clustering/kmeans/KMeansTrainer.java#L197
So, any java application relay on SecurityManager and security policies won't able to work with Tribuo Kmeans because of AccessControlException.

Describe the solution you'd like
Remove this "modifyThreadGroup" checking from SecureSM https://github.com/opensearch-project/OpenSearch/blob/main/libs/secure-sm/src/main/java/org/opensearch/secure_sm/SecureSM.java#L62. The default java security manager doesn't check this "modifyThreadGroup" permission, so Tribuo Kmeans can run successfully with default java security manager.

Describe alternatives you've considered

  1. Ask Tribuo team to replace ForkJoinPool with different pool, but from Tribuo team (check Tribuo KmeansTrainer doesn't work with Java 11 (or any later version) + SecurityManager because of using ForkJoinPool oracle/tribuo#158):

rewriting it to use a different threading mechanism for multi-threaded execution would be a breaking change and take some time

  1. Look for another ML library which supports KMeans. But this takes time and not easy to maintain as we need to integrate multiple ML library.
@ylwu-amzn ylwu-amzn added enhancement Enhancement or improvement to existing feature or request untriaged labels Dec 2, 2021
@ylwu-amzn ylwu-amzn changed the title Is it possible to remove "modifyThreadGroup" checking in SecureSM? Is it possible to remove "modifyThreadGroup" checking from SecureSM? Dec 2, 2021
@reta
Copy link
Collaborator

reta commented Dec 3, 2021

@ylwu-amzn I afraid that just dropping some checks from the SecureSM could solve your immediate issue with modifyThreadGroup but it is very likely may fail other security checks later on. Probably more robust solutions to explore could be:

  1. Supply own ForkJoinWorkerThreadFactory (using java.util.concurrent.ForkJoinPool.common.threadFactory system property) to use SecurityManager's ThreadGroup
  2. Do not use the ForkJoinPool.commonPool() but create own ForkJoinPool instance for KMeansTrainer (since ForkJoinPool.commonPool() could be used literally from anywhere). It would allow to properly set the ForkJoinWorkerThreadFactory for this specific ForkJoinPool instance only.

What do you think?

@Craigacp
Copy link

Craigacp commented Dec 5, 2021

Tribuo doesn't use the common ForkJoinPool, we use a separate instance, but we also don't expose the construction of that FJP. What would we need to do to make it work with SecureSM?

@reta
Copy link
Collaborator

reta commented Dec 5, 2021

Tribuo doesn't use the common ForkJoinPool, we use a separate instance, but we also don't expose the construction of that FJP.

@Craigacp would you consider the option to keep the construction sealed but make it aware of the presence of the SecurityManager (and use customized ForkJoinWorkerThreadFactory)? I will be more than happy to submit the pull request.

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

I would like it to work, but I don't want to take a dependency on SecureSM. Preferably I'd like Tribuo not to know about any SecurityManager at all given they are deprecated for removal in Java 17, but I guess that's unavoidable if we want to fix this. According to https://github.com/openjdk/jdk/blob/jdk-17-ga/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L812 the DefaultForkJoinWorkerThreadFactory doesn't make innocuous threads, so I'm not sure why they don't get the right permissions. What's the nature of the fix? Do we need to make our own subclass of ForkJoinWorkerThreadFactory?

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

I believe in case of Tribuo, not the DefaultForkJoinWorkerThreadFactory but DefaultCommonPoolForkJoinWorkerThreadFactory is used, since this constructor is being called https://github.com/openjdk/jdk/blob/jdk-17-ga/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L2595

Do we need to make our own subclass of ForkJoinWorkerThreadFactory?

Correct, that is an idea.

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

No, we use this constructor https://github.com/openjdk/jdk/blob/jdk-17-ga/src/java.base/share/classes/java/util/concurrent/ForkJoinPool.java#L2397, here https://github.com/oracle/tribuo/blob/main/Clustering/KMeans/src/main/java/org/tribuo/clustering/kmeans/KMeansTrainer.java#L205. So we pull in DefaultForkJoinWorkerThreadFactory rather than DefaultCommonPoolForkJoinWorkerThreadFactory. We don't use the common FJP anywhere because it could cause conflicts for our users. The stack trace in the Tribuo issue shows we're using the DefaultForkJoinWorkerThreadFactory - oracle/tribuo#158 (comment).

We do use the trick where we execute a parallel stream inside the FJP via a lambda, but that doesn't use the common pool for anything as I understand it. We have a little hook which ensures that the stream sizes the work using the FJP it's executing in rather than the common FJP, but I think that hook is only necessary in Java 8 and 9 (https://bugs.openjdk.java.net/browse/JDK-8190974), and it might be fixed in newer OpenJDK 8u versions. Our workaround for that is here - https://github.com/oracle/olcut/blob/develop/olcut-core/src/main/java/com/oracle/labs/mlrg/olcut/util/StreamUtil.java#L247.

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

Oh, you are right, the DefaultForkJoinWorkerThreadFactory is used, not the DefaultCommonPoolForkJoinWorkerThreadFactory, I believe that DefaultCommonPoolForkJoinWorkerThreadFactory would actually address the issue right away since the context it creates grants the modifyThreadGroup permission. I will try to reproduce the issue and update you shortly. In any case, providing own ForkJoinWorkerThreadFactory implementation should help.

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

@Craigacp I have now clear understanding why this exception happens and why permission checks referred in oracle/tribuo#158 did not help. One of the solutions which seems to work:

    private static final class CustomForkJoinWorkerThreadFactory implements ForkJoinWorkerThreadFactory {
        public final ForkJoinWorkerThread newThread(ForkJoinPool pool) {
            return AccessController.doPrivileged(
                new PrivilegedAction<ForkJoinWorkerThread>() {
                    public ForkJoinWorkerThread run() {
                        return new ForkJoinWorkerThread(pool) {}; 
                    }
               });
        }
    }

And than we could use a different constructor to supply own ForkJoinWorkerThreadFactory, which is essentially needed only for JDK11+ with the presence of the SecurityManager.

final ForkJoinPool pool = new ForkJoinPool(parallelism, new CustomForkJoinWorkerThreadFactory(), null, false);

I am aware that you prefer not to dial with SecurityManager in Tribuo and I think it is a very valid point. In these regard, the option which would keep Tribuo "clean" but still allowing the non-standard environments would be to provide extensibility / configuration hook to customize ForkJoinPool instantiation (with the current approach being the default one), what do you think about this path?

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

So the issue is that the FJP pool threads are created with only the "getClassLoader" and "setContextClassLoader" permissions in DefaultForkJoinWorkerThreadFactory, and the solution is to run the doPrivileged in the access control context of the Tribuo codebase, rather than the narrower one used by default?

Is there any downside to using that thread factory at all times?

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

So the issue is that the FJP pool threads are created with only the "getClassLoader" and "setContextClassLoader" permissions in DefaultForkJoinWorkerThreadFactory, and the solution is to run the doPrivileged in the access control context of the Tribuo codebase, rather than the narrower one used by default?

Correct, with the permission granted to the Tribuo codebase.

Is there any downside to using that thread factory at all times?

There are slight differences in the context class loader configuration (comparing to default one) which may cause issues in certain deployments. Keeping it as opt-in option would help to fallback to the default FRJ factory.

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

Hmm. I'm trying out the custom factory and it's not working for me. I get an identical error message to before. I believe my policy file is correct as when I comment out other things in the permissions I'm assigning to the Tribuo clustering jar then I get different error messages from earlier in the code (e.g. creating the provenance by reflectively accessing the host object).

    private static final class CustomForkJoinWorkerThreadFactory implements ForkJoinPool.ForkJoinWorkerThreadFactory {
        public final ForkJoinWorkerThread newThread(ForkJoinPool pool) {
            return AccessController.doPrivileged(new PrivilegedAction<ForkJoinWorkerThread>() {
                public ForkJoinWorkerThread run() {
                    return new ForkJoinWorkerThread(pool) {};
                }
            });
        }
    }

    private static final CustomForkJoinWorkerThreadFactory THREAD_FACTORY = new CustomForkJoinWorkerThreadFactory();

//inside train
ForkJoinPool fjp = parallel ? new ForkJoinPool(numThreads,THREAD_FACTORY,null,false) : null;

Any idea what's wrong?

Command:

$ java -cp .:Clustering/KMeans/target/tribuo-clustering-kmeans-4.2.0-SNAPSHOT-jar-with-dependencies.jar:./opensearch-secure-sm-1.2.0.jar -Djava.security.policy=tribuo.policy -Djava.security.debug=1 TestKMeans

Policy:

grant codeBase "file:/path/to/Tribuo/" {
        permission java.lang.RuntimePermission "modifyThread";
        permission java.lang.RuntimePermission "modifyThreadGroup";
};

grant codeBase "file:/path/to/Tribuo/Clustering/KMeans/target/tribuo-clustering-kmeans-4.2.0-SNAPSHOT-jar-with-dependencies.jar" {
        permission java.lang.RuntimePermission "modifyThread";
        permission java.lang.RuntimePermission "modifyThreadGroup";
        permission java.lang.RuntimePermission "accessDeclaredMembers";
        permission java.lang.reflect.ReflectPermission "suppressAccessChecks";
        permission java.util.logging.LoggingPermission "control";
        permission java.io.FilePermission "<<ALL FILES>>", "read";
        permission java.util.PropertyPermission "*", "read,write";
};

Stack trace:

access: caller thread=Thread[main,5,main]
access: caller group=java.lang.ThreadGroup[name=main,maxpri=10]
access: target group=java.lang.ThreadGroup[name=main,maxpri=10]
Exception in thread "main" java.security.AccessControlException: access denied ("java.lang.RuntimePermission" "modifyThreadGroup")
	at java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:472)
	at java.base/java.security.AccessController.checkPermission(AccessController.java:897)
	at java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:322)
	at org.opensearch.secure_sm.SecureSM.checkThreadGroupAccess(SecureSM.java:204)
	at org.opensearch.secure_sm.SecureSM.checkAccess(SecureSM.java:159)
	at java.base/java.lang.ThreadGroup.checkAccess(ThreadGroup.java:313)
	at java.base/java.lang.Thread.<init>(Thread.java:423)
	at java.base/java.lang.Thread.<init>(Thread.java:709)
	at java.base/java.lang.Thread.<init>(Thread.java:542)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.<init>(ForkJoinWorkerThread.java:94)
	at org.tribuo.clustering.kmeans.KMeansTrainer$CustomForkJoinWorkerThreadFactory$1$1.<init>(KMeansTrainer.java:528)
	at org.tribuo.clustering.kmeans.KMeansTrainer$CustomForkJoinWorkerThreadFactory$1.run(KMeansTrainer.java:528)
	at org.tribuo.clustering.kmeans.KMeansTrainer$CustomForkJoinWorkerThreadFactory$1.run(KMeansTrainer.java:526)
	at java.base/java.security.AccessController.doPrivileged(Native Method)
	at org.tribuo.clustering.kmeans.KMeansTrainer$CustomForkJoinWorkerThreadFactory.newThread(KMeansTrainer.java:526)
	at java.base/java.util.concurrent.ForkJoinPool.createWorker(ForkJoinPool.java:1328)
	at java.base/java.util.concurrent.ForkJoinPool.tryAddWorker(ForkJoinPool.java:1352)
	at java.base/java.util.concurrent.ForkJoinPool.signalWork(ForkJoinPool.java:1476)
	at java.base/java.util.concurrent.ForkJoinPool.externalPush(ForkJoinPool.java:1903)
	at java.base/java.util.concurrent.ForkJoinPool.externalSubmit(ForkJoinPool.java:1921)
	at java.base/java.util.concurrent.ForkJoinPool.submit(ForkJoinPool.java:2497)
	at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:285)
	at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:196)
	at org.tribuo.clustering.kmeans.KMeansTrainer.train(KMeansTrainer.java:319)
	at TestKMeans.main(TestKMeans.java:47)
access: caller thread=Thread[DestroyJavaVM,5,]
access: caller group=null
access: target group=java.lang.ThreadGroup[name=main,maxpri=10]

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

@Craigacp hm, I suspect the policy is not precise, I did that (since TestKMeans in test classes), and exception is gone:

grant codeBase "file:/path/to/tribuo/Clustering/KMeans/target/test-classes" {
        permission java.lang.RuntimePermission "modifyThread";
        permission java.lang.RuntimePermission "modifyThreadGroup";
};

Would you mind to share a brach (or something) so I could replicate the same setup? Thank you.

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

It's here - https://github.com/Craigacp/tribuo/tree/kmeans-securesm. You can run it like this:

java -cp Clustering/KMeans/target/test-classes:Clustering/KMeans/target/tribuo-clustering-kmeans-4.2.0-SNAPSHOT-jar-with-dependencies.jar:opensearch-secure-sm-1.2.0.jar -Djava.security.policy=tribuo.policy -Djava.security.debug=1 org.tribuo.clustering.kmeans.TestKMeansSM

after compiling with mvn clean package, downloading opensearch-secure-sm-1.2.0.jar into the Tribuo folder, and then fixing the paths in tribuo.policy to be wherever you check out Tribuo.

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

@Craigacp ah nailed it, it comes from SecureSM protection domain, needs that:

grant codeBase "file:/path/to/opensearch-secure-sm-1.2.0.jar" {
  permission java.security.AllPermission;
};

This is basically the OpenSearch server security policy.

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

Ah, ok, thanks for figuring that out. It's surprisingly difficult to find the OpenSearch security policy file on the internet. Is there a canonical location for it?

WRT the custom FJP factory approach, I think I'd prefer that we have a custom thread factory inside Tribuo rather than exposing a constructor which accepts a FJP or a ForkJoinWorkerThreadFactory. If we accepted the factory then we couldn't provenance it, and we try to track everything that could change the execution of a Tribuo model. I'll work on a patch for Tribuo for the places where we use FJPs.

@reta
Copy link
Collaborator

reta commented Dec 6, 2021

Ah, ok, thanks for figuring that out. It's surprisingly difficult to find the OpenSearch security policy file on the internet. Is there a canonical location for it?

https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/resources/org/opensearch/bootstrap/security.policy

WRT the custom FJP factory approach, I think I'd prefer that we have a custom thread factory inside Tribuo rather than exposing a constructor which accepts a FJP or a ForkJoinWorkerThreadFactory. If we accepted the factory then we couldn't provenance it, and we try to track everything that could change the execution of a Tribuo model. I'll work on a patch for Tribuo for the places where we use FJPs.

No objections, the only advice would be to have a check if the SecurityManager is present or not (since you already depend on it through AccessController):

        ForkJoinPool fjp = System.getSecurityManager() == null ?  new ForkJoinPool(numThreads) : new ForkJoinPool(numThreads, new CustomForkJoinWorkerThreadFactory(), null, false);

thank you!

@Craigacp
Copy link

Craigacp commented Dec 6, 2021

Ok, I'll add that check in too.

@ylwu-amzn
Copy link
Contributor Author

ylwu-amzn commented Dec 6, 2021

Thanks for the quick fix @reta and @Craigacp ! I planned to try custom thread factory but found Tribuo doesn't expose interface to specify thread factory. That will be perfect as @Craigacp adding the custom thread factory in Tribuo. That will unblock our integration for KMeans.

@Craigacp , when/how do you plan to release this patch? A patch version 4.1.1 and included in 4.2.0?

And I think KNN also use FJP, we may integrate KNN later too. Not so urgent, but will be perfect if you can also fix KNN part. Thanks!

@ylwu-amzn
Copy link
Contributor Author

ylwu-amzn commented Dec 6, 2021

Oh, never mind, I see your PR already fixed KNN part. Perfect! Thanks @Craigacp

@ylwu-amzn
Copy link
Contributor Author

As Tribuo team fixing by adding custom thread factory. We don't need to change OpenSearch SecureSM now. Close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement or improvement to existing feature or request untriaged
Projects
None yet
Development

No branches or pull requests

3 participants