Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STS not on class path when using parallel stream #2123

Closed
BartXZX opened this issue Oct 28, 2020 · 9 comments
Closed

STS not on class path when using parallel stream #2123

BartXZX opened this issue Oct 28, 2020 · 9 comments
Labels
bug This issue is a bug. closing-soon This issue will close in 4 days unless further comments are made.

Comments

@BartXZX
Copy link

BartXZX commented Oct 28, 2020

Running EcrClient.listImagesPaginator() in a Collection.parallelStream() using Spring Boot sometimes results in WebIdentityCredentialsUtils.factory() not finding STS on its class path.

Describe the issue

Sometimes when I execute the above I get the error message: "To use web identity tokens, the 'sts' service module must be on the class path.". I know that parallel streams delegates work to the ForkJoin common pool, and it seems that sometimes some threads from this pool might have a class loader that does not have STS on its path. (Maybe because of Spring? I noticed that the loader called "app" does not have STS)

Is there a reason you guys are using Thread.currentThread().getContextClassLoader(), instead of Object.getClass().getClassLoader() in the WebIdentityCredentialsUtils class?

I admit I know little about class loaders and how you use them, but I cannot use parallelStream() reliably at the moment. Or do you advise against using parallelStream at all?

Not filing as bug, since it might be intended behaviour.

Steps to Reproduce

I am listing docker images from multiple repositories in parallel.

@Override
public Stream<String> listImages(List<String> repositories) {
    return repositories.parallelStream()
        .flatMap(repoName -> ecr.listImagesPaginator(builder -> builder.registryId(this.registryId).repositoryName(repoName).filter(f -> f.tagStatus(TagStatus.TAGGED)))
            .imageIds()
            .stream()
            .map(ImageIdentifier::imageTag));
}

Current Behavior

I get the message "To use web identity tokens, the 'sts' service module must be on the class path." instead of it finding STS and using the token credentials.

Your Environment

  • AWS Java SDK version used: 2.14.18
  • JDK version used: 13
  • Operating System and version: Using the openjdk:13 image, which uses Oracle Linux 7 I believe.
@BartXZX BartXZX added guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Oct 28, 2020
@BartXZX BartXZX changed the title STS not on class path when using Stream.parallelStream() STS not on class path when using parallel Stream Oct 28, 2020
@BartXZX BartXZX changed the title STS not on class path when using parallel Stream STS not on class path when using parallel stream Oct 28, 2020
@zoewangg zoewangg added bug This issue is a bug. and removed guidance Question that needs advice or information. needs-triage This issue or PR still needs to be triaged. labels Oct 29, 2020
@zoewangg
Copy link
Contributor

Thank you for reporting the issue! I think we should use class loader instead of the thread context loader. We already have the ClassLoaderHelper#classLoader helper class. Marking this as a bug

@joviegas
Copy link
Contributor

joviegas commented Nov 6, 2020

Hi @BartXZX
Could you please help me to confirm ones, if you have added "sts" module in dependency like

        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>sts</artifactId>
            <version>2.15.20</version>
        </dependency>

or it would be of great help if you could provide all dependencies related to software.amazon.awssdk

@BartXZX
Copy link
Author

BartXZX commented Nov 6, 2020

Hi @joviegas, we use gradle. I have the following dependencies.

implementation platform('software.amazon.awssdk:bom:2.14.23')
implementation 'software.amazon.awssdk:s3'
implementation 'software.amazon.awssdk:ecr'
implementation 'software.amazon.awssdk:sts'
implementation 'software.amazon.awssdk:eks'
implementation 'software.amazon.awssdk:dynamodb'
implementation 'software.amazon.awssdk:rds'

Also, I removed all the parallel streams that used the SDK from the codebase and the issue goes away.
I did some debugging and have some stacktraces from both cases. The thing that stands out to me is that STS could not be found when the stream delegated some of its work to another thread ForkJoinWorkerThread.

STS was not found:

2020-10-28 13:11:30.774 ERROR 1 --- [onPool-worker-3] c.p.c.p.o.util.ClassloaderUtils          : STS not found, classloader: app

java.lang.ClassNotFoundException: software.amazon.awssdk.services.sts.internal.StsWebIdentityCredentialsProviderFactory
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:602) ~[na:na]
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178) ~[na:na]
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521) ~[na:na]
	at java.base/java.lang.Class.forName0(Native Method) ~[na:na]
	at java.base/java.lang.Class.forName(Class.java:416) ~[na:na]
	at com.planonsoftware.cloud.pco.orchestrator.util.ClassloaderUtils.printIsStsOnClasspath(ClassloaderUtils.java:13) ~[classes!/:na]
	at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.registries.EcrRegistry.listImages(EcrRegistry.java:81) ~[classes!/:na]
	at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.DockerImageStoreImpl.lambda$listImages$1(DockerImageStoreImpl.java:35) ~[classes!/:na]
	at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271) ~[na:na]
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1621) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[na:na]
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[na:na]
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952) ~[na:na]
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926) ~[na:na]
	at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327) ~[na:na]
	at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746) ~[na:na]
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290) ~[na:na]
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016) ~[na:na]
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665) ~[na:na]
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598) ~[na:na]
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177) ~[na:na]

And right below that in our logs, STS was found:

2020-10-28 13:11:30.779  INFO 1 --- [   scheduling-1] c.p.c.p.o.util.ClassloaderUtils          : STS found, classloader: null
java.lang.Exception: Stack trace
	at java.base/java.lang.Thread.dumpStack(Thread.java:1379)
	at com.planonsoftware.cloud.pco.orchestrator.util.ClassloaderUtils.printIsStsOnClasspath(ClassloaderUtils.java:16)
	at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.registries.EcrRegistry.listImages(EcrRegistry.java:81)
	at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.DockerImageStoreImpl.lambda$listImages$1(DockerImageStoreImpl.java:35)
	at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:271)
	at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1621)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
	at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
	at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
	at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinTask.doInvoke(ForkJoinTask.java:408)
	at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:736)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:919)
	at java.base/java.util.stream.DistinctOps$1.reduce(DistinctOps.java:64)
	at java.base/java.util.stream.DistinctOps$1.opEvaluateParallelLazy(DistinctOps.java:110)
	at java.base/java.util.stream.AbstractPipeline.sourceSpliterator(AbstractPipeline.java:434)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578)
	at com.planonsoftware.cloud.pco.orchestrator.services.software.docker.DockerImageStoreImpl.listImages(DockerImageStoreImpl.java:37)
	at com.planonsoftware.cloud.pco.orchestrator.services.application.ApplicationServiceImpl.refresh(ApplicationServiceImpl.java:115)
	at com.planonsoftware.cloud.pco.orchestrator.services.application.ApplicationServiceImpl.refreshCache(ApplicationServiceImpl.java:85)
	at com.planonsoftware.cloud.pco.orchestrator.scheduled.ScheduledTasks.refreshApplicationList(ScheduledTasks.java:22)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:830)

@joviegas
Copy link
Contributor

joviegas commented Nov 11, 2020

Hi @BartXZX ,
Thanks for the above data. Appreciate your quick responses.
We have fixed the issue with 2139.
Could you please try to recreate the issue with Release 2.15.26 and help me to validate if the above PR fixed the issue.

@BartXZX
Copy link
Author

BartXZX commented Nov 11, 2020

Is there a preview build somewhere for 26, or should I just wait for it to be released?

@BartXZX
Copy link
Author

BartXZX commented Nov 11, 2020

Nevermind, I see it's already in 25 :-)
And I'm happy to say I no longer see the issue!

Of course, it was an issue we saw 'sometimes' but usually we had a 50/50 chance to see it on startup.
I've tested many times now and have not seen it since, so I'm happy.
Will let you guys know if I see it pop up again.

Thank you for the quick response!

@debora-ito
Copy link
Member

Marking this to auto close soon, feel free to reach out if the issue persists after the fix.

@debora-ito debora-ito added the closing-soon This issue will close in 4 days unless further comments are made. label Nov 12, 2020
@casperbiering
Copy link

I've also hit the bug, and can confirm that the fix is working.

@joviegas
Copy link
Contributor

Thank you so much @casperbiering and @BartXZX .
Closing the issue.

aws-sdk-java-automation added a commit that referenced this issue Aug 16, 2022
…02a67e4b7

Pull request: release <- staging/e474196e-4969-4cda-932d-39a02a67e4b7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. closing-soon This issue will close in 4 days unless further comments are made.
Projects
None yet
Development

No branches or pull requests

5 participants