Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Painless - ASM/org.objectweb errors - 2.0.0 JDK17 #2664

Closed
dbbaughe opened this issue Mar 30, 2022 · 31 comments
Closed

[BUG] Painless - ASM/org.objectweb errors - 2.0.0 JDK17 #2664

dbbaughe opened this issue Mar 30, 2022 · 31 comments
Assignees
Labels
bug Something isn't working Severity-Blocker v2.0.0 Version 2.0.0

Comments

@dbbaughe
Copy link

dbbaughe commented Mar 30, 2022

Describe the bug
I'm getting some issues w/ running tests in Index Management for 2.0.
At a certain point while running integration tests the entire cluster crashes, so I tried tracking it down to see if it's a specific test.. but it seems to be flaky and sometimes crashes and sometimes doesn't.

An example of an integ test suite run and failing and running again and passing:
https://gist.github.com/dbbaughe/898d02beeb593fcf66ef97b2546a713b

It appears to be related to the org.objectweb.asm/org.ow2.asm dependencies, and I can see we recently committed some dependency upgrades for them that were also backported to the 2.0 branch.

lang-painless dependencies

From googling around..
apache/camel-quarkus#1182
quarkusio/quarkus#9832
CodeIntelligenceTesting/jazzer#55
https://youtrack.jetbrains.com/issue/IDEA-245330

I was wondering if perhaps IM was pulling in the asm dependency somewhere too and it was causing issues.. or I can see that the current lang-painless module still has an asm dependency of 7.2 for the analysis one and 9.2 for everything else which might of been causing it.

But, I tried bumping that 7.2 to 9.2 and I still see the failures when running the Index Management tests.
Also tried downgrading 9.2 to 7.2 and still see the failures when running the Index Management tests.
And then I tried just running the lang-painless tests to see and I'm getting failures (so unrelated to Index Management) by using ./gradlew :modules:lang-painless:check
https://gist.github.com/dbbaughe/a0eb025c7d9c950ea05161a58394b585

It's failing for me on JDK 17.
Passing for me on JDK 14.

Had a coworker try it out (just testing OpenSearch 2.0 w/ the lang-painless check) and it's passing for them on JDK 14 and failing on JDK 17 too.

@dbbaughe dbbaughe added bug Something isn't working untriaged labels Mar 30, 2022
@qreshi
Copy link
Contributor

qreshi commented Mar 30, 2022

Can confirm, I'm the coworker

@dbbaughe dbbaughe changed the title [BUG] Painless - ASM/org.objectweb errors - 2.0.0 [BUG] Painless - ASM/org.objectweb errors - 2.0.0 JDK17 Mar 30, 2022
@dblock dblock added the v2.0.0 Version 2.0.0 label Mar 30, 2022
@nknize
Copy link
Collaborator

nknize commented Mar 30, 2022

I agree that we should have consistent API versions. Looks like dependabot is not that.. dependa...ble.

However, I'm unable to reproduce this in the 2.0 branch w/ JDK 17. Even after upgrading api 'org.ow2.asm:asm-analysis:9.2' locally.

> ./gradlew :modules:lang-painless:check

=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.3.3
  OS Info               : Linux 4.15.0-171-generic (amd64)
  JDK Version           : 17 (Oracle JDK)
  JAVA_HOME             : /usr/lib/jvm/java-17-oracle
  Random Testing Seed   : 904D2056BEC6B699
  In FIPS 140 mode      : false
=======================================
...
BUILD SUCCESSFUL in 3m 1s
204 actionable tasks: 147 executed, 57 up-to-date

It's odd our PR gradle checks wouldn't have caught this as well. Did you clear gradle cache? Kill any stale gradle daemons?

@dbbaughe
Copy link
Author

I agree that we should have consistent API versions. Looks like dependabot is not that.. dependa...ble.

However, I'm unable to reproduce this in the 2.0 branch w/ JDK 17. Even after upgrading api 'org.ow2.asm:asm-analysis:9.2' locally.

> ./gradlew :modules:lang-painless:check

=======================================
OpenSearch Build Hamster says Hello!
  Gradle Version        : 7.3.3
  OS Info               : Linux 4.15.0-171-generic (amd64)
  JDK Version           : 17 (Oracle JDK)
  JAVA_HOME             : /usr/lib/jvm/java-17-oracle
  Random Testing Seed   : 904D2056BEC6B699
  In FIPS 140 mode      : false
=======================================
...
BUILD SUCCESSFUL in 3m 1s
204 actionable tasks: 147 executed, 57 up-to-date

It's odd our PR gradle checks wouldn't have caught this as well. Did you clear gradle cache? Kill any stale gradle daemons?

That's interesting, it passed for you on JDK 17 with asm-analysis on 7.2 and on 9.2?

I had Mohammad pull it down on to his dev desktop to test it as a way to get a second data point before nuking my caches as that always causes me other issues getting things set up again sometimes.

I guess I can now try and see what happens now since we have another data point of it passing for you..

@nknize
Copy link
Collaborator

nknize commented Mar 30, 2022

with asm-analysis on 7.2 and on 9.2?

Yes, it passes w/ both. Although it feels uncomfortable running 7.2 when the rest are on 9.2. Let's find out if this is a 9.2 runtime bug w/ JDK 17 before making a decision to either rollback or upgrade

@dbbaughe
Copy link
Author

@nknize Any chance there's a difference between the 17 JDK from Oracle and 17 JDK from OpenJDK?
I can see I was using

JDK Version           : 17 (OpenJDK)
JAVA_HOME             : /local/home/dbbaughe/jdk-17.0.2

And you have

JDK Version           : 17 (Oracle JDK)
JAVA_HOME             : /usr/lib/jvm/java-17-oracle

Mohammad also tested with OpenJDK. In the meantime I am nuking caches etc. and rerunning the test on my end.

@dbbaughe
Copy link
Author

Did this..

./gradlew clean && ./gradlew --stop
rm -rf ~/.gradle
rm -rf ~/.m2/repository/org/opensearch

Reran the test and still getting the failure on my end.

@dblock
Copy link
Member

dblock commented Mar 30, 2022

@dbbaughe Is the issue in org.objectweb.asm either way? Should we open a bug upstream?

»  java.lang.IllegalArgumentException: Unsupported api 589824
»       at org.objectweb.asm.MethodVisitor.<init>(MethodVisitor.java:89)
»       at org.objectweb.asm.commons.LocalVariablesSorter.<init>(LocalVariablesSorter.java:102)
»       at org.objectweb.asm.commons.GeneratorAdapter.<init>(GeneratorAdapter.java:227)
»       at org.objectweb.asm.commons.GeneratorAdapter.<init>(GeneratorAdapter.java:204)
»       at org.objectweb.asm.commons.GeneratorAdapter.<init>(GeneratorAdapter.java:245)

@nknize
Copy link
Collaborator

nknize commented Mar 30, 2022

Any chance there's a difference between the 17 JDK from Oracle and 17 JDK from OpenJDK?

Sure it's a possibility. I'm not in a position to test that for a while (maybe @mch2 can investigate?). If that ends up being the case then it sounds like a JDK 17 issue since you're seeing this failure on both 7.2 and 9.2 but only when using 17?

@dbbaughe
Copy link
Author

@dblock
I'm not sure if the problem is in the dependency or not, it could just be in OpenSearch itself w/ some other dependency pulling in an older version of this dependency and some shading issue. Or it could be the dependency isn't working as expected on OpenJDK 17 so it is a problem upstream.

@mch2 Can you try out the steps using OpenJDK 17 and see if you can replicate?

@nknize Yes, seeing it for OpenJDK 17 as is, upgrading asm-analysis to 9.2, and downgrading all asm in painless to 7.2 (did not try downgrading all asm dependencies in OpenSearch to 7.2 though).

@nknize
Copy link
Collaborator

nknize commented Mar 30, 2022

did not try downgrading all asm dependencies in OpenSearch to 7.2 though

I'd give this a shot. I did it and the tests pass but that's not saying anything since I can't reproduce w/ Oracle JDK.

Mixed dependency versions is always a problem and the only changes since the last release here are the dependency versions and jdk upgrade. If you can downgrade the dependency to 7.2 across the board and it passes for you in jdk 17 then I'd be fine reverting the upgrade for 2.0 and investigating the problem further for a future 2.x release.

There's no urgency for us to upgrade.

@mch2
Copy link
Member

mch2 commented Mar 30, 2022

@mch2 Can you try out the steps using OpenJDK 17 and see if you can replicate?

Trying this with 2.0 as is, first run after 13m failed with a timeout but no further output, might be flaky. Trying again with -i.

@dbbaughe
Copy link
Author

dbbaughe commented Mar 30, 2022

Checked out the advisory in the thread @mch2 found.

https://alas.aws.amazon.com/announcements/2021-001.html

Tried disabling the hot patch service
sudo systemctl stop log4j-cve-2021-44228-hotpatch
Was able to get the tests to successfully pass.

I tried upgrading it instead

sudo yum update log4j-cve-2021-44228-hotpatch     
Loaded plugins: priorities
amzn2-amazon                                                                                                                                                                                                                                     
amzn2-core                                                                                                                                                                                                                                       
amzn2-kernel                                                                                                                                                                                                                                     
amzn2-mate                                                                                                                                                                                                                                       
cloud-dev-dsk                                                                                                                                                                                                                                    
dcv                                                                                                                                                                                                                                              
firefox                                                                                                                                                                                                                                          
863 packages excluded due to repository priority protections
No packages marked for update

But it still fails, worried this will be a ticking time bomb for anyone running on Amazon EC2 (or higher abstraction services) then with that fat jar for the Log4j patch being injected.

@mch2
Copy link
Member

mch2 commented Mar 30, 2022

But it still fails, worried this will be a ticking time bomb for anyone running on Amazon EC2 (or higher abstraction services) then with that fat jar for the Log4j patch being injected.

CI is still running jdk 14, that is why it hasn't failed there.

For context, issue - elastic/elasticsearch#81915 (comment).

@mch2
Copy link
Member

mch2 commented Mar 30, 2022

I'm able to repro the failure with AL2 + JDK17. I've tried downgrading ASM to 7.2 with no success. Looking at shadowing those dependencies.

@dbbaughe
Copy link
Author

But it still fails, worried this will be a ticking time bomb for anyone running on Amazon EC2 (or higher abstraction services) then with that fat jar for the Log4j patch being injected.

CI is still running jdk 14, that is why it hasn't failed there.

For context, issue - elastic/elasticsearch#81915 (comment).

This seems like a bit of a concern and probably just a miss - did we open an issue to make sure we are running CI on the versions that we require our plugins to run on? I believe there has been work across all plugins to make sure their CI is running on the supported JDK versions.. and probably should have the expectation that upstream core is stable on those versions too.

@dblock
Copy link
Member

dblock commented Mar 30, 2022

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

@mch2
Copy link
Member

mch2 commented Mar 30, 2022

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

I checked directly on the jenkins job that runs gradle checks. It is not pulling in any script under version control.

I believe what has changed is the bundle build CI, which I would expect is now failing for this reason.

@owaiskazi19
Copy link
Member

owaiskazi19 commented Mar 30, 2022

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

I checked directly on the jenkins job that runs gradle checks. It is not pulling in any script under version control.

I believe what has changed is the bundle build CI, which I would expect is now failing for this reason.

@mch2 the jenkinsfile will be used by public jenkins later. The current CI is still using JDK-14 and has the old setup as of now. More context here

@mch2
Copy link
Member

mch2 commented Mar 30, 2022

CI for 2.0 should not/is not being run on 14, where do we see that? if we're talking gradle check, that's being fixed now in #2671

I checked directly on the jenkins job that runs gradle checks. It is not pulling in any script under version control.
I believe what has changed is the bundle build CI, which I would expect is now failing for this reason.

@mch2 the jenkinsfile will be used by public jenkins later. The current CI is still using JDK-14 and has the old setup as of now. More context here

I'm not sure its wise for us to have inconsistent versions between the build & check jobs. I'd suggest we either manually change the check job to also be 17 or revert to 11.

With that said, I am still working on resolving this issue.

@owaiskazi19
Copy link
Member

I'm not sure its wise for us to have inconsistent versions between the build & check jobs. I'd suggest we either manually change the check job to also be 17 or revert to 11.

Makes sense. @peterzhuamazon can we run gradle check on the current CI with JDK-17 once this issue gets fixed?

@penghuo
Copy link
Contributor

penghuo commented Mar 30, 2022

+1. Same issue when query with any painless script.
Env:

  • Amazon AL2 EC2
  • OpenJDK 17
  • OpenSearch 2.0 main localDistro build

Solution
sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill

@nknize
Copy link
Collaborator

nknize commented Mar 31, 2022

Solution
sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill

If that's the case, then this is the most straightforward solution I've ever seen 😆

@mch2
Copy link
Member

mch2 commented Mar 31, 2022

sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill

This will disable the log4j patch on those hosts. As a workaround we could use shadow pr here. This works for me with jdk 17 on AL2.

@aksingh-es
Copy link

aksingh-es commented Mar 31, 2022

AD plugin integ test are also failing due to this issue as the nodes are not coming up healthy . sudo touch /etc/log4j-cve-2021-44228-hotpatch.kill did work.Adding OS node bootup logs below

[2022-03-31T19:30:45,793][INFO ][o.o.a.t.StopDetectorTransportAction] [integTest-0] models of detector 1Gl14X8BSIHcmFuBSdJK get deleted [2022-03-31T19:30:45,794][INFO ][o.o.a.r.h.IndexAnomalyDetectorJobActionHandler] [integTest-0] AD model deleted successfully for detector 1Gl14X8BSIHcmFuBSdJK [2022-03-31T19:30:45,822][INFO ][o.o.a.m.CheckpointDao ] [integTest-0] 1 checkpoints docs get deleted [2022-03-31T19:30:45,849][WARN ][stderr ] [integTest-0] ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.9.3 [2022-03-31T19:30:45,879][WARN ][stderr ] [integTest-0] ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.9.3 [2022-03-31T19:30:45,943][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [integTest-0] fatal error in thread [opensearch[integTest-0][generic][T#3]], exiting java.lang.IllegalAccessError: failed to access class org.objectweb.asm.Edge from class org.objectweb.asm.MethodWriter (org.objectweb.asm.Edge is in unnamed module of loader 'app'; org.objectweb.asm.MethodWriter is in unnamed module of loader java.net.FactoryURLClassLoader @126be319) at org.objectweb.asm.MethodWriter.addSuccessorToCurrentBasicBlock(MethodWriter.java:1782) ~[Log4jHotPatchFat.jar:?]

@mch2
Copy link
Member

mch2 commented Mar 31, 2022

An update here.

The root problem is the same conflict of ASM classes with jdk17. I think this is related to jdk17 blocking illegal access to runtime internals with JEP 403. The hotpatch is bringing in its version of these classes that are getting loaded by the system classloader and conflicting with the versions included by the painless module.

I thought maybe there was a difference in the jar contents between what is distributed in the gem, so I compared it to a locally built jar from https://github.com/corretto/hotpatch-for-apache-log4j2. There does look like a difference in the namespace of these dependencies and I'd expect this is related.

This is the hotpatch contents. The namespace is still the same org/objectweb/...

[~]$ unzip /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar -d hotpatchjdk17
Archive:  /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar
  inflating: hotpatchjdk17/META-INF/MANIFEST.MF
  inflating: hotpatchjdk17/Log4jHotPatch17$1.class
  inflating: hotpatchjdk17/Log4jHotPatch17$MethodInstrumentorClassVisitor.class
  inflating: hotpatchjdk17/Log4jHotPatch17$MethodInstrumentorMethodVisitor.class
  inflating: hotpatchjdk17/Log4jHotPatch17.class
   creating: hotpatchjdk17/org/
   creating: hotpatchjdk17/org/objectweb/
   creating: hotpatchjdk17/org/objectweb/asm/
  inflating: hotpatchjdk17/org/objectweb/asm/AnnotationVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/AnnotationWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Attribute$Set.class
  inflating: hotpatchjdk17/org/objectweb/asm/Attribute.class
  inflating: hotpatchjdk17/org/objectweb/asm/ByteVector.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassReader.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassTooLargeException.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/ClassWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/ConstantDynamic.class
  inflating: hotpatchjdk17/org/objectweb/asm/Constants.class
  inflating: hotpatchjdk17/org/objectweb/asm/Context.class
  inflating: hotpatchjdk17/org/objectweb/asm/CurrentFrame.class
  inflating: hotpatchjdk17/org/objectweb/asm/Edge.class
  inflating: hotpatchjdk17/org/objectweb/asm/FieldVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/FieldWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Frame.class
  inflating: hotpatchjdk17/org/objectweb/asm/Handle.class
  inflating: hotpatchjdk17/org/objectweb/asm/Handler.class
  inflating: hotpatchjdk17/org/objectweb/asm/Label.class
  inflating: hotpatchjdk17/org/objectweb/asm/MethodTooLargeException.class
  inflating: hotpatchjdk17/org/objectweb/asm/MethodVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/MethodWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/ModuleVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/ModuleWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Opcodes.class
  inflating: hotpatchjdk17/org/objectweb/asm/RecordComponentVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/RecordComponentWriter.class
  inflating: hotpatchjdk17/org/objectweb/asm/Symbol.class
  inflating: hotpatchjdk17/org/objectweb/asm/SymbolTable$Entry.class
  inflating: hotpatchjdk17/org/objectweb/asm/SymbolTable.class
  inflating: hotpatchjdk17/org/objectweb/asm/Type.class
  inflating: hotpatchjdk17/org/objectweb/asm/TypePath.class
  inflating: hotpatchjdk17/org/objectweb/asm/TypeReference.class
   creating: hotpatchjdk17/org/objectweb/asm/signature/
  inflating: hotpatchjdk17/org/objectweb/asm/signature/SignatureReader.class
  inflating: hotpatchjdk17/org/objectweb/asm/signature/SignatureVisitor.class
  inflating: hotpatchjdk17/org/objectweb/asm/signature/SignatureWriter.class

Whereas cloning https://github.com/corretto/hotpatch-for-apache-log4j2 and unpacking...

(22-03-31 20:27:59) <0> [/local/home/handalm/workplace/hotpatch-for-apache-log4j2/build/libs]
[~]$ unzip Log4jHotPatch.jar -d localbuild-hotpatch
Archive:  Log4jHotPatch.jar
   creating: localbuild-hotpatch/META-INF/
  inflating: localbuild-hotpatch/META-INF/MANIFEST.MF
  inflating: localbuild-hotpatch/Log4jHotPatch$MethodInstrumentorClassVisitor.class
  inflating: localbuild-hotpatch/Log4jHotPatch$1.class
  inflating: localbuild-hotpatch/Log4jHotPatch$MethodInstrumentorMethodVisitor.class
  inflating: localbuild-hotpatch/Log4jHotPatch.class
  inflating: localbuild-hotpatch/Log4jHotPatch17.class
   creating: localbuild-hotpatch/com/
   creating: localbuild-hotpatch/com/amazon/
   creating: localbuild-hotpatch/com/amazon/corretto/
   creating: localbuild-hotpatch/com/amazon/corretto/org/
   creating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/
   creating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/AnnotationVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/AnnotationWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Attribute$Set.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Attribute.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ByteVector.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassReader.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassTooLargeException.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ClassWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ConstantDynamic.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Constants.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Context.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/CurrentFrame.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Edge.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/FieldVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/FieldWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Frame.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Handle.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Handler.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Label.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/MethodTooLargeException.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/MethodVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/MethodWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ModuleVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/ModuleWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Opcodes.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/RecordComponentVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/RecordComponentWriter.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Symbol.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/SymbolTable$Entry.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/SymbolTable.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/Type.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/TypePath.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/TypeReference.class
   creating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/SignatureReader.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/SignatureVisitor.class
  inflating: localbuild-hotpatch/com/amazon/corretto/org/objectweb/asm/signature/SignatureWriter.class

The fix I put in #2681 works because it creates a fat jar that rewrites the path of the objectweb dependencies but I don't think that should be required.

@mch2
Copy link
Member

mch2 commented Mar 31, 2022

Ok looks like the jar in the hotpatch is the issue. I'm able to reproduce using the static agent attached to the process with ./gradle run. Not sure why adding the agent to gradle.properties didn't load properly.

Repro steps:

  1. Disable the hotpatch sudo systemctl stop log4j-cve-2021-44228-hotpatch
  2. Start opensearch with ./gradlew run.
  3. Execute a few curl requests with painless, this will pass:
curl -X PUT "localhost:9200/test-index/_doc/1?pretty" -H 'Content-Type: application/json' -d'
{
  "foo": 1
}
'

curl -X GET "localhost:9200/test-index/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "script_fields": {
    "my_doubled_field": {
      "script": { 
        "source": "doc[\u0027foo\u0027].value * params[\u0027multiplier\u0027]", 
        "params": {
          "multiplier": 2
        }
      }
    }
  }
}
'
  1. restart opensearch and fetch its pid.
  2. Start the hotpatch agent. java -jar /usr/share/log4j-cve-2021-44228-hotpatch/jdk17/Log4jHotPatchFat.jar <pid>
  3. Execute step 3 again, the request should fail with the expected error log to stdout.
  4. Stop the service.
  5. Clone https://github.com/corretto/hotpatch-for-apache-log4j2 and execute ./gradlew build to build the jar.
  6. Restart the service, this time with the newly built jar. java -jar ~/path/to/repo/build/libs/Log4jHotPatch.jar <pid>
  7. Execute step 3 again, the requests should succeed.

@mch2
Copy link
Member

mch2 commented Apr 1, 2022

So to move this fwd I see two options. We either 1 - disable the hotpatch on CI & warn users of this issue with jdk17 + the hotpatch or 2 - move fwd with shadowing lang-painless's version of these jars ourselves. While we can't control whats running on user's hosts, I see 2 as a safer option, thoughts?

@dblock
Copy link
Member

dblock commented Apr 1, 2022

So to move this fwd I see two options. We either 1 - disable the hotpatch on CI & warn users of this issue with jdk17 + the hotpatch or 2 - move fwd with shadowing lang-painless's version of these jars ourselves. While we can't control whats running on user's hosts, I see 2 as a safer option, thoughts?

First, thanks for the thorough debugging and repro, solid work.

I think we can move forward with (2) now as a workaround. Let's leave the bug open or open a new one to undo the shadowing, and treat it as a bug in the hot patch. The latter shouldn't have this kind of effect on any application IMHO.

@mch2
Copy link
Member

mch2 commented Apr 1, 2022

To be clear, this is the commit missing from the hotpatch. This previous fix is also missing.

Am working on an update to our fix. Its also worth calling out this will increase the size of the lang-painless jar from ~700kb to ~1.1MB.

I've also not been able to repro the bug with lang-expression that also includes these dependencies. I think until we can we shouldn't make any change there.

@mch2
Copy link
Member

mch2 commented Apr 4, 2022

Closing with fix merged. Please reopen if this reappears.

@JoeJoe1989
Copy link

Hi, my projects use both asm (7.2) and asm-util (7.1). In order to upgrade my project to Open JDK 17 LTS, I need to update asm to 9.1 which supports JDK 17. However, the build failed due to:

"Unsupported api 589824" error for asm (9.1) and asm-util (9.1)

If I downgrade asm-util to 7.1, the build succeeded. asm (9.1) and asm-util (7.1)

Not sure if you have any idea. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Severity-Blocker v2.0.0 Version 2.0.0
Projects
None yet
Development

No branches or pull requests

9 participants