Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Security plugin transport layer(multi-node cluster) does not work due to missing required reflective access (JDK17) #1653

Closed
cliu123 opened this issue Feb 22, 2022 · 23 comments
Assignees
Labels
bug Something isn't working v2.0.0

Comments

@cliu123
Copy link
Member

cliu123 commented Feb 22, 2022

Describe the bug
Security plugin 1.3.0.0 does not work on OpenSearch 1.3.0 multi-node clusters without required reflective access.
This issue was originated from: #1619

To Reproduce
Steps to reproduce the behavior:

  1. Create a OpenSearch 1.3.0 cluster with at least 2 nodes.
  2. Install security plugin 1.3.0.0(security plugin repo main branch) on each of the nodes.
  3. Send a basic request(such as _cat/nodes or _cat/indices) to the OpenSearch cluster.
  4. See error:
% curl -k -XGET -u admin:admin https://localhost:9200/_cat/indices
{"error":{"root_cause":[{"type":"security_exception","reason":"Unexpected exception indices:monitor/stats"}],"type":"security_exception","reason":"Unexpected exception indices:monitor/stats"},"status":500}% 

Stacktrace in logs:

org.opensearch.OpenSearchSecurityException: Unexpected exception cluster:monitor/nodes/info
         at org.opensearch.security.filter.SecurityFilter.apply0(SecurityFilter.java:376) [opensearch-security-1.3.0.0.jar:1.3.0.0]
         at org.opensearch.security.filter.SecurityFilter.apply(SecurityFilter.java:154) [opensearch-security-1.3.0.0.jar:1.3.0.0]
         at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:192) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.performanceanalyzer.action.PerformanceAnalyzerActionFilter.apply(PerformanceAnalyzerActionFilter.java:99) [opensearch-performance-analyzer-1.3.0.0.jar:1.3.0.0]
         at org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:192) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.action.support.TransportAction.execute(TransportAction.java:169) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.action.support.TransportAction.execute(TransportAction.java:97) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.client.node.NodeClient.executeLocally(NodeClient.java:108) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.client.node.NodeClient.doExecute(NodeClient.java:95) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.client.support.AbstractClient.execute(AbstractClient.java:433) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.client.support.AbstractClient$ClusterAdmin.execute(AbstractClient.java:730) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.client.support.AbstractClient$ClusterAdmin.nodesInfo(AbstractClient.java:813) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.rest.action.cat.RestNodesAction$1.processResponse(RestNodesAction.java:127) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.rest.action.cat.RestNodesAction$1.processResponse(RestNodesAction.java:115) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.rest.action.RestActionListener.onResponse(RestActionListener.java:60) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.action.support.TransportAction$1.onResponse(TransportAction.java:103) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.action.support.TransportAction$1.onResponse(TransportAction.java:97) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.action.ActionListener$2.onResponse(ActionListener.java:104) [opensearch-1.3.0.jar:1.3.0]
         at org.opensearch.action.ActionListener.completeWith(ActionListener.java:351) [opensearch-1.3.0.jar:1.3.0]
 

Digging into the reason of the error:
1.

{
  "error": {
    "root_cause": [
      {
        "type": "security_exception",
        "reason": "Unexpected exception cluster:monitor/nodes/info"
      }
    ],
    "type": "security_exception",
    "reason": "Unexpected exception cluster:monitor/nodes/info",
    "caused_by": {
      "type": "exception_in_initializer_error",
      "reason": null,
      "caused_by": {
        "type": "inaccessible_object_exception",
        "reason": "Unable to make field private java.lang.String java.io.ObjectStreamClass.name accessible: module java.base does not \"opens java.io\" to unnamed module @5b0575d0"
      }
    }
  },
  "status": 500
}

{
  "error": {
    "root_cause": [
      {
        "type": "security_exception",
        "reason": "Unexpected exception cluster:monitor/nodes/info"
      }
    ],
    "type": "security_exception",
    "reason": "Unexpected exception cluster:monitor/nodes/info",
    "caused_by": {
      "type": "no_class_def_found_error",
      "reason": "Could not initialize class org.opensearch.security.support.Base64Helper$DescriptorNameSetter"
    }
  },
  "status": 500
}

Expected behavior
Requests gets expected response like the following:

% curl -k -XGET -u admin:admin https://localhost:9200/_cat/indices                                                                                    
green open security-auditlog-2022.02.19 l9rrGwzOR9C4ZWqMyMYHwg 1 1  1 0  23.2kb  11.6kb
green open security-auditlog-2022.02.18 qvL83MJ1RbG1YgOW4MixAQ 1 1 22 0 244.9kb 154.3kb
green open .kibana_1                    peXycy4TSoWTYm8Uh6TemQ 1 1  1 0    10kb     5kb
green open .opendistro_security         3RaQH4lCQuayi6CKLXptWg 1 1  9 0 106.1kb    53kb

Plugins
Please list all plugins currently enabled.
Security plugin

Host/Environment (please complete the following information):

  • OS: [Linux]

Additional context
PR#1278 in security plugin requires these reflective access since the changes use reflection. But reverting the PR is not an option since the PR itself supports OpenSearch backward compatibility for ODFE. Without this PR(#1278), security plugin would not work in mixed cluster(ODFE nodes + OpenSearch nodes), which would fail rolling upgrade from ODFE to OpenSearch.
In order to fix the issue, JVM Options where OpenSearch cluster runs need to be appended with something like this to grant the reflective access. Security plugin had exactly same error message in Integration Tests, which was resolved by such changes.

@cliu123 cliu123 added bug Something isn't working untriaged Require the attention of the repository maintainers and may need to be prioritized labels Feb 22, 2022
@cliu123
Copy link
Member Author

cliu123 commented Feb 22, 2022

Adding @vrozov who is the author of the original PR. Do you have any suggestions on resolving this issue?

@cliu123 cliu123 changed the title [BUG] Security plugin transport layer(multi-node cluster) does not work due to lack of required reflective access. [BUG] Security plugin transport layer(multi-node cluster) does not work due to missing required reflective access. Feb 24, 2022
@setiah
Copy link

setiah commented Feb 24, 2022

This PR #1278 was part of 1.2 version as well where it worked fine. Did something change in 1.3 that caused this to break @cliu123 ?

@cliu123
Copy link
Member Author

cliu123 commented Feb 25, 2022

@setiah Good question! I had the same suspicion, but at least in security plugin, almost of all changes since 1.2.4 are CI/CD and build system changes. Nothing looks related to this. And when I reset security plugin all the way back to the very first commit after 1.2 versions, this issue still exists. So basically starting from bumping the OpenSearch version from 1.2.x to 1.3.0, the issue got introduced.
@vrozov Any thoughts on what change could cause this issue?

@setiah
Copy link

setiah commented Feb 28, 2022

Found the cause.

So the issue started happening after the bundled JDK was updated from AdoptOpenJDK 15 to Adoptium 17 with PR opensearch-project/OpenSearch#1922. This was a backport change from OpenSearch 2.0 to 1.x (then and now 1.3)

Prior to this JDK update, OpenSearch logs used to have warning messages to reflect improper reflective access, but, it still worked. See logs below

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.opensearch.security.support.Base64Helper$DescriptorNameSetter (file:/home/ubuntu/Debug/opensearch-1.3.0-1.x_18Jan_one_commit_before_686640d/plugins/opensearch-security/opensearch-security-1.3.0.0-SNAPSHOT.jar) to field java.io.ObjectStreamClass.name
WARNING: Please consider reporting this to the maintainers of org.opensearch.security.support.Base64Helper$DescriptorNameSetter
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

Seems like the new release version of JDK no longer support this illegal reflective access operation, hence fails while initializing

Caused by: java.lang.ExceptionInInitializerError
        at org.opensearch.security.support.Base64Helper$DescriptorReplacer.lambda$replace$1(Base64Helper.java:166) ~[?:?]
        at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708) ~[?:?]
        at org.opensearch.security.support.Base64Helper$DescriptorReplacer.replace(Base64Helper.java:160) ~[?:?]

@setiah setiah removed the untriaged Require the attention of the repository maintainers and may need to be prioritized label Feb 28, 2022
@cliu123
Copy link
Member Author

cliu123 commented Feb 28, 2022

Cool! @setiah Thanks for identifying the cause! How should this be fixed then?
Reverting the PR#1278 would break backward compatibility and block rolling upgrade from ODFE to OpenSearch, which does not sound an option.

@dblock dblock added the v1.3.0 label Mar 1, 2022
@dblock
Copy link
Member

dblock commented Mar 1, 2022

@cliu123 @setiah This looks like a breaking change for 1.3.0, labelling as such

@setiah
Copy link

setiah commented Mar 1, 2022

The JDK update looks like a breaking change for a minor version update. It is breaking reflective access in plugins. Now breaking reflective access may be the right thing to do, but for a minor version update this can cause problems to the community plugins. I think we should push the JDK update change to 2.0 and exclude it from 1.3 (backport PR opensearch-project/OpenSearch#1922)

Another option is to allow reflective access by adding below jvmArgs. However, I am ruling out this option as it might be the wrong thing to do for non-testing purposes (also as JDK wants it deprecated and removed).

jvmArgs += "--add-opens=java.base/java.io=ALL-UNNAMED"

Having said that, the "illegal reflective access" would again show up as a problem for 2.0 release and needs to be dealt. But for 2.0, we would want to fix things on the Security plugin side (and community plugins which have similar issue can do the same). PR #1278 was meant to be a quick patchwork for retaining backward compatibility with ODFE (see issue #1259). There are longer term solutions discussed on the original issue to fix things "the right way", which we can explore. Also, for 2.0, we do not need the #1278 as the mixed cluster scenario with ODFE and 2.0 version nodes would not arise (PS: Since to upgrade from ODFE, ODFE 1.13 -> OpenSearch 1.x -> OpenSearch2.0).

@reta @dblock do you guys think it is reasonable to remove JDK update changes from 1.3 release?

On a side note, another concerning thing for me is why this breaking behavior was not detected early on when this change was merged (~18th Jan), and discovered now while planning for 1.3 release. The automated integration test infrastructure pipeline should be able to detect such issues. Anything I am missing? @bbarani @CEHENKLE

@reta
Copy link
Collaborator

reta commented Mar 1, 2022

@setiah @dblock the JDK-17 is only used to run tests (and consequently, used in builds), it should not break things (sadly, besides the build itself) and could be changed anytime by using -Druntime.java=11 or setting RUNTIME_JAVA_HOME to JDK of choice. JDK-15 is effectively dead and I think we should not use it in our builds (my opinion)

@dblock dblock changed the title [BUG] Security plugin transport layer(multi-node cluster) does not work due to missing required reflective access. [BUG] Security plugin transport layer(multi-node cluster) does not work due to missing required reflective access (JDK17) Mar 1, 2022
@dblock
Copy link
Member

dblock commented Mar 1, 2022

@reta This is all part of opensearch-project/opensearch-plugins#64 in which we're trying to switch builds to 11 and upgrade runtime to 17. This issue prevents upgrading the runtime to 17. I'm moving this issue to security and let's figure out whether we should fix this or ship 1.3 with JDK 11 instead.

@dblock dblock transferred this issue from opensearch-project/OpenSearch Mar 1, 2022
@reta
Copy link
Collaborator

reta commented Mar 1, 2022

@dblock gotcha, we could make org.opensearch.security.support.Base64Helper a good citizen here to conform to JDK-17 rules

@cliu123
Copy link
Member Author

cliu123 commented Mar 1, 2022

@dblock @reta Fixing this on security plugin would not resolve the root cause introduced by JDK 17 in runtime. Other plugins that require reflective accesses would still have this issue, for example PA plugin. And reverting the related PR would break backward compatibility of security plugin like I mentioned.
Cc: @davidlago @sruti1312 for any inputs.

@reta
Copy link
Collaborator

reta commented Mar 1, 2022

@cliu123 but we have to fix this access violation in any case, right? (in all the places)

@davidlago
Copy link

My vote would be to push JDK17 to 2.0 and open issues in all affected plugins like security, PA etc to fix the access violation by 2.0

@setiah
Copy link

setiah commented Mar 1, 2022

@reta This is all part of opensearch-project/opensearch-plugins#64 in which we're trying to switch builds to 11 and upgrade runtime to 17. This issue prevents upgrading the runtime to 17. I'm moving this issue to security and let's figure out whether we should fix this or ship 1.3 with JDK 11 instead.

Thanks @reta for the quick response. Since it defaults to jdk17 runtime now (bundled jdk), it runs OpenSearch and plugins on it by default. Setting RUNTIME_JAVA_HOME to a different (lower) version explicitly might work, but a user might have to set that explicitly (and also have that jdk installed beforehand). IMO it still breaks the existing behavior for users in terms of usability (for a minor version release).

@reta
Copy link
Collaborator

reta commented Mar 1, 2022

@setiah many do not use bundled JDK, moving to 11 [1] "solves" our problem but not user's problem, JDK-17 is the version many are moving to in production ...

[1] opensearch-project/OpenSearch#2301

@cliu123
Copy link
Member Author

cliu123 commented Mar 1, 2022

@dblock Security plugin has another tracking issue for this matter #1619. Would it make more sense to move this one to OpenSearch repo to track the fix? I don't have the permission to transfer this to OpenSearch repo though.

@cliu123
Copy link
Member Author

cliu123 commented Mar 2, 2022

@ylwu-amzn The PR has been merged. Has the issue been resolved?

@ylwu-amzn
Copy link
Contributor

ylwu-amzn commented Mar 2, 2022

We are using 1.2.4 doing PenTest. Please keep that issue open until you confirm the issue resolved on 1.3. We have to publish and merge PR before the issue fixed as we are reaching code complete cut-off date for 1.3 soon.

@cliu123
Copy link
Member Author

cliu123 commented Mar 2, 2022

We are using 1.2.4 doing PenTest. Please keep that issue open until you confirm the issue resolved on 1.3.

That issue that you created has been closed since the root cause(JDK 17 in runtime) has been identified, and this issue was created to track the resolution of the root cause. Please close this issue whenever you confirm that your issue is resolved.

@ylwu-amzn
Copy link
Contributor

ylwu-amzn commented Mar 2, 2022

@cliu123 , busy with finishing coding now, so will keep that issue open for a while. If you are sure that issue is gone, you can close that issue directly. If we find any issue, will reopen or create a new one. Thanks.

@dblock
Copy link
Member

dblock commented Mar 2, 2022

We've given up on JDK 17 for OpenSearch v.1.3.0 and downgraded OpenSearch 1.x branch to JDK 11. Re-labelling this issue as 2.0.

@cliu123
Copy link
Member Author

cliu123 commented Mar 14, 2022

@ylwu-amzn confirmed that the issue has been resovled. Closing the issue.

@cliu123
Copy link
Member Author

cliu123 commented Mar 22, 2022

Closing the issue with #1691

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working v2.0.0
Projects
None yet
Development

No branches or pull requests

6 participants