Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch crashes on startup when upgrading from 8.10.4 to 8.11.1 when S3 snapshots are in use #102173

Closed
dggreenbaum opened this issue Nov 14, 2023 · 4 comments · Fixed by #102230
Assignees
Labels
>bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team

Comments

@dggreenbaum
Copy link

dggreenbaum commented Nov 14, 2023

Elasticsearch Version

8.11.1

Installed Plugins

No response

Java Version

bundled

OS Version

ubuntu:20.04

Problem Description

I'm orchestrating my Elasticsearch deployment using Cloud on K8s 2.9. When attempting to upgrade my Elasticsearch cluster from 8.10.4 to 8.11.1 the first node to restart crashes with the access denied stack trace included in the logs section.

This occurs with an unmodified docker.elastic.co/elasticsearch/elasticsearch:8.11.1 image with no additional plugins installed. I am using S3 based snapshots with this cluster. I have been able to successfully do minor version upgrades of this cluster in the past.

I suspect this may be related to #101344 or #101245 since they relate to S3Service.java which appears in the stack trace and the patch notes for 8.11.0.

Steps to Reproduce

Starting from a functioning ES 8.10.4 cluster with S3 snapshots configured attempt to upgrade an existing data node to 8.11.1. It will fail to start and produce the above stack trace.

Logs (if relevant)

access denied (\"java.lang.RuntimePermission\" \"accessDeclaredMembers\")","error.stack_trace":"java.security.AccessControlException: access denied (\"java.lang.RuntimePermission\" \"accessDeclaredMembers\")
	 java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:488)
	 java.base/java.security.AccessController.checkPermission(AccessController.java:1071)
	 java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:411)
	 java.base/java.lang.Class.checkMemberAccess(Class.java:3227)
	 java.base/java.lang.Class.getDeclaredConstructors(Class.java:2725)
	 com.fasterxml.jackson.databind.util.ClassUtil.getConstructors(ClassUtil.java:1331)
	 com.fasterxml.jackson.databind.introspect.AnnotatedCreatorCollector._findPotentialConstructors(AnnotatedCreatorCollector.java:115)
	 com.fasterxml.jackson.databind.introspect.AnnotatedCreatorCollector.collect(AnnotatedCreatorCollector.java:70)
	 com.fasterxml.jackson.databind.introspect.AnnotatedCreatorCollector.collectCreators(AnnotatedCreatorCollector.java:61)
	 com.fasterxml.jackson.databind.introspect.AnnotatedClass._creators(AnnotatedClass.java:403)
	 com.fasterxml.jackson.databind.introspect.AnnotatedClass.getFactoryMethods(AnnotatedClass.java:315)
	 com.fasterxml.jackson.databind.introspect.BasicBeanDescription.getFactoryMethods(BasicBeanDescription.java:573)
	 com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._addExplicitFactoryCreators(BasicDeserializerFactory.java:641)
	 com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._constructDefaultValueInstantiator(BasicDeserializerFactory.java:278)
	 com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findValueInstantiator(BasicDeserializerFactory.java:222)
	 com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.createCollectionDeserializer(BasicDeserializerFactory.java:1421)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:403)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:350)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:264)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244)
	 com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
	 com.fasterxml.jackson.databind.DeserializationContext.findNonContextualValueDeserializer(DeserializationContext.java:644)
	 com.fasterxml.jackson.databind.deser.BeanDeserializerBase.resolve(BeanDeserializerBase.java:539)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:294)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244)
	 com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
	 com.fasterxml.jackson.databind.DeserializationContext.findNonContextualValueDeserializer(DeserializationContext.java:644)
	 com.fasterxml.jackson.databind.deser.BeanDeserializerBase.resolve(BeanDeserializerBase.java:539)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:294)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244)
	 com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
	 com.fasterxml.jackson.databind.DeserializationContext.findContextualValueDeserializer(DeserializationContext.java:621)
	 com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.createContextual(CollectionDeserializer.java:188)
	 com.fasterxml.jackson.databind.deser.std.CollectionDeserializer.createContextual(CollectionDeserializer.java:28)
	 com.fasterxml.jackson.databind.DeserializationContext.handlePrimaryContextualization(DeserializationContext.java:836)
	 com.fasterxml.jackson.databind.deser.BeanDeserializerBase.resolve(BeanDeserializerBase.java:550)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:294)
	 com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:244)
	 com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
	 com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:654)
	 com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:4956)
	 com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4826)
	 com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3809)
	 com.amazonaws.partitions.PartitionsLoader.loadPartitionFromStream(PartitionsLoader.java:92)
	 com.amazonaws.partitions.PartitionsLoader.build(PartitionsLoader.java:84)
	 com.amazonaws.regions.RegionMetadataFactory.create(RegionMetadataFactory.java:30)
	 com.amazonaws.regions.RegionUtils.initialize(RegionUtils.java:64)
	 com.amazonaws.regions.RegionUtils.getRegionMetadata(RegionUtils.java:52)
	 com.amazonaws.regions.RegionUtils.getRegion(RegionUtils.java:106)
	 com.amazonaws.client.builder.AwsClientBuilder.getRegionObject(AwsClientBuilder.java:256)
	 com.amazonaws.client.builder.AwsClientBuilder.withRegion(AwsClientBuilder.java:245)
	 org.elasticsearch.repositories.s3.S3Service$CustomWebIdentityTokenCredentialsProvider.<init>(S3Service.java:373)
	 org.elasticsearch.repositories.s3.S3Service.<init>(S3Service.java:98)
	 org.elasticsearch.repositories.s3.S3RepositoryPlugin.s3Service(S3RepositoryPlugin.java:115)
	 org.elasticsearch.repositories.s3.S3RepositoryPlugin.createComponents(S3RepositoryPlugin.java:109)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.node.Node.lambda$new$17(Node.java:759)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.plugins.PluginsService.lambda$flatMap$1(PluginsService.java:263)
	 java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273)
	 java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	 java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722)
	 java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	 java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	 java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:575)
	 java.base/java.util.stream.AbstractPipeline.evaluateToArrayNode(AbstractPipeline.java:260)
	 java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:616)
	 java.base/java.util.stream.ReferencePipeline.toArray(ReferencePipeline.java:622)
	 java.base/java.util.stream.ReferencePipeline.toList(ReferencePipeline.java:627)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.node.Node.<init>(Node.java:775)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.node.Node.<init>(Node.java:344)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.bootstrap.Elasticsearch$2.<init>(Elasticsearch.java:236)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.bootstrap.Elasticsearch.initPhase3(Elasticsearch.java:236)
	 org.elasticsearch.server@8.11.1/org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:73)
@dggreenbaum dggreenbaum added >bug needs:triage Requires assignment of a team area label labels Nov 14, 2023
@DaveCTurner
Copy link
Contributor

I suspect this is due to #101705, neither of the two suggested issues would explain this. Does it reproduce if you clear AWS_STS_REGIONAL_ENDPOINTS environment variable in the environment in which ES runs?

@DaveCTurner DaveCTurner added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs and removed needs:triage Requires assignment of a team area label labels Nov 14, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team label Nov 14, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@dggreenbaum
Copy link
Author

@DaveCTurner it no longer reproduces if I set AWS_STS_REGIONAL_ENDPOINTS to an empty string.

@DaveCTurner
Copy link
Contributor

Thanks @dggreenbaum that confirms the relationship with #101705.

@arteam arteam self-assigned this Nov 15, 2023
arteam added a commit to arteam/elasticsearch that referenced this issue Nov 15, 2023
Unfortunately, `AWSSecurityTokenServiceClientBuilder#setRegion` is not just a setter on
the builder. It looks up the region by its name which laziliy initializes some regional
configuration. As a result, the call with an `access denied` error, because
the caller doesn't have permission to call `accessDeclaredMembers` in some Jackson
internals.

We fix that in two ways:
* Make sure `withRegion` call is priviliged
* Eagarly lookup region metadata in `S3RepositoryPlugin`

Fixes elastic#102173
arteam added a commit that referenced this issue Nov 16, 2023
Unfortunately, `AWSSecurityTokenServiceClientBuilder#setRegion` is not just a setter on the builder. It looks up the region by its name which lazily initializes some regional configuration. As a result, the call with an access denied error, because the caller doesn't have permission to call `accessDeclaredMembers` in some Jackson internals.

This bug wasn't caught by the `CustomWebIdentityTokenCredentialsProviderTests#testSupportRegionalizedEndpoints` test because it's under with the test framework that does allow naked reflection calls.

We fix that in two ways:

*  Make sure withRegion call is privileged
*  Eagerly lookup region metadata in `S3Repository` 

Fixes #102173
arteam added a commit to arteam/elasticsearch that referenced this issue Nov 16, 2023
…c#102230)

Unfortunately, `AWSSecurityTokenServiceClientBuilder#setRegion` is not just a setter on the builder. It looks up the region by its name which lazily initializes some regional configuration. As a result, the call with an access denied error, because the caller doesn't have permission to call `accessDeclaredMembers` in some Jackson internals.

This bug wasn't caught by the `CustomWebIdentityTokenCredentialsProviderTests#testSupportRegionalizedEndpoints` test because it's under with the test framework that does allow naked reflection calls.

We fix that in two ways:

*  Make sure withRegion call is privileged
*  Eagerly lookup region metadata in `S3Repository` 

Fixes elastic#102173
elasticsearchmachine pushed a commit that referenced this issue Nov 16, 2023
… (#102285)

Unfortunately, `AWSSecurityTokenServiceClientBuilder#setRegion` is not just a setter on the builder. It looks up the region by its name which lazily initializes some regional configuration. As a result, the call with an access denied error, because the caller doesn't have permission to call `accessDeclaredMembers` in some Jackson internals.

This bug wasn't caught by the `CustomWebIdentityTokenCredentialsProviderTests#testSupportRegionalizedEndpoints` test because it's under with the test framework that does allow naked reflection calls.

We fix that in two ways:

*  Make sure withRegion call is privileged
*  Eagerly lookup region metadata in `S3Repository` 

Fixes #102173
rjernst pushed a commit to rjernst/elasticsearch that referenced this issue Nov 16, 2023
…c#102230)

Unfortunately, `AWSSecurityTokenServiceClientBuilder#setRegion` is not just a setter on the builder. It looks up the region by its name which lazily initializes some regional configuration. As a result, the call with an access denied error, because the caller doesn't have permission to call `accessDeclaredMembers` in some Jackson internals.

This bug wasn't caught by the `CustomWebIdentityTokenCredentialsProviderTests#testSupportRegionalizedEndpoints` test because it's under with the test framework that does allow naked reflection calls.

We fix that in two ways:

*  Make sure withRegion call is privileged
*  Eagerly lookup region metadata in `S3Repository` 

Fixes elastic#102173
andreidan pushed a commit to andreidan/elasticsearch that referenced this issue Nov 22, 2023
…c#102230)

Unfortunately, `AWSSecurityTokenServiceClientBuilder#setRegion` is not just a setter on the builder. It looks up the region by its name which lazily initializes some regional configuration. As a result, the call with an access denied error, because the caller doesn't have permission to call `accessDeclaredMembers` in some Jackson internals.

This bug wasn't caught by the `CustomWebIdentityTokenCredentialsProviderTests#testSupportRegionalizedEndpoints` test because it's under with the test framework that does allow naked reflection calls.

We fix that in two ways:

*  Make sure withRegion call is privileged
*  Eagerly lookup region metadata in `S3Repository` 

Fixes elastic#102173
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs Team:Distributed Meta label for distributed team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants