[CI] SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot failing #55513

iverase · 2020-04-21T09:40:23Z

There are a couple of errors for this test in the last week:

https://gradle-enterprise.elastic.co/s/iqg3cqcqalyak
https://gradle-enterprise.elastic.co/s/3ilvp7ty52ybk

Both seems to fail with the same error:

 Caused by:
        java.lang.AssertionError: fake allocation id has to be removed, inSyncAllocationIds:[AWIAuDNXRKK3b0jtoJ-ePw, _forced_allocation_]
            at __randomizedtesting.SeedInfo.seed([FBB4D9C024EA5F0A]:0)
            at org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater.updateInSyncAllocations(IndexMetadataUpdater.java:186)
            at org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater.applyChanges(IndexMetadataUpdater.java:122)
            at org.elasticsearch.cluster.routing.allocation.RoutingAllocation.updateMetadataWithRoutingChanges(RoutingAllocation.java:238)
            at org.elasticsearch.cluster.routing.allocation.AllocationService.buildResult(AllocationService.java:147)
            at org.elasticsearch.cluster.routing.allocation.AllocationService.buildResultAndLogHealthChange(AllocationService.java:132)
            at org.elasticsearch.cluster.routing.allocation.AllocationService.applyStartedShards(AllocationService.java:128)
            at org.elasticsearch.cluster.action.shard.ShardStateAction$ShardStartedClusterStateTaskExecutor.execute(ShardStateAction.java:579)
            at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:697)
            at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:319)
            at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:214)
            at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:151)
            at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)
            at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)
            at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:632)
            at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
            at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
            at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
            at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
            at java.base/java.lang.Thread.run(Thread.java:834)

I have not been able to reproduce it:

./gradlew ':x-pack:plugin:searchable-snapshots:test' --tests "org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot"   -Dtests.seed=38EA7CA2AD868AF3   -Dtests.security.manager=true   -Dtests.locale=ja-JP-u-ca-japanese-x-lvariant-JP   -Dtests.timezone=Navajo   -Dcompiler.java=14

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-04-21T09:40:25Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

original-brownbear · 2020-04-22T18:45:36Z

This is reproducible locally at a fairly low rate (~1/1k empirically). Still tracking down how we get to this point.

We are using `FORCE_STALE_PRIMARY_INSTANCE` in instance equality checks `==` but were creating new instances of `ExistingStoreRecoverySource` when reading from the wire. This could break these checks in corner cases, causing `org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater#shardStarted` to not remove the force allocation fake id when starting a shard. Closes elastic#55513

We are using `FORCE_STALE_PRIMARY_INSTANCE` in instance equality checks `==` but were creating new instances of `ExistingStoreRecoverySource` when reading from the wire. This could break these checks in corner cases, causing `org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater#shardStarted` to not remove the force allocation fake id when starting a shard. Closes #55513

We are using `FORCE_STALE_PRIMARY_INSTANCE` in instance equality checks `==` but were creating new instances of `ExistingStoreRecoverySource` when reading from the wire. This could break these checks in corner cases, causing `org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater#shardStarted` to not remove the force allocation fake id when starting a shard. Closes elastic#55513

We are using `FORCE_STALE_PRIMARY_INSTANCE` in instance equality checks `==` but were creating new instances of `ExistingStoreRecoverySource` when reading from the wire. This could break these checks in corner cases, causing `org.elasticsearch.cluster.routing.allocation.IndexMetadataUpdater#shardStarted` to not remove the force allocation fake id when starting a shard. Closes #55513

iverase added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test-failure Triaged test failures from CI labels Apr 21, 2020

original-brownbear self-assigned this Apr 22, 2020

original-brownbear mentioned this issue Apr 23, 2020

Fix Broken ExistingStoreRecoverySource Deserialization #55657

Merged

original-brownbear closed this as completed in #55657 Apr 23, 2020

original-brownbear mentioned this issue Apr 23, 2020

Fix Broken ExistingStoreRecoverySource Deserialization (#55657) #55665

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot failing #55513

[CI] SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot failing #55513

iverase commented Apr 21, 2020

elasticmachine commented Apr 21, 2020

original-brownbear commented Apr 22, 2020

[CI] SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot failing #55513

[CI] SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot failing #55513

Comments

iverase commented Apr 21, 2020

elasticmachine commented Apr 21, 2020

original-brownbear commented Apr 22, 2020