[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards failing #105331

iverase · 2024-02-09T09:26:24Z

Build scan:
https://gradle-enterprise.elastic.co/s/kxdqmnytyenuq/tests/:server:internalClusterTest/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT/testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards

Reproduction line:

./gradlew ':server:internalClusterTest' --tests "org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards" -Dtests.seed=ED5C019BB52E4B53 -Dtests.locale=is -Dtests.timezone=America/Anchorage -Druntime.java=21

Applicable branches:
main

Reproduces locally?:
No

Failure history:
Failure dashboard for org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT#testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards

Failure excerpt:

java.lang.Exception: Test abandoned because suite timeout was reached.

  at __randomizedtesting.SeedInfo.seed([ED5C019BB52E4B53]:0)

The text was updated successfully, but these errors were encountered:

elasticsearchmachine · 2024-02-09T09:26:48Z

Pinging @elastic/es-distributed (Team:Distributed)

idegtiarenko · 2024-02-09T11:27:42Z

Oh, interesting, according to:

"TEST-DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards-seed#[ED5C019BB52E4B53]" ID=2402 WAITING on org.elasticsearch.action.support.PlainActionFuture$Sync@dcadc57
	at java.base@21.0.2/jdk.internal.misc.Unsafe.park(Native Method)
	- waiting on org.elasticsearch.action.support.PlainActionFuture$Sync@dcadc57
	at java.base@21.0.2/java.util.concurrent.locks.LockSupport.park(LockSupport.java:221)
	at java.base@21.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:754)
	at java.base@21.0.2/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1099)
	at app//org.elasticsearch.action.support.PlainActionFuture$Sync.get(PlainActionFuture.java:273)
	at app//org.elasticsearch.action.support.PlainActionFuture.get(PlainActionFuture.java:93)
	at app//org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1534)
	at app//org.elasticsearch.client.internal.support.AbstractClient$RefCountedFuture.get(AbstractClient.java:1514)
	at app//org.elasticsearch.common.util.concurrent.FutureUtils.get(FutureUtils.java:45)
	at app//org.elasticsearch.action.support.PlainActionFuture.actionGet(PlainActionFuture.java:152)
	at app//org.elasticsearch.action.ActionRequestBuilder.get(ActionRequestBuilder.java:43)
	at app//org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards(DiskThresholdDeciderIT.java:204)
	at java.base@21.0.2/java.lang.invoke.LambdaForm$DMH/0x00007f2a3c300000.invokeVirtual(LambdaForm$DMH)
	at java.base@21.0.2/java.lang.invoke.LambdaForm$MH/0x00007f2a3c0a0800.invoke(LambdaForm$MH)

it was stuck waiting at this line:

elasticsearch/server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/allocation/decider/DiskThresholdDeciderIT.java

Lines 202 to 204 in a7f2e2d

    
           final RestoreSnapshotResponse restoreSnapshotResponse = clusterAdmin().prepareRestoreSnapshot("repo", "snap") 
        
               .setWaitForCompletion(true) 
        
               .get();

idegtiarenko · 2024-02-09T11:31:52Z

Output also contains:

WARNING: Uncaught exception in thread: Thread[#2574,elasticsearch[node_t0][clusterApplierService#updateTask][T#1],5,TGRP-DiskThresholdDeciderIT]
java.lang.AssertionError: 
Expected: <0>
     but: was <1>
	at __randomizedtesting.SeedInfo.seed([ED5C019BB52E4B53]:0)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
	at org.elasticsearch.test.ESTestCase.assertThat(ESTestCase.java:2119)
	at org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT.lambda$testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards$2(DiskThresholdDeciderIT.java:176)
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListener(ClusterApplierService.java:560)
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateListeners(ClusterApplierService.java:546)
	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:505)
	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:429)
	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:154)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:917)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:217)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:183)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1583)

so the actual failure happened at:

elasticsearch/server/src/internalClusterTest/java/org/elasticsearch/cluster/routing/allocation/decider/DiskThresholdDeciderIT.java

Lines 173 to 178 in a7f2e2d

    
           internalCluster().getCurrentMasterNodeInstance(ClusterService.class).addListener(event -> { 
        
               ClusterInfoServiceUtils.refresh(clusterInfoService); 
        
               if (allowRelocations.get() == false) { 
        
                   assertThat(numberOfShardsWithState(event.state().getRoutingNodes(), ShardRoutingState.RELOCATING), equalTo(0)); 
        
               } 
        
           });

idegtiarenko · 2024-03-04T14:02:29Z

According to the recent failure info, the test created the following shards:

ShardSize[shardId=[btydusdduwrc][3], size=16919]
ShardSize[shardId=[btydusdduwrc][0], size=17886]
ShardSize[shardId=[btydusdduwrc][5], size=17972]
ShardSize[shardId=[btydusdduwrc][1], size=18195]
ShardSize[shardId=[btydusdduwrc][4], size=18198]
ShardSize[shardId=[btydusdduwrc][2], size=18472]

During the first allocation, only 5 shard had computed balance and were allocated accordingly:

[2024-03-03T17:22:30,358][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Reconciling desired balance: DesiredBalance[lastConvergedIndex=13, assignments={
[btydusdduwrc][0]=ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][1]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][4]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][5]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][2]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0],
[btydusdduwrc][3]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1]
}]

Notice the smallest one is ignored. Possibly as the size was still not computed from the repository.

Later the shard balance was computed as following:

[2024-03-03T17:22:30,400][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=14, assignments={[btydusdduwrc][0]=ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][1]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][4]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][5]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][2]=ShardAssignment[nodeIds=[suT6FRFCTBiMDfiBCAU_nQ], total=1, unassigned=0, ignored=0], [btydusdduwrc][3]=ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0]}]. Diff: 
[btydusdduwrc][3]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[CTSLnDyTQj2xDmatjmYTtQ], total=1, unassigned=0, ignored=0]

This suggests that the computation failed to take into account another non-empty shard (still?) initializing on the same node.

idegtiarenko · 2024-03-04T14:29:25Z

I wonder if in this case the shard has started, but corresponding information was still not available in ClusterInfo and the second allocation round was happening as if CTSLnDyTQj2xDmatjmYTtQ is empty.

idegtiarenko · 2024-03-12T17:01:52Z

I checked the latest failure logs and they contain:

[2024-03-11T18:25:16,539][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Recomputing desired balance for [15]: DesiredBalance[lastConvergedIndex=14, assignments={[wggatuinx][2]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][3]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][4]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][5]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1], [wggatuinx][0]=ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0], [wggatuinx][1]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1]}], routing_nodes:
-----node_id[CwlFGBJYSXywcA9Qqt25xg][V]
--------[wggatuinx][0], node[CwlFGBJYSXywcA9Qqt25xg], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=lHmaALapQ1qElJRtpQAgug], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11386]
-----node_id[B-MnLxupT2qkXpWstJb6mA][V]
---- unassigned
--------[wggatuinx][1], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][2], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][3], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][4], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[wggatuinx][5], node[null], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
, 


{
  "nodes" : {
    "CwlFGBJYSXywcA9Qqt25xg" : {
      "node_name" : "node_t2",
      "least_available" : {
        "path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2",
        "total_bytes" : 21632,
        "used_bytes" : 0,
        "free_bytes" : 21632,
        "free_disk_percent" : 100.0,
        "used_disk_percent" : 0.0
      },
      "most_available" : {
        "path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2",
        "total_bytes" : 21632,
        "used_bytes" : 0,
        "free_bytes" : 21632,
        "free_disk_percent" : 100.0,
        "used_disk_percent" : 0.0
      }
    },
    "B-MnLxupT2qkXpWstJb6mA" : {
      "node_name" : "node_t1",
      "least_available" : {
        "path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-1",
        "total_bytes" : 63302156288,
        "used_bytes" : 9738104832,
        "free_bytes" : 53564051456,
        "free_disk_percent" : 84.6,
        "used_disk_percent" : 15.4
      },
      "most_available" : {
        "path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-1",
        "total_bytes" : 63302156288,
        "used_bytes" : 9738104832,
        "free_bytes" : 53564051456,
        "free_disk_percent" : 84.6,
        "used_disk_percent" : 15.4
      }
    }
  },
  "shard_sizes" : {
    "[wggatuinx][0][p]_bytes" : 11386
  },
  "shard_data_set_sizes" : {
    "[wggatuinx][0]_bytes" : 11386
  },
  "shard_paths" : {
    "NodeAndShard[nodeId=CwlFGBJYSXywcA9Qqt25xg, shardId=[wggatuinx][0]]" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2"
  },
  "reserved_sizes" : [
    {
      "node_id" : "CwlFGBJYSXywcA9Qqt25xg",
      "path" : "/dev/shm/bk/bk-agent-prod-gcp-1710158680974142795/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_7FA2BD7AB6A11F7E-001/tempDir-003/node-2",
      "total" : 0,
      "shards" : [
        "[wggatuinx][0]"
      ]
    }
  ]
}, 
^^^ cluster info is complete and correct

SnapshotShardSizeInfo{snapshotShardSizes=[[snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][2]]=>11469, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][3]]=>11869, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][4]]=>11392, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][5]]=>12075, [snapshot=repo:snap/gLsztBgmR1G2j8MufCwrSQ, index=[wggatuinx/neUKsrfBR2iP0GAWlUpV2g], shard=[wggatuinx][1]]=>12389]}
^^^ all snapshot sizes are known and available. They are also available for the prior allocation round

[2024-03-11T18:25:16,540][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][0] marked shard as started (routing: [wggatuinx][0], node[CwlFGBJYSXywcA9Qqt25xg], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=lHmaALapQ1qElJRtpQAgug], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11386])
^^^ allocated correctly in the prior round

[2024-03-11T18:25:16,540][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] running delegate allocator
[2024-03-11T18:25:16,542][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][4] marked shard as started (routing: [wggatuinx][4], node[CwlFGBJYSXywcA9Qqt25xg], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=-gFVMSNgRoeo1Fv53EIBhQ], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11392])
^^^ allocated incorrectly using in the balanced allocator using all available information

[2024-03-11T18:25:16,542][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][1] marked shard as started (routing: [wggatuinx][1], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=Ehs2RXxxTaaYu-puiIjWOg], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12389])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][2] marked shard as started (routing: [wggatuinx][2], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=v3Yh-Ie1SEWjezhmdIiwfg], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11469])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][3] marked shard as started (routing: [wggatuinx][3], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=wz8am3sZSGm7LteXF6GN2Q], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[11869])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] [wggatuinx][5] marked shard as started (routing: [wggatuinx][5], node[B-MnLxupT2qkXpWstJb6mA], [P], recovery_source[snapshot recovery [Us0WnflFQ-eNYAoK_hOaBQ] from repo:snap/gLsztBgmR1G2j8MufCwrSQ], s[INITIALIZING], a[id=BJ7H_lu6QKyV8WvVZsaHRA], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-03-11T12:25:16.245Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12075])
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Desired balance computation for [15] is still not converged after [0s] and [1] iterations
[2024-03-11T18:25:16,543][TRACE][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] running delegate allocator
[2024-03-11T18:25:16,549][DEBUG][o.e.c.r.a.a.DesiredBalanceComputer] [node_t0] Desired balance computation for [15] converged after [0s] and [2] iterations
[2024-03-11T18:25:16,550][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=15, assignments={[wggatuinx][2]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0], [wggatuinx][3]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0], [wggatuinx][4]=ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0], [wggatuinx][5]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0], [wggatuinx][0]=ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0], [wggatuinx][1]=ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]}]. Diff: 
[wggatuinx][2]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
[wggatuinx][3]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
[wggatuinx][4]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[CwlFGBJYSXywcA9Qqt25xg], total=1, unassigned=0, ignored=0]
[wggatuinx][5]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]
[wggatuinx][1]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[B-MnLxupT2qkXpWstJb6mA], total=1, unassigned=0, ignored=0]

idegtiarenko · 2024-04-09T14:59:58Z

According to the logs from the latest failure:

Shard sizes
shard=[zoxdkcqpuouh][0]=>12166
shard=[zoxdkcqpuouh][3]=>12451
shard=[zoxdkcqpuouh][2]=>12535
shard=[zoxdkcqpuouh][4]=>12594
shard=[zoxdkcqpuouh][5]=>13006

Nodes
"r4ludfGvTh2wZ7UGP-SjWA" / "node_t1" <-- big node
"zIF6uDFXRYKPVDXFSMTRQQ" / "node_t2" <-- small node

Allocation of all shards happens in 2 rounds.
Round 1:

routing_nodes:
-----node_id[zIF6uDFXRYKPVDXFSMTRQQ][V]
-----node_id[r4ludfGvTh2wZ7UGP-SjWA][V]
---- unassigned
--------[zoxdkcqpuouh][0], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][1], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][2], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][3], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][4], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]
--------[zoxdkcqpuouh][5], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[fetching_shard_data]], failed_attempts[0]

  "shard_sizes" : { },
  "reserved_sizes" : [ ]

SnapshotShardSizeInfo{snapshotShardSizes=[
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][0]]=>12166, 
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][1]]=>13187, 
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][2]]=>12535, 
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][5]]=>13006, 
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][4]]=>12594, 
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][3]]=>12451
]}

[2024-04-06T16:17:35,503][TRACE][o.e.c.r.a.a.DesiredBalanceShardsAllocator] [node_t0] Desired balance updated: DesiredBalance[lastConvergedIndex=16, assignments={
[zoxdkcqpuouh][1]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1],
[zoxdkcqpuouh][0]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1],
[zoxdkcqpuouh][5]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][4]=ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1],
[zoxdkcqpuouh][3]=ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0],
[zoxdkcqpuouh][2]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
}]. Diff:
[zoxdkcqpuouh][5]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][3]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][2]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]

Nothing out of ordinary here.

Round 2:

routing_nodes:
-----node_id[zIF6uDFXRYKPVDXFSMTRQQ][V]
--------[zoxdkcqpuouh][3], node[zIF6uDFXRYKPVDXFSMTRQQ], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[INITIALIZING], a[id=092Hbr2cSEakrjwApnkZow], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12451]
-----node_id[r4ludfGvTh2wZ7UGP-SjWA][V]
--------[zoxdkcqpuouh][2], node[r4ludfGvTh2wZ7UGP-SjWA], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[INITIALIZING], a[id=TpE9Bw8VTAy1CJz-pDXquQ], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[12535]
--------[zoxdkcqpuouh][5], node[r4ludfGvTh2wZ7UGP-SjWA], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[INITIALIZING], a[id=2n_rRR9XQteq_585F1FliQ], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0], expected_shard_size[13006]
---- unassigned
--------[zoxdkcqpuouh][0], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[zoxdkcqpuouh][1], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]
--------[zoxdkcqpuouh][4], node[null], [P], recovery_source[snapshot recovery [ArTTnq9aTsel6jxW7feZ6Q] from repo:snap/qM6kXprETqqT0R_Qb0Wx-g], s[UNASSIGNED], unassigned_info[[reason=NEW_INDEX_RESTORED], at[2024-04-06T20:17:35.294Z], delayed=false, details[restore_source[repo/snap]], allocation_status[no_attempt]], failed_attempts[0]

  "shard_sizes" : {
    "[zoxdkcqpuouh][2][p]_bytes" : 0, <=== has reserved 12535 bytes
    "[zoxdkcqpuouh][3][p]_bytes" : 12451, <=== has reserved 0 bytes
    "[zoxdkcqpuouh][5][p]_bytes" : 0 <=== has no reserved entry
  },
  "reserved_sizes" : [
    {
      "node_id" : "zIF6uDFXRYKPVDXFSMTRQQ",
      "path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-2",
      "total" : 0,
      "shards" : [
        "[zoxdkcqpuouh][3]"
      ]
    },
    {
      "node_id" : "r4ludfGvTh2wZ7UGP-SjWA",
      "path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-1",
      "total" : 12535,
      "shards" : [
        "[zoxdkcqpuouh][2]"
      ]
    }
  ]
    "zIF6uDFXRYKPVDXFSMTRQQ" : {
      "node_name" : "node_t2",
      "least_available" : {
        "path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-2",
        "total_bytes" : 22691,
        "used_bytes" : 0,
        "free_bytes" : 22691,
        "free_disk_percent" : 100.0,
        "used_disk_percent" : 0.0
      }
    },
    "r4ludfGvTh2wZ7UGP-SjWA" : {
      "node_name" : "node_t1",
      "least_available" : {
        "path" : "/dev/shm/bk/bk-agent-prod-gcp-1712433772034280571/elastic/elasticsearch-periodic/server/build/testrun/internalClusterTest/temp/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT_E34CC9AAB7634065-001/tempDir-005/node-1",
        "total_bytes" : 63302148096,
        "used_bytes" : 9228197888,
        "free_bytes" : 54073950208,
        "free_disk_percent" : 85.4,
        "used_disk_percent" : 14.6
      },

SnapshotShardSizeInfo{snapshotShardSizes=[
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][1]]=>13187, 
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][4]]=>12594,
[snapshot=repo:snap/qM6kXprETqqT0R_Qb0Wx-g, index=[zoxdkcqpuouh/ZsO0w7BsQRy3hAFwsIpZ2g], shard=[zoxdkcqpuouh][0]]=>12166
]}

Desired balance updated: DesiredBalance[lastConvergedIndex=17, assignments={
[zoxdkcqpuouh][5]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0], 
[zoxdkcqpuouh][4]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0], 
[zoxdkcqpuouh][3]=ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0], 
[zoxdkcqpuouh][2]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0], 
[zoxdkcqpuouh][1]=ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0], 
[zoxdkcqpuouh][0]=ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0]
}]. Diff: 
[zoxdkcqpuouh][4]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][1]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[r4ludfGvTh2wZ7UGP-SjWA], total=1, unassigned=0, ignored=0]
[zoxdkcqpuouh][0]: ShardAssignment[nodeIds=[], total=1, unassigned=1, ignored=1] -> ShardAssignment[nodeIds=[zIF6uDFXRYKPVDXFSMTRQQ], total=1, unassigned=0, ignored=0]

The round 2 balance is calculated incorrectly as it is based on incorrect cluster info:

[zoxdkcqpuouh][5][p]_bytes does not have size nor reserved entry
zIF6uDFXRYKPVDXFSMTRQQ reports "used_bytes" : 0 even though there is one shard on it already (can be related to TestFileStore and how we mock?)

idegtiarenko · 2024-04-10T09:12:32Z

I have not found any particular reason why balance was calculated in 2 round here, however I consider this to be a valid scenario.
I also do not think we have any guarantees to have a point in time snapshot for ClusterInfo that can explain the test value.

Muting #105331

cbuescher · 2024-05-17T12:58:44Z

This one looks related https://gradle-enterprise.elastic.co/s/kxx7lbvti6dii/tests/task/:server:internalClusterTest/details/org.elasticsearch.cluster.routing.allocation.decider.DiskThresholdDeciderIT/testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards?top-execution=1

iverase added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >test-failure Triaged test failures from CI labels Feb 9, 2024

elasticsearchmachine added blocker Team:Distributed Meta label for distributed team labels Feb 9, 2024

arteam self-assigned this Feb 9, 2024

idegtiarenko mentioned this issue Feb 9, 2024

More descriptive error message #105340

Merged

DaveCTurner added a commit that referenced this issue Feb 12, 2024

AwaitsFix for #105331

a71f9aa

arteam assigned idegtiarenko and unassigned arteam Feb 13, 2024

idegtiarenko mentioned this issue Feb 14, 2024

additional test logging #105508

Merged

arteam added medium-risk An open issue or test failure that is a medium risk to future releases and removed blocker labels Feb 15, 2024

idegtiarenko mentioned this issue Mar 4, 2024

Additional trace logging for desired balance computer #105910

Merged

idegtiarenko mentioned this issue Mar 12, 2024

[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards failing #105959

Closed

idegtiarenko mentioned this issue Mar 12, 2024

Additional logging for testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards #106255

Merged

gwbrown mentioned this issue Apr 11, 2024

Add extension point for file-based role validation #107177

Merged

jonathan-buttner mentioned this issue Apr 15, 2024

Muting DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards #107480

Merged

elasticsearchmachine pushed a commit that referenced this issue Apr 15, 2024

Muting (#107480)

2269662

Muting #105331

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards failing #105331

[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards failing #105331

iverase commented Feb 9, 2024

elasticsearchmachine commented Feb 9, 2024

idegtiarenko commented Feb 9, 2024

idegtiarenko commented Feb 9, 2024

idegtiarenko commented Mar 4, 2024 •

edited

Loading

idegtiarenko commented Mar 4, 2024

idegtiarenko commented Mar 12, 2024

idegtiarenko commented Apr 9, 2024

idegtiarenko commented Apr 10, 2024

cbuescher commented May 17, 2024

[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards failing #105331

[CI] DiskThresholdDeciderIT testRestoreSnapshotAllocationDoesNotExceedWatermarkWithMultipleShards failing #105331

Comments

iverase commented Feb 9, 2024

elasticsearchmachine commented Feb 9, 2024

idegtiarenko commented Feb 9, 2024

idegtiarenko commented Feb 9, 2024

idegtiarenko commented Mar 4, 2024 • edited Loading

idegtiarenko commented Mar 4, 2024

idegtiarenko commented Mar 12, 2024

idegtiarenko commented Apr 9, 2024

idegtiarenko commented Apr 10, 2024

cbuescher commented May 17, 2024

idegtiarenko commented Mar 4, 2024 •

edited

Loading