Add global checkpoint tracking on the primary #26666

jasontedor · 2017-09-15T15:13:01Z

This commit adds local tracking of the global checkpoints on all shard copies when a global checkpoint tracker is operating in primary mode. With this, we relay the global checkpoint on a shard copy back to the primary shard during replication operations. This serves as another step towards adding a background sync of the global checkpoint to the shard copies.

Relates #26591

This commit adds local tracking of the global checkpoints on all shard copies when a global checkpoint tracker is operating in primary mode. With this, we relay the global checkpoint on a shard copy back to the primary shard during replication operations. This serves as another step towards adding a background sync of the global checkpoint to the shard copies.

jasontedor · 2017-09-15T15:13:28Z

@ywelsch I want to write more tests for this (and I'm open to any suggestions that you have) but I want to start review cycles early to keep moving.

ywelsch · 2017-09-15T15:42:23Z

core/src/main/java/org/elasticsearch/action/support/replication/ReplicationOperation.java

@@ -385,12 +395,24 @@ void markShardCopyAsStaleIfNeeded(ShardId shardId, String allocationId, Runnable
    }

    /**
-     * An interface to encapsulate the metadata needed from replica shards when they respond to operations performed on them
+     * An interface to encapsulate the metadata needed from replica shards when they respond to operations performed on them.


ywelsch · 2017-09-15T15:44:58Z

core/src/main/java/org/elasticsearch/action/support/replication/TransportReplicationAction.java

            /*
-             * A replica should always know its own local checkpoint so this should always be a valid sequence number or the pre-6.0 local
+             * A replica should always know its own local checkpoints so this should always be valida  sequence number or the pre-6.0


ywelsch · 2017-09-15T15:50:09Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

+        assert !primaryMode
+                || getGlobalCheckpoint() <= inSyncCheckpointStates(checkpoints, CheckpointState::getLocalCheckpoint, LongStream::min);
+
+        // when in primary mode, the local knowledge of the global checkpoints on shard copies is bounded by the global checkpoint


ywelsch · 2017-09-15T15:53:01Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

-            logger.trace("updating global checkpoint from [{}] to [{}] due to [{}]", this.globalCheckpoint, globalCheckpoint, reason);
-            this.globalCheckpoint = globalCheckpoint;
+        if (getGlobalCheckpoint() <= globalCheckpoint) {
+            logger.trace("updating global checkpoint from [{}] to [{}] due to [{}]", getGlobalCheckpoint(), globalCheckpoint, reason);


store getGlobalCheckpoint() in a variable so that we don't have to do the map lookup twice (it's called in the line above as well, and updateGlobalCheckpointOnReplica method is called quite often)?

ywelsch · 2017-09-15T15:56:41Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

+            logger.trace("updating global checkpoint from [{}] to [{}] due to [{}]", getGlobalCheckpoint(), globalCheckpoint, reason);
+            final long unassignedSeqNo = SequenceNumbers.UNASSIGNED_SEQ_NO;
+            final CheckpointState cps =
+                    checkpoints.computeIfAbsent(allocationId, k -> new CheckpointState(unassignedSeqNo, unassignedSeqNo, true));


checkpoints.get(allocationId) is always non-null, so no need for the computeIfAbsent. Also this is a third map lookup.
Maybe better to write this method

final CheckpointState cps = checkpoints.get(allocationId); if (cps.globalCheckpoint <= globalCheckpoint) { logger.trace(...); cps.globalCheckpoint = globalCheckpoint; }

maybe we can even do code sharing with updateGlobalCheckpointForShard

ywelsch · 2017-09-15T16:03:04Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

@@ -50,7 +57,7 @@
 */
 public class GlobalCheckpointTracker extends AbstractIndexShardComponent {

-    private final String allocationId;
+    final String allocationId;


some methods in this class also have allocationId as parameter, which is a bit unfortunate. Maybe rename this (I don't have a good suggestion right now)?

For now I renamed it to shardAllocationId. Let me know if you prefer something else.

ywelsch · 2017-09-15T16:06:54Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

-            if (lcps.localCheckpoint != SequenceNumbers.UNASSIGNED_SEQ_NO &&
-                lcps.localCheckpoint != SequenceNumbersService.PRE_60_NODE_LOCAL_CHECKPOINT) {
-                lcps.localCheckpoint = SequenceNumbers.UNASSIGNED_SEQ_NO;
+        // forget all checkpoint information except for current shard (should we forget local checkpoint for current shard as well?)


change this comment to say "forget all checkpoint information except for global checkpoint of current shard"

ywelsch · 2017-09-15T16:08:08Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

            routingTable = IndexShardRoutingTable.Builder.readFrom(in);
        }

        public long clusterStateVersion() {
            return clusterStateVersion;
        }

-        public Map<String, LocalCheckpointState> getLocalCheckpoints() {
+        public Map<String, CheckpointState> getLocalCheckpoints() {


rename this method

ywelsch · 2017-09-15T16:10:57Z

core/src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java

@@ -2028,7 +2028,7 @@ public void testSeqNoAndCheckpoints() throws IOException {
        final Set<String> indexedIds = new HashSet<>();
        long localCheckpoint = SequenceNumbers.NO_OPS_PERFORMED;
        long replicaLocalCheckpoint = SequenceNumbers.NO_OPS_PERFORMED;
-        long globalCheckpoint = SequenceNumbers.UNASSIGNED_SEQ_NO;
+        long globalCheckpoint;


AFAICS this can be made final

ywelsch · 2017-09-15T16:13:42Z

core/src/test/java/org/elasticsearch/index/seqno/GlobalCheckpointTrackerTests.java

+        builder.addShard(primaryShard);
+
+        if (primaryShard.relocating()) {
+           // builder.addShard(primaryShard.getTargetRelocatingShard());


RoutingTable has never the relocation target shard, it's auto-included with the relocating shard. Just remove these 2 lines

* master: fix testSniffNodes to use the new error message Add check for invalid index in WildcardExpressionResolver (elastic#26409) Docs: Use single-node discovery.type for dev example Filter unsupported relation for range query builder (elastic#26620) Fix kuromoji default stoptags (elastic#26600) [Docs] Add description for missing fields in Reindex/Update/Delete By Query (elastic#26618) [Docs] Update ingest.asciidoc (elastic#26599) Better message text for ResponseException [DOCS] Remove edit link from ML node enable bwc testing fix StartRecoveryRequestTests.testSerialization Add bad_request to the rest-api-spec catch params (elastic#26539) Introduce a History UUID as a requirement for ops based recovery (elastic#26577) Add missing catch arguments to the rest api spec (elastic#26536)

ywelsch

LGTM

ywelsch · 2017-09-18T06:55:12Z

core/src/main/java/org/elasticsearch/index/seqno/GlobalCheckpointTracker.java

+        assert primaryMode;
+        assert handoffInProgress == false;
+        assert invariant();
+        updateGlobalCheckpoint(allocationId, globalCheckpoint, current -> {});


Also add trace logging when updating the global checkpoint knowledge of another shard copy?

ywelsch · 2017-09-18T07:03:28Z

core/src/test/java/org/elasticsearch/recovery/RelocationIT.java

+                    final IndicesService indicesService =
+                            internalCluster().getInstance(IndicesService.class, node.getName());
+                    final IndexService indexService = indicesService.indexService(primaryShardRouting.index());
+                    final IndexShard indexShard = indexService.getShardOrNull(primaryShardRouting.id());


shorter: indicesService.getShardOrNull(primaryShardRouting.shardId())

This commit adds local tracking of the global checkpoints on all shard copies when a global checkpoint tracker is operating in primary mode. With this, we relay the global checkpoint on a shard copy back to the primary shard during replication operations. This serves as another step towards adding a background sync of the global checkpoint to the shard copies. Relates #26666

This commit reenables the BWC tests after they were disabled for backporting the change to track global checkpoints of shard copies on the primary. Relates #26666

When checking that the global checkpoint on the primary is consistent with the local checkpoints of the in-sync shards, we have to filter pre-6.0 nodes from the check or the invariant will trivially trip. This commit filters these nodes out when checking this invariant. Relates #26666

This commit reenables the BWC tests after they were disabled for backporting the change to track global checkpoints of shard copies on the primary. Relates #26666

When checking that the global checkpoint on the primary is consistent with the local checkpoints of the in-sync shards, we have to filter pre-6.0 nodes from the check or the invariant will trivially trip. This commit filters these nodes out when checking this invariant. Relates #26666

This commit reenables the BWC tests after they were disabled for backporting the change to track global checkpoints of shard copies on the primary. Relates #26666

jasontedor · 2017-09-18T11:10:58Z

Thanks @ywelsch. Safely backported and BWC tests reenabled.

After recovery completes from a primary, we now update the local knowledge on the primary of the global checkpoint on the recovery target. However if this occurs concurrently with a relocation, an assertion could trip that we are no longer in primary mode. As this local knowledge should only be tracked when we are in primary mode, updating this local knowledge should be done under a permit. This commit causes that to be the case. Relates #26666

jasontedor added :Sequence IDs blocker v6.0.0 v6.1.0 v7.0.0 labels Sep 15, 2017

jasontedor requested a review from ywelsch September 15, 2017 15:13

Checkstyle

8eaade3

ywelsch reviewed Sep 15, 2017

View reviewed changes

jasontedor added 19 commits September 15, 2017 13:22

Tighter assertions

80654ed

Remove leftover debugging code

51c6948

Fix comment

41cd84a

Cache global checkpoint lookup

b8a5183

Cleaner handling of update

c402ddf

Share some code

37872bf

Fix comment

0a104e0

Rename method

d7eb28f

Make final

4f52100

Add assertion message

7dde7bc

Fix

601300d

Fix assertion

bd6b8e7

Remove imports

d15b7b7

Fix assertion

17c4d20

Disable BWC testing

89dd3a2

Rename field

bb5603d

Add test

dee5e0f

Fix typo

375ad63

jasontedor added 2 commits September 16, 2017 10:40

Add another test

8276497

hashCode

c74599f

ywelsch approved these changes Sep 18, 2017

View reviewed changes

jasontedor added 2 commits September 18, 2017 06:02

trace logging

f9a06a3

Simplify

514db2e

jasontedor merged commit c238b79 into elastic:master Sep 18, 2017

jasontedor added a commit that referenced this pull request Sep 18, 2017

Reenable BWC tests after disabling for backport

52e80a9

This commit reenables the BWC tests after they were disabled for backporting the change to track global checkpoints of shard copies on the primary. Relates #26666

jasontedor added a commit that referenced this pull request Sep 18, 2017

Reenable BWC tests after disabling for backport

34fdcc6

This commit reenables the BWC tests after they were disabled for backporting the change to track global checkpoints of shard copies on the primary. Relates #26666

jasontedor added a commit that referenced this pull request Sep 18, 2017

Reenable BWC tests after disabling for backport

ff5470d

This commit reenables the BWC tests after they were disabled for backporting the change to track global checkpoints of shard copies on the primary. Relates #26666

jasontedor deleted the track-global-checkpoints branch September 18, 2017 11:11

jasontedor mentioned this pull request Sep 20, 2017

Introduce global checkpoint background sync #26591

Merged

colings86 added v6.0.0-rc1 >enhancement and removed v6.0.0 labels Sep 22, 2017

lcawl removed the v6.1.0 label Dec 12, 2017

clintongormley added :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Sequence IDs labels Feb 14, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add global checkpoint tracking on the primary #26666

Add global checkpoint tracking on the primary #26666

jasontedor commented Sep 15, 2017

jasontedor commented Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

jasontedor Sep 16, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch Sep 15, 2017

ywelsch left a comment

ywelsch Sep 18, 2017

ywelsch Sep 18, 2017

jasontedor commented Sep 18, 2017

Add global checkpoint tracking on the primary #26666

Add global checkpoint tracking on the primary #26666

Conversation

jasontedor commented Sep 15, 2017

jasontedor commented Sep 15, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ywelsch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jasontedor commented Sep 18, 2017