Skip to content

Commit

Permalink
Set global checkpoint before open engine from store (#27972)
Browse files Browse the repository at this point in the history
In PR #27965, we set the global checkpoint from the translog in a store
recovery. However, we set after an engine is opened. This causes the
global checkpoint assertion in TranslogWriter violated as if we are
forced to close the engine before we set the global checkpoint. A
closing engine will close translog which in turn read the current global
checkpoint; however it is still unassigned and smaller than the initial
global checkpoint from translog.

Closes #27970
  • Loading branch information
dnhatn authored and ywelsch committed Dec 23, 2017
1 parent ac43544 commit 146e82d
Showing 1 changed file with 7 additions and 4 deletions.
11 changes: 7 additions & 4 deletions core/src/main/java/org/elasticsearch/index/shard/IndexShard.java
Expand Up @@ -1342,16 +1342,19 @@ private void innerOpenEngineAndTranslog(final EngineConfig.OpenMode openMode, fi
// we disable deletes since we allow for operations to be executed against the shard while recovering
// but we need to make sure we don't loose deletes until we are done recovering
config.setEnableGcDeletes(false);
if (openMode == EngineConfig.OpenMode.OPEN_INDEX_AND_TRANSLOG) {
// we have to set it before we open an engine and recover from the translog because
// acquiring a snapshot from the translog causes a sync which causes the global checkpoint to be pulled in,
// and an engine can be forced to close in ctor which also causes the global checkpoint to be pulled in.
globalCheckpointTracker.updateGlobalCheckpointOnReplica(Translog.readGlobalCheckpoint(translogConfig.getTranslogPath()),
"read from translog checkpoint");
}
Engine newEngine = createNewEngine(config);
verifyNotClosed();
if (openMode == EngineConfig.OpenMode.OPEN_INDEX_AND_TRANSLOG) {
// We set active because we are now writing operations to the engine; this way, if we go idle after some time and become inactive,
// we still give sync'd flush a chance to run:
active.set(true);
// we have to set it before we recover from the translog as acquring a snapshot from the translog causes a sync which
// causes the global checkpoint to be pulled in.
globalCheckpointTracker.updateGlobalCheckpointOnReplica(getEngine().getTranslog().getLastSyncedGlobalCheckpoint(),
"read from translog");
newEngine.recoverFromTranslog();
}
assertSequenceNumbersInCommit();
Expand Down

0 comments on commit 146e82d

Please sign in to comment.