New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make setting index.translog.sync_interval be dynamic #37382
Conversation
Pinging @elastic/es-distributed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR has the same flaw as the attempt in #34104. Repeating my comment here:
While there's testing on the IndexSettings level, I don't think that the update logic actually takes effect on a live shard. The reason for this is that the fsyncTask (see IndexService) is currently only rescheduled when translog durability changes, not when the translog sync interval changes.
@elasticmachine test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @liketic. I think there's still a bug in this code, which means we should also look at beefing up the tests around setting both durability and the sync interval. I've added a comment to detail how to simplify the implementation. Let me know if you need more assistance on this.
if (durability != oldTranslogDurability) { | ||
final TimeValue syncInterval = indexSettings.getTranslogSyncInterval(); | ||
if (syncInterval.equals(oldSyncInterval) == false) { | ||
rescheduleFsyncTask(() -> new AsyncTranslogFSync(IndexService.this)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if both durability and sync interval are changed in the same update? This update here ignores the change to durability. In particular, if durability is SYNC, this will now suddenly schedule a sync task when the sync interval is changed even though there's no such sync task supposed to run due to durability SYNC.
It might be better to restructure the sync task scheduling code to not look at oldTranslogDurability
but just at whether we have a current task and whether we're supposed to have one, and if we have both, check that the intervals match (you can get the current tasks interval directly from the task (see refreshTask.getInterval()
as an example a few lines above it), and then just do a plain rescheduleFsyncTask
without parameters where it gets all it needs from the indexSettings (which have been updated).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ywelsch , so are we going to do something like:
if (fsyncTask != null && durability == Translog.Durability.ASYNC
&& fsyncTask.getInterval().equals(indexSettings.getTranslogSyncInterval()) == false) {
rescheduleFsyncTask();
}
private void rescheduleFsyncTask() {
try {
if (fsyncTask != null) {
fsyncTask.close();
}
} finally {
fsyncTask = indexSettings.getTranslogDurability() == Translog.Durability.REQUEST ? null : new AsyncTranslogFSync(this);
}
}
? I'm a little confused, what if the current fsyncTask
is null or durability is changed from ASYNC
to REQUEST
, the rescheduleFsyncTask()
will not be called. Or we can check if durability
is changed first, like:
final Translog.Durability oldTranslogDurability = indexSettings.getTranslogDurability();
if (oldTranslogDurability != durability) {
rescheduleFsyncTask();
} else if (fsyncTask != null && fsyncTask.getInterval().equals(indexSettings.getTranslogSyncInterval()) == false) {
rescheduleFsyncTask();
}
}
}
private void rescheduleFsyncTask() {
try {
if (fsyncTask != null) {
fsyncTask.close();
}
} finally {
fsyncTask = indexSettings.getTranslogDurability() == Translog.Durability.REQUEST ? null : new AsyncTranslogFSync(this);
}
}
Could you help me? Thanks in advance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about the following:
diff --git a/server/src/main/java/org/elasticsearch/index/IndexService.java b/server/src/main/java/org/elasticsearch/index/IndexService.java
index 604d27a1e70..da95ef3eff2 100644
--- a/server/src/main/java/org/elasticsearch/index/IndexService.java
+++ b/server/src/main/java/org/elasticsearch/index/IndexService.java
@@ -197,7 +197,7 @@ public class IndexService extends AbstractIndexComponent implements IndicesClust
this.refreshTask = new AsyncRefreshTask(this);
this.trimTranslogTask = new AsyncTrimTranslogTask(this);
this.globalCheckpointTask = new AsyncGlobalCheckpointTask(this);
- rescheduleFsyncTask(indexSettings.getTranslogDurability());
+ updateFsyncTaskIfNecessary();
}
public int numberOfShards() {
@@ -636,8 +636,6 @@ public class IndexService extends AbstractIndexComponent implements IndicesClust
@Override
public synchronized void updateMetaData(final IndexMetaData currentIndexMetaData, final IndexMetaData newIndexMetaData) {
- final Translog.Durability oldTranslogDurability = indexSettings.getTranslogDurability();
-
final boolean updateIndexMetaData = indexSettings.updateIndexMetaData(newIndexMetaData);
if (Assertions.ENABLED
@@ -689,20 +687,23 @@ public class IndexService extends AbstractIndexComponent implements IndicesClust
});
rescheduleRefreshTasks();
}
- final Translog.Durability durability = indexSettings.getTranslogDurability();
- if (durability != oldTranslogDurability) {
- rescheduleFsyncTask(durability);
- }
+ updateFsyncTaskIfNecessary();
}
}
- private void rescheduleFsyncTask(Translog.Durability durability) {
- try {
- if (fsyncTask != null) {
- fsyncTask.close();
+ private void updateFsyncTaskIfNecessary() {
+ if (indexSettings.getTranslogDurability() == Translog.Durability.REQUEST) {
+ try {
+ if (fsyncTask != null) {
+ fsyncTask.close();
+ }
+ } finally {
+ fsyncTask = null;
}
- } finally {
- fsyncTask = durability == Translog.Durability.REQUEST ? null : new AsyncTranslogFSync(this);
+ } else if (fsyncTask == null) {
+ fsyncTask = new AsyncTranslogFSync(this);
+ } else {
+ fsyncTask.updateIfNeeded();
}
}
@@ -841,6 +842,13 @@ public class IndexService extends AbstractIndexComponent implements IndicesClust
super(indexService, indexService.getIndexSettings().getTranslogSyncInterval());
}
+ void updateIfNeeded() {
+ final TimeValue newInterval = indexService.getIndexSettings().getTranslogSyncInterval();
+ if (newInterval.equals(getInterval()) == false) {
+ setInterval(newInterval);
+ }
+ }
+
@Override
protected String getThreadPool() {
return ThreadPool.Names.FLUSH;
…ync_interval-dynamic
@liketic Are you still interested in moving this forward? |
@michaelbaamonde Yes, will update soon. |
ping @liketic |
…ync_interval-dynamic
@ywelsch please review again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you @liketic!
@elasticmachine retest this please |
@elasticmachine retest this please |
@elasticmachine retest this please |
CI run failed for unrelated reasons:
@elasticmachine retest this please |
another unrelated test failure:
/cc: @dnhatn @elasticmachine run elasticsearch-ci/1 |
Currently, we cannot update index setting index.translog.sync_interval if index is open, because it's not dynamic which can be updated for closed index only. Closes #32763
Currently, we cannot update index setting index.translog.sync_interval if index is open, because it's not dynamic which can be updated for closed index only. Closes elastic#32763
Currently, we cannot update index setting
index.translog.sync_interval
if index is open, because it's not dynamic which can be updated for closed index only.Closes #32763