-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(elasticsearch): Implement optimization to use reindexing instead… #8352
Conversation
@@ -326,6 +328,12 @@ public void configure() { | |||
} | |||
} | |||
|
|||
@Override | |||
public String reindexAsync(String index, @Nullable QueryBuilder filterQuery, BatchWriteOperationsOptions options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first glance, this doesn't seem right to me. The graph service index is not timeseries
and would not be subject to timeseries-like truncation. Remove reindexAsync
from the common interface. It is only needed on the time series related classes.
@@ -44,6 +46,12 @@ public void configure() { | |||
indexBuilders.reindexAll(); | |||
} | |||
|
|||
@Override | |||
public String reindexAsync(String index, @Nullable QueryBuilder filterQuery, BatchWriteOperationsOptions options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, remove from the interface (also applies to below) non-timeseries Service classes.
@@ -220,6 +224,27 @@ public void buildIndex(ReindexConfig indexState) throws IOException { | |||
} | |||
} | |||
|
|||
public String reindexInPlaceAsync(String indexAlias, @Nullable QueryBuilder filterQuery, BatchWriteOperationsOptions options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can do, I'll move that. That seems better than taking in the batch size and timeout as parameters since we may want to do other tuning (or support other storages that have different parameters) in the future
@@ -220,6 +224,27 @@ public void buildIndex(ReindexConfig indexState) throws IOException { | |||
} | |||
} | |||
|
|||
public String reindexInPlaceAsync(String indexAlias, @Nullable QueryBuilder filterQuery, BatchWriteOperationsOptions options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very much just for timeseries, I'd say we should name it differently to make that clear. It also then makes since that a timeseries method would take a timeseries package options parameter.
|
||
|
||
public interface ElasticSearchIndexed { | ||
String reindexAsync(String index, @Nullable QueryBuilder filterQuery, BatchWriteOperationsOptions options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets remove it from this interface, it is not generally a method for all indices at this time.
@@ -33,6 +36,12 @@ public void reindexAll() { | |||
} | |||
} | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should exist, perhaps not an override but depending on how the clean-up of the ElasticsearchIndex interface goes, you may end up wanting to create a new interface like ElasticSearchIndexedTimeseries (yikes that's long). The timeseries interface applies to timeseries Services' indices (or a subset). Most Services will not be implementing the tiimeseries methods: entity, system metadata, and graph for example do not require the timeseries truncations.
Testing on quickstart instance: Dry run:
Execute:
Number of records after the reindex:
|
… of deleteByQuery when a large amount of records will be deleted
… of deleteByQuery when a large amount of records will be deleted
Checklist