New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Downsampling: copy the _tier_preference setting #96982
Downsampling: copy the _tier_preference setting #96982
Conversation
We need to override the actual value because the target index already has a default value set, "data_content". Because of the default value set on the target index we were not overriding it with the settings coming from the source index. This results in potential usage of different tiers for the downsampling source and target index.
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Hi @salvatore-campagna, I've created a changelog YAML for you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for fixing this Salvatore 🚀
@@ -595,6 +598,10 @@ private IndexMetadata.Builder copyIndexMetadata(IndexMetadata sourceIndexMetadat | |||
if (FORBIDDEN_SETTINGS.contains(key)) { | |||
continue; | |||
} | |||
|
|||
if (OVERRIDE_SETTINGS.contains(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method could be package private and even static in order to be unit tested. Perhaps in a subsequent PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to unit test the copyIndexMetadata()
method and changing the visibility to package-protected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also need to make this static and then make indexScopedSettings
a parameter to this method.
This PR assumes that the downsample index lives in the same tier as the source index. If that's the desired requirement here this is ready to be merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a unit test, otherwise LGTM.
@@ -91,6 +91,275 @@ | |||
- match: { test-rollup.settings.index.default_pipeline: null } | |||
- match: { test-rollup.settings.index.final_pipeline: null } | |||
|
|||
--- | |||
"Downsample index with tier preference": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe only keep one yaml test? (downsampling data stream with tier preference?)
I think rest should be tested via unit tests. We have been adding a lot of yaml tests,
which is good, but also are more costly compared to unit tests. If things can be tested
with unit tests we should always prefer that and keep using yaml tests to real test integration at the rest api level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote different tests because in the two scenarios the default value we are overwriting is different. In one case it is "data_content" while in the other is "data_hot". My understanding is that this setting is applied differently and I wanted to make sure we are ALWAYS overwriting that.
I assume this for the following reasons:
|
|
||
private static void assertSourceSettings(final IndexMetadata indexMetadata, final Settings settings) { | ||
assertEquals(indexMetadata.getIndex().getName(), settings.get(IndexMetadata.INDEX_DOWNSAMPLE_SOURCE_NAME_KEY)); | ||
assertEquals( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While writing this test I realised that copying the tier preference from the source index might not always be the right thing to do.
I wonder if the following scenario is possible considering that, at least at the moment, the name of the downsample target index is generated automatically and not user provided.
A user might define an index template that matches the downsample target index name and that template applies setting to the rollup index, including, for instance, the _tier_preference
. In that case we would override a user defined setting which probably we don't want. I think, anyway, that we have no way to know if the setting is coming from a template. Also consider that the value we see at this stage changes:
- if the index is a data stream backing index the value we see is "data_hot"
- if the index is a "standalone" index the value we see if "data_content"
Does it make sense that we override the value only if the target index setting is the default (data_hot or data_content) and the source index setting is set with a different value? Or should we introduce an additional downsampling request parameter to define the tier to use for downsampling?
I am not sure what is the right strategy here to apply...because if for instance the template matches and applies a certain user-selected _tier_preference
using the template, then we would override the user choice.
@martijnvg @andreidan any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it is possible to setup an index template for downsample-
index prefix, I don't think this will happen often?
And I see this is an anti-pattern. My thinking here is that all the templates are defined on the data stream level. Whenever a new backing index of a data stream is created, the mappings/settings from the template that matches with the data stream is applied. I don't think that during the lifecycle of a backing index whenever we downsample, shrink or do create a searchable snapshot, we should apply a whole different template. Even though the concrete backing index gets replaced by another concrete backing index.
I don't think we should let custom _tier_preference
get applied here. ILM is actually in charge of that. I think we should maybe even prevent templates from being applied the backing indices getting downsampled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++ I think we shouldn't worry about _tier_preference
being custom configured in index templates and we should copy it from the source index. As Martijn said, ILM is in charge of transitioning the indices through tiers and will change this setting as part of the migrate
action in different phases.
Custom, user-defined allocations will still be able to exist via the custom node attributes.
IMO we shouldn't add special handling in index templates for this particular namespace (which might already exist for certain users out there - outside of the downsampling feature)
We need to override the actual value because the target index already has a default value set, "data_content" or "data_hot" (when the index is a datastream backing index). Because of the default value set on the target index we were not overriding it with the settings coming from the source index. This results in potential usage of different tiers for the downsampling source and target index.
We need to override the actual value because the target index already has a default value set,
"data_content". Because of the default value set on the target index we were not overriding it
with the settings coming from the source index. This results in potential usage of different tiers
for the downsampling source and target index.
Here we explicitly override the default with the setting coming from the source index.
As a result the downsampling target index will use the same
_tier_preference
of thedownsampling source index.
Resolves #96733