New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow snapshot restore after write alias has been moved by ILM #73934
Comments
Pinging @elastic/es-distributed (Team:Distributed) |
We (the @elastic/es-distributed team) discussed possible solutions in our team meeting today. Our favourite idea was to introduce a new option that would let you preserve the aliases of an existing index rather than overwriting them or clearing them as we do today. The reasoning was that when restoring an index like this you're really trying to put its data back without changing its place in the cluster, so the aliases of the existing index are likely more useful than the aliases in the snapshot. We discussed changing the default behaviour but decided it'd be surprising for the API to behave differently from today by default. Instead we would expect tooling that restores indices like this to use this new option explicitly. We also discussed whether to preserve any other metadata (mappings, settings, ...) rather than overwriting them from those in the snapshot but decided that there are too many ways that such a mechanism might lead to operational surprises. How does that sound @matschaffer? |
Hard to say without a little more detail. My expectation would be that you have some ability to restore If the new option would do this, then that's probably fine. It'd be good if we make this the default in Kibana's restore UI, or maybe even in elasticsearch itself. We see this with some frequency when orchestrating snapshot restore after VM failure on non-HA indices. |
On closer inspection it seems that |
cc @elastic/cloud-orchestration for comment/prioritization |
I don't have a strong understanding of all the implications here, but if the recommendation from ES is to just set |
Yep +1 here, though dave your wording here has me a little concerned.
Should we just always be setting |
one additional thing , that happens to us after snapshot restore. By default , it will restore the ILM policy , which means that ILM usually kicks in and removes the restored index , shortly after restore has completed , which is very annoying. We opened a support case on this and we pretty arrived at the conclusion , that the snapshot web interface cant be used and we have since then used dev tools for this , which is kinda sad. |
I've seen some cases where a snapshot restore has failed with an error like this:
The sequence of events is roughly:
matschaffer-filebeat-7.7.1-2021.03.21-000095
viamatschaffer-filebeat-7.7.1
write aliasmatschaffer-filebeat-7.7.1-2021.03.21-000095
with the alias informationmatschaffer-filebeat-7.7.1-2021.03.21-000095
tomatschaffer-filebeat-7.7.1-2021.03.21-000096
and updates the write aliasmatschaffer-filebeat-7.7.1-2021.03.21-000095
is lostmatschaffer-filebeat-7.7.1-2021.03.21-000095
fails because it attempts to also use thematschaffer-filebeat-7.7.1
write index, currently backed bymatschaffer-filebeat-7.7.1-2021.03.21-000096
To work around this I had to perform the restore manually without aliases:
Then replace the read alias so the restored data would be available via normal query load:
It'd be great if restore could be more ILM-aware such that it won't try to re-claim write indices already backed by a more-current index.
The text was updated successfully, but these errors were encountered: