Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup new excludeOnUpgrade Saved Object API #106991

Open
1 of 6 tasks
joshdover opened this issue Jul 28, 2021 · 10 comments
Open
1 of 6 tasks

Cleanup new excludeOnUpgrade Saved Object API #106991

joshdover opened this issue Jul 28, 2021 · 10 comments
Assignees
Labels
Feature:Migrations project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc

Comments

@joshdover
Copy link
Contributor

joshdover commented Jul 28, 2021

As a follow up to #106534, there are several tasks we'd like to complete to "lock down" the API and improve the test coverage:

@joshdover joshdover added Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Feature:Migrations labels Jul 28, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@joshdover
Copy link
Contributor Author

@elastic/kibana-alerting-services Would it make sense for you all to take the last item in this here, "Generate task & action types from task manager registry instead of using hard-coded ones"? I imagine it's a pretty small change and you all know better than I about how to do it in the most robust way.

@mikecote
Copy link
Contributor

@joshdover, yup, that makes sense to me 👍 we can prioritize a separate issue on our side. Let me know if you want me to create it.

@joshdover
Copy link
Contributor Author

@mikecote Created a new issue here: #107012

@joshdover
Copy link
Contributor Author

Alternatively, we may be able to remove this API from master/8.0. Reason being that our recommended upgrade path is to go through 7.16 first, use the Upgrade Assistant, and then upgrade to 8.0. For users who follow the guidance, this should result in these old tasks & action_task_params getting filtered out on their upgrade to 7.16. For users who don't follow the guidance and migration from < 7.13 to 8.0, they may experience a slower upgrade, but it should still work.

Any thoughts on this this @mikecote?

@mikecote
Copy link
Contributor

mikecote commented Aug 5, 2021

@joshdover There is still a backlog item on the alerting side to clean up documents as failures are encountered (#55340). Until this is complete, there will be scenarios (not sure how likely) where the user could end up in the same state as before, with a lot of documents to clean up.

We did create a background cleanup task in 7.13 (#96971) to mitigate this problem, but there is still a chance a scenario like the above could happen. The user would need to have > 1,200 actions failing per hour, which can be possible.

With that said, it sounds preferable to solve #55340 and remove this core API at the same time, I'm not sure if it can be deferred to 8.x instead of 8.0?

@joshdover
Copy link
Contributor Author

With that said, it sounds preferable to solve #55340 and remove this core API at the same time, I'm not sure if it can be deferred to 8.x instead of 8.0?

I think we can really only remove this in the short-term if we fix #55340 for the 7.16 release since that is a release we'd expect users to upgrade to before upgrading to 8.x. If #55340 is fixed in a later release, it's still possible for their 7.16 -> 8.x upgrade to be slow since there may be many tasks objects created in 7.16 that will have to processed in the migration.

If it's not feasible to fix #55340 for 7.16 then I think we're stuck with supporting this for some time and should proceed with the items in this issue.

@mikecote
Copy link
Contributor

If it's not feasible to fix #55340 for 7.16 then I think we're stuck with supporting this for some time and should proceed with the items in this issue.

Thanks for the insight, I'll have to default on the team not foreseeing capacity to complete this in time for 7.16. Though, PRs are welcome if the ROI on the core team is worth it.

@mikecote
Copy link
Contributor

In the meantime, I've added #55340 to our 7.16/8.0 candidates list so we don't lose sight of the issue.

@mikecote
Copy link
Contributor

@lukeelmers good news, from the conversation above ^^, we've fixed #55340 so it should unblock this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Migrations project:ResilientSavedObjectMigrations Reduce Kibana upgrade failures by making saved object migrations more resilient Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc
Projects
None yet
Development

No branches or pull requests

4 participants