Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up ShardFollowTasks for deleted indices #44702

Merged

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Jul 22, 2019

Deleting a follower index does not delete its ShardFollowTasks, potentially leaving many persistent tasks in the cluster that cannot be allocated on nodes and unnecessary fill the logs. This pull request adds a cluster state listener (ShardFollowTaskCleaner) that completes (with a failure) any persistent task that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the task would have been completed as failed and removed from the cluster state.

@tlrx tlrx added >bug :Distributed/CCR Issues around the Cross Cluster State Replication features v8.0.0 v7.4.0 labels Jul 22, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

@tlrx
Copy link
Member Author

tlrx commented Jul 23, 2019

@elasticmachine run elasticsearch-ci/1
@elasticmachine run elasticsearch-ci/default-distro
@elasticmachine run elasticsearch-ci/packaging-sample

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up. I left a suggestion.

@tlrx tlrx requested a review from jasontedor July 23, 2019 12:09
@tlrx
Copy link
Member Author

tlrx commented Jul 23, 2019

Thanks @jasontedor, I applied your feedback. Can you please have another look?

Copy link
Member

@jasontedor jasontedor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jasontedor
Copy link
Member

Can you apply this to 6.8 as well?

@tlrx tlrx merged commit 5afdca2 into elastic:master Jul 24, 2019
@tlrx tlrx deleted the clean-up-shard-follow-tasks-for-deleted-indices branch July 24, 2019 08:13
@tlrx
Copy link
Member Author

tlrx commented Jul 24, 2019

Thanks @jasontedor. I'll backport this change down the line to 6.8.

tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by elastic#34404: before this change the
task would have been completed as failed and removed from the cluster state.
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by elastic#34404: before this change the
task would have been completed as failed and removed from the cluster state.
tlrx added a commit that referenced this pull request Jul 24, 2019
…ex (#44801)

This commit unmutes and renames the test that failed on CI (#44796) 
after #44702 has been merged.

This test assumes that follow stats still exist after a follower index has 
been deleted. The follow stats are based on persistent tasks, and 
since #44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I 
don't think that this test makes much sense now the tasks are cleaned 
up. This is why the test has been renamed.

closes #44796
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
…ex (elastic#44801)

This commit unmutes and renames the test that failed on CI (elastic#44796)
after elastic#44702 has been merged.

This test assumes that follow stats still exist after a follower index has
been deleted. The follow stats are based on persistent tasks, and
since elastic#44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I
don't think that this test makes much sense now the tasks are cleaned
up. This is why the test has been renamed.

closes elastic#44796
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
…ex (elastic#44801)

This commit unmutes and renames the test that failed on CI (elastic#44796)
after elastic#44702 has been merged.

This test assumes that follow stats still exist after a follower index has
been deleted. The follow stats are based on persistent tasks, and
since elastic#44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I
don't think that this test makes much sense now the tasks are cleaned
up. This is why the test has been renamed.

closes elastic#44796
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Jul 24, 2019
…ex (elastic#44801)

This commit unmutes and renames the test that failed on CI (elastic#44796)
after elastic#44702 has been merged.

This test assumes that follow stats still exist after a follower index has
been deleted. The follow stats are based on persistent tasks, and
since elastic#44702 the persistent tasks of deleted following indices are now
 automatically cleaned up to avoid to bloat the cluster state.

I don't think we should report any follow stats for deleted indices and I
don't think that this test makes much sense now the tasks are cleaned
up. This is why the test has been renamed.

closes elastic#44796
tlrx added a commit that referenced this pull request Jul 25, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the
task would have been completed as failed and removed from the cluster state.

Backport of #44702 and #44801 on 7.x
tlrx added a commit that referenced this pull request Jul 25, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the
task would have been completed as failed and removed from the cluster state.

Backport of #44702 and #44801 to 7.3
tlrx added a commit that referenced this pull request Jul 25, 2019
Deleting a follower index does not delete its ShardFollowTasks, potentially
leaving many persistent tasks in the cluster that cannot be allocated on
nodes and unnecessary fill the logs. This commit adds a cluster state listener
(ShardFollowTaskCleaner) that completes (with a failure) any persistent task
that refers to a non existent follower index.

I think that this bug has been introduced by #34404: before this change the
task would have been completed as failed and removed from the cluster state.

Backport of #44702 and #44801 on 6.8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Distributed/CCR Issues around the Cross Cluster State Replication features v6.8.3 v7.3.1 v7.4.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants