Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean-up phase in krator fixed where a channel was closed prematurely #557

Merged
merged 3 commits into from
Apr 9, 2021
Merged

Clean-up phase in krator fixed where a channel was closed prematurely #557

merged 3 commits into from
Apr 9, 2021

Conversation

siegfriedweber
Copy link
Contributor

Fixes #556

@kflansburg
Copy link
Collaborator

Thanks for catching this. It sounds like keeping the manifest object around until the end of run_object_task has fixed the issue for you, but I would expect there to still be a race condition here. Do you think that we should await at the end of run_object_task until the Deleted event arrives?

@siegfriedweber
Copy link
Contributor Author

siegfriedweber commented Mar 31, 2021

I am fairly new to Rust so treat my proposals with caution.

We have the following options:

  1. Do not send a manifest after the resource was deleted, i.e. remove krator/src/runtime.rs#L207-L213.
  2. Add another Notify channel, e.g. resource_task_terminated besides deleted, which is awaited at the end of run_object_task. The deleted channel is actually notified very often:
    • in the Running phase when the deletion_timestamp is applied
    • in the Terminated phase when the new status is applied
    • when the run_object_task deletes the resource and the deletion_timestamp without a grace period is applied
    • when the Deleted event is handled

I would opt for option 1 because the resource is already deregistered and nobody listens anymore to manifest updates.

@kflansburg
Copy link
Collaborator

I think its important that Deleted events get sent to the channel in case one arrives when Krator was not already shutting down. I think this happens when someone runs kubectl delete --force --grace-period=0.

I think my suggestion was that Manifest would have a new method on it that allows you to await for it to be deleted.

@kflansburg
Copy link
Collaborator

@siegfriedweber Let me know what you think of my commit. I think your solution makes sense, I just want to make sure that the Manifest isn't dropped until the Deleted event is received.

We now have a lot of Notifies, but I think that can be refactored later.

@siegfriedweber
Copy link
Contributor Author

I fixed a copy&paste error and moved the notification of deleted_event after the manifest was sent.

Copy link
Collaborator

@kflansburg kflansburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kflansburg kflansburg merged commit 7d2467e into krustlet:main Apr 9, 2021
@siegfriedweber siegfriedweber deleted the krator_cleanup_fixed branch April 9, 2021 06:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Channel in krator is closed prematurely
2 participants