Skip to content

fix: reorder cleanup sequence in node-cache teardown #690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zooneon
Copy link

@zooneon zooneon commented May 21, 2025

The previous implementation removed the dummy interface before cleaning up iptables rules, which could leave orphaned iptables rules referencing the removed interface. This fix ensures proper cleanup order by:

  1. First removing all iptables rules
  2. Then removing the dummy interface

This change prevents potential issues where iptables rules might reference a non-existent interface after teardown.

I found this in my testing and confirmed that it was indeed causing errors.
When deleting or updating, the dummy interface is deleted first, causing a disconnect and DNS queries to fail.
I changed the order of the teardown logic and it worked without errors.

The previous implementation removed the dummy interface before cleaning up iptables
rules, which could leave orphaned iptables rules referencing the removed interface.
This fix ensures proper cleanup order by:

1. First removing all iptables rules
2. Then removing the dummy interface

This change prevents potential issues where iptables rules might reference a
non-existent interface after teardown.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 21, 2025
@k8s-ci-robot
Copy link
Contributor

Welcome @zooneon!

It looks like this is your first PR to kubernetes/dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 21, 2025
@k8s-ci-robot k8s-ci-robot requested a review from bowei May 21, 2025 03:02
@k8s-ci-robot
Copy link
Contributor

Hi @zooneon. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zooneon
Once this PR has been reviewed and has the lgtm label, please assign damiansawicki for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from MrHohn May 21, 2025 03:02
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label May 21, 2025
@Michcioperz
Copy link
Contributor

Rules created in initIptables don't reference the dummy interface, only the IP addresses which it is being assigned.

Did you run into an issue which is fixed by this PR? Can you talk more about it?

@zooneon
Copy link
Author

zooneon commented May 22, 2025

@Michcioperz , ah i'm sorry that my explanation wasn't clear enough.
As you said initIptables do not reference the dummy interface, but because of that rules that initIptables generate, DNS query packets are forwarding to "Local DNS" not "Cluster DNS", and in that process dummy interface is used.
So when dummy interface is deleted and iptable rules remain, the packets can not find the interface to the local DNS and are forced to error out.
As mentioned in the description, i discovered this while testing and actually saw the error occur, and when i changed the order of tearDown logic as i changed, it worked without error.

How i tested

  • Query the NGINX pod that executed in the different namespace using the dig command every 0.1 seconds on the test pod
  • Enter to the worker node where the test pod is running, check the iptable rules, dummy interface deletion order, and check if the test pod throws an error.

And finally, i think it makes sense to delete them in the reverse order in which they were created.

@zooneon
Copy link
Author

zooneon commented May 22, 2025

I'm attaching a video of my test.

Delete dummy interface first, and after that delete iptable rules
current_version

Delete iptable rules first, and after that delete dummy interface
modified_version

Each screen is described below.

  • Left top : Run the watch command to watch 'iptables -t raw -nvL' every 0.5 seconds.
  • Left bottom : Run the watch command to watch 'ip a' every 0.5 seconds.
  • Right top : Command to delete node local dns.
  • Right bottom : Test pod that quering to nginx pod that executed in the different namespace with dig command, to see the error being thrown.

@Michcioperz
Copy link
Contributor

Ohh, yeah, I think I understand now. If the interface is deleted first, but rules are still there, there will be a short moment where the traffic is still kept on the node, because it will bypass the Kubernetes Service handling iptables rules; at the same time there will be no interface on the node those packets can go to, so they will go into the void.

I was worried this could be related to an issue I'm fixing, and it is the same symptoms, but different amount of time and cause.

@zooneon
Copy link
Author

zooneon commented May 22, 2025

yepp, i read PR(#689) that you created, and what i understand is, that PR is intended to address the issue of processes not being gracefully shutdown by deadlocks. (So pod will be terminated without tearing down logic to be done.)
What i want to resolve is, even if the pod is gracefully shutdown that PR wants to resolve, there is a possibility of errors due to iptable rules that are remain because of the deletion order.
Thanks for clarifying. 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants