Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YUNIKORN-282] K8shim may have staled nodes causing problems. #158

Closed
wants to merge 7 commits into from
Closed

[YUNIKORN-282] K8shim may have staled nodes causing problems. #158

wants to merge 7 commits into from

Conversation

yangwwei
Copy link
Contributor

When adding/removing nodes back and force, and nodes are created with the same name. In such cases, the shared index informer may trigger Update event instead of an Add (resource key is the same). If the node has stale info in the cache, this will cause the fix we did for YUNIKORN-220 not work. We need to make sure when a node is deleted, the corresponding info is also deleted from the cache.

@yangwwei yangwwei added the bug Something isn't working label Jul 15, 2020
@yangwwei yangwwei self-assigned this Jul 15, 2020
Copy link
Contributor

@wilfred-s wilfred-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all checks pass
fix looks good inline with how pods are processed

@wilfred-s wilfred-s closed this in eade57d Jul 15, 2020
wilfred-s pushed a commit that referenced this pull request Jul 15, 2020
Nodes can be removed and added back again using the same identifier.
The informer can change that remove and add into an update. This can
cause stale information in the shim cache.
The fix makes sure the cache is correctly cleaned up in this case.
YUNIKORN-220 relies on the cache to be clean.

Fixes: #158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants