-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ENI compatibility regression between 1.7 <-> 1.8 #14991
Conversation
In 1.8, some ENI specifics fields have been moved from spec.ENI to spec.IPAM to make them available for Azure IPAM. The old fields still need to be set to account for: * Downgrade of operator from 1.8 to 1.7 while agents remain on 1.8. * Race condition when the agents get upgraded from 1.7 to 1.8 before the operator gets upgraded to 1.8. While this version mismatch exists, the operator will assume a value of MinAllocate of 0 and a value of InstanceID of "". This leads to the following incorrect behavior: * The operator will not allocate up to MinAllocate * The InstanceID "" value will result in incorrect attachments of ENIs to the Status.ENI field, i.e., ENIs of other nodes will get reported on a node. This in turn will result in IPs being used on nodes for ENIs which are not attached to that node. Fixes: 04d2538 ("eni: No longer set Spec.ENI.PreAllocate") Fixes: f0dfcb4 ("k8s: CiliumNode: Move generic IPAM parmeters into Spec.IPAM") Spotted-by: André Martins <andre@cilium.io> Signed-off-by: Thomas Graf <thomas@cilium.io>
Fix ForeachInterface to not return all ENIs if the instanceID is not known. None of the callers expect this behavior and if the instanceID is not known for some reason (for example due to the recent bug in 1.7 <-> 1.8 compatibility handling), then all ENIs are iterated over instead of just the ENI of the desired instance. Signed-off-by: Thomas Graf <thomas@cilium.io>
test-me-please |
GKE CI failure:
|
1.12-kernel-4.9 failure:
Both are unrelated to the change. |
retest-gke |
retest-4.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice catch
An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: cilium#11673. Related: cilium#14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: #11673. Related: #14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
[ upstream commit 91e68c2 ] An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: #11673. Related: #14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
@christarazi I think it's just generic code that is intended to be used from ENI, Azure, and potentially GCP in the future. It looks like we could change it. The merged PR already did so for ENI. |
[ upstream commit 91e68c2 ] An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: #11673. Related: #14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
[ upstream commit 91e68c2 ] An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: #11673. Related: #14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@covalent.io>
[ upstream commit 91e68c2 ] An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: #11673. Related: #14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
[ upstream commit 91e68c2 ] An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: #11673. Related: #14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
An empty instanceID will cause issues. Add a check to ensure that instanceID is not empty. If it is empty, retry. We also increase the number of retries from 5 to 10. An option I considered was to also avoid the `Get()` operation if there is a conflict. I did not do this since it would go against the explicit intention of the original code: cilium#11673. Related: cilium#14991 Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com>
No description provided.