Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cilium Operator IPAM #11083

Merged
merged 19 commits into from May 13, 2020
Merged

Add Cilium Operator IPAM #11083

merged 19 commits into from May 13, 2020

Conversation

aanm
Copy link
Member

@aanm aanm commented Apr 21, 2020

This PR adds the Cilium Operator IPAM which is based on the Kubernetes IPAM module. It will be enabled by default for new installations which means that podCIDRs allocated in the Kubernetes node spec can be different from the podCIDR allocated by Cilium Operator as both IPAM modules are independent. All pods managed by Cilium will have their IPs allocated from the podCIDR set in the CiliumNode, by the Cilium Operator.

@aanm aanm added kind/feature This introduces new functionality. release-note/major This PR introduces major new functionality to Cilium. area/operator Impacts the cilium-operator component labels Apr 21, 2020
@aanm aanm requested a review from a team April 21, 2020 18:52
@aanm aanm requested review from a team as code owners April 21, 2020 18:52
@aanm aanm requested a review from a team April 21, 2020 18:52
@aanm aanm requested a review from a team as a code owner April 21, 2020 18:52
@maintainer-s-little-helper
Copy link

Commit 3eea2cf5b308bd5811b18ce71f24c9f06d044ae2 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Apr 21, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.8.0 Apr 21, 2020
@aanm aanm force-pushed the pr/add-ipam-operator-based branch from 3eea2cf to aaac3ec Compare April 21, 2020 18:53
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Apr 21, 2020
@aanm
Copy link
Member Author

aanm commented Apr 21, 2020

test-me-please

@coveralls
Copy link

coveralls commented Apr 21, 2020

Coverage Status

Coverage increased (+0.2%) to 38.074% when pulling b35b4bd on pr/add-ipam-operator-based into f8972f8 on master.

@joestringer
Copy link
Member

[Disclaimer: I didn't yet review.]

If I follow right, I can see the maintenance and operations perspective of this which should collapse the different IPAM paths into one common path with clear visibility / debugging steps via CRDs.

On the other hand, does this mean that for instance if the operator is down or there is a connectivity issue connecting to the operator, then nodes cannot allocate IPs or spin up pods? Is there a meaningful tradeoff here that may impact Cilium resiliency? Or are we pushing the core allocation logic into the operator and delegating IPAM management from the operator into Cilium so there's no impact?

@aanm aanm requested a review from a team as a code owner April 22, 2020 10:03
@aanm aanm force-pushed the pr/add-ipam-operator-based branch 2 times, most recently from aaac3ec to 6d5776b Compare April 24, 2020 08:16
@aanm aanm requested a review from a team as a code owner April 24, 2020 08:16
@aanm aanm force-pushed the pr/add-ipam-operator-based branch from 6d5776b to 0b6180f Compare April 24, 2020 08:16
@aanm
Copy link
Member Author

aanm commented Apr 24, 2020

test-me-please

@aanm aanm force-pushed the pr/add-ipam-operator-based branch from 0b6180f to 47dd512 Compare April 24, 2020 09:46
@aanm aanm marked this pull request as draft April 24, 2020 09:46
@aanm
Copy link
Member Author

aanm commented Apr 24, 2020

test-me-please

@aanm aanm force-pushed the pr/add-ipam-operator-based branch from 47dd512 to 99656a4 Compare April 24, 2020 11:22
@aanm
Copy link
Member Author

aanm commented Apr 24, 2020

test-me-please

@aanm aanm force-pushed the pr/add-ipam-operator-based branch from 99656a4 to cbc2027 Compare April 24, 2020 12:41
@aanm
Copy link
Member Author

aanm commented Apr 24, 2020

test-me-please

@aanm aanm force-pushed the pr/add-ipam-operator-based branch from 8ae878d to e1a1eaa Compare May 13, 2020 09:23
@aanm
Copy link
Member Author

aanm commented May 13, 2020

test-me-please

aanm added 19 commits May 13, 2020 12:43
Signed-off-by: André Martins <andre@cilium.io>
Initializing k8s client should not require to wait for the node spec to
be retrieved from k8s.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: André Martins <andre@cilium.io>
Distinguish between an update an create will be helpful for future
implementations of the `nodeEventHandler` interface.

Signed-off-by: André Martins <andre@cilium.io>
These functions will be helpful for the cilium operator CIDR allocator.

Signed-off-by: André Martins <andre@cilium.io>
This field will be useful to report cilium-operator error messages into
the CiliumNode structure.

Signed-off-by: André Martins <andre@cilium.io>
It will be helpful to use this interface to Create and Delete nodes
for the operator IPAM based mode.

Signed-off-by: André Martins <andre@cilium.io>
None of the implementation were respecting the arguments order of the
CiliumNodeGetterUpdater interface. This commit fixes all implementations
to respect such order.

Signed-off-by: André Martins <andre@cilium.io>
The implementation of the Update* methods do not necessary require the
origNode to be provided. In case the origNode is nil the client should
try updating the node regardless.

Signed-off-by: André Martins <andre@cilium.io>
All allocators should share the same interface to handle node events so
we can create a common interface for this purpose.

Signed-off-by: André Martins <andre@cilium.io>
All IPAM providers share the same code, we can remove the duplicated
code.

Signed-off-by: André Martins <andre@cilium.io>
This commit introduces the podCIDR implementation based on the k8s
podCIDR allocation. This commit only introduces the package with the
podCIDR implementation and integration with cilium-agent and
cilium-operator will be made in follow up commits.

Signed-off-by: André Martins <andre@cilium.io>
Some methods and constructors are useful for public usage, so this
commit exposes them.

Signed-off-by: André Martins <andre@cilium.io>
This commit adds the cilium-operator podCIDR allocator that is intended
to replace kubernetes podCIDR allocator, which might not always be
available. Cilium agents will rely on Cilium nodes to retrieve the
podCIDR allocated by the cilium-operator and will use that podCIDR to
allocate IPs for pods running on the nodes.

Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: André Martins <andre@cilium.io>
To avoid new CiliumNodes from getting a podCIDR of existing CiliumNodes,
when the Cilium Operator is starting, and synchronizing with k8s, the
allocation of new CiliumNodes should only happen once Cilium Operator
has synced with k8s.

Signed-off-by: André Martins <andre@cilium.io>
This way all ipam provider tags will be consistent

Signed-off-by: André Martins <andre@cilium.io>
@aanm aanm force-pushed the pr/add-ipam-operator-based branch from e1a1eaa to b35b4bd Compare May 13, 2020 10:44
@aanm
Copy link
Member Author

aanm commented May 13, 2020

test-me-please

@aanm
Copy link
Member Author

aanm commented May 13, 2020

hit #11512 and #11442

@aanm aanm merged commit 9a8a69a into master May 13, 2020
1.8.0 automation moved this from In progress to Merged May 13, 2020
@aanm aanm deleted the pr/add-ipam-operator-based branch May 13, 2020 14:01
@pires
Copy link

pires commented May 22, 2020

[Disclaimer: I didn't yet review.]

If I follow right, I can see the maintenance and operations perspective of this which should collapse the different IPAM paths into one common path with clear visibility / debugging steps via CRDs.

On the other hand, does this mean that for instance if the operator is down or there is a connectivity issue connecting to the operator, then nodes cannot allocate IPs or spin up pods? Is there a meaningful tradeoff here that may impact Cilium resiliency? Or are we pushing the core allocation logic into the operator and delegating IPAM management from the operator into Cilium so there's no impact?

Are there answers to these questions?

@joestringer
Copy link
Member

Are there answers to these questions?

I believe that @aanm answered them in another location. My understanding at this point is it's basically the latter option. During startup, Cilium retrieves the range from the operator then from that point onwards there's no dependency from a pod scheduling perspective. As always, there would still be other chores that the cilium-operator works on such as identity garbage collection but that's not at all impacted by these changes.

@aanm
Copy link
Member Author

aanm commented Jul 10, 2020

@pires @joestringer the answers are here #11083 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operator Impacts the cilium-operator component kind/feature This introduces new functionality. release-note/major This PR introduces major new functionality to Cilium.
Projects
No open projects
1.8.0
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

7 participants