Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cilium BGPv1 Reconciler - Handle updated and deprecated Cidr fields for CiliumLoadBalancerIPPool #32694

Merged
merged 1 commit into from
May 31, 2024

Conversation

dswaffordcw
Copy link
Contributor

@dswaffordcw dswaffordcw commented May 23, 2024

Cilium BGPv1 Reconciler - Handle updated and deprecated Cidr fields for CiliumLoadBalancerIPPool

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

Fixes: #32693

Description

In 27322f3, the CiliumLoadBalancerIPPool's field named "cidrs" was deprecated. The documentation on https://docs.cilium.io/en/stable/network/lb-ipam/ provides an example of configuring a CiliumLoadBalancerIPPool using the field named "blocks". While testing a BGP policy configured with the Advertised Path Attributes feature (https://docs.cilium.io/en/stable/network/bgp-control-plane/#advertised-path-attributes), I was not able to achieve the desired policy. BGP attributes configured were not being applied.

While discussing this in Cilium's Slack channel, it was pointed out that the BGPv1 reconciler was only aware of the deprecated field.

This commit updates Cilium's BGPv1 reconciler to support both the deprecated and updated fields.

Unit Tests

Unit tests were updated as follows:

  • To ensure that the deprecated field continues to be supported
  • To ensure that the new field is supported
  • To ensure that mixing both new and deprecated fields is supported
  • To ensure that mixing both new and deprecated fields containing the same prefix is handled (by removing duplicate prefixes)

Manual Testing

For manual testing, I brought up the BGP development environment (see https://docs.cilium.io/en/stable/contributing/development/bgp_cplane/).

Steps included:

make kind-bgp-v4

KIND_CLUSTER_NAME=bgp-cplane-dev-v4 make kind-image
cilium install --chart-directory install/kubernetes/cilium -f contrib/containerlab/bgp-cplane-dev-v4/values.yaml --set image.override="localhost:5000/cilium/cilium-dev:local" --set image.pullPolicy=Never --set operator.image.override="localhost:5000/cilium/operator-generic:local" --set operator.image.pullPolicy=Never

Next, I applied the policy below:

cat bgpp.yaml
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeeringPolicy
metadata:
  name: control-plane
spec:
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: bgp-cplane-dev-v4-control-plane
  virtualRouters:
  - localASN: 65001
    neighbors:
    - peerASN: 65000
      peerAddress: 10.0.1.1/32
---
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPPeeringPolicy
metadata:
  name: worker
spec:
  nodeSelector:
    matchLabels:
      kubernetes.io/hostname: bgp-cplane-dev-v4-worker
  virtualRouters:
  - localASN: 65002
    serviceSelector:
      matchExpressions:
       - { key: app, operator: In, values: [hello-node, hello-node2, hello-node3] }
    serviceAdvertisements:
      - LoadBalancerIP # <-- default
      #- ClusterIP      # <-- options
      #- ExternalIP     # <-- options
    neighbors:
    - peerASN: 65000
      peerAddress: 10.0.2.1/32
      gracefulRestart:
        enabled: true
        restartTimeSeconds: 60
      advertisedPathAttributes:
      - selectorType: CiliumLoadBalancerIPPool
        selector:
          matchLabels:
            environment: mgmt
        communities:
          standard:
          - 65002:100
          large:
          - 65002:100:1
---
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: "mgmt"
  labels:
    environment: mgmt
spec:
  cidrs:
    - cidr: "100.64.8.0/22"

Using:

k apply -f bgpp.yaml

Next, I checked the state of the peers:

$  cilium bgp peers
Node                              Local AS   Peer AS   Peer Address   Session State   Uptime   Family         Received   Advertised
bgp-cplane-dev-v4-control-plane   65001      65000     10.0.1.1       established     3m24s    ipv4/unicast   0          0
                                                                                               ipv6/unicast   0          0
bgp-cplane-dev-v4-worker          65002      65000     10.0.2.1       established     3m27s    ipv4/unicast   0          1
                                                                                               ipv6/unicast   0          1

At this point, I have not yet spun up any services/pods. I then brought up a "hello-world" application using:

kubectl create deployment hello-node --image=registry.k8s.io/e2e-test-images/agnhost:2.39 -- /agnhost netexec --http-port=8080

kubectl expose deployment hello-node --type=LoadBalancer --port=8080

And observed:

$  k get services
NAME         TYPE           CLUSTER-IP    EXTERNAL-IP   PORT(S)          AGE
hello-node   LoadBalancer   10.2.118.17   100.64.8.1    8080:30934/TCP   21s

Followed by:

$  cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node                              VRouter   Prefix          NextHop    Age   Attrs
bgp-cplane-dev-v4-control-plane   65001     100.64.8.1/32   10.0.1.1   38s   [{Origin: i} {AsPath: 65000 65002} {Nexthop: 10.0.1.1} {Communities: 65002:100} {LargeCommunity: [ 65002:100:1]}]
bgp-cplane-dev-v4-worker          65002     100.64.8.1/32   0.0.0.0    38s   [{Origin: i} {Nexthop: 0.0.0.0}]

Prior to this PR, when a prefix was defined on a CiliumLoadBalancerIPPool using cidrs, there were no routes advertised. In the above, we see routes advertised.

Next, I tore down the lab by running:

make kind-bgp-v4-down

After rebuilding the lab, I changed to:

---
apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: "mgmt"
  labels:
    environment: mgmt
spec:
  blocks:
  - cidr: "100.64.8.0/22"

After applying, I see routes as expected:

$  cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node                              VRouter   Prefix          NextHop    Age     Attrs
bgp-cplane-dev-v4-control-plane   65001     100.64.8.1/32   10.0.1.1   1m34s   [{Origin: i} {AsPath: 65000 65002} {Nexthop: 10.0.1.1} {Communities: 65002:100} {LargeCommunity: [ 65002:100:1]}]
bgp-cplane-dev-v4-worker          65002     100.64.8.1/32   0.0.0.0    1m42s   [{Origin: i} {Nexthop: 0.0.0.0}]

In one test, I specified both the new and deprecated fields:

apiVersion: "cilium.io/v2alpha1"
kind: CiliumLoadBalancerIPPool
metadata:
  name: "mgmt"
  labels:
    environment: mgmt
spec:
  blocks:
    - cidr: "100.64.8.0/22"
  cidrs:
    - cidr: "100.64.12.0/22"

Here, I observed the last prefix being used:

$  k get services
NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
hello-node   LoadBalancer   10.2.238.103   100.64.12.1   8080:30608/TCP   2m30s
kubernetes   ClusterIP      10.2.0.1       <none>        443/TCP          4m34s

$  cilium bgp routes
(Defaulting to `available ipv4 unicast` routes, please see help for more options)

Node                              VRouter   Prefix           NextHop    Age     Attrs
bgp-cplane-dev-v4-control-plane   65001     100.64.12.1/32   10.0.1.1   1m36s   [{Origin: i} {AsPath: 65000 65002} {Nexthop: 10.0.1.1} {Communities: 65002:100} {LargeCommunity: [ 65002:100:1]}]
bgp-cplane-dev-v4-worker          65002     100.64.12.1/32   0.0.0.0    1m44s   [{Origin: i} {Nexthop: 0.0.0.0}]
Cilium BGPv1 Reconciler - Handle updated and deprecated Cidr fields for CiliumLoadBalancerIPPool

@dswaffordcw dswaffordcw requested a review from a team as a code owner May 23, 2024 21:21
@maintainer-s-little-helper
Copy link

Commit 971fe08 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels May 23, 2024
@dswaffordcw dswaffordcw marked this pull request as draft May 23, 2024 21:21
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label May 23, 2024
@rastislavs
Copy link
Contributor

@dswaffordcw Thanks for the fix! Looks good, could you please also extend the test in route_policy_test.go to cover this case as well?

@rastislavs rastislavs added kind/bug This is a bug in the Cilium logic. area/bgp release-note/bug This PR fixes an issue in a previous release of Cilium. labels May 24, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 24, 2024
@rastislavs
Copy link
Contributor

Note: the CiliumLoadBalancerIPPool.Spec.Cidrs field has been deprecated as of 27322f3, so we also need a backport to v1.15 (labeling for backport).

@rastislavs rastislavs added the needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch label May 24, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in 1.15.6 May 24, 2024
@harsimran-pabla
Copy link
Contributor

Hi @dswaffordcw Thanks for the quick fix. Couple of suggestions

  • Add sign-off in the commit message.
  • Add a brief commit message explaining the change.

@maintainer-s-little-helper
Copy link

Commit 32e729b does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper
Copy link

Commit 5e93a93 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper
Copy link

Commit dd60d44 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label May 24, 2024
@dswaffordcw
Copy link
Contributor Author

dswaffordcw commented May 24, 2024

@dswaffordcw Thanks for the fix! Looks good, could you please also extend the test in route_policy_test.go to cover this case as well?

Hi @rastislavs ,

I updated the tests as follows:

  • The existing test was changed to reference the field blocks
  • A copy of the existing test was made and continues to reference the original field cidrs.
  • Another copy of the existing test was made that specifies different prefixes using blocks and cidrs.
  • Another copy of the existing test was made that specifies the same prefix using both blocks and cidrs. This test exercises a check that prevents duplicates. When the same prefix is specified in both new and old fields, the duplicate is ignored.

I don't like all the copy-and-paste that's going on there. I thought about re-factoring some of that test out with another function or two but held off. Let me know if you'd like me to explore that.

@dswaffordcw dswaffordcw changed the title Cilium BGPv1 - Support updated CRD field for CIDR definition on Ciliu… Cilium BGPv1 Reconciler - Support updated CRD field for CIDR definition on Ciliu… May 24, 2024
@dswaffordcw dswaffordcw changed the title Cilium BGPv1 Reconciler - Support updated CRD field for CIDR definition on Ciliu… Cilium BGPv1 Reconciler - Handle updated and deprecated Cidr fields for CiliumLoadBalancerIPPool May 24, 2024
@dswaffordcw dswaffordcw marked this pull request as ready for review May 24, 2024 20:44
@maintainer-s-little-helper
Copy link

Commit 9b8262e does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@dswaffordcw dswaffordcw marked this pull request as ready for review May 24, 2024 21:21
pkg/bgpv1/manager/reconciler/route_policy.go Outdated Show resolved Hide resolved
pkg/bgpv1/manager/reconciler/route_policy.go Outdated Show resolved Hide resolved
pkg/bgpv1/manager/reconciler/route_policy.go Outdated Show resolved Hide resolved
@harsimran-pabla
Copy link
Contributor

Thanks @dswaffordcw for the changes. Some minor comments, but overall change looks good.

I don't like all the copy-and-paste that's going on there. I thought about re-factoring some of that test out with another function or two but held off. Let me know if you'd like me to explore that.

Let's hold on to that for now - this code will eventually deprecate as we move to BGPv2 reconcilers. Please let me know if you want to contribute to the BGP subsystem. There is plenty to do :)

@harsimran-pabla
Copy link
Contributor

One natural change is to make sure BGPv2 Reconciler also have similar fix. Currently, it is only checking for Blocks here.

@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label May 28, 2024
Copy link
Contributor

@rastislavs rastislavs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for updating the tests as well!

Copy link
Contributor

@harsimran-pabla harsimran-pabla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@harsimran-pabla
Copy link
Contributor

One of the check patch is failing with this error
Error: ERROR:CUSTOM: Please avoid long commit subjects (max: 75, found: 96)

@harsimran-pabla
Copy link
Contributor

Hi @dswaffordcw I think check is still failing because of merge commit 5d8961c, instead of merging the main branch into this branch - rebasing this branch on top of the main should resolve this issue.

In cilium@27322f3, the CiliumLoadBalancerIPPool's field named "cidrs" was deprecated.  The documentation on https://docs.cilium.io/en/stable/network/lb-ipam/ provides an example of configuring a CiliumLoadBalancerIPPool using the field named "blocks".  While testing a BGP policy configured with the Advertised Path Attributes feature, I was not able to achieve the desired policy.  BGP attributes configured were not being applied.

While discussing this in Cilium's Slack channel, it was pointed out that the BGPv1 reconciler was only aware of the deprecated field.

This commit updates Cilium's BGPv1 reconciler to support both the deprecated and updated fields.

Fixes: cilium#32693

Signed-off-by: David Swafford <dswafford@coreweave.com>
@dswaffordcw
Copy link
Contributor Author

dswaffordcw commented May 30, 2024

Thanks @harsimran-pabla! You were right. I had messed up the rebase w/ my local fork. Corrected now.

@harsimran-pabla
Copy link
Contributor

/test

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label May 31, 2024
@aanm aanm added this pull request to the merge queue May 31, 2024
Merged via the queue into cilium:main with commit 686876c May 31, 2024
62 of 64 checks passed
@dswaffordcw dswaffordcw deleted the bug/GH_ISSUE_32693 branch May 31, 2024 18:10
@viktor-kurchenko viktor-kurchenko mentioned this pull request Jun 4, 2024
5 tasks
@viktor-kurchenko viktor-kurchenko added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Jun 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport pending to v1.15 in 1.15.6 Jun 4, 2024
@github-actions github-actions bot added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Jun 5, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed this from Backport pending to v1.15 in 1.15.6 Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bgp backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. kind/bug This is a bug in the Cilium logic. kind/community-contribution This was a contribution made by a community member. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
Status: Released
Development

Successfully merging this pull request may close these issues.

BGP Advertised Path Attributes fails to match CiliumLoadBalancerIPPool
5 participants