Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium-operator not respecting hostname config on HTTPRoutes and Gateway Listeners #30685

Closed
2 tasks done
cjvirtucio87 opened this issue Feb 8, 2024 · 0 comments · Fixed by #30686
Closed
2 tasks done
Labels
area/servicemesh GH issues or PRs regarding servicemesh feature/k8s-gateway-api kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/agent Cilium agent related.

Comments

@cjvirtucio87
Copy link
Contributor

cjvirtucio87 commented Feb 8, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Given the following Gateway manifest deployed in the cilium namespace:

---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: default
  namespace: cilium
  labels:
    app: cilium
spec:
  gatewayClassName: cilium
  addresses:
    - value: "<ip address>"
  listeners:
  - hostname: bar-dev.home
    name: bar-dev-home-http
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            kubernetes.io/metadata.name: "bar-dev"
  - hostname: bar-dev.home
    name: bar-dev-home-https
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: frigate-tls
        namespace: bar-dev
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            kubernetes.io/metadata.name: "bar-dev"
  - hostname: foo-dev.home
    name: foo-dev-home-http
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            kubernetes.io/metadata.name: "foo-dev"
  - hostname: foo-dev.home
    name: foo-dev-home-https
    port: 443
    protocol: HTTPS
    tls:
      certificateRefs:
      - group: ""
        kind: Secret
        name: foo-dev.home-tls
        namespace: foo-dev
    allowedRoutes:
      namespaces:
        from: Selector
        selector:
          matchLabels:
            kubernetes.io/metadata.name: "foo-dev"

and the following HTTPRoute manifest deployed in a separate namespace:

---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: keycloakx
  namespace: foo-dev
spec:
  hostnames:
    - foo-dev.home.io
  parentRefs:
    - name: default
      namespace: cilium
  rules:
    - backendRefs:
        - name: keycloakx
          port: 8080
      matches:
        - path:
            type: PathPrefix
            value: /

the HTTPRoute is stuck with the following status:

Status:
  Parents:
    Conditions:
      Last Transition Time:  2024-02-08T21:15:52Z
      Message:               HTTPRoute is not allowed to attach to this Gateway due to namespace selector restrictions
      Observed Generation:   1
      Reason:                NotAllowedByListeners
      Status:                False
      Type:                  Accepted
      Last Transition Time:  2024-02-08T21:15:52Z
      Message:               Service reference is valid
      Observed Generation:   1
      Reason:                ResolvedRefs
      Status:                True
      Type:                  ResolvedRefs
    Controller Name:         io.cilium/gateway-controller
    Parent Ref:
      Group:      gateway.networking.k8s.io
      Kind:       Gateway
      Name:       default
      Namespace:  cilium

Cilium Version

quay.io/cilium/operator-generic:v1.15.0@sha256:e26ecd316e742e4c8aa1e302ba8b577c2d37d114583d6c4cdd2b638493546a79

Kernel Version

Linux cluster-control.home 5.14.0-362.13.1.el9_3.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Dec 13 14:07:45 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Kubernetes Version

Client Version: v1.29.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.27.2+k3s1

Sysdump

sysdump.zip

Relevant log output

No response

Anything else?

PR with proposed changes.

Also, here is the pertinent portion of the kubectl explain gateway.spec.listeners.hostname output:

     For HTTPRoute and TLSRoute resources, there is an interaction with the
    `spec.hostnames` array. When both listener and route specify hostnames,
    there MUST be an intersection between the values for a Route to be accepted.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@cjvirtucio87 cjvirtucio87 added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Feb 8, 2024
@youngnick youngnick added sig/agent Cilium agent related. area/servicemesh GH issues or PRs regarding servicemesh and removed needs/triage This issue requires triaging to establish severity and next steps. labels Feb 13, 2024
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 17, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>

added test

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 17, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>

added test

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 19, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>

added test

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 19, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 19, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 28, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Feb 29, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Christopher Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Mar 2, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
cjvirtucio87 added a commit to cjvirtucio87/cilium that referenced this issue Mar 3, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
sayboras pushed a commit to sayboras/cilium that referenced this issue Mar 10, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
sayboras pushed a commit to cjvirtucio87/cilium that referenced this issue Mar 10, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
sayboras pushed a commit to cjvirtucio87/cilium that referenced this issue Mar 10, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: cilium#30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
github-merge-queue bot pushed a commit that referenced this issue Mar 10, 2024
This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: #30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
sayboras pushed a commit that referenced this issue Apr 4, 2024
[ upstream commit a052869 ]

This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: #30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
sayboras pushed a commit that referenced this issue Apr 5, 2024
[ upstream commit a052869 ]

This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: #30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>
ubergesundheit added a commit to giantswarm/cilium-upstream that referenced this issue May 21, 2024
* ipsec: Update existing states when a node's bootid changes

When we detect that a node's bootid has changed, we need to update the
IPsec states.

Unfortunately this is not as straightforward as it should be, because we
may receive the new boot ID before a CiliumInternalIP is assign to the
node. In such a case, we can't install the XFRM states yet because we
don't have the CiliumInternalIP, but we need to remember that the boot
ID changed and states should be replaced.

We therefore record that information in a map, ipsecUpdateNeeded, which
is later read to see if the boot ID changed.

Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com>
Co-authored-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* docs: Document Xfrm{In,Out}NoStates on node reboots

When a node reboots the key used to communicate with it is expected to
change due to the new boot id generated. While the new key is being
installed we may need to do it non-atomically (delete + insert), so
packets to/from that node might be dropped which would cause increases
in the XfrmNoStatesIn/Out. Add a note about it in the docs so users are
not surprised.

Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com>
Co-authored-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* ipsec: Enable ESN anti-replay protection

Now we can enable ESN anti-replay with window size of 1024. If a node
reboots then everyone updates the related keys with the new one due to
the different bootid, the node itself is already generating the keys
with the new bootid. The window is used to allow for out-of-order
packets, anti-replay still doesn't allow to replay any packet but keeps
a bitmap and can accept out-of-order packets within window size range.
For more information check section ""A2. Anti-Replay Window" of RFC 4303.

Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com>
Co-authored-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* ipsec: Control use of per-node-pair keys from secret bit

The ESN bit in the IPsec secret will be used to indicate whether
per-node-pair keys should be used or if the global key should remain in
use. Specifically, it consist in a '+' sign after the SPI number in the
secret.

This ESN bit will be used to transition from a global key system to a
per-node-pair system at runtime. We would typically rely on an agent
flag for such a configuration. However, in this case, we need to perform
a key rotation at the same time as we change the key system. Encoding
the key system in the IPsec secret achieves that.

By transition from the global to the per-node-pair keys via a key
rotation, we ensure that the two can coexist during the transition. The
old, global key will have XFRM rules with SPI n, whereas the new,
per-node-pair keys will have XFRM rules with SPI n+1.

Using a bit in the IPsec secret is also easier to test because we
already have all the logic to test key rotation (whereas we would need
new logic to test a flag change).

The users therefore need to perform a key rotation from e.g.:

    3 rfc4106(gcm(aes)) [...] 128

to:

    4+ rfc4106(gcm(aes)) [...] 128

The key rotation test in CI is updated to cover a rotation from 3 to 4+
(meaning a rotation into the new per-node-pair key system).

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* conn-disrupt: Allowlist XfrmInNoStates packet drops

The IPsec fixes will introduce a few XfrmInNoStates packet drops on
up/downgrades due to non-atomic Linux APIs (can't replace XFRM states
atomically). Those are limited to a very short time (time between two
netlink syscalls).

We however need to allowlist them in the CI. Since we're using the
conn-disrupt GitHub action from main, we need to allowlist in main for
the pull request's CI to pass.

Note that despite the expected-xfrm-errors flag, the tests will still
fail if we get 10 or more such drops. We don't expect so many
XfrmInNoStates drops so we still want to fail in that case.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* workflows: Extend IPsec key rotation coverage

[ backporter's notes: couple minor conflicts due to a switch from simple
  to double quote for parameters and the addition of "-c cilium-agent"
  in the kubectl exec commands. ]

Since commit 4cf468b91b ("ipsec: Control use of per-node-pair keys from
secret bit"), IPsec key rotations can be used to switch from the
single-key system to the per-tunnel key system (also referred to as
per-node-pair key system). Our key rotation test in CI was updated to
cover such a switch.

This commit extends it to also cover traditional key rotations, with
both the new and old key systems. The switch back into a single-key
system is also covered.

These special key rotations are controlled with a single + sign. Adding
it after the SPI in the IPsec Kubernetes secret is enough to switch to a
per-tunnel key system. We thus simply need to cover all 4 cases of
having or not having the + sign in the old and new secrets.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* ipsec: disallow empty bootid for key generation

A node update that doesn't contain a BootID will cause the creation
of non-matching XFRM IN and OUT states across the cluster as the
BootID is used to generate per-node key pairs. Non-matching XFRM
states will result in XfrmInStateProtoError, causing packet drops.
An empty BootID should thus be treated as an error, and Cilium
should not attempt to derive per-node keys from it.

Signed-off-by: Robin Gögge <r.goegge@gmail.com>

* k8s: bump CRD schema version

When adding the BootID field to the CiliumNode CRD, we forgot to bump
the version, which is an issue when after an cilium upgrade the
operator tries to update the CiliumNode objects to include the BootID
field.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* ipsec: fix per-node-pair-key computation

This commit ensures that

- each time we compute a per-node-pair-key we create an empty slice with
  the correct length first, and then append all the input data instead
  of appending to one of the input slices (`globalKey`) directly.
- the IPs that are used as arguments in `computeNodeIPsecKey` are
  canonical, meaning IPv4 IPs consist of 4 bytes and IPv6 IPs consist of
  16 bytes.

This is necessary to always have the same inputs on all nodes when
computing the per-node-pair-key. Without this IPs might not match on the
byte level, e.g on one node the input is a v6 mapped v4 address (IPv4
address in 16 bytes) and on the other it isn't when used as input to the
hash function. This will generate non-matching keys.

Co-authored-by: Zhichuan Liang <gray.liang@isovalent.com>
Signed-off-by: Robin Gögge <r.goegge@gmail.com>

* node: Log local boot ID

We have very little logging of the boot IDs. Really fixing that will
require a bit of work to not be too verbose, but in the meantime, we
should at least log the local boot ID.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* identitybackend: address race condition in test

[ upstream commit deb06875717b47d5ea286f5fe86f9cff778eacb8 ]

[ backporter's note: onadd func was not backported doing it as part of
  this commit ]

TestGetIdentity has been unreliable, even withstanding some previous
attempts at deflaking.

The issue lies in the use of the k8s fake infrastructure: the simple
testing object tracker of client-go does _not_ set the ResourceVersion
for resources created. This interacts badly with the logic of the
client-go reflector's ListAndWatch method, which relies on the resource
version to close the racy window between its List and Watch calls. The
real k8s api-server will replay events which occur after the completion
of List and before the establishment of the Watch, thanks to the
ResourceVersion. The object tracker's Watch implementation, however,
does (and can) not do so, as it doesn't have a resource version to
determine which events it would need to replay.

Notably, the HasSynced method of the informer will return true once the
initial List has succeeded. This isn't a guarantee for the Watch to be
established (and indeed, the reflector establishes the Watch _after_ the
list). This is fine for reality, again thanks to the resource version
and the api-server replaying.

The race, hence, is that the creation of the identities can happen
concurrently to the establishment of the watch (HasSynced guarantees
that it happens _after_ the list), and thus we race the creation of the
"RaceFreeWatcher" in the object tracker. If the watcher is late, it
misses the creation of an identity, and we time out waiting on the wait
group.

To fix this, instead of attempting to wait for the Watch
establishment (which doesn't seem easy, on first glance), just create
the resources _before_ list and watch is started, so that they are
returned in the initial list call.

Prior to this patch, the following commandline typically failed quickly:

while true; do go test ./pkg/k8s/identitybackend -run 'TestGetIdentity' -v -count=1 -timeout=10s || break; done

After this patch, it ran thousands of times reliably.

Co-authored-by: Fabian Fischer <fabian.fischer@isovalent.com>
Signed-off-by: David Bimmler <david.bimmler@isovalent.com>

* identitybackend: clean up TestGetIdentity

[ upstream commit 190733402d6b190cbdba220f29d87b56b674f87b ]

The previous patch explains and fixes a flake, this patch removes some
of the remaining cruft from earlier attempts at fixing said flake, as
well as running the test in parallel (for efficiency).

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>

* Prepare for release v1.15.3

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>

* install: Update image digests for v1.15.3

Generated from https://github.com/cilium/cilium/actions/runs/8438952573.

`quay.io/cilium/cilium:v1.15.3@sha256:da74ab61d1bc665c1c088dff41d5be388d252ca5800f30c7d88844e6b5e440b0`
`quay.io/cilium/cilium:stable@sha256:da74ab61d1bc665c1c088dff41d5be388d252ca5800f30c7d88844e6b5e440b0`

`quay.io/cilium/clustermesh-apiserver:v1.15.3@sha256:da4573f8fe4415bdb786c4fdcbc3b518e5a485f930cd4292416eb80800cbd7fc`
`quay.io/cilium/clustermesh-apiserver:stable@sha256:da4573f8fe4415bdb786c4fdcbc3b518e5a485f930cd4292416eb80800cbd7fc`

`quay.io/cilium/docker-plugin:v1.15.3@sha256:1d302b643fe70c6036c5e991520fbf87df92f0fd862ee7b96a5d9c937211f91a`
`quay.io/cilium/docker-plugin:stable@sha256:1d302b643fe70c6036c5e991520fbf87df92f0fd862ee7b96a5d9c937211f91a`

`quay.io/cilium/hubble-relay:v1.15.3@sha256:b9c6431aa4f22242a5d0d750c621d9d04bdc25549e4fb1116bfec98dd87958a2`
`quay.io/cilium/hubble-relay:stable@sha256:b9c6431aa4f22242a5d0d750c621d9d04bdc25549e4fb1116bfec98dd87958a2`

`quay.io/cilium/operator-alibabacloud:v1.15.3@sha256:59d5c0c5782163d38151dd06bae0118144f6c080598901a632c628b1143ccd10`
`quay.io/cilium/operator-alibabacloud:stable@sha256:59d5c0c5782163d38151dd06bae0118144f6c080598901a632c628b1143ccd10`

`quay.io/cilium/operator-aws:v1.15.3@sha256:2b05dc6b88037a5ce05e4030ef616b1f7be9e65083e35abd36a1b66953fd0b6a`
`quay.io/cilium/operator-aws:stable@sha256:2b05dc6b88037a5ce05e4030ef616b1f7be9e65083e35abd36a1b66953fd0b6a`

`quay.io/cilium/operator-azure:v1.15.3@sha256:b85a2671a74903c6e9a45e884654bb970b5b8d6a6e20371811a6cc0ad92b2f87`
`quay.io/cilium/operator-azure:stable@sha256:b85a2671a74903c6e9a45e884654bb970b5b8d6a6e20371811a6cc0ad92b2f87`

`quay.io/cilium/operator-generic:v1.15.3@sha256:c97f23161906b82f5c81a2d825b0646a5aa1dfb4adf1d49cbb87815079e69d61`
`quay.io/cilium/operator-generic:stable@sha256:c97f23161906b82f5c81a2d825b0646a5aa1dfb4adf1d49cbb87815079e69d61`

`quay.io/cilium/operator:v1.15.3@sha256:3d1f8f3c208064a78ae851b2e3cef28d5f484a0a368164fb8f95b92d1a974251`
`quay.io/cilium/operator:stable@sha256:3d1f8f3c208064a78ae851b2e3cef28d5f484a0a368164fb8f95b92d1a974251`

Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>

* gh/actions: Add ipsec-key-rotate

[ upstream commit 687a4f5d86fd816f59b316cda48f0c8a7f45708b ]

[ backporter's notes: also apply diff from e448644f497eb83bc7184a77e0e3045646e7e216
  and e8ddc88aa3bcd6333c21b775e2b18aea9514654f to support new key system ]

The action is for testing whether IPsec key rotations do not cause
any packet drops.

NB for backporters: this commit just moves the code for the workflow
into the new action, and the timeout increase.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* ci-eks: Add IPsec key rotation tests

[ upstream commit 5c06c8e2ea5c575276f84d7f90d967f3c9cb3551 ]

[backporter's notes: apply diff from e448644f497eb83bc7184a77e0e3045646e7e216
 and e8ddc88aa3bcd6333c21b775e2b18aea9514654f ]

First, this commit includes the IPsec key rotation tests action.

Second, it changes the CLI exec name and path to "./cilium-cli", so that
it can be used by the key rotation action and friends.

Third, it runs the IPsec tests only if the matrix.ipsec is set to
"true". A subsequent commit will extend the matrix configuration
accordingly.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* gh/actions: Add IPsec config to aws/k8s-versions.yaml

[ upstream commit f99ddb99ed2cc1a4d96dee25abeea4689ca02628 ]

The file name is non-ideal, but changing it would require changing many
files :-(

For each PR we will run 1.25 w/o IPsec and 1.28 w/ IPsec.

Signed-off-by: Martynas Pumputis <m@lambda.lt>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* metrics: prepare to increase cardinality of BPF metrics key

[ upstream commit b41b78db2c9c8894eee57d80f2aac7b1840c7c83 ]

This commit is split off from subsequent commits to backport to 1.15.

In a follow-up commit, the reserved space in metricsmap.Key will be used for
storing line and file info. Since older versions of Cilium don't decode these
fields yet, this either causes duplicate metrics to be displayed, or causes
the last metricsmap entry with a given reason/direction combination to
overwrite the counters of other entries, resulting in wrong metrics.

metricsmapCollector.Collect() was believed to handle this correctly, but the
code turned out to be wrong. The updated implementation sums up all values
that resolve to the same label set.

Various cleanups were made to remove type conversions and improve legibility.

Signed-off-by: Timo Beckers <timo@isovalent.com>

* endpointmanager: fix data race in test

[ upstream commit 234e7d092020b5c1509de849443e36b30bbe499f ]

Go's race detector was unhappy with this test due to unserialised
concurrent access to the last value of the fake gauge. Use an atomic
float value instead, to ensure no weirdness can occur, and placate the
race detector.

Race detector warning (mildly edited):

WARNING: DATA RACE
Read at 0x00c000930488 by goroutine 205:
  github.com/cilium/cilium/pkg/endpointmanager.TestPolicyMapPressure.TestPolicyMapPressure.func1.func2()
      cilium/pkg/endpointmanager/policymap_pressure_test.go:27 +0x69
  github.com/stretchr/testify/assert.Eventually.func1()
      cilium/vendor/github.com/stretchr/testify/assert/assertions.go:1902 +0x33

Previous write at 0x00c000930488 by goroutine 203:
  github.com/cilium/cilium/pkg/endpointmanager.(*fakeGague).Set()
      cilium/pkg/endpointmanager/policymap_pressure_test.go:45 +0x30
  github.com/cilium/cilium/pkg/endpointmanager.(*policyMapPressure).update()
      cilium/pkg/endpointmanager/policymap_pressure.go:82 +0x32a
  github.com/cilium/cilium/pkg/endpointmanager.newPolicyMapPressure.func1()
      cilium/pkg/endpointmanager/policymap_pressure.go:57 +0x2e
  github.com/cilium/cilium/pkg/trigger.(*Trigger).waiter()
      cilium/pkg/trigger/trigger.go:201 +0x771
  github.com/cilium/cilium/pkg/trigger.NewTrigger.gowrap1()
      cilium/pkg/trigger/trigger.go:122 +0x33

Goroutine 205 (running) created at:
  github.com/stretchr/testify/assert.Eventually()
      cilium/vendor/github.com/stretchr/testify/assert/assertions.go:1902 +0x3d5
  github.com/stretchr/testify/assert.(*Assertions).Eventually()
      cilium/vendor/github.com/stretchr/testify/assert/assertion_forward.go:319 +0xc7
  github.com/cilium/cilium/pkg/endpointmanager.TestPolicyMapPressure.func1()
      cilium/pkg/endpointmanager/policymap_pressure_test.go:26 +0x2a8
  github.com/cilium/cilium/pkg/endpointmanager.TestPolicyMapPressure()
      cilium/pkg/endpointmanager/policymap_pressure_test.go:30 +0x205

Goroutine 203 (running) created at:
  github.com/cilium/cilium/pkg/trigger.NewTrigger()
      cilium/pkg/trigger/trigger.go:122 +0x36d
  github.com/cilium/cilium/pkg/endpointmanager.newPolicyMapPressure()
      cilium/pkg/endpointmanager/policymap_pressure.go:52 +0x287
  github.com/cilium/cilium/pkg/endpointmanager.TestPolicyMapPressure()
      cilium/pkg/endpointmanager/policymap_pressure_test.go:18 +0x84

Fixes: 28ce005918 (endpointmanager: fix bpf policy pressure getting stuck.)

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* idpool: return pointer to pool

[ upstream commit 0b56a2156595a2867bd8ebc2e35297c057e1b79f ]

IDPool contains a mutex, passing copies around is a potential footgun. I
don't think we ever used it incorrectly, but I don't see a reason for
all the copying either.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* endpointmanager: idallocator: remove checkmate

[ upstream commit 59a31008b088400423aef9d5896952df781f81a2 ]

Transform the test from using checkmate to standard Go tests, as it was
not using any of the features anyway.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* endpointmanager: make idallocator a struct

[ upstream commit 58ad35a84faab45fa0dab5f0078287f4cceb3926 ]

The endpointmanagers idallocator package was using a package global pool
for its identifier allocation. That's fine for running the agent, but
causes flakes in testing when multiple tests access the same pool. It's
also not idiomatic Go.

This patch makes the local endpoint identifier allocator a struct, and
the next patch will move it into the endpointmanager package itself, as
there is no other consumer.

While at it, also ensure that the RemoveAll method is only called from a
testing context, by taking a testing.TB as an argument. We cannot simply
move the method into the _test.go files, as tests from other packages
use it.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>

* endpointmanager: move EP identifier alloc into pkg

[ upstream commit a1a03cf55c98ca17ba99fba8f6c8e32c37c3c166 ]

The endpoint manager assumed it was the only consumer of the idallocator
pkg anyway. Having a pkg that only has one consumer is pointless, hence
move it, including tests. This also allows unexporting everything, and
reducing API surface.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* endpointmanager: deflake TestLookup

[ upstream commit 0f27591862016285a9f2fc77298fa687b4b8477a ]

TestLookup could fail with

--- FAIL: Test (0.12s)
    --- FAIL: Test/EndpointManagerSuite (0.12s)
        --- FAIL: Test/EndpointManagerSuite/TestLookup (0.00s)
            manager_test.go:438:
                ... value *endpoint.Endpoint = &endpoint.Endpoint{[...]")
                ... Test Name: endpoint does not exist

In about 0.02% of the runs. Specifically, it would fail iff the endpoint
created happened to randomly get ID 1234, out of the pool of 4096 IDs.

Fix this by not creating an endpoint unconditionally in the test, so
that there is no chance of creating one with the "wrong" id.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* endpointmanager: check expose errors in test

[ upstream commit fce11213dc1770272cbdd2b6c1adb49258ba0379 ]

For some reason, lots of calls to expose didn't check the error
returned. Specifically in the test context, this is not great, as it
makes debugging test failures more difficult than necessary.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* endpointmanager: test: fix fakeGauge spelling

[ upstream commit 6045c6190cb47c0bf3f7304e94c596220b83f3c8 ]

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* endpointmanager: remove RemoveAll from interface

[ upstream commit e2fbd38eebcf1dfb27baeb36bded4a1ba2f4efc3 ]

Instead of having to import the testing pkg in non-testing code, let's
remove the blocker on just having this method in the _test.go files.

The endpoint manager in the daemon_test is re-initialized for every
test, and since we don't have package global state any more we can just
remove the call to RemoveAll, which solves all the problem nicely.

Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* datapath: Move IPsec logic to get local IPs

[ upstream commit bee85354c6c4f82e504f0183f17d007a80168d7e ]

Those helper functions to retrieve the local IPs are all IPsec specific
so let's move them to the ipsec.go file. No functional changes in this
commit.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* ipsec: Refactor logic to get default IPsec interface

[ upstream commit 4687325d23582ad3e7f927c35b3ba86b8b205bd7 ]

This commit extends getDefaultEncryptionInterface to also handle
tunneling mode. That change allows us to start using
getDefaultEncryptionInterface everywhere we need to retrieve the default
IPsec interface.

The unit test for IPsec in subnet encryption mode (ENI and Azure IPAM
modes) must be updated. Subnet encryption is only ever possible in
native routing mode. If we were doing subnet encryption with tunneling,
it would cause undefined behaviors (in the test would fail :)).

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* cmd, datapath: Support --devices for encryption interfaces

[ upstream commit 3c6f95710f4b18caea564c88be052091a8bcf911 ]

The agent supported attaching the IPsec decryption logic to interfaces
given via --devices. In that case, this logic was contained in bpf_host
instead of bpf_network. This support is partially covered in ginkgo
end-to-end tests.

That support is however broken, as there doesn't seem to be anything
preventing bpf_network from being reloaded in place of bpf_host on the
same interfaces.

This commit fixes it by implementing proper support for --devices in
IPsec. If no devices flag is given then we fallback to using the
encrypt-interface flag. That should allow us to deprecate
encrypt-interface at a latter time.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* workflows: Cover support for devices in IPsec tests

[ upstream commit f153f423ae635da8975b01d63b7fc136bb5789d4 ]

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>

* ingress/gateway-api: ordered envoy filterchain

[ upstream commit d8082f908ed7ed773d74abc10745985d027bb4b3 ]

Currently, while translating K8s Ingress or Gateway API resources into Envoy resources,
the filterchain is in random order. This leads to situations (especially in combination
with Shared Ingress) where the order of the filterchains isn't guaranteed -
resulting in unnecessary reconciliations.

Therefore, this commit orders the filterchains within a Envoy Listener by the namespace
and name of the TLS secret. This makes the translation deterministic.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* ingress/gateway-api: ordered envoy filterchain for TLS listener

[ upstream commit 305ea7493c79df825d0c95ed8c4e17c9cc74503a ]

Currently, while translating K8s Ingress or Gateway API resources into Envoy resources,
the filterchain for TLS listeners is in random order. This leads to situations (especially in combination
with Shared Ingress) where the order of the filterchains isn't guaranteed -
resulting in unnecessary reconciliations.

Therefore, this commit orders the filterchains within a Envoy Listener by the name of the backends.
This makes the translation deterministic.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* bpf,multicast: add map infrastructure

[ upstream commit b7822a33ee9ae6c758e54f6ed8b4d969cfaa8242 ]

This commit adds the eBPF map used to implement the synthetic multicast
feature.

A `BPF_MAP_TYPE_HASH_OF_MAPS`, which employees a `BPF_MAP_TYPE_HASH`
inner map, is added to the datapath.

The outer eBPF map is keyed by IPv4 multicast group addresses in big
endian format and the values are `BPF_MAP_TYPE_HASH` maps.

The inner hash map associates IPv4 source addresses with their
subscriber multicast metadata.

Each key/value in the inner hash map is a subscriber of the owning
multicast group.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>

* bpf,mcast: initial IGMPv3 parsing in bpf_lxc

[ upstream commit d7e580ff93595a9f4b4d8834f04c149f8264e678 ]

This commit introduces IGMPv3 detection and parsing.

When bpf_lxc recognizes IGMP messages egressing the Pod we attempt to
parse them.

The parsing logic is as follows:
1. Determine if traffic is IGMP
2. Determine the IGMP message type
3. If the type is not a membership report simply drop it (for now)
4. Parse each group record in the membership report
5. For any group records which indicate a join add a subscriber to the
   multicast subscriber map, if it exists.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>

* bpf,igmp: igmpv2 join and leave parsing

[ upstream commit 8c488dd21254b8d801de2a44f4d66fb6a4dfc97f ]

This commit adds parsing of IGMPv2 messages in a similar fashion as
IGMPv3 messages.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>

* bpf: implement multicast delivery

[ upstream commit 2afcb614a3eeef0df963fffc52006063f96bcac9 ]

This commit implements replication and delivery of multicast packets.

This commit also enables the Cilium datapath to access both `bpf_clone_redirect`
and `bpf_map_for_each_elem` helpers.

The datapath flow is illustrated below:

┌──────────────────────────────────────────┐
│                                          │
│  Sender                                  │
│  ┌──────┐     ┌─────────┐                │
│  │ pod  ├─────► bpf_lxc │                │
│  └──────┘     └────┬────┘                │
│  Local Receivers   │  eBPF Replication   │
│  ┌──────┐ ┌──────┐ │  and Redirection    │
│  │ pod  ◄─┤ veth ◄─┤(cil_from_container) │
│  └──────┘ └──────┘ │ ┌───────┐           │
│                    ├─► vxlan │           │
│  ┌──────┐ ┌──────┐ │ └───┬───┘           │
│  │ pod  ◄─┤ veth ◄─┘     │               │
│  └──────┘ └──────┘  ┌────┘               │
│                     │                    │
└─────────────────────┼────────────────────┘
                      │
┌─────────────────────┼────────────────────┐
│                     │                    │
│                 ┌───▼───┐                │
│                 │ vxlan │                │
│                 └───┬───┘                │
│   Remote Receivers  │  eBPF Replication  │
│   ┌──────┐ ┌──────┐ │  and Redirection   │
│   │ pod  ◄─┤ veth ◄─┤  (from_overlay)    │
│   └──────┘ └──────┘ │                    │
│                     │                    │
│   ┌──────┐ ┌──────┐ │                    │
│   │ pod  ◄─┤ veth ◄─┘                    │
│   └──────┘ └──────┘                      │
│                                          │
└──────────────────────────────────────────┘

A multicast sender sends a multicast packet.

The sender's bpf_lxc program does a lookup in the multicast group map to
discover who has subscribed to the group.

The program then clones and redirects the packets to the subscriber's
ingress device on the host namespace.

If the subscriber is remote the packet is cloned and redirected to a
vxlan device for encapsulation.

Once the host stack forwards the vxlan encap'd packet to the receiving
vxlan device on the remote host a similar "clone and redirect" process
is performed once the vxlan driver decaps the packet.

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>

* bpf: sync test data

Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com>

* chore(deps): update dependency cilium/cilium-cli to v0.16.4

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* chore(deps): update all-dependencies

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* chore(deps): update cilium/little-vm-helper action to v0.0.17

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* chore(deps): update stable lvh-images

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* loader: refactor/cleanup replaceNetworkDatapath

[ upstream commit 1409a371d911e828514405f038161ef9c4cf26e9 ]

replaceNetworkDatapath() is only called from one place and adds an
additional loop over the encryption devices. This commit removes the
function and calls replaceDatapath() from reinitializeIPSec() directly.

There are no functional changes.

Signed-off-by: Robin Gögge <r.goegge@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>

* bpf: make BPF unit tests reproducible

[ upstream commit 0b9b390e823274698b40f6b7c437441916d32c7e ]

Currently, we develop the datapath against multiple versions of LLVM: various local
(host) toolchains during development, CI uses an LLVM version installed by a GH workflow,
and the agent uses the cilium-builder container image.

This PR changes the BPF unit tests to use cilium-builder by default using the
`run_bpf_tests` make target in the root Makefile of the project.

Small overview of the changes:
- run bpf unit tests CI using the cilium-builder image so we use the same LLVM
  toolchain across all tests
- set -j<numcpu> on the root Docker invocation to build .o's in parallel, as
  building the tests was becoming rather slow
- moved `test/bpf_tests/` to `bpf/tests/bpftest` to keep the BPF test runner closer
  to the .c test files it's used with
- removed the layer of indirection through `test/Makefile`; the root Makefile now
  calls `bpf/tests/Makefile` directly
- added a `run` target to `bpf/tests/Makefile` to make it easier to invoke the tests
  using the host Go toolchain without rebuilding the world. sudo is now used
  automatically for 'go test' if `make` is invoked as a non-root user.
- cleaned up output generated by bpf/tests/Makefile

Signed-off-by: Timo Beckers <timo@isovalent.com>

* loader: aggregate replaceDatapath arguments

[ upstream commit e2d90dad6ea18242a3ba67230eb09a7340bfbc5c ]

The arguments to the replaceDatapath functions are already quite numerous
and make the function signature hard to read. In preparation for future
commits, this patch aggregates almost all arguments to the function into
one option parameter.

Signed-off-by: Robin Gögge <r.goegge@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>

* loader: clean up tcx bpf_links created by newer Cilium versions

[ upstream commit f2d804babb4ac4098733cf0a15bd15bd834d1380 ]

A follow-up commit will introduce attaching TC programs using tcx. Those
attachments cannot be overridden using netlink. If an older version of
Cilium wants to replace an TC program on a managed interface, it'll need to
remove the tcx attachment first.

This commit teaches the agent to remove leftover tcx link objects from previous
installs, before reattaching it using netlink. Note that this transition is
never seamless, since some time passes between deleting the link and attaching
the new program using netlink. However, as explained in 7a8e3c810c
("loader: clean up XDP bpf_links created by newer Cilium versions"), this
downgrade path should rarely happen.

Signed-off-by: Robin Gögge <r.goegge@isovalent.com>
Co-authored-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>

* Makefile: declare CILIUM_BUILDER_IMAGE in Makefile.defs

[ upstream commit 5c35dc31c1acb0cba013f20a026c57e291573e55 ]

Centralize the declaration so we can assume it's present in other Makefiles
importing Makefile.defs.

Signed-off-by: Timo Beckers <timo@isovalent.com>

* testdata: minimize build output by reducing header includes

[ upstream commit cd5bc4e03b5ebe3af56e639f252ff2a4a239f2a2 ]

This patch should make testdata play a bit nicer with backports, since
including headers like node_config.h, ep_config.h and maps.h cause potential
churn in the resulting BTF info.

Include a minimal subset of headers and reduce testdata code to what's
strictly necessary for the Go tests to run.

Signed-off-by: Timo Beckers <timo@isovalent.com>

* images: update cilium-{runtime,builder}

Signed-off-by: Cilium Imagebot <noreply@cilium.io>

* ingress/gateway-api: sorted virtual hosts

Currently, while translating K8s Ingress or Gateway API resources into
Envoy resources, the virtualhosts aren't sorted. This leads to situations
(especially in combination with Shared Ingress) where the order of the virtual
hosts isn't guaranteed.

Therefore, this commit orders the virtualhosts within a Envoy RouteConfiguration
by their name. This influences the Envoy route matching process
(https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/route_matching),
but only by making it constant and not random.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>

* chore(deps): update gcr.io/distroless/static-debian11:nonroot docker digest to f41b84c

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* chore(deps): update go to v1.21.9

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* images: update cilium-{runtime,builder}

Signed-off-by: Cilium Imagebot <noreply@cilium.io>

* envoy: Bump envoy image for golang 1.21.9

This is mainly for go 1.21.9 and ubuntu builder 22.04 image.

Related build: https://github.com/cilium/proxy/actions/runs/8552986534/job/23435236145

Signed-off-by: Tam Mach <tam.mach@cilium.io>

* bugfix: hostname config in httproute and gateway

[ upstream commit a0528696705fee551c941784467d5b6dd20482c6 ]

This fixes a bug where the hostname config isn't
respected when set on a Gateway Listener and
on an HTTPRoute's spec.

Fixes: #30685

Signed-off-by: CJ Virtucio <cjv287@gmail.com>

* gateway-api: shorten the length of the value of the svc's label.

[ upstream commit d6fbccf96cdc0a5f3bdf7aa7ac6006a100a09ba9 ]

Fixes #31285

When creating a gateway-api with a name exceeding 64 characters, it is impossible to create svc.

This is because the label of svc references the name of gateway-api.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>

* gateway-api: RequestRedirect picks wrong port with multiple listeners

[ upstream commit 74119bea8b4dbe2b8236f50bf9be9716171f51f5 ]

Fixes: 29099

If RequestRedirect does not specify a port and schem is empty. The port of the listener is used by default.

Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>

* envoy: Bump envoy version to v1.27.4

This is mainly to resolve CVE-2024-30255.

Related: https://github.com/envoyproxy/envoy/security/advisories/GHSA-j654-3ccm-vfmm
Related build: https://github.com/cilium/proxy/actions/runs/8579381142/job/23514301386
Related release: https://github.com/envoyproxy/envoy/releases/tag/v1.27.4

Signed-off-by: Tam Mach <tam.mach@cilium.io>

* chore(deps): update all github action dependencies

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* docs: Restructure OpenShift installation instructions

[ upstream commit efff613258445ca9986e1b6b74af07875f20099c ]

This commit restructures the OpenShift installation instructions to
point to the Red Hat Ecosystem Catalog, so users may find vendor
maintained OLM images.

The old installation instructions which refer to the deprecated
cilium/cilium-olm repository will be moved to the
isovalent/olm-for-cilium repository.

Fixes: #24270

Signed-off-by: Ryan Drew <ryan.drew@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* nodemap: add node-map-max flag to configure nodemap bpf size.

[ upstream commit 0c3570c738aa260999a126b7501465d8e58e0fe0 ]

Using the node-map-max flag, one can now override the default 16k node map size.
This may be needed for large clusters, where the number of distinct node IPs in the cluster exceeds the standard size.

Also provides Size() to nodemap.Map interface such that loader can use this to set the NODE_MAP_MAX var while building bpf programs.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>

* helm: add bpf.nodeMapMax helm val to configure node map size.

[ upstream commit a596a5aa27129056830ff780a54e1e53c6e0dd50 ]

This can be used to override the default node-map-max value which sets bpf node map size.
In some cases, node map size may need to be overridden for very large clusters.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>

* nodemap: add validation to check that node map max is at least 16384.

[ upstream commit 3d641e1742370be37a9e4b48ead0ba43a867480e ]

This is the constant default size prior to adding the flag.
There's not much reason to lower this value so to avoid edge cases we'll just say that this is the lower bound.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* docs: add section for scale implications of nodemap size.

[ upstream commit 4367ffa198955c589a195a9b0ea8df73297fb9d0 ]

With previous commits adding the ability to adjust nodemap size, this adds a section explaining the implications of the nodemap sizing.

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* docs: ipsec: document native-routing + Egress proxy case

[ upstream commit a2bf108c4d1652760953518397dedaf8755189f2 ]

Let the docs reflect the limitation from
https://github.com/cilium/cilium/security/advisories/GHSA-j89h-qrvr-xc36.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* ingress: sort all shared ingresses during model generation

[ upstream commit 16f6afe19fc51b96f5f6484f37adc8fccbf2c47d ]

Currently, when building the model for shared Ingress, all Ingresses
in the cluster are listed and processed. The order of the Ingresses can
differ and potentially influence the generated CiliumEnvoyConfig. This
can lead to unnecessary reconciles. (Even though the internal translation
already handles a stable CiliumEnvoyConfig generation where possible.)

In addition to the existing stable translation logic, this commit sorts
all shared Ingresses by their namespace and name before processing. This way
a consistent translation is more likely to be guaranteed.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* ci-e2e: Add e2e test with WireGuard + Host Firewall

[ upstream commit 54b2ce4c5023c64a30d47e2be3b9cb1b2c7cec14 ]

To get more coverage about the host firewall, let's add a new job in the
e2e test suites to run it alongside WireGuard encryption.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* fqdn: Update DNS Restore to Index to PortProto

[ upstream commit 1941679572fd10932f20beb30a0dc6fd4c70c05f ]

DNS Proxy needs to account for protocol when indexing
L7 DNS rules that it needs to adhere to, otherwise
L7 rules with differing port-protocols can override
each other (nondeterministically) and create overly
restrictive, and incorrect DNS rules. The problem with
accounting for protocol is that Endpoint restoration
logic uses DNS rules that index to port-only as JSON
saved to disk. Adding an additional protocol index to
a map structure changes the JSON structure and breaks
restoration logic between Cilium versions.

This change makes the map index backwards compatible,
since it changes the index from a uint16 to a uint32,
both of which marshal the same into a JSON structure.
The endpoint restoration logic will succeed between
versions, because the older version will be
automatically differentiated with a lack of a 1-bit
at bit position 24. Version 2 will save a 1 bit at the
24th bit going forward to differentiate when protocol
is indexed or not present.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* fqdn: Add Protocol to DNS Proxy Cache

[ upstream commit bc7fbf384bd2179c943130fc6842e27045c372de ]

DNS Proxy indexes domain selectors by port
only. In cases where protocols collide on port
the DNS proxy may have a more restrictive selector
than it should because it does not merge port
protocols for L7 policies (only ports).

All callers of the DNS Proxy are updated
to add protocol to any DNS Proxy entries, and all
tests are updated to test for port-protocol
merge errors.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

* endpoint: Create a New Restore Field for DNS

[ upstream commit 6baab364c2fe71e54b50f3d746175ef1db75f6e2 ]

DNSRulesV2 accounts for protocol and DNSRules does not.
DNSProxy needs to account for both, and endpoint needs
to be able to restore from a downgrade. DNSRulesV2 is used
by default now, but DNSRules is maintained in case of a
downgrade.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

* fqdn: Fallback to Version 1 Port Lookups

[ upstream commit abd7c6e7fdca4352f2d83c0701d95d53cf3e10af ]

In cases where a port-protocol is not present
in an restored port protocol, look up
up the Version 1 version of the PortoProto
in case a Version 1 PortProto was restored.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* testing: Update Restore Sort Method Signatrues

[ upstream commit 51852524f8315d98fa82b292ac7254f0564bea3a ]

The Sort methods are updated to take an unused
testing.T structure to indicate to all callers
that they are only for testing purposes.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* docs: Document No node ID drops in case of remote node deletion

[ upstream commit e2e97f3e07843f568813f90500ba75b21c462b8b ]

While testing cluster scale downs, we noticed that under constant
traffic load, we sometimes had drops of type "No node ID found". We
confirmed that these are expected when the remote node was just deleted,
the delete event received by the local agent, but a local pod is still
sending traffic to pods on that node. In that case, the node is removed
from the node ID map, but information on pods hosted by that node may
still be present.

This commit documents it with the other expected reasons for "No node
ID found" drops.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* cilium-health: Fix broken retry loop in `cilium-health-ep` controller

[ upstream commit 43bd8c17f020eea053aab71216c37e2814fc4570 ]

This commit fixes a bug in the `cilium-health-ep` controller restart
logic where it did not give the cilium-health endpoint enough time to
startup before it was re-created.

For context, the `cilium-health-ep` performs two tasks:

  1. Launch the cilium-health endpoint when the controller is started
     for the first time.
  2. Ping the cilium-health endpoint, and if it does not reply, destroy
     and re-create it.

The controller has a `RunInterval` of 60 seconds and a default
`ErrorRetryBaseDuration` of 1 second. This means that after launching
the initial cilium-health endpoint, we wait for 60 seconds before we
attempt to ping it. If that ping succeeds, we then keep pinging the
health endpoint every 60 seconds.

However, if a ping fails, the controller deletes the existing endpoint
and creates a new one. Because the controller then also returns an
error, it is immediately re-run after one second, because in the failure
case a controller retries with an interval of `consecutiveErrors *
ErrorRetryBaseDuration`.

This meant that after a failed ping, we deleted the unreachable
endpoint, recreated a new one, and after 1s would immediately try to
ping it. Because the newly launched endpoint will is unlikely to be
reachable after just one second (it requires a full endpoint
regeneration with BPF compilation), the `cilium-health-ep` logic would
declare the still starting endpoint as dead and re-create it. This loop
would continue endlessly, causing lots of unnecessary CPU churn, until
enough consecutive errors have happened for the wait time between launch
and the first ping to be long enough for a cilium-health endpoint to be
fully regenerated.

This commit attempts to fix the logic by not immediately killing a
unreachable health endpoint and instead waiting for three minutes to
pass before we attempt to try again. Three minutes should hopefully be
enough time for the initial endpoint regeneration to succeed.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>

* workflows: ipsec-e2e: add missing key types for some configs

[ upstream commit 283cb040ba65681e9e8776190af545769179f9ac ]

These configs were recent additions, and missed the introduction of
the key-type-* parameters. Add them now.

Suggested-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* workflows: Debug info for key rotations

[ upstream commit 820aa07acdcdcb160b62574cdf2a766cf47f5da0 ]

During the key rotations, we compare the number of keys to the expected
number to know where we are in the process (started the rotation or
finished it). The expected number of keys depends on the configuration
so let's print it in the logs to help debug.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* fix: Delegated ipam not configure ipv6 in ipv6 disabled case

[ upstream commit 034aee74f57905898741b89549430c409bef99e3 ]

Delegated ipam returns ipv6 address to cilium cni even if ipv6 disabled
in cilium agent config. In this scenario, ipv6 node addressing is not
set and its causing cilium cni to crash if delegated ipam returns ipv6
but disabled in cilium agent.

Signed-off-by: Tamilmani <tamanoha@microsoft.com>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* install/kubernetes: use digest for nodeinit image

[ upstream commit 2d32dab5451d6ecc1dd1de3bc39f1070ff02b6b5 ]

Like other images used in the Cilium helm chart, use a digest in
addition to the tag for the nodeinit image.

Signed-off-by: Tobias Klauser <tobias@cilium.io>

* install/kubernetes: use renovate to update quay.io/cilium/startup-script

[ upstream commit ac804b6980aac59950e23484809cbc2cafa318c2 ]

Make sure the latest version of the image is used in the helm charts by
letting renovatebot update it automatically.

Signed-off-by: Tobias Klauser <tobias@cilium.io>

* ci/ipsec: Print more info to debug credentials removal check failures

[ upstream commit 129f2e235e62445b73a1b5630f1f7a3a36bf5014 ]

In commit 6fee46f9e753 ("ci/ipsec: Fix downgrade version retrieval") we
added a check to make sure that GitHub credentials are removed before
pulling the untrusted branch from the Pull Request's author. It appears
that this check occasionally fails and causes the whole job to abort.
But Cilium's repository _is_ public, and it's unclear why ".private ==
false" does not evaluate to "false" as we expected in that case. Did the
curl request fail? Did the reply miss the expected .private field? We'll
probably loosen the check as a workaround, but before that it would be
interesting to understand better what's going on. Here we remove the -s
flag from curl and print the reply from the GitHub API request, so we
can better understand what's going on next time we observe a failure.

Signed-off-by: Quentin Monnet <qmo@qmon.net>
Signed-off-by: Jussi Maki <jussi@isovalent.com>

* chore(deps): update docker/setup-buildx-action action to v3.3.0

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* route: Specify "proto kernel" for ip routes and rules

[ upstream commit: 318a64874832bbc9c0047c114a0723d8bfb9219d ]

v1.14 installs routes and rules with specific "proto kernel", v1.15
missed them. Without "proto kernel", it causes troubles when downgrade
from v1.15 to v1.14, as v1.14 is deleting routes with "proto kernel" but
there are no matching ones.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>
Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>

* route: Clean up ip rules with "proto unspec"

This commit adds "removeStaleProxyRulesIPvX()" which removes any ip
rules with "proto unspec" to ensure upgrade/downgrade goes smoothly.

Scenario 1: upgrade from v1.15-old to v1.15-tip
  v1.15-old cilium installs ip rules with "proto unspec", then
  v1.15-tip will install "duplicate" ip rules with "proto kernel".
  This is the moment when "removeStaleProxyRulesIPvX()" plays a role,
  it cleans those "proto unspec" stale rules without breaking
  connectivity.

Scenario 2: downgrade from v1.15-tip to v1.15-old
  v1.15-tip has rules with "proto kernel". When v1.15-old tries to
  "ReplaceRule()" with "proto unspec", thanks to
  [this](https://github.com/cilium/cilium/blob/v1.15.3/pkg/datapath/linux/route/route_linux.go#L402),
  "ReplaceRule()" won't replace the rules because they already exist
  (with a different proto). This ensures connectivity can survive
  the downgrade too.

Scenario 3: upgrade from v1.15-tip to v1.16
  Since v1.15-tip installs correct rules with "proto kernel", v1.16
  will do nothing after confirming existance by "lookupRule()". It
  should be painless as well.

This is a v1.15-only commit because:
1. v1.14 is still using bpf/init.sh which sets rules with "proto kernel"
   properly;
2. v1.16 has been fixed to set "proto kernel";
3. v1.15-tip -> v1.16 upgrade has been discussed above without any
   issue;

Also please note that we don't have to clean up leftover ip routes with
"proto unspec", because we replace them via "route.Upsert()" which
replaces the old ones unconditionally, leaving no stale routes.

Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com>

* bpf: tests: don't define HAVE_ENCAP in IPsec tests

[ upstream commit 3e32efc962afca58548144ab6b126640e0ec0794 ]

This is an internal macro that's selected by common.h (based on
TUNNEL_MODE and a few other config options). There should be no need to
explicitly set it.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* bpf: nodeport: avoid setting SNAT-done mark for to-overlay

[ upstream commit 5b37dc9d6a4fd5ccde4ca78d2b5fa7cc27b99781 ]

When a packet gets SNATed in to-overlay, there is no point in setting the
SNAT-done mark. Firstly the mark refers to the *inner* packet, but all
subsequent users will only see the outer headers. Also our own netfilter
rules installed by installHostTrafficMarkRule() currently clear the mark
for overlay traffic.

Free up the mark for future usage.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* datapath: mark to-overlay traffic

[ upstream commit 2860ded6cc37bc2d63e6c3bce2ec8aceddbf9fe1 ]

To make smarter decisions at the native-device level (in to-netdev), mark
traffic that is created by cilium's overlay interface.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* bpf: nat: skip SNAT in to-netdev for overlay traffic

[ upstream commit 7c789e52e8713dfe5a45bdd88f69fea3fb698cb5 ]

Creating SNAT entries for our own overlay traffic makes little sense. In
particular as the replies will not be addressed to the egressing packet's
source port, but to TUNNEL_PORT.

Avoiding such SNAT tracking reduces the pressure on the CT and NAT maps.

Fixes: https://github.com/cilium/cilium/issues/26908
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* bpf: host: restore HostFW for overlay traffic in to-netdev

[ upstream commit b523a92ce76feeb59e4c325e8f6be0139e8a8e67 ]

Prior to 2860ded6cc37 ("datapath: mark to-overlay traffic"), overlay
traffic would reach the HostFW egress path in to-netdev with
MARK_MAGIC_HOST set. Restore this behaviour by also assigning HOST_ID for
traffic that has MARK_MAGIC_OVERLAY set.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* bpf: host: provide src_sec_identity in to-netdev's drop events

[ upstream commit 8984c20d7ec1ada83b6c9db61a12bee651abf6d0 ]

As we now have a mark-derived src_sec_identity available, we might as well
share this bit of information with the user.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>

* fqdn: Fix Restore Check Logic

[ upstream commit 79029db115743b9884a06e1acf0067140d8a33fe ]

Sometimes restored IPRules do not have the
default "nil" populating their IP maps, but instead have
an empty map structure. We need to check for this
restore possibility.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>

* update cilium/certgen to v0.1.11

v0.1.10 and v0.1.11 are patch releases which update the Go version and
dependencies. See releases for details:

v0.1.10: https://github.com/cilium/certgen/releases/tag/v0.1.10
v0.1.11: https://github.com/cilium/certgen/releases/tag/v0.1.11

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

* enable renovate for cilium/certgen

Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net>

* service: Update for L7 LB while locked

[ upstream commit d913b6298123064f51a8b97495f956b5ebbe62b7 ]

Keep service manager locked while reupserting services after L7 LB
redirection updates. This removes the race where a service would be
re-upserted while having been (concurrently) removed.

Fixes: #18894

Reported-by: Jussi Maki <joamaki@isovalent.com>
Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>

* bugtool: Collect hubble metrics

[ upstream commit 3d95fbcdf655bdb30ca144866b38435a9b0ec3e3 ]

Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* feat: Add the http return code to metric api_processed_total

[ upstream commit 76867e23700ea899bdfdfe247998723cbe9512b2 ]

Signed-off-by: Vipul Singh <vipul21sept@gmail.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* update azure k8s versions

[ upstream commit e22c108f92716c453c8b99a5f30654f80203163c ]

This commit updates tested azure k8s versions according to supported versions

Signed-off-by: Birol Bilgin <birol@cilium.io>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* metric: Avoid memory leak/increase

[ upstream commit 4d05e9f9bedf3fe5022adec256c8d4dcbe224d48 ]

This commit is to make sure that the processed item in pod deletion
queue is removed by explicitly call Done() function as per suggestion
in godoc[^1].

The impact of not having this change will be increasing of memory in
cilium agent when the hubble metrics are enabled. This might take days
(if not weeks) to observe in a normal Cilium deployment due to low number
of Pod deletion events (i.e. in high churn environment, the memory will
be increasing in a faster pace).

Testing is done before and after the changes as per below.

Sample workload to simulate high number of pod deletion events

```yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: pod-churn-job
spec:
  completions: 50000000
  parallelism: 100
  template:
    metadata:
      labels:
        app: pod-churn-job
    spec:
      containers:
      - name: churn-app
        image: sandeshkv92/highpodchurn:linux_amd64
      restartPolicy: Never
```

Before this change, the cilium agent memory keeps increasing from 150MB
to ~500MB in less than 3 hours, while with the same workload configured
and this change, the memory is quite stable for a longer period (e.g. 5
hours).

[^1]: https://pkg.go.dev/k8s.io/client-go@v0.29.3/util/workqueue#Type.Get

Fixes: 782f934641df5bafd4a9ee737e00872f65f56b64
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* cni: Allow text-ts log format value

[ upstream commit 0a79dbd66f7f29b0fdfa622f7554803c6443d42e ]

The new log format (e.g. text-ts) is added recently in the below commit,
so we need to allow it in regex. Additionally, text-ts is used as the
default value if not specified or invalid.

Fixes: a099bf1571f1a090ccfd6ccbba545828a6b3b63c
Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* daemon: add BackendSlot to Service6Key.String and Service4Key.String

[ upstream commit ecf6ff19e7a8011c73ad62d337dabddb34ad72cf ]

    This commit adds BackendSlot value to the Service6Key.String
    and Service4Key.String methods. This is to prevent the
    service key from being deleted when the backend endpoint is deleted.

    Fixes: #29580

Signed-off-by: xyz-li <hui0787411@163.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* cilium-dbg: don't write to file on error opening

[ upstream commit 09572e237ce4d3142f2f5387915c9aadb84ec307 ]

If os.OpenFile returns an error we shouldn't be writing to the returned
os.(*File) instance which might be nil.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* cilium-dbg: avoid leaking file resources

[ upstream commit 7aa2c74a7448076599a5ef7bd22df62e62daf5d9 ]

Files opened using os.Open{,File} need to be closed manually using
os.(*File).Close to avoid leaking os.(*File) instances and file
descriptors.

Signed-off-by: Tobias Klauser <tobias@cilium.io>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* Fix spelling in DNS-based proxy info

[ upstream commit d4ec036b415514bc2956a910c8ce289d2f71edab ]

Signed-off-by: Dean <22192242+saintdle@users.noreply.github.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* etcd: fix lock leases reporting in status

[ upstream commit 559eb5a423487c23f944009263292cc4db4deee8 ]

Status was incorrectly reporting the number of generic leases twice,
rather than generic and lock leases. Let's fix it.

Fixes: c6eb358b7bab ("etcd: switch lock session to lease manager")
Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* bpf: nat: tolerate non-CT L4 protocols when checking for reply traffic

[ upstream commit 20fbf45f98077e936c2ad018cb53bf5761ea37f9 ]

Prior to the blamed commit we ignored any errors from the L4 port
extraction, and so traffic that's not supported by CT (eg ESP) would be
allowed through. Restore this behaviour by explicitly checking for
DROP_CT_UNKNOWN_PROTO.

Fixes: 76217a1b328d ("bpf: nat: Handle errors from snat_v(4|6)_prepare_state()")
Reported-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* Move governance docs to the Cilium Community repo

[ upstream commit 3847833e01eb99da79017f149bd5b4fa77e1d86c ]

We are moving the governance docs out of the main Cilium repo and into the Cilium community repo. They are currently published in the community repo, and this commit will finish the move by deleting them from this main repo.

Changes includes deleting the governance docs from this location and updating links to point to the governance docs in the new location.

Fixes: cilium/community#78

Signed-off-by: Katie Struthers <99215338+katiestruthers@users.noreply.github.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* docs: Remove Hubble-OTel from roadmap

[ upstream commit 0976a1b6bd085f6c95a8f0ed4c24fac83e244bcd ]

The Hubble OTel repo is going to be archived so it should be removed from the roadmap

Signed-off-by: Bill Mulligan <billmulligan516@gmail.com>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* bitlpm: Document and Fix Descendants Bug

[ upstream commit 9e89397d2d81fe915bfbd74d41bd72e5d0c6ad5b ]

Descendants and Ancestors cannot share the same
traversal method, because Descendants needs to be
able to select at least one in-trie key-prefix match
that may not be a full match for the argument key-prefix.
The old traversal method worked for the Descendants
method if there happened to be an exact match of the
argument key-prefix in the trie. These new tests ensure
that Descendants will still return a proper list of
Descendants even if there is not an exact match in the
trie.

Signed-off-by: Nate Sweet <nathanjsweet@pm.me>
Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* Prepare for release v1.15.4

Signed-off-by: Andrew Sauber <2046750+asauber@users.noreply.github.com>

* install: Update image digests for v1.15.4

Generated from https://github.com/cilium/cilium/actions/runs/8654016733.

`quay.io/cilium/cilium:v1.15.4@sha256:b760a4831f5aab71c711f7537a107b751d0d0ce90dd32d8b358df3c5da385426`
`quay.io/cilium/cilium:stable@sha256:b760a4831f5aab71c711f7537a107b751d0d0ce90dd32d8b358df3c5da385426`

`quay.io/cilium/clustermesh-apiserver:v1.15.4@sha256:3fadf85d2aa0ecec09152e7e2d57648bda7e35bdc161b25ab54066dd4c3b299c`
`quay.io/cilium/clustermesh-apiserver:stable@sha256:3fadf85d2aa0ecec09152e7e2d57648bda7e35bdc161b25ab54066dd4c3b299c`

`quay.io/cilium/docker-plugin:v1.15.4@sha256:af22e26e927ec01633526b3d2fd5e15f2c7f3aab9d8c399081eeb746a4e0db47`
`quay.io/cilium/docker-plugin:stable@sha256:af22e26e927ec01633526b3d2fd5e15f2c7f3aab9d8c399081eeb746a4e0db47`

`quay.io/cilium/hubble-relay:v1.15.4@sha256:03ad857feaf52f1b4774c29614f42a50b370680eb7d0bfbc1ae065df84b1070a`
`quay.io/cilium/hubble-relay:stable@sha256:03ad857feaf52f1b4774c29614f42a50b370680eb7d0bfbc1ae065df84b1070a`

`quay.io/cilium/operator-alibabacloud:v1.15.4@sha256:7c0e5346483a517e18a8951f4d4399337fb47020f2d9225e2ceaa8c5d9a45a5f`
`quay.io/cilium/operator-alibabacloud:stable@sha256:7c0e5346483a517e18a8951f4d4399337fb47020f2d9225e2ceaa8c5d9a45a5f`

`quay.io/cilium/operator-aws:v1.15.4@sha256:8675486ce8938333390c37302af162ebd12aaebc08eeeaf383bfb73128143fa9`
`quay.io/cilium/operator-aws:stable@sha256:8675486ce8938333390c37302af162ebd12aaebc08eeeaf383bfb73128143fa9`

`quay.io/cilium/operator-azure:v1.15.4@sha256:4c1a31502931681fa18a41ead2a3904b97d47172a92b7a7b205026bd1e715207`
`quay.io/cilium/operator-azure:stable@sha256:4c1a31502931681fa18a41ead2a3904b97d47172a92b7a7b205026bd1e715207`

`quay.io/cilium/operator-generic:v1.15.4@sha256:404890a83cca3f28829eb7e54c1564bb6904708cdb7be04ebe69c2b60f164e9a`
`quay.io/cilium/operator-generic:stable@sha256:404890a83cca3f28829eb7e54c1564bb6904708cdb7be04ebe69c2b60f164e9a`

`quay.io/cilium/operator:v1.15.4@sha256:4e42b867d816808f10b38f555d6ae50065ebdc6ddc4549635f2fe50ed6dc8d7f`
`quay.io/cilium/operator:stable@sha256:4e42b867d816808f10b38f555d6ae50065ebdc6ddc4549635f2fe50ed6dc8d7f`

Signed-off-by: Andrew Sauber <2046750+asauber@users.noreply.github.com>

* cilium-dbg: remove section with unknown health status.

The "unknown" status simply refers to components that accept a health
reporter scope, but have not declared their state as being either "ok"
or degraded.

This is a bit confusing, as this does not necessarily mean any problems
with Cilium.

In the future we may want to rework this state to distinguish between
unreported states and components that are "timing-out" reconciling a desired
state.

This PR simply removes displaying this information in `cilium-dbg
status`

Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>

* chore(deps): update all github action dependencies

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* chore(deps): update azure/login action to v2.1.0

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* fix k8s versions tested in CI

- Remove older versions we do not officially support anymore on v1.15.
- Make K8s 1.29 the default version on all platforms.

Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com>

* chore(deps): update docker.io/library/golang:1.21.9 docker digest to 81811f8

Signed-off-by: renovate[bot] <bot@renovateapp.com>

* images: update cilium-{runtime,builder}

Signed-off-by: Cilium Imagebot <noreply@cilium.io>

* golangci: Enable errorlint

[ upstream commit d20f15ecab7c157f6246a07c857662bec491f6ee ]

Enable er…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/servicemesh GH issues or PRs regarding servicemesh feature/k8s-gateway-api kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. sig/agent Cilium agent related.
Projects
None yet
3 participants