Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13 backports 2022-12-21 #22822

Merged
merged 56 commits into from
Dec 22, 2022
Merged

v1.13 backports 2022-12-21 #22822

merged 56 commits into from
Dec 22, 2022

Commits on Dec 21, 2022

  1. test/l4lb,nat46x64: pass k8s api server to the standalone proxy

    [ upstream commit bdc23f5 ]
    
    These tests deploy the standalone agent in a funny way; using the
    existing helm chart, but not configuring it fully. This breaks the
    config-resolver, which expects to be able to reach a functioning
    in-cluster apiserver.
    
    So, just pass the correct apiserver to the config resolver.
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    74ec29a View commit details
    Browse the repository at this point in the history
  2. clustermesh: Add some helper functions required for v3 backend map

    [ upstream commit 0f2f3fe ]
    
    Add some missing helper functions to implement v3 backend map.
    1. A new AddrCluster constructor AddrClusterFrom
    2. A new method of AddrCluster to extract ClusterID
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    YutaroHayakawa authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    948fc8d View commit details
    Browse the repository at this point in the history
  3. Introduce v3 backend maps

    [ upstream commit c6e53ea ]
    
    Extend current backend map values (struct lb4_backend and struct
    lb6_backend) to contain a new cluster_id field. With this field, we can
    distinguish two backends which have the same IP address, but belong to
    different clusters. Since we don't have available padding bit for both
    structs anymore, we need to extend the structs and bump the map version
    as well. This commit also contains migration code from v2 backend map to
    v3 backend map.
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    YutaroHayakawa authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    bc28587 View commit details
    Browse the repository at this point in the history
  4. ipam/crd: Fix agent fatal on router initialization

    [ upstream commit 4d7acac ]
    
    Currently, when a cilium-agent in eni/alibabacloud mode initializes
    router info, it might encounter the following fatal:
    
    ```
    level=fatal msg="Error while creating daemon" error="failed to create router info invalid ip: " subsys=daemon
    ```
    
    The gateway IP of routing info is derived from the CIDR of the subnet,
    the eni.Subnet.CIDR in InstanceMap is set as empty after ENI creation.
    In normal cases it will be filled by a later resyncTrigger after maintainIPPool.
    But if another goroutine (periodic resync or pool maintainer of another node)
    happens to sync local InstanceMap cache to k8s, cilium-agent would be
    informed of that ENI IP pool with empty cidr and router IP allocation would
    fatal due to empty gateway IP.
    
    This patch fixes this by filling the CIDR right after ENI creation.
    
    Signed-off-by: Jaff Cheng <jaff.cheng.sh@gmail.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jaffcheng authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    96dc218 View commit details
    Browse the repository at this point in the history
  5. daemon: Do not remove PERM L2 entries in L4LB

    [ upstream commit ced77b3 ]
    
    In the L4LB mode, the PERM L2 entries are managed by users. So, in this
    mode Cilium should not mess with them.
    
    Signed-off-by: Martynas Pumputis <m@lambda.lt>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    brb authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    f9318f5 View commit details
    Browse the repository at this point in the history
  6. clustermesh: Introduce CiliumClusterConfig

    [ upstream commit 857060b ]
    
    Add a new type CiliumClusterConfig which represents a cluster
    configuration. This will be serialized and stored into kvstore during
    the clustermesh-apiserver startup time. Later on, cilium-agent on each
    node reads it when connecting to new clusters.
    
    The current use case of this is getting ClusterID at connect time, but by
    exposing the cluster configuration, we can also do some useful validation
    such as
    
    - Make sure the cluster id is not conflicting with existing clusters.
    - Make sure the new cluster doesn't have any capability mismatch.
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    YutaroHayakawa authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    a2e3aa2 View commit details
    Browse the repository at this point in the history
  7. clustermesh: Add helper functions to set/get cluster configuration

    [ upstream commit b24973e ]
    
    Add hepler functions to set/get cluster information on kvstore.
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    YutaroHayakawa authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    dad8659 View commit details
    Browse the repository at this point in the history
  8. clustermesh: Implement a basic connect-time validation

    [ upstream commit 5e5a26e ]
    
    Implement a basic connect-time validation using CiliumClusterConfig.
    clustermesh-apiserver is modified to set local CiliumClusterConfig on
    start-up time and cilium-agent is modified to get CiliumClusterConfig
    of remote clusters and validates it.
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    YutaroHayakawa authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    c7da54d View commit details
    Browse the repository at this point in the history
  9. clustermesh: Add test case for the case that cluster config is missing

    [ upstream commit 398cf5e ]
    
    For compatibility with an old Cilium that doesn't support cluster
    configuration feature, we should be able to connect to the remote
    cluster even if the configuration is missing. Test it.
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    YutaroHayakawa authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    82f50de View commit details
    Browse the repository at this point in the history
  10. gha: Add retry mechanism for conformance ingress (shared)

    [ upstream commit 303d2e7 ]
    
    This is to make sure http retry was done as part of the test.
    
    Fixes: #21710
    Fixes: #21993
    
    Signed-off-by: Tam Mach <tam.mach@cilium.io>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    sayboras authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    093a1be View commit details
    Browse the repository at this point in the history
  11. daemon, cmd: Add benchmark for notifyOnDNSMsg()

    [ upstream commit 25fa14c ]
    
    This allows us to stress / benchmark the main callback function of the
    DNS proxy within Cilium. This callback is called on each DNS request and
    response made by each endpoint managed by Cilium.
    
    For this commit, we are only considering the DNS response path because
    it is the path which contains the most work to be done. This includes
    policy calculation, identity allocation, and ipcache upsert.
    
    It is useful to establish baseline performance numbers so that we can
    compare against changes to this code path. In the following commits, a
    mutex will be added to this path to fix a race condition during parallel
    DNS response handling.
    
    Baseline:
    
    ```
    $ go test -v -tags integration_tests ./daemon/cmd -check.b -check.bmem -check.f DaemonFQDNSuite.Benchmark_notifyOnDNSMsg -test.benchtime 100x
    ...
    PASS: fqdn_test.go:180: DaemonFQDNSuite.Benchmark_notifyOnDNSMsg           20000             96078 ns/op        36567 B/op         482 allocs/op
    ```
    
    Signed-off-by: Chris Tarazi <chris@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    christarazi authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    0ab3209 View commit details
    Browse the repository at this point in the history
  12. daemon, fqdn: Fix race between parallel DNS requests for the same name

    [ upstream commit 604eb21 ]
    
    It is possible for parallel DNS requests handled by separate goroutines
    (controlled by --dnsproxy-concurrency-limit) to race if they are for the
    same DNS name, i.e. both goroutines handling cilium.io.
    
    Here is a diagram for this race:
    
    ```
                 G1                                    G2
    
    T0 --> NotifyOnDNSMsg()               NotifyOnDNSMsg()            <-- T0
    
    T1 --> UpdateGenerateDNS()            UpdateGenerateDNS()         <-- T1
    
    T2 ----> mutex.Lock()                 +---------------------------+
                                          |No identities need updating|
    T3 ----> mutex.Unlock()               +---------------------------+
    
    T4 --> UpsertGeneratedIdentities()    UpsertGeneratedIdentities() <-- T4
    
    T5 ---->  Upsert()                    DNS released back to pod    <-- T5
                                                       |
    T6 --> DNS released back to pod                    |
                 |                                     |
                 |                                     |
                 v                                     v
          Traffic flows fine                   Leads to policy drop
    ```
    
    Note how G2 releases the DNS msg back to the pod at T5 because
    UpdateGenerateDNS() was a no-op. It's a no-op because G1 had executed
    UpdateGenerateDNS() first at T1 and performed the necessary identity
    allocation for the response IPs. Due to G1 performing all the work
    first, G2 executes T4 also as a no-op and releases the msg back to the
    pod at T5 before G1 would at T6.
    
    The above results in a policy drop because it is assumed that when the
    DNS msg is released back to the pod / endpoint, then the ipcache has
    been updated with the proper IP <-> identity mapping. In other words,
    because G2 releases the DNS msg back to the pod / endpoint, the app
    attempts to use the IPs returned in the DNS response, but the ipcache
    lookup fails in the datapath because the IP <-> identity mappings don't
    exist. In essence, this is a race between DNS requests to the ipcache.
    
    To fix, we create a critical section when handling the DNS response IPs.
    Allocating identities and associating them to IPs in the ipcache is now
    atomic because of the critical section. This is especially important as
    we've already seen for multiple DNS requests are in-flight for the same
    name (i.e. cilium.io).
    
    The critical section is implemented using a fixed sized array of
    mutexes. Why? Because the other options (listed below) all have flaws
    which may result in drops.
    
    Other options considered:
    
    1) A map of mutexes keyed on DNS name
    2) A map of mutexes keyed on set of DNS response IPs
    3) A map of mutexes keyed on endpoint ID
    4) A slice of mutexes with hash based on individual IPs from the set of
       DNS response IPs
    
    (1) could cause a race between the DNS response IPs from `a.com` and the
    set of DNS response IPs from `b.com` especially if the sets are the
    same.
    (2) could race between a super / subset of DNS response IPs, i.e. only
    an intersection of sets would have protection. Any IPs that are not in
    the intersection are vulnerable to race.
    (3) could cause a race between different endpoints querying the same
    name, i.e. two endpoints querying cilium.io.
    (4) could work but this is approach is not ideal as the slice scales
    with the number of unique IPs seen by the DNS proxy. This has memory
    consumption concerns.
    
    Therefore, we opt for a fixed size array of mutexes where we hash the
    individual IPs from the set of DNS response IPs. For example, say we
    attempt to resolve cilium.io and the response IPs are A and B.
    
      hash(A) % mod = 8
      hash(B) % mod = 3
    
      where mod is --dnsproxy-lock-count, i.e. size of array
    
    What's returned is a slice containing the above array indices that's
    sorted in ascending order. This is the order in which the mutexes must
    be locked and unlocked. To be specific, [3, 8] is the sorted resulted
    and it means we acquire & release mutex at index 3 and then do the same
    for mutex at index 8.
    
    What this does is essentially serialize parallel DNS requests which
    involve the same IP <-> identity mappings in the ipcache.
    
    It's possible that the hash of the IPs collide (map to the same mutex)
    when the IPs are different, but this is OK because it's overprotecting,
    rather then underprotecting which the other options above suffer from.
    Overprotecting results in a small performance hit because we are
    potentially serializing completely different / unrelated DNS requests,
    i.e. no intersection in the response IP sets.
    
    For --dnsproxy-lock-count, a default of 128 was chosen as a reasonable
    balance between memory usage and hash collisions, for most deployments
    of Cilium. An increased lock count will result in more memory usage, but
    faster DNS processing as there are less IP hash collisions, and
    therefore less mutexes to acquire. A decreased lock count will save
    memory at the cost of slower DNS processing as the IP hash collisions
    are more likely, and this tends towards behavior similar to a single,
    shared mutex. Users who need to tune this value because they have many
    parallel DNS requests (1000s) can make the trade-off of using more
    memory to avoid hash collisions. It's worth mentioning that the memory
    usage is for storing a instance of a mutex which is 64 bits as of Go
    1.19 [1].
    
    Benchmark run comparing the code before this commit, changing the fixed
    sized array to one to simulate a single mutex, and this commit
    (multiple-locks):
    
    ```
    $ go test -v -tags integration_tests ./daemon/cmd -check.b -check.bmem -check.f DaemonFQDNSuite.Benchmark_notifyOnDNSMsg -test.benchtime 100x -test.count 8
    ```
    
    ```
    $ benchstat no-lock.txt single-lock.txt multiple-locks.txt
    name \ time/op    no-lock.txt  single-lock.txt  multiple-locks.txt
    _notifyOnDNSMsg   93.2µs ± 9%     179.6µs ± 3%        143.2µs ±28%
    
    name \ alloc/op   no-lock.txt  single-lock.txt  multiple-locks.txt
    _notifyOnDNSMsg   36.5kB ± 2%      35.3kB ± 0%         33.8kB ± 1%
    
    name \ allocs/op  no-lock.txt  single-lock.txt  multiple-locks.txt
    _notifyOnDNSMsg      485 ± 1%         474 ± 0%            453 ± 1%
    ```
    
    Reproducing the bug and performance testing on a real cluster included
    running the 3 different modes and comparing the average processing time
    of the DNS proxy code. The cluster was 2 nodes with 32 replicas of the
    following workload:
    
      # websites.txt contains different common URLs, some duplicated.
      while [[ true ]]; do
        curl --parallel --parallel-immediate --parallel-max 5 --config websites.txt
        sleep 0.25s
      done
    
    No lock:        4.61ms, 18.3ms
    Single lock:    7.42ms, 18.3ms
    Multiple locks: 5.25ms, 17.9ms
    
    [1]: https://cs.opensource.google/go/go/+/refs/tags/go1.19.3:src/sync/mutex.go;l=34
    
    Co-authored-by: Michi Mutsuzaki <michi@isovalent.com>
    Signed-off-by: Chris Tarazi <chris@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    2 people authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    d5aca96 View commit details
    Browse the repository at this point in the history
  13. policy: Precompute redirect type on Attach()

    [ upstream commit 9eb0bfb ]
    
    Optimize policy processing by precomputing redirect types for the policy
    when ready to avoid multiple scans of the policy filters later.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    38041b1 View commit details
    Browse the repository at this point in the history
  14. policy: Add ParserTypeCRD

    [ upstream commit 38589e7 ]
    
    Add a sticky parser type for CiliumEnvoyConfig CRDs. This will be used
    for policy based redirect to custom Envoy listeners.
    
    While CRD parser type will be redirected to Envoy, it is generally
    handled by a custom Listener, which may not perform HTTP policy
    enforcement. Thus this parser type is incompatible with HTTP rules.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    9430986 View commit details
    Browse the repository at this point in the history
  15. policy: Cache redirect status on PerSelectorPolicy

    [ upstream commit 27861ae ]
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    c571a77 View commit details
    Browse the repository at this point in the history
  16. envoy: Wait for Cluster before Listeners

    [ upstream commit 7b01459 ]
    
    Listener config can fail if we don't wait for clusters to be configured
    first, so wait for clusters to be configured if both clusters and
    listeners are configured at the same time.
    
    While failure like this was once seen on local testing for the custom
    listener support in a CNP, similar failure could also happen for Cilium
    Ingress and Gateway API.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    1f8bbe5 View commit details
    Browse the repository at this point in the history
  17. envoy: Inject Cilium Envoy filters to CEC with TcpProxy filter

    [ upstream commit 0036e24 ]
    
    Support injecting Cilium Network filter also to TcpProxy filter
    chains. This enables SNI and other L3/L4 policy enforcement features also
    on custom Envoy listeners applied via Cilium Envoy Config resources that
    contain Envoy filter chains using TcpProxy filter, in addition to Http
    Connection Manager filter.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    3373c68 View commit details
    Browse the repository at this point in the history
  18. envoy: Configure CEC Listners without L7LB for policy redirection

    [ upstream commit 41fe8c8 ]
    
    Tell envoy.ParseResources() if the resources are intended for a L7 load
    balancer. If so, the Envoy filter chain is configured to not use original
    source address and use the socket mark as the source endpoint ID. This was
    the current behavior with all CEC CRDs.
    
    When the CEC CRD does not contain a frontend service that is redirected
    to the defined listener, then the new 'isL7LB' flag is passed as
    'false'. In this case the Envoy filter chain is configured to allow use
    of the original source address, and mark the upstream socket with the
    numeric source security identity. This is also the way in which Cilium
    uses Envoy Listener filter chains when enforcing policy for HTTP, for
    example. In this mode L4 LB is assumed to be taken place, if applicable,
    before traffic is redirected to the Envoy listener.
    
    Prior to this change CEC CRDs were mostly only usable for L7 LB uses,
    such as with Cilium Ingress. After this change CEC CRDs without a Service
    redirect can be used in place of Cilium policy enforcement filter chains,
    if desired.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    c2ae5bc View commit details
    Browse the repository at this point in the history
  19. proxy: use egress in test as CRD proxy ports only work for egress for…

    … now
    
    [ upstream commit a679c40 ]
    
    Specify proxy port as egress (or not-ingress) in test cases as the
    datapath currently supports ProxyTypeCRD only for L7 LB which is always
    on the egress path.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    9906f06 View commit details
    Browse the repository at this point in the history
  20. proxy: Do not create Envoy redirects for listeners defined in CEC CRDs

    [ upstream commit aff0655 ]
    
    Add a new no-op CRDRedirect type to be used with Envoy listeners defined
    in CEC CRDs. In this case the listeners already exist and new Envoy
    Listener resources do not need to be created for them.
    
    This is needed for the forthcoming policy feature where policy can refer
    to a Listener defined in CEC CRD.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    8b81a83 View commit details
    Browse the repository at this point in the history
  21. proxy: Find CRD proxy ports by name instead of by type

    [ upstream commit 06ed0a4 ]
    
    Find CRD proxy port by name instead of type. This is needed for enabling
    CEC CRD defined listeners to be used in CNPs. Prior to this CRD proxy
    ports did not use this code path, which is only called from endpoint
    policy updates, so there was no need to find CRD proxy ports by name.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    0c4b8c2 View commit details
    Browse the repository at this point in the history
  22. envoy: Move resourceQualifiedName() to policy/api

    [ upstream commit ede26fb ]
    
    Move resourceQualifiedName() to policy/api and export it so that it can
    be used in policy as well as in envoy package.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    b43c8e4 View commit details
    Browse the repository at this point in the history
  23. policy: Add CRD Listener support for egress policies

    [ upstream commit 9159609 ]
    
    Add `listener` field to CNP and CCNP, that causes traffic to the
    specified port(s) to be redirected to the named Envoy listener. If the
    listener does not exist the traffic is not allowed at all.
    
    When `listener.envoyConfig.kind` is left out it defaults to namespaced
    `CiliumEnvoyConfig` for rules in namespaced policies (CNP) or to
    cluster-scoped `CiliumClusterwideEnvoyConfig` for rules in cluster-scoped
    policies (CCNP). Namespaced policies can also refer to cluster-scoped
    listeners with an explicit `listener.envoyConfig.kind:
    CiliumClusterwideEnvoyConfig`. Cluster-scoped policies can not refer to
    namespaced listeners.
    
    Endpoint policies are regenerated whenever Envoy listeners change to
    update potential listener redirections in the bpf policy maps.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    93ba663 View commit details
    Browse the repository at this point in the history
  24. test/helpers: Disambiguate executor logs

    [ upstream commit 527f5fd ]
    
    This makes it much easier to track down which code path the executor
    logs are coming from, so that when debugging issues, we can focus on the
    relevant code path.
    
    Signed-off-by: Chris Tarazi <chris@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    christarazi authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    f2d411b View commit details
    Browse the repository at this point in the history
  25. test/helpers: Use switch case for readability

    [ upstream commit bf3fd5f ]
    
    Signed-off-by: Chris Tarazi <chris@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    christarazi authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    5e5b17b View commit details
    Browse the repository at this point in the history
  26. test/helpers: Fix retry condition for CiliumExecContext

    [ upstream commit dab8723 ]
    
    Previously, 11cb4d0 assumed that 137 was the exit code for when a
    process exists due to a SIGKILL. However, upon reading the Go source
    code as of 1.20 rc1, this is not the case, and that -1 is set for all
    exit codes due to signals [1].
    
    Fixes: 11cb4d0 ("test: Keep trying exec if killed")
    Fixes: #22570
    
    [1]:
    https://github.com/golang/go/blob/go1.20rc1/src/os/exec_posix.go#L128-L130
    
    Signed-off-by: Chris Tarazi <chris@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    christarazi authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    27a2999 View commit details
    Browse the repository at this point in the history
  27. datapath: Don't enforce policies at overlay if ep routes are enabled

    [ upstream commit e49ab12 ]
    
    When endpoint routes are enabled, we should enforce ingress policies at
    the destination lxc interface, to which a BPF program will be attached.
    Nevertheless, today, for packets coming from the overlay, we enforce
    ingress policies twice, once at the e.g. cilium_vxlan interface and a
    second time at the lxc device.
    
    This is happening for two reasons:
      1. bpf_overlay is not aware of the endpoint routes settings so it
         doesn't even know that it's not responsible for enforcing ingress
         policies.
      2. We have a flag to force the enforcement of ingress policies at the
         source in this case. This flag exists for historic reasons that are
         not valid anymore.
    
    A separate patch will fix the reason 2 above. This commit fixes reason 1
    by telling bpf_overlay to *not* enforce ingress policies when endpoint
    routes are enabled.
    
    Note that we do not support the case where some endpoint have endpoint
    routes enabled and others don't. If we did, additional logic would be
    required.
    
    Fixes: 3179a47 ("datapath: Support enable-endpoint-routes with encapsulation")
    Signed-off-by: Paul Chaignon <paul@cilium.io>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    pchaigno authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    c9b877f View commit details
    Browse the repository at this point in the history
  28. bpf: Preserve overlay->lxc path with kube-proxy

    [ upstream commit 3d2ceaf ]
    
    The previous commit changed the packet handling on the path
    overlay->lxc to fix a bug. More presicely, when endpoint routes are
    enabled, we won't enforce ingress policies on both the overlay and the
    lxc devices but only on the latter.
    
    However, as a consequence of that patch, we don't go through the
    policy-only program in bpf_lxc and we therefore changed the way the
    packet is transmitted between overlay and lxc devices in some cases. As
    a summary of changes made in the previous path, consider the following
    table for the path overlay -> lxc.
    
    Before the previous patch:
    | Endpoint routes | Enforcement     | Path                 |
    |-----------------|-----------------|----------------------|
    | Enable          | overlay AND lxc | bpf_redirect if KPR; |
    |                 |                 | stack otherwise      |
    | Disabled        | overlay         | bpf_redirect         |
    
    Now:
    | Endpoint routes | Enforcement | Path         |
    |-----------------|-------------|--------------|
    | Enable          | lxc         | bpf_redirect |
    | Disabled        | overlay     | bpf_redirect |
    
    The previous patch intended to fix the enforcement to avoid the double
    policy enforcement, but it also changed the packet path in case endpoint
    routes are enabled.
    
    This patch now fixes this by adding the same exception we have in
    bpf_lxc to the l3.h logic we have. Hence, with the current patch, the
    table will look like:
    | Endpoint routes | Enforcement | Path                 |
    |-----------------|-------------|----------------------|
    | Enable          | lxc         | bpf_redirect if KPR; |
    |                 |             | stack otherwise      |
    | Disabled        | overlay     | bpf_redirect         |
    
    I've kept this in a separate commit from the previous in an attempt to
    split up and the logic and more clearly show the deltas.
    
    Signed-off-by: Paul Chaignon <paul@cilium.io>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    pchaigno authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    549cf7a View commit details
    Browse the repository at this point in the history
  29. gh/workflows: Enable encryption tests in conformance DP

    [ upstream commit 843c072 ]
    
    The encryption test details -
    cilium/cilium-cli#1241.
    
    Signed-off-by: Martynas Pumputis <m@lambda.lt>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    brb authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    4e3097e View commit details
    Browse the repository at this point in the history
  30. CRDs: add CiliumNodeConfig CRD and scaffolding

    [ upstream commit 884ccc1 ]
    
    This generates the CiliumNodeConfig type, a new way to set
    configuration overrides on a set of nodes.
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    cf53659 View commit details
    Browse the repository at this point in the history
  31. cmd: add build-config command

    [ upstream commit 6c62ce6 ]
    
    The build-config command is responsible for doing configuration
    resolution on the node.
    
    By default, it retrieves ConfigMaps, CiliumConfigOverrides, and Node
    objects. However, the list of sources is customizable. The intended
    usage is to allow administrators to roll changes out in a controlled
    manner to a cluster.
    
    This implements the proposed "Per-node configuration overrides" feature.
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    a43ff3d View commit details
    Browse the repository at this point in the history
  32. helm: wire in the config-builder

    [ upstream commit 0dbde96 ]
    
    This changes the agent daemonset to use the config-builder rather than
    directly reading from the ConfigMap. This adds an initContainer that
    does the configuration resolution and writes to a temporary directory.
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    df26172 View commit details
    Browse the repository at this point in the history
  33. Documentation: add additional place were CRDs must be referenced

    [ upstream commit 1ef7889 ]
    
    The existing documentation missed a point where new CRDs need to be
    added.
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    2e1878c View commit details
    Browse the repository at this point in the history
  34. install/kubernetes: label all RBAC objects

    [ upstream commit 08b33b2 ]
    
    This is so we can delete them as part of cleanup. Also, it's good
    practice, especially for non-namespaced objects (ClusterRole /
    ClusterRoleBinding).
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    8bd4a55 View commit details
    Browse the repository at this point in the history
  35. .github: manually clean up RBAC artifacts

    [ upstream commit 43cb8e9 ]
    
    [ Backporter's notes: Dropped changes to files deleted from v1.13 tree. ]
    
    We just need this until cilium/cilium-cli#1257
    is fixed.
    
    Ths problem is that, right now, reinstalling after "cilium uninstall" is
    broken.
    
    Signed-off-by: Casey Callendrello <cdc@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    squeed authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    8e2eb10 View commit details
    Browse the repository at this point in the history
  36. install/kubernetes: make securityContext SELinux options configurable

    [ upstream commit 76837ea ]
    
    Make the hardcoded SELinux options in the helm charts configurable.
    
    Fixes #22703
    
    Signed-off-by: Tobias Klauser <tobias@cilium.io>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    tklauser authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    d57f773 View commit details
    Browse the repository at this point in the history
  37. bpf: add drop notification for missed L7 LB tailcall in to-netdev

    [ upstream commit 49db467 ]
    
    cil_to_netdev() is the upper-most function in the program. So don't just
    return a raw DROP reason to the kernel, but translate it to CTX_ACT_DROP
    and raise a drop notification.
    
    Fixes: d1d8e7a ("datapath: Add support for re-entering LXC egress path after L7 LB")
    Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    julianwiedmann authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    341f444 View commit details
    Browse the repository at this point in the history
  38. backporting: leave backport/author PRs alone

    [ upstream commit 3961355 ]
    
    This label means that the backport will be performed by the author, and
    thus the backporting automation should ignore it.
    
    It's okay to not filter by backport-done/<branch> because these labels
    should not be present at the same time. (Notably, if resiliency was the
    intention, we should already be filtering backport-pending too.)
    
    Signed-off-by: David Bimmler <david.bimmler@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    bimmlerd authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    07308b6 View commit details
    Browse the repository at this point in the history
  39. cilium: Add deprecation warning for service ids

    [ upstream commit e50bf2e ]
    
    The NodePort service frontends are currently expanded early in
    the K8sWatcher and the IDs used by datapath are allocated for
    these expanded frontends in pkg/service. A NodePort frontend is
    created for the Node IP and other routable IPs on the system.
    As this is now something that can be reconfigured at runtime when
    devices change, we would like to make the frontend expansion and the
    service and backend identifiers implementation details of the datapath
    and not expose them to the user via the REST API (PUT /service/{id}) or
    the "cilium service update" command.
    
    In v1.14 we will implement this by changing the {id} from int into a string
    (something like "1.2.3.4:80:TCP" or "[f00d::1]:80:TCP").
    
    We're expecting this change to only affect standalone load-balancer users
    that are using the "cilium service" commands directly. We do not expect there
    to be direct use of the "/service/" REST endpoints. Based on this we deem the
    backwards incompatible change to the type of the {id} parameter acceptable.
    
    In order to give advance warning to the users, this commit adds deprecation
    warnings to the cilium-agent logs when the /service endpoints are used and to
    the cilium command-line utility when "cilium service update" or
    "cilium service delete" is used.
    
    Signed-off-by: Jussi Maki <jussi@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    joamaki authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    8f9d8a2 View commit details
    Browse the repository at this point in the history
  40. fqdn: dnsproxy: fix data race in dns proxy implementation

    [ upstream commit 06c5754 ]
    
    A recent commit patched dnsproxy to configure a net.Dialer for every outgoing
    request. However, the dialer was assigned to a single shared copy of dns.Client
    which lead to data corruption. Create a new dns.Client each time we do a
    client, so the state is not shared between threads.
    
    Fixes: cf3cc16 ("fqdn: dnsproxy: fix forwarding of the original security identity for TCP")
    
    Signed-off-by: Anton Protopopov <aspsk@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    aspsk authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    788c30d View commit details
    Browse the repository at this point in the history
  41. Add tests for hubble metrics handlers: Drop, Tcp, PortDistribution.

    [ upstream commit 00eb46e ]
    
    Signed-off-by: Marek Chodor <mchodor@google.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    marqc authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    f449491 View commit details
    Browse the repository at this point in the history
  42. test: service: fix formatting of error msg in doFragmentedRequest()

    [ upstream commit 965ca0d ]
    
    Fix up the string formatting to include the `dstIP` parameter.
    
    Spotted in a Jenkins run:
    
    /home/jenkins/workspace/Cilium-PR-K8s-1.26-kernel-net-next/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:515
    Failed to account for INGRESS IPv4 fragments in BPF metrics%!(EXTRA string=10.100.188.224)
    
    Fixes: 938b494 ("bpf: add metrics for fragmented IPv4 packets")
    Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    julianwiedmann authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    2eb4486 View commit details
    Browse the repository at this point in the history
  43. ci: Replace deprecated hubble observe -o json with -o jsonpb

    [ upstream commit 31093c1 ]
    
    This commit replaces the use of the deprecated `-o json` flag on Hubble
    observe with `-o jsonpb`. Future versions of Hubble will make `-o json`
    an alias to `-o jsonpb`, so this commit should be future-proof.
    
    The new output wraps each flow in a `GetFlowsResponse`, meaning the old
    object is now accessible in the `.flow` attribute. All jsonpath queries
    in the code have been changed to reflect this change.
    
    Notably, some parts of CI (namely the `hubble-flows-*.json` files used
    for troubleshooting CI flakes) already used `-o jsonpb` and thus this
    commit should not cause any change in usual the CI troubleshooting
    workflow.
    
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    gandro authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    db33ae7 View commit details
    Browse the repository at this point in the history
  44. Add sphinxcontrib-googleanalytics to doc requirements

    [ upstream commit 4bc5629 ]
    
    Signed-off-by: Patrice Chalin <chalin@cncf.io>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    chalin authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    7660245 View commit details
    Browse the repository at this point in the history
  45. daemon/cmd: improve stale cilium endpoint error handling.

    [ upstream commit cec554b ]
    
    A CEP that is already gone by the time cleanup occurs (but may still
    be attempted to be cleaned up due it still being in ciliums k8s cache)
    should be skipped.
    This happens occasionally in CI as Pods/CEPs are deleted in close
    proximity to Cilium agent restarts.
    Logging error was causing the tests to fail unecessarily.
    This captures the NotFound case separately and instead logs an info
    message.
    
    Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    tommyp1ckles authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    2dad8a6 View commit details
    Browse the repository at this point in the history
  46. policy: Refactor per-selector policy handling

    [ upstream commit 67b1460 ]
    
    Rename L7RulesPerSelector in L4Filter as PerSelectorPolicies, as the map
    values already hold more than just the L7 rules (e.g., if the L3 or L4
    rule is a deny rule or not). Following commits will add more non-L7
    related fields, which would be even more confusing with the old name.
    
    JSON name 'l7-rules' is kept for backwards compatibility, should consider
    if backwards compatibility is needed here, though.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    f138d3d View commit details
    Browse the repository at this point in the history
  47. policy: Add Auth member to CNP ingress and egress

    [ upstream commit df0eebc ]
    
    Add optional Auth object to L3 policy (ingress/egress) rules.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    aaf5b82 View commit details
    Browse the repository at this point in the history
  48. bpf: Add drop due to missing authentication

    [ upstream commit 9882a49 ]
    
    Add new drop reason for missing authentication.
    
    Pass authentication type as extended error in the drop event.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    733077d View commit details
    Browse the repository at this point in the history
  49. examples: Add auth policy example

    [ upstream commit e4685f2 ]
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    c2dee09 View commit details
    Browse the repository at this point in the history
  50. bpf: Make proxy_port an explicit parameter to policy lookup

    [ upstream commit be21f88 ]
    
    Later on we need to be able to return both a drop reason and a proxy
    port, so make proxy port an explicit parameter rater than folding it into
    the function return value.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    f45dca4 View commit details
    Browse the repository at this point in the history
  51. bpf: Flip emit_policy_verdict logic

    [ upstream commit a7a1add ]
    
    Set 'emit_policy_verdict' to 'true' only if policy lookup is performed.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    a14cac2 View commit details
    Browse the repository at this point in the history
  52. bpf: Skip policy enforcement for return traffic

    [ upstream commit c9d40dd ]
    
    Policy lookup results were already ignored for return traffic, so we can
    skip policy lookup for CT_REPLY and CT_RELATED altogether. This allows
    simplification of the policy verdict reporting logic as well.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    2f06347 View commit details
    Browse the repository at this point in the history
  53. bpf: Used ENABLE_L7_LB in complexity test

    [ upstream commit a3f6796 ]
    
    Add -DENABLE_L7_LB=1 to MAX_BASE_OPTIONS to pull that code in for load testing.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    037fdd5 View commit details
    Browse the repository at this point in the history
  54. bpf: Add CT map flag for auth required

    [ upstream commit 1f7ed72 ]
    
    Add a new conntrack flag for auth required, and create CT entries for
    connections that are dropped due to not being authenticated yet.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    17bb9c5 View commit details
    Browse the repository at this point in the history
  55. bpf: Enforce zero dst_id for egress

    [ upstream commit a246aa0 ]
    
    Drop notification 'dst_id' is non-zero only for ingress. Enforcing this
    we can detect ingress/egress based on this property when receiving drop
    notifications.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    db8e6c8 View commit details
    Browse the repository at this point in the history
  56. auth: Update conntrack entry

    [ upstream commit fd24b10 ]
    
    Tell datapath connection is authenticated by clearing the auth_required
    flag on the conntrack entry.
    
    Signed-off-by: Jarno Rajahalme <jarno@isovalent.com>
    Signed-off-by: Joe Stringer <joe@cilium.io>
    jrajahalme authored and joestringer committed Dec 21, 2022
    Configuration menu
    Copy the full SHA
    42d0880 View commit details
    Browse the repository at this point in the history