Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13 Backports 2024-03-19 #31496

Merged
merged 9 commits into from Mar 21, 2024
Merged

Commits on Mar 19, 2024

  1. Handle InvalidParameterValue as well for PD fallback

    [ upstream commit 5a487b5 ]
    
    #30536 prematurely concluded that
    AWS now uses InsufficientCidrBlocks to indicate the subnet is out of
    prefixes. Looks like AWS still uses InvalidParameterValue and "There
    aren't sufficient free Ipv4 addresses or prefixes" to indicate subnet is
    at capacity. In addition to this InsufficientCidrBlocks is returned
    when subnet is at capacity potentially due to fragmentation. In either
    case, it's worth trying to fallback since /32 IPs might still be
    available compared to /28. See PR for details from AWS support ticket.
    
    Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    hemanthmalla authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    0b0a11c View commit details
    Browse the repository at this point in the history
  2. Adding unit test for PD fallback

    [ upstream commit 0007e35 ]
    [ backporter notes: Minor import conflict in pkg/aws/ec2/mock/mock.go
      and replaced `Upsert` with `Update` in node_manager_test.go ]
    
    Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    hemanthmalla authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    ff032e0 View commit details
    Browse the repository at this point in the history
  3. doc: Clarified GwAPI KPR prerequisites

    [ upstream commit 1321e03 ]
    [ backporter notes: Adjusted wording for v1.13 ]
    
    Before the GwAPI doc listed KPR mode as a prerequisite. However,
    it's actually only required to enable BPF nodePort support.
    
    Signed-off-by: Philip Schmid <philip.schmid@isovalent.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    PhilipSchmid authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    06d4778 View commit details
    Browse the repository at this point in the history
  4. ci: Bump lvh-kind ssh-startup-wait-retries

    [ upstream commit a5eafe0 ]
    
    Recently, we frequently see the CI failure with lvh-kind startup
    failure with exit code 41. This indicates the timeout of the task
    waiting for the SSH startup. Bump the timeout (retry) to 600 (10min) as
    a workaround.
    
    Fixes: #31336
    
    Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    YutaroHayakawa authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    155e07a View commit details
    Browse the repository at this point in the history
  5. ipam: fix azure ipam test panics due to shared pointers.

    [ upstream commit 1ca6141 ]
    
    pkg/ipam/types.(*InstanceMap).DeepCopy(...) will iterate for all instances/interfaces in order to copy the data.
    However, unlike what the name suggests, underlying instance pkg/ipam/types.Interface pointers are copied and shared in the returned instance map.
    In some cases, this case result in memory corruption issues resulting in confusing panics while running tests such as:
    
    ```
    panic: runtime error: makeslice: cap out of range
    goroutine 1366 [running]:
    strconv.appendQuotedWith({0xc000576208, 0x0, 0x3f?}, {0x1000000, 0x100000001000000}, 0x22, 0x0, 0x0)
    	/opt/hostedtoolcache/go/1.22.0/x64/src/strconv/quote.go:35 +0x85
    strconv.AppendQuote(...)
    
    ...
    ```
    
    Capturing such an event in a debugger you would see a AzureInterface struct such as this with the IP string memory being corrupt (likely due to an interleaved read/write) being passed to logrus causing a crash.
    
    ```
    github.com/cilium/cilium/pkg/azure/types.AzureAddress {
    	IP: "\x10\x01\x00\x00\x00\x00\x00\x00\x10\x01\x00\x00\x00\x00\x00\x007d�_\x02\b\b\x19\x00\x00\x00\x00\x00\x00\x00\x00�\x1f�\x03\x00\x00\x00\x00W�\b\x00\x00\x00\x00\x00\x00p \x03\x00\x00\x00\x00�qi\x03\x00\x00\x00\x00...+51559946186908214 more",
    	Subnet: "subsys",
    	State: "instanceID",}
    ```
    This ensures that the revision interface is correctly deepcopied such that the underlying resource is also safely copied.
    
    Fixed: #31059
    
    Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    tommyp1ckles authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    6639022 View commit details
    Browse the repository at this point in the history
  6. gateway-api: Retrieve LB service from same namespace

    [ upstream commit e8bed8d ]
    [ backporter notes: conflicts in unit tests. Had to change introduce
      the `NoError` check instead of expecting a specific error ]
    
    This commit is to add the same namespace while listing generated LB
    service, the main reason is to avoid wrong status update for gateways
    having the same name, but belonged to different namespace.
    
    Testing was done locally as per below:
    
    Before the fix:
    ```
    $ kg gateway -A
    NAMESPACE   NAME         CLASS    ADDRESS          PROGRAMMED   AGE
    another     my-gateway   cilium   10.110.222.237   True         4s
    default     my-gateway   cilium   10.110.222.237   True         56s
    ```
    
    After the fix:
    
    ```
    $ kg gateway -A
    NAMESPACE   NAME         CLASS    ADDRESS          PROGRAMMED   AGE
    another     my-gateway   cilium   10.102.170.180   True         14m
    default     my-gateway   cilium   10.110.222.237   True         14m
    
    $ kg services -A
    NAMESPACE     NAME                           TYPE           CLUSTER-IP       EXTERNAL-IP      PORT(S)                      AGE
    another       cilium-gateway-my-gateway      LoadBalancer   10.102.170.180   10.102.170.180   80:31424/TCP                 15m
    default       cilium-gateway-my-gateway      LoadBalancer   10.110.222.237   10.110.222.237   80:31889/TCP                 15m
    ...
    ```
    
    Fixes: #31270
    Signed-off-by: Tam Mach <tam.mach@cilium.io>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    sayboras authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    a2e3596 View commit details
    Browse the repository at this point in the history
  7. gha: disable fail-fast on integration tests

    [ upstream commit 0fb203e ]
    
    So that the failure of one matrix entry (e.g., caused by a flake) doesn't
    cancel the other ongoing tests, if any.
    
    Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    giorio94 authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    5826983 View commit details
    Browse the repository at this point in the history
  8. docs: Warn on key rotations during upgrades

    [ upstream commit b639eab ]
    
    In general, it is not recommended to carry several admin. operations on
    the cluster at the same time, as it can make troubleshooting in case of
    issues a lot more complicated. Mixing operations is also less likely to
    be covered in CI so more likely to hit corner cases.
    
    Performing IPsec key rotations during Cilium up/downgrades is one such
    case. Let's document it explicitly to discourage users from doing that.
    
    Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    pchaigno authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    771f2c9 View commit details
    Browse the repository at this point in the history
  9. hubble: fix traffic direction and reply on encrypted trace notifications

    [ upstream commit 9939fa2 ]
    [ backporter notes: No SRV6 support on v1.13, I removed those test cases ]
    
    Before this patch, Hubble would wrongly report known traffic direction
    and reply status when IPSec was enabled.
    
    Signed-off-by: Alexandre Perrin <alex@isovalent.com>
    Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
    kaworu authored and gandro committed Mar 19, 2024
    Configuration menu
    Copy the full SHA
    515d1dc View commit details
    Browse the repository at this point in the history