Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Ensure ConfigMap remains compatible across 1.7 -> 1.8 upgrade #12097

Merged
merged 1 commit into from Jun 18, 2020

Conversation

tgraf
Copy link
Member

@tgraf tgraf commented Jun 16, 2020

No description provided.

@tgraf tgraf requested review from a team as code owners June 16, 2020 12:04
@tgraf tgraf requested a review from a team June 16, 2020 12:04
@maintainer-s-little-helper
Copy link

Please set the appropriate release note label.

@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.8.0 Jun 16, 2020
@tgraf tgraf marked this pull request as draft June 16, 2020 12:04
@tgraf tgraf added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. labels Jun 16, 2020
@coveralls
Copy link

coveralls commented Jun 16, 2020

Coverage Status

Coverage decreased (-0.01%) to 37.06% when pulling 3886ce5 on pr/tgraf/fix-upgrade-compatibility into 989eced on master.

@tgraf tgraf force-pushed the pr/tgraf/fix-upgrade-compatibility branch from 008f502 to 805a16f Compare June 16, 2020 12:57
@tgraf tgraf force-pushed the pr/tgraf/fix-upgrade-compatibility branch 2 times, most recently from 10827bd to 4d354ea Compare June 16, 2020 13:06
@tgraf
Copy link
Member Author

tgraf commented Jun 16, 2020

test-me-please

@tgraf tgraf marked this pull request as ready for review June 16, 2020 13:24
Documentation/install/upgrade.rst Show resolved Hide resolved
Documentation/install/upgrade.rst Show resolved Hide resolved
Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we be updating the K8sUpdates test to use this new method?

@joestringer
Copy link
Member

Some of the failures look likely related to disabling CNP updates:

Error creating resource /home/jenkins/workspace/Cilium-PR-K8s-newest-kernel-4.9/k8s-1.18-gopath/src/github.com/cilium/cilium/test/k8sT/manifests/l7-policy-demo.yaml: Timed out while waiting CNP to be enforced: 4m0s timeout expired
Expected
    <*errors.errorString | 0xc000566390>: {
        s: "Timed out while waiting CNP to be enforced: 4m0s timeout expired",
    }
to be nil

@tgraf
Copy link
Member Author

tgraf commented Jun 16, 2020

Some of the failures look likely related to disabling CNP updates:

This construct still doesn't work:

    enable-session-affinity: {{ default $defaultSessionAffinity .Values.sessionAffinity | quote }}

I think we have to write it as:

  {{- if hasKey .Values "sessionAffinity" }}
    enable-session-affinity: {{ .Values.sessionAffinity | quote }}
  {{- else if eq $defaultSessionAffinity "true" }}
    enable-session-affinity: {{ $defaultSessionAffinity | quote }}
  {{- end }}

default true $var will always result in true, regardless of whether $var is true, false, or empty/unset.

@tgraf tgraf force-pushed the pr/tgraf/fix-upgrade-compatibility branch 2 times, most recently from 5f78ae9 to 282d1df Compare June 16, 2020 20:23
@tgraf tgraf requested a review from aanm June 16, 2020 20:25
@tgraf
Copy link
Member Author

tgraf commented Jun 16, 2020

Shouldn't we be updating the K8sUpdates test to use this new method?

That's a fantastic idea!

@tgraf tgraf requested a review from joestringer June 16, 2020 20:38
@tgraf tgraf force-pushed the pr/tgraf/fix-upgrade-compatibility branch from 282d1df to 4d6d8b3 Compare June 16, 2020 20:38
@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

test-me-please

@tgraf tgraf force-pushed the pr/tgraf/fix-upgrade-compatibility branch from 421d240 to f692c56 Compare June 17, 2020 12:35
@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

test-me-please

@tgraf tgraf added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Jun 17, 2020
@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

Will rebase after tests have passed

When running with option `--set upgradeCompatibility=1.7`, then the diff
between the ConfigMaps is:

```
@@ -60,8 +60,7 @@
   #
   # Only effective when monitor aggregation is set to "medium" or higher.
   monitor-aggregation-flags: all
-
-  # ct-global-max-entries-* specifies the maximum number of connections
+  # bpf-ct-global-*-max specifies the maximum number of connections
   # supported across all endpoints, split by protocol: tcp or other. One pair
   # of maps uses these values for IPv4 connections, and another pair of maps
   # use these values for IPv6 connections.
@@ -71,10 +70,9 @@
   # policy drops or a change in loadbalancing decisions for a connection.
   #
   # For users upgrading from Cilium 1.2 or earlier, to minimize disruption
-  # during the upgrade process, comment out these options.
+  # during the upgrade process, set bpf-ct-global-tcp-max to 1000000.
   bpf-ct-global-tcp-max: "524288"
   bpf-ct-global-any-max: "262144"
-
   # bpf-policy-map-max specified the maximum number of entries in endpoint
   # policy map (per endpoint)
   bpf-policy-map-max: "16384"
@@ -140,9 +138,6 @@
   install-iptables-rules: "true"
   auto-direct-node-routes: "false"
   kube-proxy-replacement:  "probe"
-  enable-host-reachable-services: "false"
-  enable-external-ips: "false"
-  enable-node-port: "false"
   node-port-bind-protection: "true"
   enable-auto-protect-node-port-range: "true"
   enable-endpoint-health-checking: "true"
```

When running without upgradeCompatibility, the diff is:

```
@@ -43,6 +43,7 @@
   # Enable IPv6 addressing. If enabled, all endpoints are allocated an IPv6
   # address.
   enable-ipv6: "false"
+  enable-bpf-clock-probe: "true"

   # If you want cilium monitor to aggregate tracing for packets, set this level
   # to "low", "medium", or "maximum". The higher the level, the less packets
@@ -60,24 +61,12 @@
   #
   # Only effective when monitor aggregation is set to "medium" or higher.
   monitor-aggregation-flags: all
-
-  # ct-global-max-entries-* specifies the maximum number of connections
-  # supported across all endpoints, split by protocol: tcp or other. One pair
-  # of maps uses these values for IPv4 connections, and another pair of maps
-  # use these values for IPv6 connections.
-  #
-  # If these values are modified, then during the next Cilium startup the
-  # tracking of ongoing connections may be disrupted. This may lead to brief
-  # policy drops or a change in loadbalancing decisions for a connection.
-  #
-  # For users upgrading from Cilium 1.2 or earlier, to minimize disruption
-  # during the upgrade process, comment out these options.
-  bpf-ct-global-tcp-max: "524288"
-  bpf-ct-global-any-max: "262144"
-
   # bpf-policy-map-max specified the maximum number of entries in endpoint
   # policy map (per endpoint)
   bpf-policy-map-max: "16384"
+  # Specifies the ratio (0.0-1.0) of total system memory to use for dynamic
+  # sizing of the TCP CT, non-TCP CT, NAT and policy BPF maps.
+  bpf-map-dynamic-size-ratio: "0.0025"

   # Pre-allocation of map entries allows per-packet latency to be reduced, at
   # the expense of up-front memory allocation for the entries in the maps. The
@@ -136,18 +125,24 @@
   wait-bpf-mount: "false"

   masquerade: "true"
+  enable-bpf-masquerade: "true"
   enable-xt-socket-fallback: "true"
   install-iptables-rules: "true"
   auto-direct-node-routes: "false"
   kube-proxy-replacement:  "probe"
-  enable-host-reachable-services: "false"
-  enable-external-ips: "false"
-  enable-node-port: "false"
   node-port-bind-protection: "true"
   enable-auto-protect-node-port-range: "true"
+  enable-session-affinity: "true"
+  k8s-require-ipv4-pod-cidr: "true"
+  k8s-require-ipv6-pod-cidr: "false"
   enable-endpoint-health-checking: "true"
   enable-well-known-identities: "false"
   enable-remote-node-identity: "true"
+  operator-api-serve-addr: "127.0.0.1:9234"
+  ipam: "cluster-pool"
+  cluster-pool-ipv4-cidr: "10.0.0.0/8"
+  cluster-pool-ipv4-mask-size: "24"
+  disable-cnp-status-updates: "true"
```

Signed-off-by: Thomas Graf <thomas@cilium.io>
@tgraf tgraf force-pushed the pr/tgraf/fix-upgrade-compatibility branch from f692c56 to 3886ce5 Compare June 17, 2020 17:46
@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

test-me-please

1 similar comment
@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

test-me-please

@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

Single test failed:

Suite-k8s-1.17.K8sServicesTest Bookinfo Demo Tests bookinfo demo

This is a completely new test failure so it is likely related to the PR.

@christarazi
Copy link
Member

It looks it hit a newly discovered flake: #12130

It is passing locally for me.

@tgraf
Copy link
Member Author

tgraf commented Jun 17, 2020

retest-4.19

@tgraf tgraf merged commit 8ec15e2 into master Jun 18, 2020
1.8.0 automation moved this from In progress to Merged Jun 18, 2020
@tgraf tgraf deleted the pr/tgraf/fix-upgrade-compatibility branch June 18, 2020 07:04
@tgraf tgraf added needs-backport/1.8 and removed dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. labels Jun 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.8.0 Jun 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.8 in 1.8.0 Jun 18, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.8 to Backport done to v1.8 in 1.8.0 Jun 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
1.8.0
  
Merged
1.8.0
Backport done to v1.8
Development

Successfully merging this pull request may close these issues.

None yet

9 participants