Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Use correct agent value in preflight check #14393

Merged
merged 1 commit into from
Dec 15, 2020

Conversation

gandro
Copy link
Member

@gandro gandro commented Dec 14, 2020

The K8sUpdates deploys the Helm preflight check on top of an existing
older Cilium installation. To deploy the preflight check, the agent
must be disabled. The flag to disable the agent has been renamed in the
Cilium 1.9 charts. Therefore the upgrade test needs to set the proper
value.

This fixes an issue with the upgrade test on GKE where it fails as
follows:

coalesce.go:165: warning: skipped value for agent: Not a table.
Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: kind: ResourceQuota, namespace: cilium, name: cilium-resource-quota

I have been able to pass the K8sUpdates test with this commit manually on GKE against the master branch.

The `K8sUpdates` deploys the Helm preflight check on top of an existing
older Cilium installation. To deploy the preflight check, the agent
must be disabled. The flag to disable the agent has been renamed in the
Cilium 1.9 charts. Therefore the upgrade test needs to set the proper
value.

This fixes an issue with the upgrade test on GKE where it fails as
follows:

```
coalesce.go:165: warning: skipped value for agent: Not a table.
Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: kind: ResourceQuota, namespace: cilium, name: cilium-resource-quota
```

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@gandro gandro added area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI. labels Dec 14, 2020
@gandro gandro requested a review from a team December 14, 2020 11:02
@gandro gandro requested a review from a team as a code owner December 14, 2020 11:02
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Dec 14, 2020
@gandro gandro requested a review from nebril December 14, 2020 11:02
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.2 Dec 14, 2020
@gandro
Copy link
Member Author

gandro commented Dec 14, 2020

test-me-please

@gandro
Copy link
Member Author

gandro commented Dec 14, 2020

GKE failed, maybe due to the Google outage. The symptom is tests failing with Error: rendered manifests contain a resource that already exists. Unable to continue with install: existing resource conflict: kind: Secret, namespace: cilium, name: hubble-server-certs, but I believe that's due to some clean-up routine getting sigkilled, as this would have had to occur previously. Restarting GKE.

https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/3661/

@gandro
Copy link
Member Author

gandro commented Dec 14, 2020

retest-gke

Copy link
Member

@aanm aanm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM but I'm wondering about all other options

@gandro
Copy link
Member Author

gandro commented Dec 14, 2020

Changes LGTM but I'm wondering about all other options

What other options? I did quickly skim over the other options in the map I modified, none of them have changed between 1.8 and 1.9. Did you have anything specific in mind?

Edit: Discussed offline. This was about the other values in the map, which were unaffected by the 1.9 Helm change.

@gandro gandro added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Dec 14, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Dec 14, 2020
@aanm aanm merged commit f3fa271 into master Dec 15, 2020
@aanm aanm deleted the pr/gandro/ci-upgrade-tests-use-correct-agent-value branch December 15, 2020 09:37
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.9 in 1.9.2 Dec 15, 2020
@joestringer
Copy link
Member

Looks like this was needed on the v1.9 branch, but we should be able to remove it on master when we bump the upgrade test to upgrade from v1.9 to master. (Along with convertOptionsToLegacyOptions() which is the generic code that handles this type of change).

@gandro
Copy link
Member Author

gandro commented Dec 16, 2020

Looks like this was needed on the v1.9 branch, but we should be able to remove it on master when we bump the upgrade test to upgrade from v1.9 to master. (Along with convertOptionsToLegacyOptions() which is the generic code that handles this type of change).

Thanks for pointing out convertOptionsToLegacyOptions(), I was not aware of that!

I'm not sure I fully understand your comment however: The preflight check (which on master is currently deployed with the 1.9.90 chart) does also need agent instead of agent.enabled.

Is your comment solely referring to the if-logic? In that case yes, arguably, we don't need the if logic on master, assuming InstallAndValidateCiliumUpgrades is never called with newHelmChartVersion != helpers.CiliumLatestHelmChartVersion. I was not sure this assumption always holds, which is why I added logic.

gandro added a commit that referenced this pull request Dec 21, 2020
This removes the quarantine check for the K8sUpdates test on GKE. The
issues with upgrade tests had on GKE were addressed in #14187, #14294,
 #14295, and #14393.

The last fix was merged 6 days ago. The quarantined GKE pipeline had
since no failures that look related to the issues with DNS that we
previously observed.

Fixes: #13833
Fixes: #13898

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
vadorovsky pushed a commit that referenced this pull request Dec 21, 2020
This removes the quarantine check for the K8sUpdates test on GKE. The
issues with upgrade tests had on GKE were addressed in #14187, #14294,
 #14295, and #14393.

The last fix was merged 6 days ago. The quarantined GKE pipeline had
since no failures that look related to the issues with DNS that we
previously observed.

Fixes: #13833
Fixes: #13898

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.2 Jan 5, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Backport pending to v1.9 to Backport done to v1.9 in 1.9.2 Jan 5, 2021
christarazi pushed a commit that referenced this pull request Jan 6, 2021
[ upstream commit 699ca46 ]

This removes the quarantine check for the K8sUpdates test on GKE. The
issues with upgrade tests had on GKE were addressed in #14187, #14294,
 #14295, and #14393.

The last fix was merged 6 days ago. The quarantined GKE pipeline had
since no failures that look related to the issues with DNS that we
previously observed.

Fixes: #13833
Fixes: #13898

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
Signed-off-by: Chris Tarazi <chris@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake release-note/ci This PR makes changes to the CI.
Projects
No open projects
1.9.2
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

7 participants