-
Notifications
You must be signed in to change notification settings - Fork 228
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: [CNI] Adding aks cluster creation steps for k8s e2e test #2052
Conversation
os: linux | ||
clusterType: linux-cniv1-up | ||
clusterName: "ubuntu18e2e" | ||
k8sVersion: 1.24.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it not default to 1.24.x? I do not think that we should specify versions as they will get outdated.
Also 1.24.x is End of Life this month. Should we consider shifting this to ubuntu 22 and 1.25.x+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No by default it is greater than 1.25, the reason for this is aks does not support ubuntu 18 with k8s 1.25.
The pipeline currently is testing ubuntu 18 so we need to test for ubuntu 18.
https://learn.microsoft.com/en-us/azure/aks/cluster-configuration
Regarding the ubuntu 22, you are right we need to support that. Once this is merged. It would be just a matter of adding another set of parameters for Ubuntu 22.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ubuntu22 is must.. ubuntu18 will get eol soon
.pipelines/pipeline.yaml
Outdated
displayName: AKS Windows 1903 | ||
arch: amd64 | ||
os: linux | ||
clusterType: windows-cniv1-up | ||
clusterName: "win19e2e" | ||
windowsOsSku: Windows2019 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Is this testing windows 1903 or 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Windows 2019 update version was 1903.
6fe9fb4
to
777f8b1
Compare
os: linux | ||
clusterType: linux-cniv1-up | ||
clusterName: "ubuntu18e2e" | ||
k8sVersion: 1.24.9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ubuntu22 is must.. ubuntu18 will get eol soon
clusterName: "ubuntu18e2e" | ||
k8sVersion: 1.24.9 | ||
|
||
- template: singletenancy/aks/e2e-job-template.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add for windows22 first.. most customers use windows22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure let me add both windows 22 and ubuntu 22
else | ||
export DROP_GZ_URL=$( make cni-dropgz-image-name-and-tag OS=${{ parameters.os }} ARCH=${{ parameters.arch }} CNI_DROPGZ_VERSION=${{ parameters.version }}) | ||
envsubst < ./test/integration/manifests/cni/cni-installer-v1.yaml | kubectl apply -f - | ||
kubectl rollout status daemonset/azure-cni -n kube-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have to restart nodes after installing cni.. will that be a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but i think I am still testing the windows part of it so will add to this PR.
b2d6283
to
000bc83
Compare
os: 'linux' | ||
clusterType: linux-cniv1-up | ||
clusterName: 'ubuntu22e2e' | ||
k8sVersion: 1.25 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we specify latest k8s version for ubuntu22
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried latest but it showed error. I added 1.25 as it picked the latest 1.25 supported by aks i.e 1.25.6.
We will remove this once we no longer support ubuntu 18 OR need to update it to 1.26 once we have that as default.
- /opt/cni/bin/azure-vnet-ipam | ||
- azure-vnet-telemetry | ||
- -o | ||
- /opt/cni/bin/azure-vnet-telemetry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should copy telemetry conflist file as well.. check the released tar.gz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
azure-vnet-telemetry config you mean ?
640d509
to
fd345f0
Compare
5c46376
to
8201471
Compare
COPY --from=azure-vnet /azure-container-networking/cni/azure-$OS-swift.conflist pkg/embed/fs/azure-swift.conflist | ||
COPY --from=azure-vnet /azure-container-networking/cni/azure-$OS-swift-overlay.conflist pkg/embed/fs/azure-swift-overlay.conflist | ||
COPY --from=azure-vnet /azure-container-networking/cni/azure-$OS-swift-overlay-dualstack.conflist pkg/embed/fs/azure-swift-overlay-dualstack.conflist | ||
COPY --from=azure-vnet /azure-container-networking/bin/* pkg/embed/fs | ||
COPY --from=azure-vnet /azure-container-networking/telemetry/azure-vnet-telemetry.config pkg/embed/fs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rbtr I added the azure conflist and telemetry config for testing cniv1 on the aks clusters. This is for cniTest Dockerfile and I guess we need to change the same in linux Dockerfile(will keep that separate as that has production impact for dropgz release)
cc: @tamilmani1989
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508)
* build azure-vnet-telemetry and azure-vnet-ipam in dropgz-test (#1846) build azure-vnet-telemetry and azure-vnet-ipam in dropgz-test for parity with release image Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> (cherry picked from commit f619259) * ci: disable kube-proxy for test clusters (#1965) * disable kube-proxy for byocni cluster creation * test config mapping * shell pwd * use CURDIR * check current directory * test with repo root dir * test azp format * test azp format * test azp format * change e2e steps to remove kube proxy * fix load test update args * fix ns and rg in update * update ciliume2e * fix kubectl cmd in load test * adding new targets for no kube proxy * remove cluster update * update overlay e2e * test behavior of load test * test grep for azure-cns * look for container deployment * testing * restart node variable check * update if condition * add skip node case --------- Co-authored-by: tamilmani1989 <tamanoha@microsoft.com> (cherry picked from commit 024819d) * CI: [CNI] Replace the bash scripts for CNI load testing with golang test cases (#2003) CI:[CNI] Replace the bash scripts with the golang test cases (cherry picked from commit 008ae45) * ci: [CNI] Move Nightly Cilium Pipeline test to ACN (#1963) * CNS to be able to generate dualstack overaly CNI conflist (#1981) * fix: Eliminating duplicate lines * ci: Add update permission for ciliumidentity * fix: Parameterize Image Registry add retry to nnc update during scaledown (#1970) * add retry to nnc update during scaledown Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> * test for panic in pool monitor Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> --------- Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> fix: reserve 0th IP as gateway for overlay on Windows (#1968) * fix: reserve 0th IP as gateway for overlay on Windows * fix: allow gateway to be updated ci: windows profile container image (#1988) Always use 0 for NC version in Overlay (#1979) always use 0 for NC version in overlay Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> [Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921) * Read secondary CIDRs from VnetScale NNC * fix comment * update comment * For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows * fix/add import * address pr comments * add comments * address pr comments * wrap error * fix typo * fix UT fix: [NPM] check if policy exists in case of nil pointer (#1974) fix: check for nil first ci: disable kube-proxy for test clusters (#1965) * disable kube-proxy for byocni cluster creation * test config mapping * shell pwd * use CURDIR * check current directory * test with repo root dir * test azp format * test azp format * test azp format * change e2e steps to remove kube proxy * fix load test update args * fix ns and rg in update * update ciliume2e * fix kubectl cmd in load test * adding new targets for no kube proxy * remove cluster update * update overlay e2e * test behavior of load test * test grep for azure-cns * look for container deployment * testing * restart node variable check * update if condition * add skip node case --------- Co-authored-by: tamilmani1989 <tamanoha@microsoft.com> perf: [WIN-NPM] fast bootup (#1900) * wip * wip2 * use other apply DP func * address comment about if statement * finish bootup for both DPs * fix lint * fix lint 2 * fix lint 3 * longer UT timeout and add missing UTs for apply in background tool: [NPM] script to clean up iptable chains (#1978) tool: script to clean up NPM iptable chains feat: [WIN-NPM] metrics for latencies and failures (#1959) * implement metrics * add npm prefix * rename windows files * metrics pkg UTs * allow reinitializing prometheus metrics * fix: hns wrapper should not throw error for empty SetPolicy values * test: metric UTs in dataplane * fix: record list endpoint latency always * remove flaky UT * feat: metric for max ipset members * fix lint * fix lint 2 * fix build * fix lint 3 * simplify conditionals and protect against maxMembers becoming negative * remove bottom 4 histogram buckets. start at 16 ms * reset metrics for ipset UTs * style: don't check for windows dp in *_windows.go files * build: remove unused import * test: reset windows metrics in UT Remove SSH port 22 rule from aks-engine clusters (#1983) ci: change overlaye2e stage to cilium-overlay (#1997) * renaming overlaye2e for cilium * update display names for stages Initial getHomeAZ 404 changes (#1994) * initial getHomeAZ 404 changes * treat 404 as success * address comments CNS to be able to generate dualstack overaly CNI conflist (#1981) fix: Parameterize Image Registry add retry to nnc update during scaledown (#1970) * add retry to nnc update during scaledown Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> * test for panic in pool monitor Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> --------- Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> fix: reserve 0th IP as gateway for overlay on Windows (#1968) * fix: reserve 0th IP as gateway for overlay on Windows * fix: allow gateway to be updated ci: windows profile container image (#1988) Always use 0 for NC version in Overlay (#1979) always use 0 for NC version in overlay Signed-off-by: Evan Baker <rbtr@users.noreply.github.com> [Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921) * Read secondary CIDRs from VnetScale NNC * fix comment * update comment * For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows * fix/add import * address pr comments * add comments * address pr comments * wrap error * fix typo * fix UT fix: [NPM] check if policy exists in case of nil pointer (#1974) fix: check for nil first ci: disable kube-proxy for test clusters (#1965) * disable kube-proxy for byocni cluster creation * test config mapping * shell pwd * use CURDIR * check current directory * test with repo root dir * test azp format * test azp format * test azp format * change e2e steps to remove kube proxy * fix load test update args * fix ns and rg in update * update ciliume2e * fix kubectl cmd in load test * adding new targets for no kube proxy * remove cluster update * update overlay e2e * test behavior of load test * test grep for azure-cns * look for container deployment * testing * restart node variable check * update if condition * add skip node case --------- Co-authored-by: tamilmani1989 <tamanoha@microsoft.com> perf: [WIN-NPM] fast bootup (#1900) * wip * wip2 * use other apply DP func * address comment about if statement * finish bootup for both DPs * fix lint * fix lint 2 * fix lint 3 * longer UT timeout and add missing UTs for apply in background tool: [NPM] script to clean up iptable chains (#1978) tool: script to clean up NPM iptable chains feat: [WIN-NPM] metrics for latencies and failures (#1959) * implement metrics * add npm prefix * rename windows files * metrics pkg UTs * allow reinitializing prometheus metrics * fix: hns wrapper should not throw error for empty SetPolicy values * test: metric UTs in dataplane * fix: record list endpoint latency always * remove flaky UT * feat: metric for max ipset members * fix lint * fix lint 2 * fix build * fix lint 3 * simplify conditionals and protect against maxMembers becoming negative * remove bottom 4 histogram buckets. start at 16 ms * reset metrics for ipset UTs * style: don't check for windows dp in *_windows.go files * build: remove unused import * test: reset windows metrics in UT Remove SSH port 22 rule from aks-engine clusters (#1983) ci: change overlaye2e stage to cilium-overlay (#1997) * renaming overlaye2e for cilium * update display names for stages Initial getHomeAZ 404 changes (#1994) * initial getHomeAZ 404 changes * treat 404 as success * address comments CNS to be able to generate dualstack overaly CNI conflist (#1981) * fix: File Directory * style: Comments * Addressing Comments --------- Co-authored-by: Paul Johnston <35265851+pjohnst5@users.noreply.github.com> (cherry picked from commit 1514d95) * ci:[CNI] Add windows CNIv1 datapath test (#2016) * ci: Transfer files * test: Working Datapath Test * test: apierror Tests * style: Datapath Package * test: Deployment timing * fix: Error check * fix: Lint (cherry picked from commit 390977d) * fix: [CNI] CNI load test failing due to namespace already created (#2031) fix: CNI load test failing due to namespace already created (cherry picked from commit c10900e) * ci:[CNI] Windows cniv1 load test pipeline (#2024) CI:[CNI] Windows cniv1 load test pipeline (cherry picked from commit e45ad21) * ci: [CNI] Adding aks cluster creation steps for k8s e2e test (#2052) * ci: [CNI] Adding aks cluster creation steps for k8s e2e test * Add validate step to the pipeline * Adding the telemetry config to the cluster (cherry picked from commit 846e508) * ci:[CNI] Replace AKS-Engine Tests with k8s conformance tests (#2062) * Initial Commit * Add attempts to prevent flakyness * Add taint for windows tests * Add k8s e2e tests * Testing vmSizes * Artifact k8se2e binary * Remove NPM E2E * Add testing and increase processes * Addressing comments (cherry picked from commit 451c691) * CI: Removing AKS engine related code (#2089) (cherry picked from commit b45c2c7) * feat: [dropgz] Dropgz for windows (#2075) * feat: [dropgz] Dropgz for windows * Removing the code for killing the process from dropgz for windows (cherry picked from commit 7a41178) * ci: Update dns tests for k8s conformance (#2104) Update dns tests for k8s v1.26 (cherry picked from commit bbf2fd4) * ci: adding cni package as a trigger (#2108) (cherry picked from commit e6a8ea6) * ci: add packages for submodule trigger (#2154) (cherry picked from commit 4aecfd6) * set mellanox reg key (#1768) (cherry picked from commit fa2de6d) --------- Co-authored-by: Evan Baker <rbtr@users.noreply.github.com> Co-authored-by: Camryn Lee <31013536+camrynl@users.noreply.github.com> Co-authored-by: Vipul Singh <vipul21sept@gmail.com> Co-authored-by: Rajvi <107083915+rajvinar@users.noreply.github.com>
This PR adds creation of aks cluster and replacing the CNI binaries. This is a step to migrate away from aks-engine(deprecated). @jpayne3506 will be adding the datapath to these created clusters.
Reason for Change:
AKS Engine is deprecated.
Issue Fixed:
Requirements:
Notes: