Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: [CNI] Move Nightly Cilium Pipeline test to ACN #1963

Merged
merged 8 commits into from
Jun 9, 2023

Conversation

jpayne3506
Copy link
Contributor

@jpayne3506 jpayne3506 commented May 17, 2023

Reason for Change:

Moved pipeline to ACN to enable the reuse of existing code for this pipeline and others in the future. Changed the cilium image value to be parameterized for future use.

Creation of new file directories were required to follow the same pattern in overlay-e2e-step of calling the directory with kubectl to call clusterrole, serviceaccount, and clusterrolebinding. Similarly the dameonset and deployment for cilium-agent and cilium-operator had to be pulled out of the directories as they cannot be given a dynamic value while within those directories and using the current pattern.

Issue Fixed:

Requirements:

Notes:

Ensured that other pipelines that use the new file directory are updated and working. See below for pipeline status.
CNI Load Test - https://msazure.visualstudio.com/One/_build/results?buildId=73426710&view=logs&j=3713fd4c-b129-5ac5-2b38-c7386192ca24
Nightly latest - https://msazure.visualstudio.com/One/_build/results?buildId=73423183&view=results
Nightly v1.12.8 - https://msazure.visualstudio.com/One/_build/results?buildId=73426500&view=results
Azure PR - https://msazure.visualstudio.com/One/_build/results?buildId=73501911&view=results

PR runs with image repository parameter - commit f4e4de3
CNI Load Test - https://msazure.visualstudio.com/One/_build/results?buildId=74508070&view=results
Nightly latest - https://msazure.visualstudio.com/One/_build/results?buildId=74508121&view=results
Azure PR - https://msazure.visualstudio.com/One/_build/results?buildId=74540414&view=results

Added CILIUM_VERSION_TAG and CILIUM_IMAGE_REPOSITORY env variable to

  • Azure Container Networking PR
  • Azure Container Networking PR - Submodules
  • CNI Load Test
  • Cilium Nightly

Merged changes from

Changed number of operator pods to match TSG - https://eng.ms/docs/cloud-ai-platform/azure-core/azure-management-and-platforms/containers-bburns/azure-kubernetes-service/azure-kubernetes-service-troubleshooting-guide/doc/tsg/cilium

  • cilium-operator" pods are running (2 pods if there are >= 2 nodes, otherwise 1).
  • Default is 2 Nodes from makefile
 overlay-byocni-up:` rg-up overlay-net-up ## Brings up an Overlay BYO CNI cluster
	$(AZCLI) aks create -n $(CLUSTER) -g $(GROUP) -l $(REGION) \
		--node-count $(NODE_COUNT) \

NODE_COUNT ?= 2

Once Approved I will start up automatic scheduling with the new nighty pipeline file located in ACN

@jpayne3506 jpayne3506 force-pushed the ciliumNightly branch 3 times, most recently from acb5acf to c4d836d Compare May 17, 2023 15:47
@jpayne3506 jpayne3506 added cni Related to CNI. ci Infra or tooling. labels May 17, 2023
@rbtr rbtr changed the title ci: [CNI] Move Nightly Cilium Pipeline test from Networking-Aquarius to ACN ci: [CNI] Move Nightly Cilium Pipeline test to ACN May 18, 2023
@jpayne3506
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@jpayne3506
Copy link
Contributor Author

Waiting for Azure Container Networking PR - submodules to be tested.

@jpayne3506 jpayne3506 marked this pull request as ready for review May 19, 2023 14:47
@jpayne3506 jpayne3506 requested a review from a team as a code owner May 19, 2023 14:47
@jpayne3506
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s), but failed to run 1 pipeline(s).

@jpayne3506 jpayne3506 enabled auto-merge (squash) May 22, 2023 18:04
camrynl
camrynl previously approved these changes May 22, 2023
@@ -48,6 +48,8 @@ stages:
echo "install Cilium onto Overlay Cluster"
kubectl apply -f test/integration/manifests/cilium/cilium-agent
kubectl apply -f test/integration/manifests/cilium/cilium-operator
envsubst '${CILIUM_VERSION_TAG}' < test/integration/manifests/cilium/daemonset.yaml | kubectl apply -f -
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a default value for this tag Or we need to provide for each run ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I currently have it set within each pipelines user defined variables. When a update occurs it we will have to change it in 3 locations ( 1 for each pipeline that is currently using the E2E-step-template for cilium).

The other way of doing this is to add parameters to the template call in order to pass the variable. Which still results in having to change it per pipeline.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but in the pipeline, we don't need a PR so would be faster than updating the file then PR.

Either way, we need to add doc so that people are aware of these variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who sets CILIUM_VERSION_TAG in nightly pipeline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All pipelines have this tag set within their user defined variables, as such it will stay the same throughout unless changed by us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it would be worth an effort to use the same cluster role binding or cluster role for the PR pipeline and nightly pipeline.
Is there a scenario where this is different from the yaml currently used for PR pipeline ? If not would it make sense to have a common folder for these yamls[ which are not dependent on the CILIUM_VERSION_TAG]
(Agent and operator can be different directories)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are doing that only( correct me if I am wrong). So can we delete the other duplicate yamls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The differences between the directories lies in the clusterrole.yaml. To keep the same pattern I needed to make the directories complete with their bindings and serviceaccount files as well, which do match between agent/nightly-agent and operator/nightly-operator as you mentioned.

If we do it the way you propose we will still have the need for the 4 directories as agent and operator clusterroles are both different as well, but then add a 5th and 6th directory that contains the matching agent and operator files respectively.

My other solution was to give the files different names per agent and operator which also included nightly, but then it starts to look very messy in order to avoid matching file names.

@vipul-21
Copy link
Contributor

Lgtm.
Make sure we change the nightly pipeline source to this yaml from networking Aquarius one.

vipul-21
vipul-21 previously approved these changes May 25, 2023
add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)
@jpayne3506
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 2 pipeline(s), but failed to run 2 pipeline(s).

@jpayne3506 jpayne3506 requested review from vipul-21 and camrynl and removed request for a team, rbtr and matmerr June 8, 2023 21:51
vipul-21
vipul-21 previously approved these changes Jun 9, 2023
@@ -0,0 +1,89 @@
apiVersion: v1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the differences between nightly config map and regular PR pipeline config map? can you document that somewhere

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting these changes here https://msazure.visualstudio.com/AzureWiki/_wiki/wikis/AzureWiki.wiki/480616/Cilium-Version-Upgrade-WIP-

It also captures the changes to the clusterroles and additional features.

There are also comments within the nightly-config.yaml to indicate which KV's changed to ensure that nothing is missed.

Comment on lines +50 to +51
kubectl apply -f cilium/configmap.yaml
kubectl apply -f test/integration/manifests/cilium/cilium${FILE_PATH}-config.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we applying config map twice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 50 is a cni-configuration configmap for cilium.
line 51 is a dynamic cilium-config configmap that contains all of the KV's for the respective cilium based pipelines. $FILE_PATH directs it to the nightly or the current 1.12.x KV's.

# Nightly does not build images per commit. Will use existing image.
if [ "$CILIUM_VERSION_TAG" = "cilium-nightly-pipeline" ]
then
CNS=v1.4.44_hotfix DROPGZ=v0.0.4 && echo "Running nightly"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets use latest CNS which i guess 1.5.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed here and within /test/integration/manifests/cns/daemonset.yaml

@tamilmani1989
Copy link
Member

@camrynl since we parmeterized image, can you make sure PR pipeline use mcr registry

@camrynl
Copy link
Contributor

camrynl commented Jun 9, 2023

@camrynl since we parmeterized image, can you make sure PR pipeline use mcr registry

It looks like John has already taken care of setting the cilium image variables
image

@camrynl
Copy link
Contributor

camrynl commented Jun 9, 2023

@jpayne3506 I have recently moved the pipelines to cilium v1.12.10 with this PR, please update the pipeline vars with the new version

@jpayne3506
Copy link
Contributor Author

jpayne3506 commented Jun 9, 2023

Will do. Done @camrynl

@tamilmani1989 tamilmani1989 merged commit 1514d95 into master Jun 9, 2023
42 checks passed
@tamilmani1989 tamilmani1989 deleted the ciliumNightly branch June 9, 2023 18:52
@tamilmani1989 tamilmani1989 removed the cni Related to CNI. label Jun 9, 2023
jpayne3506 added a commit that referenced this pull request Sep 15, 2023
* CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: Eliminating duplicate lines

* ci: Add update permission for ciliumidentity

* fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: File Directory

* style: Comments

* Addressing Comments

---------

Co-authored-by: Paul Johnston <35265851+pjohnst5@users.noreply.github.com>
(cherry picked from commit 1514d95)
jpayne3506 added a commit that referenced this pull request Sep 15, 2023
* CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: Eliminating duplicate lines

* ci: Add update permission for ciliumidentity

* fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: File Directory

* style: Comments

* Addressing Comments

---------

Co-authored-by: Paul Johnston <35265851+pjohnst5@users.noreply.github.com>
(cherry picked from commit 1514d95)
jpayne3506 added a commit that referenced this pull request Sep 17, 2023
* CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: Eliminating duplicate lines

* ci: Add update permission for ciliumidentity

* fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: File Directory

* style: Comments

* Addressing Comments

---------

Co-authored-by: Paul Johnston <35265851+pjohnst5@users.noreply.github.com>
(cherry picked from commit 1514d95)
jpayne3506 added a commit that referenced this pull request Sep 22, 2023
* build azure-vnet-telemetry and azure-vnet-ipam in dropgz-test (#1846)

build azure-vnet-telemetry and azure-vnet-ipam in dropgz-test for parity with release image

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>
(cherry picked from commit f619259)

* ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>
(cherry picked from commit 024819d)

* CI: [CNI] Replace the bash scripts for CNI load testing with golang test cases (#2003)

CI:[CNI] Replace the bash scripts with the golang test cases
(cherry picked from commit 008ae45)

* ci: [CNI] Move Nightly Cilium Pipeline test to ACN (#1963)

* CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: Eliminating duplicate lines

* ci: Add update permission for ciliumidentity

* fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

fix: Parameterize Image Registry

add retry to nnc update during scaledown (#1970)

* add retry to nnc update during scaledown

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

* test for panic in pool monitor

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

---------

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

fix: reserve 0th IP as gateway for overlay on Windows (#1968)

* fix: reserve 0th IP as gateway for overlay on Windows

* fix: allow gateway to be updated

ci: windows profile container image (#1988)

Always use 0 for NC version in Overlay (#1979)

always use 0 for NC version in overlay

Signed-off-by: Evan Baker <rbtr@users.noreply.github.com>

[Vnet Scale - CNS]: Flattening CIDR ranges for Node NNC to a list (#1921)

* Read secondary CIDRs from VnetScale NNC

* fix comment

* update comment

* For VnetScale mode, Use 1st IP for def gateway instead of 0th for windows

* fix/add import

* address pr comments

* add comments

* address pr comments

* wrap error

* fix typo

* fix UT

fix: [NPM] check if policy exists in case of nil pointer (#1974)

fix: check for nil first

ci: disable kube-proxy for test clusters (#1965)

* disable kube-proxy for byocni cluster creation

* test config mapping

* shell pwd

* use CURDIR

* check current directory

* test with repo root dir

* test azp format

* test azp format

* test azp format

* change e2e steps to remove kube proxy

* fix load test update args

* fix ns and rg in update

* update ciliume2e

* fix kubectl cmd in load test

* adding new targets for no kube proxy

* remove cluster update

* update overlay e2e

* test behavior of load test

* test grep for azure-cns

* look for container deployment

* testing

* restart node variable check

* update if condition

* add skip node case

---------

Co-authored-by: tamilmani1989 <tamanoha@microsoft.com>

perf: [WIN-NPM] fast bootup (#1900)

* wip

* wip2

* use other apply DP func

* address comment about if statement

* finish bootup for both DPs

* fix lint

* fix lint 2

* fix lint 3

* longer UT timeout and add missing UTs for apply in background

tool: [NPM] script to clean up iptable chains (#1978)

tool: script to clean up NPM iptable chains

feat: [WIN-NPM] metrics for latencies and failures (#1959)

* implement metrics

* add npm prefix

* rename windows files

* metrics pkg UTs

* allow reinitializing prometheus metrics

* fix: hns wrapper should not throw error for empty SetPolicy values

* test: metric UTs in dataplane

* fix: record list endpoint latency always

* remove flaky UT

* feat: metric for max ipset members

* fix lint

* fix lint 2

* fix build

* fix lint 3

* simplify conditionals and protect against maxMembers becoming negative

* remove bottom 4 histogram buckets. start at 16 ms

* reset metrics for ipset UTs

* style: don't check for windows dp in *_windows.go files

* build: remove unused import

* test: reset windows metrics in UT

Remove SSH port 22 rule from aks-engine clusters (#1983)

ci: change overlaye2e stage to cilium-overlay (#1997)

* renaming overlaye2e for cilium

* update display names for stages

Initial getHomeAZ 404 changes (#1994)

* initial getHomeAZ 404 changes

* treat 404 as success

* address comments

CNS to be able to generate dualstack overaly CNI conflist (#1981)

* fix: File Directory

* style: Comments

* Addressing Comments

---------

Co-authored-by: Paul Johnston <35265851+pjohnst5@users.noreply.github.com>
(cherry picked from commit 1514d95)

* ci:[CNI] Add windows CNIv1 datapath test (#2016)

* ci: Transfer files

* test: Working Datapath Test

* test: apierror Tests

* style: Datapath Package

* test: Deployment timing

* fix: Error check

* fix: Lint

(cherry picked from commit 390977d)

* fix: [CNI] CNI load test failing due to namespace already created (#2031)

fix: CNI load test failing due to namespace already created
(cherry picked from commit c10900e)

* ci:[CNI] Windows cniv1 load test pipeline (#2024)

CI:[CNI] Windows cniv1 load test pipeline
(cherry picked from commit e45ad21)

* ci: [CNI] Adding aks cluster creation steps for k8s e2e test (#2052)

* ci: [CNI] Adding aks cluster creation steps for k8s e2e test

* Add  validate step to the pipeline

* Adding the telemetry config to the cluster

(cherry picked from commit 846e508)

* ci:[CNI] Replace AKS-Engine Tests with k8s conformance tests (#2062)

* Initial Commit

* Add attempts to prevent flakyness

* Add taint for windows tests

* Add k8s e2e tests

* Testing vmSizes

* Artifact k8se2e binary

* Remove NPM E2E

* Add testing and increase processes

* Addressing comments

(cherry picked from commit 451c691)

* CI: Removing AKS engine related code (#2089)

(cherry picked from commit b45c2c7)

* feat: [dropgz] Dropgz for windows (#2075)

* feat: [dropgz] Dropgz for windows

* Removing the code for killing the process from dropgz for windows

(cherry picked from commit 7a41178)

* ci: Update dns tests for k8s conformance (#2104)

Update dns tests for k8s v1.26

(cherry picked from commit bbf2fd4)

* ci: adding cni package as a trigger (#2108)

(cherry picked from commit e6a8ea6)

* ci: add packages for submodule trigger (#2154)

(cherry picked from commit 4aecfd6)

* set mellanox reg key (#1768)

(cherry picked from commit fa2de6d)

---------

Co-authored-by: Evan Baker <rbtr@users.noreply.github.com>
Co-authored-by: Camryn Lee <31013536+camrynl@users.noreply.github.com>
Co-authored-by: Vipul Singh <vipul21sept@gmail.com>
Co-authored-by: Rajvi <107083915+rajvinar@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Infra or tooling.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants