Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CI for testing antrea compatibility with 4 K8s versions on CAPA #5476

Closed
wants to merge 1 commit into from

Conversation

jainpulkit22
Copy link
Contributor

@jainpulkit22 jainpulkit22 commented Sep 7, 2023

Add CI for testing compatibility with previous four K8s versions using Cluster API, with provider as AWS.

@jainpulkit22 jainpulkit22 marked this pull request as draft September 7, 2023 06:54
@jainpulkit22 jainpulkit22 marked this pull request as ready for review September 14, 2023 05:13
@jainpulkit22 jainpulkit22 force-pushed the capi-aws branch 3 times, most recently from 7ecfa8d to 88787a4 Compare October 4, 2023 10:22
@jainpulkit22 jainpulkit22 marked this pull request as ready for review October 9, 2023 05:52
@rajnkamr rajnkamr added area/provider/aws Issues or PRs related to aws provider. area/test/infra Issues or PRs related to test infrastructure (Jenkins configuration, Ansible playbook, Kind wrappers labels Oct 12, 2023
@jainpulkit22 jainpulkit22 changed the title Add CI for testing compatibility of antrea with previous four K8s version Add CI for testing compatibility of antrea with previous four K8s versions Oct 12, 2023
@jainpulkit22
Copy link
Contributor Author

/test-rancher-e2e

@rajnkamr
Copy link
Contributor

rajnkamr commented Nov 9, 2023

@jainpulkit22 ,
Need to resolve conflicting files on the PR

using Cluster API, with provider as AWS.

Signed-off-by: Pulkit Jain <jainpu@vmware.com>
Signed-off-by: Shengkai Lin <jefflin@sjtu.edu.cn>
Signed-off-by: Zhengsheng Zhou <zhengshengz@vmware.com>
#!/bin/bash
set -ex
DOCKER_REGISTRY="$(head -n1 ci/docker-registry)"
export JOB_NAME="matrix-${TEST_OS}-k8s-${K8S_VERSION//./-}-build-num"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Jenkins has a built-in variable JOB_NAME. We assign a custom value here. Is it because the custom values is more readable? In this case, maybe use CLUSTER_NAME as variable name. Same for JOB_NAME in other builders.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do that as well, i preferred this because it was already in use, when i started working on capa task.

./ci/jenkins/test-vmc.sh --cluster-name "${JOB_NAME}-${BUILD_NUMBER}" --setup-only --provider aws --aws-region "us-west-2" --aws-access-key-id "${AWS_ACCESS_KEY}" --aws-secret-access-key "${AWS_SECRET_KEY}" --aws-service-user-name "${AWS_SERVICE_USER_NAME}" --aws-service-user-role "${AWS_SERVICE_USER_ROLE_ARN}" --aws-vpc-id "${CAPA_VPC}" --aws-subnet-id "${CAPA_SUBNET}"
testcases=("e2e" "conformance" "networkpolicy")
failure=0
for testcase in "${testcases[@]}"; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to turn off errexit before running the test, and turn on errexit after running the test. For example:

set +e
for testcase in ...; do
  ...
done
set -e

if [[ $result == 124 ]]; then
echo "Error: Clean up job of ${clustername} timeout"
fi
if [[ $result == 124 ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why there are two result == 124 conditions?

branches:
- '*/main'
included_regions: []
cron: '' # 'H H * * *'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why it's '' # 'H H * * *' ? I thought it should be 'H H * * *'. Same question for cron on line 1115.

- v1.23.1
- v1.24.1
- v1.25.1
- v1.26.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are K8s v1.27 and v1.28 supported? Maybe we can change this to v1.25-v1.28.

shift 2
if [ "$provider" = "aws" ]; then
SSH_USERNAME=ubuntu
SKIP_LIST="TestEgress|TestProxy|TestProxyHairpinIPv4"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I know why these test cases need to be skipped? If need to skip them, maybe add short comment here to explain.

sed -i "s/SSHAUTHORIZEDKEYS/default/g" ${GIT_CHECKOUT_DIR}/jenkins/out/cluster.yaml
sed -i "s/CLUSTERNAMESPACE/${CLUSTER}/g" ${GIT_CHECKOUT_DIR}/jenkins/out/namespace.yaml

sleep 15s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit confused, too. When did it trigger or start the initialization of the management cluster? I thought the management cluster was created in advance, and it should be ready before creating any workload clusters.

export AWS_CONTROLLER_IAM_ROLE=$AWS_SERVICE_USER_ROLE_ARN
clusterctl delete --infrastructure aws
export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile --region $AWS_REGION)
clusterctl init --infrastructure aws
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why it need to run clusterctl delete and init here? Is it because the temporary assumed role has been expired?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

export AWS_B64ENCODED_CREDENTIALS=$(clusterawsadm bootstrap credentials encode-as-profile --region $AWS_REGION)
clusterctl init --infrastructure aws

kubectl delete cluster ${CLUSTER} -n ${CLUSTER}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After deleting the cluster, can it traverse though all kinds of resources in the cluster template, and kubectl delete --all $KIND -n ${CLUSTER}, and finally delete the namespace? If this still cannot delete the ec2 and LB instance, it can try deleting via ec2 and elb commands.

aws elb delete-load-balancer --load-balancer-name ${loadbalancer_name}
sleep 90s

echo "=== Cleaning up Security Groups ==="
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it keep the security groups? In case some clusters are created at the same time, the security group may be used by other clusters.

@rajnkamr rajnkamr removed this from the Antrea v1.15 release milestone Dec 22, 2023
@rajnkamr rajnkamr added this to the Antrea v1.16 release milestone Jan 3, 2024
@XinShuYang
Copy link
Contributor

Our cloud CI has been migrated to jenkins.antrea.io, please continue verifying the job changes on it.

@luolanzone
Copy link
Contributor

@jainpulkit22 are you still actively working on this? please help to estimate your bandwidth and make sure you deliver this in Antrea 2.0, or please remove it from the milestone if it's not a must-have.
cc @XinShuYang @rajnkamr

@rajnkamr
Copy link
Contributor

@luolanzone ,
It is good to have candidate and aligned with cloud Jenkins goals wrt CAPA. We are considering this for 2.0
It was initially part of 1.15 however was moved out.

@luolanzone luolanzone removed this from the Antrea v2.0 release milestone Apr 16, 2024
@rajnkamr rajnkamr added this to the Antrea v2.1 release milestone May 3, 2024
@rajnkamr rajnkamr changed the title Add CI for testing compatibility of antrea with previous four K8s versions Add CI for testing antrea compatibility with 4 K8s versions on CAPA May 16, 2024
@rajnkamr rajnkamr removed this from the Antrea v2.1 release milestone May 31, 2024
@jainpulkit22
Copy link
Contributor Author

Paused for now, shifting to CAPV testbeds or GCP testbeds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/aws Issues or PRs related to aws provider. area/test/infra Issues or PRs related to test infrastructure (Jenkins configuration, Ansible playbook, Kind wrappers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants