Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add periodic conformance test run against cloud-provider-gcp using kubetest2 #18714

Merged

Conversation

michaelmdresser
Copy link
Contributor

@michaelmdresser michaelmdresser commented Aug 6, 2020

Adds a new periodic job to run the conformance ginkgo suite against the cloud-provider-gcp repository. Test job is based on
ci-kubernetes-gce-conformance-latest, with the main difference being using the new kubetest2 GCE deployer instead of kubetest. Because cloud-provider-gcp does not have published release tars, this job has to clone and build the cloud-provider-gcp repository. The job also does not include the forking and the alerting in the job it is based on because it needs to be validated in production prow before meaningful decisions can be made using it.

Additional note: due to #18706, this job uses KRTE instead of kubekins-e2e as the test runner.

Tested in a GKE cluster using mkpj and mkpod.

Adds a new periodic job to run the conformance ginkgo suite against
the cloud-provider-gcp repository. Test job is based on
ci-kubernetes-gce-conformance-latest, with the main difference being
using the new kubetest2 GCE deployer instead of kubetest. Because
cloud-provider-gcp does not have published release tars, this job
has to clone and build the cloud-provider-gcp repository. The job
also does not include the forking and the alerting in the job it is
based on because it needs to be validated in production prow before
meaningful decisions can be made using it.
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 6, 2020
@k8s-ci-robot k8s-ci-robot added area/config Issues or PRs related to code in /config area/jobs area/provider/aws Issues or PRs related to aws provider sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Aug 6, 2020
preset-service-account: "true"
spec:
containers:
- image: gcr.io/k8s-testimages/krte:v20200806-28035c4-1.16
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which tag of this KRTE release I should be using. This is the postsubmit that built it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need your fix so this is fine, but eventually we should use the auto generated tag.

@@ -0,0 +1,32 @@
periodics:
- interval: 3h
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this interval should be longer. Conformance tests take a while.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3h seems fine, that's also the cadence for the existing job.

go get sigs.k8s.io/kubetest2@latest;
go get sigs.k8s.io/kubetest2/kubetest2-gce@latest;
go get sigs.k8s.io/kubetest2/kubetest2-tester-ginkgo@latest;
kubetest2 gce -v 2 --repo-root $REPO_ROOT --build --up --down --test=ginkgo -- --focus-regex='\[Conformance\]' --parallel=30
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original job doesn't configure a parallel amount for the tester. Doing so speeds up tests. I would appreciate perspective on potential issues this could cause.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all jobs can be safely run in parallel and usually causes failures as compared to when run serially.
Let's stick to serial for now, or we can have another job for parallel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped the parallel

Copy link
Contributor

@amwat amwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really excited to see this.

/cc @cheftako @cici37

preset-service-account: "true"
spec:
containers:
- image: gcr.io/k8s-testimages/krte:v20200806-28035c4-1.16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need your fix so this is fine, but eventually we should use the auto generated tag.

@@ -0,0 +1,32 @@
periodics:
- interval: 3h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3h seems fine, that's also the cadence for the existing job.

go get sigs.k8s.io/kubetest2@latest;
go get sigs.k8s.io/kubetest2/kubetest2-gce@latest;
go get sigs.k8s.io/kubetest2/kubetest2-tester-ginkgo@latest;
kubetest2 gce -v 2 --repo-root $REPO_ROOT --build --up --down --test=ginkgo -- --focus-regex='\[Conformance\]' --parallel=30
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all jobs can be safely run in parallel and usually causes failures as compared to when run serially.
Let's stick to serial for now, or we can have another job for parallel.

cluster: k8s-infra-prow-build
name: ci-cloud-provider-gcp-conformance-latest
annotations:
testgrid-tab-name: Conformance - Cloud Provider GCP - master
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added, hopefully that's the correct syntax.

@k8s-ci-robot k8s-ci-robot added area/provider/gcp Issues or PRs related to gcp provider area/testgrid labels Aug 6, 2020
set -o nounset;
set -o pipefail;
set -o xtrace;
git clone https://github.com/kubernetes/cloud-provider-gcp;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use extra_refs instead of manually git cloning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Periodics don't get access to extra_refs. I believe this is also indicated by the fact that there is no clonerefs containers as part of periodic job specs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if that's true. extra_refs is a ProwJob field so it's independent of whether it's a presubmit or a periodic.
We use it in a bunch of jobs

extra_refs:
- org: kubernetes-sigs
repo: kind
base_ref: master
path_alias: sigs.k8s.io/kind

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I tried adding it to my periodic, regenerating it, and then looking for the repo and couldn't find it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need decorate: true

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decorate: true solves this problem. Updated the PR to use extra_refs to get the cloud-provider-gcp repo.

cd cloud-provider-gcp;
REPO_ROOT=$(git rev-parse --show-toplevel);
cd;
go get -u github.com/onsi/ginkgo/ginkgo;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi the ginkgo binary also exists in the same tarball as e2e.test .
Let's start with this.
and as a follow up add the logic inside the tester to use that one and then remove this from here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea

Copy link
Contributor

@amwat amwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2020
@amwat
Copy link
Contributor

amwat commented Aug 7, 2020

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Aug 7, 2020
@amwat
Copy link
Contributor

amwat commented Aug 7, 2020

/cc @BenTheElder @cheftako
approval for adding the dashboard

Comment on lines 22 to 26
cpu: 2000m
memory: 2G
requests:
cpu: 2000m
memory: 2G
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The closest analogue I would have for this is maybe pull-kubernetes-e2e-gce, which is currently at cpu: 4, memory: 14Gi.

Can be tuned once job is running regularly and can be inspected.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2020
@amwat
Copy link
Contributor

amwat commented Aug 7, 2020

/lgtm
@spiffxp can you approve the dashboard changes as well?

Copy link
Member

@spiffxp spiffxp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
for the testgrid changes as well
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 7, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: amwat, michaelmdresser, spiffxp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2020
@k8s-ci-robot k8s-ci-robot merged commit 78767f2 into kubernetes:master Aug 7, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Aug 7, 2020
@k8s-ci-robot
Copy link
Contributor

@michaelmdresser: Updated the job-config configmap in namespace default at cluster default using the following files:

  • key cloud-provider-gcp-periodics.yaml using file config/jobs/kubernetes/cloud-provider-gcp/cloud-provider-gcp-periodics.yaml

In response to this:

Adds a new periodic job to run the conformance ginkgo suite against the cloud-provider-gcp repository. Test job is based on
ci-kubernetes-gce-conformance-latest, with the main difference being using the new kubetest2 GCE deployer instead of kubetest. Because cloud-provider-gcp does not have published release tars, this job has to clone and build the cloud-provider-gcp repository. The job also does not include the forking and the alerting in the job it is based on because it needs to be validated in production prow before meaningful decisions can be made using it.

Additional note: due to #18706, this job uses KRTE instead of kubekins-e2e as the test runner.

Tested in a GKE cluster using mkpj and mkpod.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@liggitt
Copy link
Member

liggitt commented Aug 17, 2020

it looks like this hasn't passed since it was added: https://testgrid.k8s.io/provider-gcp-periodics#Conformance%20-%20Cloud%20Provider%20GCP%20-%20master&width=5

the failing tests look exclusively connected to the new APIs added in 1.19 (ingress v1, ingressclass v1, certificatesigningrequests v1)... what configuration is this running with, and are we sure it's running against the 1.19 artifacts?

@liggitt
Copy link
Member

liggitt commented Aug 17, 2020

opened kubernetes/kubernetes#94051 to track

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config area/jobs area/provider/aws Issues or PRs related to aws provider area/provider/gcp Issues or PRs related to gcp provider area/testgrid cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants