Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend tests for using images of each PR #6766

Open
kimwnasptd opened this issue Nov 24, 2022 · 12 comments
Open

Extend tests for using images of each PR #6766

kimwnasptd opened this issue Nov 24, 2022 · 12 comments

Comments

@kimwnasptd
Copy link
Member

/kind feature

Why you need this feature:
Right now we have tests for just applying the manifests in a KinD cluster for each component, i.e. https://github.com/kubeflow/kubeflow/blob/master/.github/workflows/jwa_kind_test.yaml

But when the actions are run for a PR they don't actually test the images that would be built from the current PR, but rather the ones already committed. We should instead aim at testing manifests by using images that are built from the code changes of each PR's commits.

This will also allow us to start doing more advanced integration tests, like:

  • Running API tests on the deployed web apps
  • Test that the Controllers can get objects to the desired state

And of course we will improve our testing quality by ensuring each PR results in images that will be working.

Describe the solution you'd like:
The end result I have in mind is to have

  1. A distinct workflow file for publishing the docker images, only when a PR is merged
  2. A distinct workflow file that will be testing on each PR commit that
    1. The docker image can be built
    2. It will spin up a KinD cluster and insert the image we built https://kind.sigs.k8s.io/docs/user/quick-start/#loading-an-image-into-your-cluster
    3. Do a simple sed and update the manifests to use that image
    4. Build and apply the manifests
    5. Wait for the Pods to become Ready

For the first task we'll need to update the existing *_docker_publish.yaml workflows to only get triggered when a PR is merged
https://github.com/kubeflow/kubeflow/blob/master/.github/workflows/jwa_docker_publish.yaml

For the second task we'll need to extend the *_kind_test.yaml files to do the above behavior.

Anything else you would like to add:
Regarding testing that Pods are ready, we've also done similar work on the Manifests repo https://github.com/kubeflow/manifests/blob/master/.github/workflows/jwa_kind_test.yaml#L31

@kimwnasptd
Copy link
Member Author

/good-first-issue
/help
/area testing

@google-oss-prow
Copy link

@kimwnasptd:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

/good-first-issue
/help
/area testing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@salilmishra23
Copy link

/assign salilmishra23

@apo-ger
Copy link
Contributor

apo-ger commented Dec 8, 2022

Hi @kimwnasptd, I saw your comments on my PR (extend tests for the Profile Controller)

Since this effort is relative to building/pushing images we'll need to first cherry pick #6548 (to use dockerhub for KF images) to master and then rebase my work (#6820) on top!

As soon as we merge the above two PRs I'll cherry-pick the rest of the missing PRs regarding publishing images when the releasing/version/VERSION file changes. WDYT?

@kimwnasptd
Copy link
Member Author

@apo-ger let's organize the pending items and have a clear view of which of them interact with the PRs of this effort.

My current understanding is that we'll need to ensure we use the DockerHub registry before these tests, since for this effort we are generating and applying the manifests. So #6825 should be merged before #6820 #6799.

Then items for future improvements that we should do, but are orthogonal with using container images in tests are:

  • Using the latest tag in the manifests in master
  • Updating the publish workflows to also publish the latest tags
  • Cherry pick the commits for NOT having a central workflow for building all the images
  • Cherry pick the commits for pushing images with specific tag, if the releasing/version/VERSION file changes

Is this your understanding as well?

(Note, using latest tag can allow us to create more "production ready" e2e tests for the CentralDashboard)

@apo-ger
Copy link
Contributor

apo-ger commented Dec 8, 2022

@kimwnasptd Yes this is my understanding as well!

@apo-ger
Copy link
Contributor

apo-ger commented Dec 9, 2022

/assign apo-ger

@apo-ger
Copy link
Contributor

apo-ger commented Dec 14, 2022

There's an issue with the Login to Dockerhub step in the publish docker image workflows. The step fails with:

Error: Username and password required

This is because the runner cannot access secrets.PASSWORD_DOCKER_TOKEN in the context of a pull_request event as it spawns an unprivileged environment to run the workflow:

A workaround for this is to trigger the workflows ONLY on push events as a merged PR always results in a push. See relative discussion in https://github.com/orgs/community/discussions/26724#discussioncomment-3253093

@apo-ger
Copy link
Contributor

apo-ger commented Dec 14, 2022

@kimwnasptd Also I noticed that we have the following two steps in each docker image publish workflow:

- name: Build multi-arch docker image
  run: |
      cd components/tensorboard-controller
      make docker-build-multi-arch

- name: Build and push multi-arch docker image
  run: |
      cd components/tensorboard-controller
      make docker-build-push-multi-arch

I propose to remove the Build multi-arch docker image step and keep the combined Build and push multi-arch docker image

@apo-ger
Copy link
Contributor

apo-ger commented Jan 18, 2023

We currently don't have any tests for testing that notebook server images can be built when a PR introduces changes. I've prepared a PR to introduce the required workflows which will be triggered on pull_request events.

@apo-ger
Copy link
Contributor

apo-ger commented Jan 30, 2023

During #6916, we found that the integration tests do not work as expected. The issue is with our sed command and how it expects a latest tag, which is not the case when we update the manifests for RCs.

To resolve this we will make use of the kustomize edit image <img>:<img>:<tag> command (https://github.com/kubernetes-sigs/kustomize/blob/master/examples/image.md) to set the appropriate tag in the manifests before they are applied.

@rudrakshkarpe
Copy link

Hello @kimwnasptd and @apo-ger, I'm new to contributing to Kubeflow and I'd like to know if there are any remaining tasks with this issue. I'm eager to take on the task and consider it as an opportunity learn about the Kubeflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

4 participants