Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

E2E tests need to verify that we can submit a TFJob #207

Closed
jlewi opened this issue Feb 5, 2018 · 1 comment
Closed

E2E tests need to verify that we can submit a TFJob #207

jlewi opened this issue Feb 5, 2018 · 1 comment
Assignees

Comments

@jlewi
Copy link
Contributor

jlewi commented Feb 5, 2018

We need an E2E test that verifies we can actually submit a TFJob after deploying Kubeflow.

The lack of such tests is why things are breaking; e.g. because the ksonnet config is out of sync with the actual image.

I think the issue here is how to organize the tests so we don't end up duplicating code between tensorflow/k8s and kubeflow/kubeflow.

Here's my current thinking

  • E2E tests for TFJob should be defined in tensorflow/k8s
  • Kubeflow should checkout a copy of tensorflow/k8s and run one or more of the tests.

We can do this as follows
* Modify checkout.sh so that for kubeflow/kubeflow we check out tensorflow/k8s

  • Add appropriate steps to our E2E workflow to run the tensorflow/k8s steps.
@jlewi jlewi added this to the Kubecon Europe milestone Feb 6, 2018
@jlewi jlewi self-assigned this Feb 6, 2018
jlewi added a commit to jlewi/k8s that referenced this issue Feb 6, 2018
* This will be used as a replacement for using helm.
* The ksonnet template is used to run a K8s job which runs the E2E test;
  ksonnet makes it easy to parameterize the test (e.g. namespace, image).
* This will be used to add an E2E test to our ksonnet repository to
  actually verify we can successfully submit jobs.
  kubeflow/kubeflow#207
jlewi added a commit to kubeflow/training-operator that referenced this issue Feb 12, 2018
* We want to easily run TFJobs corresponding to different TFJob specs (See #373).

* We currently do this using Jinja2 templates and then having a python script test_runner.py to run those templates.

* This PR migrates to using ksonnet to define those templates.

* This PR is prework to running some TFJob E2E tests as part of ksonnet tests (kubeflow/kubeflow#207)

Cleanup

* Include smoke_tf.py inside the TFJob operator docker image; no reason we should have to build a separate image just to get that.
@jlewi
Copy link
Contributor Author

jlewi commented Feb 12, 2018

Fixed by #227

@jlewi jlewi closed this as completed Feb 12, 2018
elenzio9 pushed a commit to arrikto/kubeflow that referenced this issue Oct 31, 2022
added name as contributor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant