E2E tests need to verify that we can submit a TFJob #207

jlewi · 2018-02-05T23:59:06Z

We need an E2E test that verifies we can actually submit a TFJob after deploying Kubeflow.

The lack of such tests is why things are breaking; e.g. because the ksonnet config is out of sync with the actual image.

I think the issue here is how to organize the tests so we don't end up duplicating code between tensorflow/k8s and kubeflow/kubeflow.

Here's my current thinking

E2E tests for TFJob should be defined in tensorflow/k8s
Kubeflow should checkout a copy of tensorflow/k8s and run one or more of the tests.

We can do this as follows
* Modify checkout.sh so that for kubeflow/kubeflow we check out tensorflow/k8s

Add appropriate steps to our E2E workflow to run the tensorflow/k8s steps.

* This will be used as a replacement for using helm. * The ksonnet template is used to run a K8s job which runs the E2E test; ksonnet makes it easy to parameterize the test (e.g. namespace, image). * This will be used to add an E2E test to our ksonnet repository to actually verify we can successfully submit jobs. kubeflow/kubeflow#207

* We want to easily run TFJobs corresponding to different TFJob specs (See #373). * We currently do this using Jinja2 templates and then having a python script test_runner.py to run those templates. * This PR migrates to using ksonnet to define those templates. * This PR is prework to running some TFJob E2E tests as part of ksonnet tests (kubeflow/kubeflow#207) Cleanup * Include smoke_tf.py inside the TFJob operator docker image; no reason we should have to build a separate image just to get that.

jlewi · 2018-02-12T14:40:06Z

Fixed by #227

added name as contributor

jlewi added the area/testing label Feb 6, 2018

jlewi added this to the Kubecon Europe milestone Feb 6, 2018

jlewi self-assigned this Feb 6, 2018

jlewi mentioned this issue Feb 6, 2018

Move tensorflow/k8s to kubeflow/tf-operator kubeflow/training-operator#350

Closed

jlewi mentioned this issue Feb 7, 2018

Use ksonnet to easily define TFJobs to be run as tests kubeflow/training-operator#374

Merged

jlewi closed this as completed Feb 12, 2018

elenzio9 pushed a commit to arrikto/kubeflow that referenced this issue Oct 31, 2022

Update org.yaml (kubeflow#207)

cf69bff

added name as contributor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E2E tests need to verify that we can submit a TFJob #207

E2E tests need to verify that we can submit a TFJob #207

jlewi commented Feb 5, 2018

jlewi commented Feb 12, 2018

E2E tests need to verify that we can submit a TFJob #207

E2E tests need to verify that we can submit a TFJob #207

Comments

jlewi commented Feb 5, 2018

jlewi commented Feb 12, 2018