-
Notifications
You must be signed in to change notification settings - Fork 698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Estimator e2etest #814
Estimator e2etest #814
Conversation
/uncc @ankushagarwal |
4009416
to
7928851
Compare
author Richard Liu <ricliu@google.com> 1535586798 -0700 committer Richard Liu <ricliu@google.com> 1536118792 -0700 Modify dockerfile Add runconfig to server Remove test env var Remove test env var Add workflow Dump response as json Test sending runconfig rpc Fix errors Fix param parsing Fix errors Verify before waiting for job to end Fixes Fix lint error Fix docker image Verify runconfig Fix pylint Fix error message Make test fail Fix docker image version Fix test Fix a few things
58e5ce1
to
b4b3ac4
Compare
/assign @jlewi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/assign @jlewi
nit: @richardsliu please reference issues in the PR description e.g. #762 |
terminate_replica(masterHost, namespace, full_target) | ||
|
||
# TODO(richardsliu): | ||
# There are lots of verifications in this file, consider refactoring them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; I would suggest creating a separate test driver for this test e.g. like
https://github.com/kubeflow/tf-operator/blob/master/py/test_invalid_job.py
But its fine to do this later.
containers: [ | ||
{ | ||
name: "tensorflow", | ||
image: "gcr.io/kubeflow-images-staging/tf-operator-test-server:v20180904-7d89548b", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Make the image a parameter so you can easily update the image of the test server.
There are 3 different cases we want to test.
For cases 1 and 2 we already have So maybe we could just add the TFConfig verification to those jobs? For 3 we could add a similar workflow. @richardsliu I know you're headed OOO so if you can't address the above before you do just go ahead and submit this PR (cancel the hold) as is. This will make it easier for someone else to make the additional changes in a follow on pr. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Fixes #762 |
@jlewi Thanks. These are good suggestions but require a bit of refactoring. I'll address these when I come back. /hold cancel |
@richardsliu: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Related to #762
This change is