Skip to content

Commit

Permalink
Add watch function for TFJob python Client API (#1122)
Browse files Browse the repository at this point in the history
  • Loading branch information
jinchihe authored and k8s-ci-robot committed Jan 2, 2020
1 parent c588dae commit 2228b72
Show file tree
Hide file tree
Showing 32 changed files with 229 additions and 175 deletions.
2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/openapi_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/types.go
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
// Copyright 2019 The Kubeflow Authors
// Copyright 2020 The Kubeflow Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
Expand Down
2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/apis/tensorflow/v1/zz_generated.defaults.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/clientset.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/fake/clientset_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/fake/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/fake/register.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/scheme/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/scheme/register.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/clientset/versioned/typed/tensorflow/v1/doc.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/informers/externalversions/factory.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/informers/externalversions/generic.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/listers/tensorflow/v1/expansion_generated.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pkg/client/listers/tensorflow/v1/tfjob.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 13 additions & 2 deletions sdk/python/docs/TFJobClient.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ namespace | str | Namespace for tfjob deploying to. If the `namespace` is not de
object

## get
> get(name=None, namespace=None)
> get(name=None, namespace=None, watch=False, timeout_seconds=600)
Get the created tfjob in the specified namespace

Expand All @@ -114,7 +114,8 @@ Name | Type | Description | Notes
------------ | ------------- | ------------- | -------------
name | str | The TFJob name. If the `name` is not specified, it will get all tfjobs in the namespace.| Optional. |
namespace | str | The tfjob's namespace. Defaults to current or default namespace.| Optional |

watch | bool | Watch the created TFJob if `True`, otherwise will return the created TFJob object. Stop watching if TFJob reaches the optional specified `timeout_seconds` or once the TFJob status `Succeeded` or `Failed`. | Optional |
timeout_seconds | int | Timeout seconds for watching. Defaults to 600. | Optional |

### Return type
object
Expand Down Expand Up @@ -180,6 +181,7 @@ object
> namespace=None,
> timeout_seconds=600,
> polling_interval=30,
> watch=False,
> status_callback=None):
Wait for the specified job to finish.
Expand All @@ -191,6 +193,14 @@ from kubeflow.tfjob import TFJobClient

tfjob_client = TFJobClient()
tfjob_client.wait_for_job('mnist', namespace='kubeflow')

# The API also supports watching the TFJob status till it's Succeeded or Failed.
tfjob_client.wait_for_job('mnist', namespace=namespace, watch=True)
NAME STATE TIME
mnist Created 2019-12-31T09:20:07Z
mnist Running 2019-12-31T09:20:19Z
mnist Running 2019-12-31T09:20:19Z
mnist Succeeded 2019-12-31T09:22:04Z
```

### Parameters
Expand All @@ -201,6 +211,7 @@ namespace | str | The tfjob's namespace. Defaults to current or default namespac
timeout_seconds | int | How long to wait for the job, default wait for 600 seconds. | Optional|
polling_interval | int | How often to poll for the status of the job.| Optional|
status_callback | str | Callable. If supplied this callable is invoked after we poll the job. Callable takes a single argument which is the tfjob.| Optional|
watch | bool | Watch the TFJob if `True`. Stop watching if TFJob reaches the optional specified `timeout_seconds` or once the TFJob status `Succeeded` or `Failed`. | Optional |

### Return type
object
Expand Down
84 changes: 24 additions & 60 deletions sdk/python/examples/kubeflow-tfjob-sdk.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -120,13 +120,13 @@
"text/plain": [
"{'apiVersion': 'kubeflow.org/v1',\n",
" 'kind': 'TFJob',\n",
" 'metadata': {'creationTimestamp': '2019-12-17T05:40:26Z',\n",
" 'metadata': {'creationTimestamp': '2019-12-31T09:20:07Z',\n",
" 'generation': 1,\n",
" 'name': 'mnist',\n",
" 'namespace': 'default',\n",
" 'resourceVersion': '13585452',\n",
" 'resourceVersion': '20125141',\n",
" 'selfLink': '/apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'},\n",
" 'uid': 'bcb3b867-2bae-11ea-8c04-00000a1001ee'},\n",
" 'spec': {'cleanPodPolicy': 'None',\n",
" 'tfReplicaSpecs': {'Worker': {'replicas': 1,\n",
" 'restartPolicy': 'Never',\n",
Expand Down Expand Up @@ -166,13 +166,13 @@
"text/plain": [
"{'apiVersion': 'kubeflow.org/v1',\n",
" 'kind': 'TFJob',\n",
" 'metadata': {'creationTimestamp': '2019-12-17T05:40:26Z',\n",
" 'metadata': {'creationTimestamp': '2019-12-31T09:20:07Z',\n",
" 'generation': 1,\n",
" 'name': 'mnist',\n",
" 'namespace': 'default',\n",
" 'resourceVersion': '13585464',\n",
" 'resourceVersion': '20125155',\n",
" 'selfLink': '/apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'},\n",
" 'uid': 'bcb3b867-2bae-11ea-8c04-00000a1001ee'},\n",
" 'spec': {'cleanPodPolicy': 'None',\n",
" 'tfReplicaSpecs': {'Worker': {'replicas': 1,\n",
" 'restartPolicy': 'Never',\n",
Expand All @@ -183,14 +183,14 @@
" '--batch_size=150'],\n",
" 'image': 'gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0',\n",
" 'name': 'tensorflow'}]}}}}},\n",
" 'status': {'conditions': [{'lastTransitionTime': '2019-12-17T05:40:26Z',\n",
" 'lastUpdateTime': '2019-12-17T05:40:26Z',\n",
" 'status': {'conditions': [{'lastTransitionTime': '2019-12-31T09:20:07Z',\n",
" 'lastUpdateTime': '2019-12-31T09:20:07Z',\n",
" 'message': 'TFJob mnist is created.',\n",
" 'reason': 'TFJobCreated',\n",
" 'status': 'True',\n",
" 'type': 'Created'}],\n",
" 'replicaStatuses': {'Worker': {}},\n",
" 'startTime': '2019-12-17T05:40:26Z'}}"
" 'startTime': '2019-12-31T09:20:09Z'}}"
]
},
"execution_count": 5,
Expand All @@ -217,7 +217,7 @@
{
"data": {
"text/plain": [
"'Running'"
"'Created'"
]
},
"execution_count": 6,
Expand All @@ -242,57 +242,19 @@
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'apiVersion': 'kubeflow.org/v1',\n",
" 'kind': 'TFJob',\n",
" 'metadata': {'creationTimestamp': '2019-12-17T05:40:26Z',\n",
" 'generation': 1,\n",
" 'name': 'mnist',\n",
" 'namespace': 'default',\n",
" 'resourceVersion': '13586024',\n",
" 'selfLink': '/apis/kubeflow.org/v1/namespaces/default/tfjobs/mnist',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'},\n",
" 'spec': {'cleanPodPolicy': 'None',\n",
" 'tfReplicaSpecs': {'Worker': {'replicas': 1,\n",
" 'restartPolicy': 'Never',\n",
" 'template': {'spec': {'containers': [{'command': ['python',\n",
" '/var/tf_mnist/mnist_with_summaries.py',\n",
" '--log_dir=/train/logs',\n",
" '--learning_rate=0.01',\n",
" '--batch_size=150'],\n",
" 'image': 'gcr.io/kubeflow-ci/tf-mnist-with-summaries:1.0',\n",
" 'name': 'tensorflow'}]}}}}},\n",
" 'status': {'completionTime': '2019-12-17T05:42:19Z',\n",
" 'conditions': [{'lastTransitionTime': '2019-12-17T05:40:26Z',\n",
" 'lastUpdateTime': '2019-12-17T05:40:26Z',\n",
" 'message': 'TFJob mnist is created.',\n",
" 'reason': 'TFJobCreated',\n",
" 'status': 'True',\n",
" 'type': 'Created'},\n",
" {'lastTransitionTime': '2019-12-17T05:40:36Z',\n",
" 'lastUpdateTime': '2019-12-17T05:40:36Z',\n",
" 'message': 'TFJob mnist is running.',\n",
" 'reason': 'TFJobRunning',\n",
" 'status': 'False',\n",
" 'type': 'Running'},\n",
" {'lastTransitionTime': '2019-12-17T05:42:19Z',\n",
" 'lastUpdateTime': '2019-12-17T05:42:19Z',\n",
" 'message': 'TFJob mnist successfully completed.',\n",
" 'reason': 'TFJobSucceeded',\n",
" 'status': 'True',\n",
" 'type': 'Succeeded'}],\n",
" 'replicaStatuses': {'Worker': {'succeeded': 1}},\n",
" 'startTime': '2019-12-17T05:40:26Z'}}"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
"NAME STATE TIME \n",
"mnist Created 2019-12-31T09:20:07Z \n",
"mnist Running 2019-12-31T09:20:19Z \n",
"mnist Running 2019-12-31T09:20:19Z \n",
"mnist Succeeded 2019-12-31T09:22:04Z \n"
]
}
],
"source": [
"tfjob_client.wait_for_job('mnist', namespace=namespace)"
"tfjob_client.wait_for_job('mnist', namespace=namespace, watch=True)"
]
},
{
Expand All @@ -305,7 +267,9 @@
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
Expand Down Expand Up @@ -344,7 +308,7 @@
" 'details': {'name': 'mnist',\n",
" 'group': 'kubeflow.org',\n",
" 'kind': 'tfjobs',\n",
" 'uid': 'b9faefd7-208f-11ea-9e34-00000a1001ee'}}"
" 'uid': 'bcb3b867-2bae-11ea-8c04-00000a1001ee'}}"
]
},
"execution_count": 9,
Expand Down

0 comments on commit 2228b72

Please sign in to comment.