Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the TFJob image to the latest image and tag 0.3 #1608

Merged
merged 1 commit into from
Sep 24, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 33 additions & 1 deletion docs_dev/releasing.md
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,9 @@ to point to the new image.
Update [workflows.libsonnet](https://github.com/kubeflow/kubeflow/blob/master/testing/workflows/components/workflows.libsonnet#L183)
to checkout kubeflow/tf-operator at the tag corresponding to the release.

**Note** We should make extra_repos and their versions a ksonnet parameter and
set it in prow_config.yaml. We can then set it differently on the release branch.

## Update PyTorchJob
Identify the [release](https://github.com/kubeflow/pytorch-operator/releases) of pytorch-operator you want to use.
* If you need to cut a new PyTorch operator release follow the instructions in [kubeflow/pytorch-operator](https://github.com/kubeflow/pytorch-operator/blob/master/releasing.md)
Expand Down Expand Up @@ -271,7 +274,7 @@ If you aren't already working on a release branch (of the form `v${MAJOR}.${MINO
## Updating ksonnet prototypes with docker image

Here is the general process for how we update our Docker prototypes to point to
the correct Docker image.
the correct Docker image. See sections below for component specific instructions.

1. Build a Docker image using whatever tagging schema you like

Expand Down Expand Up @@ -307,6 +310,35 @@ the correct Docker image.
* IMAGE_PATTERN should be a regex matching the images that you want to add the tag
* Create a PR checking **into master** the changes in image_tags.yaml

### TFJob Operator

1. Identify the docker image in [gcr.io/kubeflow-images-public/tf_operator](https://gcr.io/kubeflow-images-public/tf_operator)

* Docker images are pushed by kubeflow/tf-operator postsubmit jobs
* You should pick an image corresponding to a green postsubmit at the desired
commit

1. Update the entry for **gcr.io/kubeflow-images-public/tf_operator** in [image_tags.yaml](https://github.com/kubeflow/kubeflow/blob/master/releasing/image_tags.yaml#L288)

* Add a version that specifies the sha of the image you want to use and the release
tag you want to add e.g. "vX.Y.Z"

```

```
1. Run the following command to apply the new image tag

```
releasing/run_apply_image_tags.sh .*tf_operator.*:vX.Y.Z
```

* The command needs to be run by someone with write permissions on
gcr.io/kubeflow-images-public

* Typically this will be the release czar but you can also consult
[kubeflow-images-public.iam.policy.yaml](https://github.com/kubeflow/testing/blob/master/release-infra/kubeflow-images-public.iam.policy.yaml)


### Release branching policy

A release branch should be substantially _feature complete_ with respect to the intended release. Code that is committed to `master` may be merged or cherry-picked on to a release branch, but code that is directly committed to the release branch should be solely applicable to that release (and should not be committed back to master). In general, unless you're committing code that only applies to the release stream (for example, temporary hotfixes, backported security fixes, or image hashes), you should commit to `master` and then merge or cherry-pick to the release branch.
Expand Down
2 changes: 1 addition & 1 deletion kubeflow/core/prototypes/tf-job-operator.jsonnet
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
// @param name string Name to give to each of the components
// @optionalParam namespace string null Namespace to use for the components. It is automatically inherited from the environment if not set.
// @optionalParam cloud string null String identifying the cloud to customize the deployment for.
// @optionalParam tfJobImage string gcr.io/kubeflow-images-public/tf_operator:v20180822-b576c253 The image for the TfJob controller.
// @optionalParam tfJobImage string gcr.io/kubeflow-images-public/tf_operator:v0.3.0 The image for the TfJob controller.
// @optionalParam tfDefaultImage string null The default image to use for TensorFlow.
// @optionalParam tfJobUiServiceType string ClusterIP The service type for the UI.
// @optionalParam tfJobVersion string v1alpha2 which version of the TFJob operator to use
Expand Down
10 changes: 5 additions & 5 deletions kubeflow/core/tests/tf-job_test.jsonnet
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
local tfjob = import "../tf-job-operator.libsonnet";
local paramsv1alpha1 = {
name:: "tf-job-operator",
tfJobImage:: "gcr.io/kubeflow-images-public/tf_operator:v20180226-403",
tfJobImage:: "gcr.io/kubeflow-images-public/tf_operator:v0.3.0",
tfDefaultImage:: "null",
deploymentScope:: "cluster",
deploymentNamespace:: "null",
tfJobVersion: "v1alpha1",
};
local paramsv1alpha2 = {
name:: "tf-job-operator",
tfJobImage:: "gcr.io/kubeflow-images-public/tf_operator:v20180226-403",
tfJobImage:: "gcr.io/kubeflow-images-public/tf_operator:v0.3.0",
tfDefaultImage:: "null",
deploymentScope:: "cluster",
deploymentNamespace:: "null",
Expand Down Expand Up @@ -66,7 +66,7 @@ std.assertEqual(
},
},
],
image: "gcr.io/kubeflow-images-public/tf_operator:v20180226-403",
image: "gcr.io/kubeflow-images-public/tf_operator:v0.3.0",
name: "tf-job-operator",
volumeMounts: [
{
Expand Down Expand Up @@ -358,7 +358,7 @@ std.assertEqual(
},
},
],
image: "gcr.io/kubeflow-images-public/tf_operator:v20180226-403",
image: "gcr.io/kubeflow-images-public/tf_operator:v0.3.0",
name: "tf-job-operator",
volumeMounts: [
{
Expand Down Expand Up @@ -415,7 +415,7 @@ std.assertEqual(
},
},
],
image: "gcr.io/kubeflow-images-public/tf_operator:v20180226-403",
image: "gcr.io/kubeflow-images-public/tf_operator:v0.3.0",
name: "tf-job-dashboard",
ports: [
{
Expand Down
3 changes: 3 additions & 0 deletions releasing/image_tags.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -290,6 +290,9 @@ images:
- digest: sha256:4f20e349f79059a009ef75aea158ca0c555fcc4a22e7c80a7cb9bff54fbab6c1
tags:
- v0.2.0
- digest: sha256:9007f398a8da9287e4693f7cb01e711c94d1404e8bf91885837fdd5fe3cca35
tags:
- v0.3.0
- name: gcr.io/kubeflow-images-public/pytorch-operator
versions:
- digest: sha256:33aa95a3aa0108d5bc631fa3f8a04e646d4eef08a0e8c4695842f92ef0c79027
Expand Down