Skip to content

Commit

Permalink
Tooling to make it easier to tag images and update the ksonnet protot…
Browse files Browse the repository at this point in the history
…ypes (kubeflow#1066)

* Provide some python scripts to match images and then apply a tag to
  them like "v0.2.0". This makes it possible to easily apply a new release
  tag to set of images like the Jupyter images.

* Creeate a shell script to use sed and other twos to update images
  in our ksonnet prototypes.

* Add instructions for doing this.

* I used the scripts to add the v0.2.0 tag to our Jupyter images.

Related to kubeflow#1060
  • Loading branch information
jlewi authored and k8s-ci-robot committed Jun 22, 2018
1 parent 61adfb4 commit 8366029
Show file tree
Hide file tree
Showing 8 changed files with 510 additions and 24 deletions.
34 changes: 32 additions & 2 deletions docs_dev/releasing.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
- [Create Release Workflow](#create-release-workflow)
- [Update Release Config](#update-release-config)

- [Manual Release Kubeflow](#manual-release-kubeflow)
- [Release Kubeflow](#release-kubeflow)
- [Authenticate to GCP](#authenticate-to-gcp)
- [Update TFJob](#update-tfjob)
- [Build TF Serving Images](#build-tf-serving-images)
Expand Down Expand Up @@ -83,7 +83,7 @@ A prototype would be:

Your images will be auto released everyday.

# Example: Manual Release Kubeflow
# Release Kubeflow

Some preliminary instructions for how to cut a release.

Expand Down Expand Up @@ -223,6 +223,36 @@ If you aren't already working on a release branch (of the form `v${MAJOR}.${MINO
2. they allow sophisticated users to track the development of a release (by using the release branch as a `ksonnet` registry), and
4. they simplify backporting critical bugfixes to a patchlevel release particular release stream (e.g., producing a `v0.1.1` from `v0.1-branch`), when appropriate.

## Updating ksonnet prototypes with docker image

Here is the general process for how we update our Docker prototypes to point to
the correct Docker image.

1. Build a Docker image using whatever tagging schema you like

* General convention is v${DATE}-${COMMIT}

1. On the **release branch** update all references to images that will be updated as part
of the release to use the tag v${RELEASE} where ${RELEASE} will be the next release

* e.g if the next RC is v0.2.1-RC.0 then you would use tag v0.2.1
* You can modify and then run the script `releasing/update_ksonnet.sh` to update
the prototypes

1. Update [image_tags.yaml](https://github.com/kubeflow/kubeflow/blob/master/releasing/image_tags.yaml) **on the master branch**

* You can do this by updating and then running **update_image_tags.sh**
* This invokes some python scripts that use regexes to match
images and apply a tag to them
* You can use suitable regexes to get a group of images (e.g. all the
notebook) images.
* There should be an entry for ever image you want to use referenced by the sha of the image
* If there was a previous release using an earlier image, remove the tag v${RELEASE}
from that entry
* Add the tag v${RELEASE} to the newly added image
* Run apply_image_tags.py
* Create a PR checking **into master** the changes in image_tags.yaml

### Release branching policy

A release branch should be substantially _feature complete_ with respect to the intended release. Code that is committed to `master` may be merged or cherry-picked on to a release branch, but code that is directly committed to the release branch should be solely applicable to that release (and should not be committed back to master). In general, unless you're committing code that only applies to the release stream (for example, temporary hotfixes, backported security fixes, or image hashes), you should commit to `master` and then merge or cherry-pick to the reelase branch.
Expand Down
22 changes: 11 additions & 11 deletions kubeflow/core/kubeform_spawner.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,16 @@ def _options_form_default(self):
<label for='image'>Image</label>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<input list="image" name="image" placeholder='repo/image:tag'>
<datalist id="image">
<option value="{0}/{1}/tensorflow-1.4.1-notebook-cpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.4.1-notebook-gpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.5.1-notebook-cpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.5.1-notebook-gpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.6.0-notebook-cpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.6.0-notebook-gpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.7.0-notebook-cpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.7.0-notebook-gpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.8.0-notebook-cpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.8.0-notebook-gpu:v20180607-476e150e">
<option value="{0}/{1}/tensorflow-1.4.1-notebook-cpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.4.1-notebook-gpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.5.1-notebook-cpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.5.1-notebook-gpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.6.0-notebook-cpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.6.0-notebook-gpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.7.0-notebook-cpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.7.0-notebook-gpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.8.0-notebook-cpu:v0.2.0">
<option value="{0}/{1}/tensorflow-1.8.0-notebook-gpu:v0.2.0">
</datalist>
<br/><br/>
Expand Down Expand Up @@ -57,7 +57,7 @@ def singleuser_image_spec(self):
if cloud == 'ack':
image = 'registry.aliyuncs.com/kubeflow-images-public/tensorflow-notebook-cpu'
else:
image = 'gcr.io/kubeflow-images-public/tensorflow-1.8.0-notebook-cpu:v20180619-c79194b3'
image = 'gcr.io/kubeflow-images-public/tensorflow-1.8.0-notebook-cpu:v0.2.0'
if self.user_options.get('image'):
image = self.user_options['image']
return image
Expand Down
118 changes: 118 additions & 0 deletions releasing/add_image_shas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
"""The script uses a regex to identify images in GCR and add
entries for them to image_tags.yaml
"""

import argparse
import logging
import re
import json
import yaml

from kubeflow.testing import util

def main(unparsed_args=None): # pylint: disable=too-many-locals
logging.getLogger().setLevel(logging.INFO) # pylint: disable=too-many-locals
# create the top-level parser
parser = argparse.ArgumentParser(
description="Get Images by regex")

parser.add_argument(
"--pattern",
default="",
type=str,
help="Regex pattern e.g. .*tensorflow.*notebook.*:v20180619.*")

parser.add_argument(
"--images_file",
default="image_tags.yaml",
type=str,
help="Yaml file containing the tags to attach.")

args = parser.parse_args()

with open(args.images_file) as hf:
config = yaml.load(hf)

existing_images = {}

for image in config["images"]:
existing_images[image["name"]] = {}
for v in image["versions"]:
existing_images[image["name"]][v["digest"]] = v

raw_images = util.run(["gcloud",
"--project=kubeflow-images-public",
"container", "images", "list",
"--format=json"])

all_images = json.loads(raw_images)
name_pattern, tag_pattern = args.pattern.split(":")

name_re = re.compile(name_pattern)
tag_re = re.compile(tag_pattern)

matching = []
for image in all_images:
if not name_re.match(image["name"]):
continue
logging.info("Matching image: %s", image["name"])
matching.append(image)

# For each image ist all tags and find the matching ones
images_to_add = {}
for image in matching:
raw_tags = util.run(["gcloud",
"--project=kubeflow-images-public",
"container", "images", "list-tags", image["name"],
"--format=json"])

tags = json.loads(raw_tags)

for info in tags:
for t in info["tags"]:
if tag_re.match(t):
is_match = True
versions = images_to_add.get(image["name"], {})
versions[info["digest"]] = info
images_to_add[image["name"]] = versions

# Merge in any missing versions
for name, versions in images_to_add.iteritems():
if not name in existing_images:
existing_images[name] = {}

for v in versions.itervalues():
if v["digest"] in existing_images[name]:
logging.info("Image %s sha %s already defined.", name, v["digest"])
else:
logging.info("Image %s adding sha %s", name, v["digest"])
existing_images[name][v["digest"]] = v

# Convert to the expected output
output = {}
output["images"] = []

names = existing_images.keys()
names.sort()
for name in names:
versions = existing_images[name]
new_image = {}
new_image["name"] = name
new_image["versions"] = []
for v in versions.itervalues():
new_image["versions"].append(v)

output["images"].append(new_image)

with open(args.images_file, "w") as hf:
hf.write(yaml.safe_dump(output, default_flow_style=False))
logging.info("Done.")

if __name__ == "__main__":
logging.basicConfig(level=logging.INFO,
format=('%(levelname)s|%(asctime)s'
'|%(pathname)s|%(lineno)d| %(message)s'),
datefmt='%Y-%m-%dT%H:%M:%S',
)
logging.getLogger().setLevel(logging.INFO)
main()
114 changes: 114 additions & 0 deletions releasing/add_image_tag.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
"""This script adds or moves a tag in image_tags.yaml
This script doesn't actually update the images. For that you need to
call apply_image_tags using image_tags.yaml
The script looks for images matching a regex and will add a tag to that
image. If that tag is already on an existing version of the image it is removed.
Example:
python add_image_tag.py --pattern=.*tensorflow.*1.*notebook.*:v20180619.* \
--tag=v0.2.0
This would add the tag v0.2.0 to images matching the pattern and remove it
from any existing images.
"""

import argparse
import logging
import re
import yaml

from kubeflow.testing import util

def main(unparsed_args=None): # pylint: disable=too-many-locals
logging.getLogger().setLevel(logging.INFO) # pylint: disable=too-many-locals
# create the top-level parser
parser = argparse.ArgumentParser(
description="Apply tags to file")

parser.add_argument(
"--images_file",
default="image_tags.yaml",
type=str,
help="Yaml file containing the tags to attach.")

parser.add_argument(
"--pattern",
default="",
type=str,
help=("Regex pattern e.g. .*tensorflow.*notebook.*:v20180619.* "
"to select the images to apply."))

parser.add_argument(
"--tag",
default="",
type=str,
help="The tag to apply")


args = parser.parse_args()

with open(args.images_file) as hf:
config = yaml.load(hf)

if not config:
raise ValueError("No images could be load from %s" % args.images_file)
name_pattern, tag_pattern = args.pattern.split(":")
name_re = re.compile(name_pattern)
tag_re = re.compile(tag_pattern)

for image in config["images"]:
name = image["name"]
if not name_re.match(name):
continue

# Loop over all the images and see if the supplied tag is already
# mapped to an image and which version to add the label to.
# The index of the version to add the tag to.
new_index = []
existing_index = []
for v_index, v in enumerate(image["versions"]):
for tag in v["tags"]:
if tag == args.tag:
existing_index.append(v_index)

if tag_re.match(tag):
new_index.append(v_index)

if len(existing_index) > 1:
logging.error("Multiple images %s had tag %s", name, args.tag)

# TODO(jlewi)
if existing_index and not new_index:
logging.error("Not moving tag for image %s because no images matched %s",
name, args.pattern)
existing_index = []
for e in existing_index:
image["versions"][e]["tags"].remove(args.tag)

logging.info("Image %s removing tag from sha %s", name,
image["versions"][e]["digest"])

if len(new_index) > 1:
raise ValueError("Image {0} had {1} images match {2}".format(name, len(
new_index, args.pattern)))

if new_index:
v = image["versions"][new_index[0]]
logging.info("Image %s adding tag from sha %s", name,
v["digest"])
v["tags"].append(args.tag)

with open(args.images_file, "w") as hf:
hf.write(yaml.safe_dump(config, default_flow_style=False))
logging.info("Done.")

if __name__ == "__main__":
logging.basicConfig(level=logging.INFO,
format=('%(levelname)s|%(asctime)s'
'|%(pathname)s|%(lineno)d| %(message)s'),
datefmt='%Y-%m-%dT%H:%M:%S',
)
logging.getLogger().setLevel(logging.INFO)
main()
27 changes: 22 additions & 5 deletions releasing/apply_image_tags.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

import argparse
import logging
import re
import yaml

from kubeflow.testing import util
Expand All @@ -18,17 +19,33 @@ def main(unparsed_args=None): # pylint: disable=too-many-locals
type=str,
help="Yaml file containing the tags to attach.")

parser.add_argument(
"--pattern",
default="",
type=str,
help=("Regex pattern e.g. .*tensorflow.*notebook.*:v20180619.* "
"to select the images to apply."))
args = parser.parse_args()

with open(args.images_file) as hf:
config = yaml.load(hf)

name_pattern, tag_pattern = args.pattern.split(":")
name_re = re.compile(name_pattern)
tag_re = re.compile(tag_pattern)

for image in config["images"]:
for tag in image["tags"]:
# TODO(jlewi): This appears to be really slow even when we aren't
# moving the image. Much slower than doing it in the UI
util.run(["gcloud", "container", "images", "add-tag", "--quiet",
image["image"], tag])
name = image["name"]
if not name_re.match(name):
continue
for v in image["versions"]:
for tag in v["tags"]:
if not tag_re.match(tag):
continue
source = name + "@" + v["digest"]
dest = name + ":" + tag
util.run(["gcloud", "container", "images", "add-tag", "--quiet",
source, dest])

logging.info("Done.")

Expand Down
Loading

0 comments on commit 8366029

Please sign in to comment.