# Docker Image Management

This notebook takes the latest docker file (opt_dockerfile), creates an image from it, and then pushes the image to GCP Container Registry, where it will be available for use by Kubeflow.

If you are creating a new image/version, update the parameters below as needed, and follow the notebook.

If you are replacing an already existing image, use the Clean Up Image Space section to remove/replace the existing image/version.

In [None]:
import socket

In [None]:
# params
HNAME ='us.gcr.io'  # container registry address
PROJECT_ID ='pbm-mac-lp-prod-ai' 

# Image Names
BASE_NAME = 'pbm_base'
SCRIPT_RUN_NAME = 'pbm_script_run'
OPT_NAME = 'pbm_opt'

# Image Version Setup
git_hash = !git rev-parse --short HEAD
git_branch = !git rev-parse --abbrev-ref HEAD
version_iteration = '0'  # change as needed, this should follow the version of the code on GIT
version_type = f'WIP-{socket.gethostname()}'  # Use 'PROD' and 'DEV' only for official versions.  For individual test use the WIP-YOUR_NAME.  This will allow to easy find and thelete those images
version = f'{git_branch[0]}-{git_hash[0]}-{version_type}-{version_iteration}'
try:
    version = version.split('/')[1]
except:
    pass

# Tags for Google Container Registry (where google stores images)
BASE_TAG = f"{HNAME}/{PROJECT_ID}/{BASE_NAME}:{version}"
SCRIPT_RUN_TAG = f"{HNAME}/{PROJECT_ID}/{SCRIPT_RUN_NAME}:{version}"
OPT_TAG = f"{HNAME}/{PROJECT_ID}/{OPT_NAME}:{version}"

In [None]:
print(version)

#### Clean Up Image Space (Optional)

Each image created in this notebook is pushed to GCP Container Registry. Use the next cell to list current images/versions. 

In [None]:
!gcloud container images list --repository {HNAME}/{PROJECT_ID}
!gcloud container images list-tags {HNAME}/{PROJECT_ID}/{BASE_NAME}

If an image will be replaced it must be deleted from GCP Container Registry first. Use the below cell to delete the image or image version if needed. Uncomment and fill in the command(s) below as needed.

In [None]:
# Delete a specific tag (image version)
# CAUTION! These commands delete images/versions and cannot be undone.
# It can be done via de version name or DIGEST

# VERSION_DELETE is the version that one wants to delete.
# VERSION_DELETE = ''
# !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{SCRIPT_RUN_NAME}:{VERSION_DELETE}
# !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{OPT_NAME}:{VERSION_DELETE}
# !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{BASE_NAME}:{VERSION_DELETE}

# VERSION_DIGEST is the version that one wants to delete as DIGEST.
# VERSION_DIGEST = ''
# !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{SCRIPT_RUN_NAME}@{VERSION_DIGEST}
# !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{OPT_NAME}@{VERSION_DIGEST}
# !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{BASE_NAME}@{VERSION_DIGEST}

Images can be batch deleted with the cell below. Uncomment and fill in the date to delete the images created before it.
We filter out PROD-CPMO-* images so that we don't accidentally delete our production images built from the master branch.

In [None]:
# import datetime as dt
# delete_date = dt.datetime.strptime('01/01/2022', '%m/%d/%Y')
# imgs = !gcloud container images list-tags {HNAME}/{PROJECT_ID}/{BASE_NAME} --format="get(digest, timestamp)" --filter="-tags:PROD-CPMO-"
# # add "--filter='-tags:*'" to the above gcloud call to only delete images with empty tags
# for row in imgs:
#     row_split = row.split('\t')
#     VERSION_DIGEST = row_split[0]
#     img_ts = dict()
#     for dt_str in row_split[1].split(';'):
#         img_ts[dt_str.split('=')[0]] = dt_str.split('=')[1]
    
#     if dt.datetime(int(img_ts['year']), int(img_ts['month']), int(img_ts['day'])) < delete_date:
#         !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{SCRIPT_RUN_NAME}@{VERSION_DIGEST}
#         !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{OPT_NAME}@{VERSION_DIGEST}
#         !gcloud container images delete --quiet --force-delete-tags {HNAME}/{PROJECT_ID}/{BASE_NAME}@{VERSION_DIGEST}

Images are built locally before they are pushed to GCP Container Registry. Use the following commands to list, and optionally clean up (delete) local image versions. If you have worked on the same Jupyter Notebook server instance for a while, you may have several local images/versions from previous builds.

In [None]:
!docker image list

In [None]:
# Delete image tags (replace IMAGE_NAME, uncomment, and run as needed)

# !docker image rm us.gcr.io/{PROJECT_ID}/pbm_opt:{IMAGE_NAME}  #<-- IMAGE_NAME>
# !docker image rm pbm_opt:{IMAGE_NAME}  #<-- IMAGE_NAME>

# !docker image rm us.gcr.io/{PROJECT_ID}/pbm_base:{IMAGE_NAME}  #<-- IMAGE_NAME>
# !docker image rm pbm_base:{IMAGE_NAME}  #<-- IMAGE_NAME>

# !docker image rm us.gcr.io/{PROJECT_ID}/pbm_script_run:{IMAGE_NAME}  #<-- IMAGE_NAME>
# !docker image rm pbm_script_run:{IMAGE_NAME}  #<-- IMAGE_NAME>

#### Build Docker Images Locally

In [None]:
# create docker image
!docker build -f dockerfile-base -t {BASE_NAME}:{version} .
# (use --no-cache if needed)

In [None]:
!docker image tag {BASE_NAME}:{version} {BASE_TAG}

Build Script Run Image

In [None]:
dfile_prep = f"""FROM {BASE_TAG}
RUN ["pip", "install", "--upgrade", "scikit-learn==0.23.2", "scipy==1.6.2", "statsmodels==0.12.1", "--index-url", "https://nexus-ha.cvshealth.com:9443/repository/pypi-proxy/simple"]"""
print(dfile_prep, file=open("dockerfile-script-run", "w"))

In [None]:
!docker build -f dockerfile-script-run -t {SCRIPT_RUN_NAME}:{version} .
!docker image tag {SCRIPT_RUN_NAME}:{version} {SCRIPT_RUN_TAG}

Build Optimization Image

In [None]:
dfile_opt = f"""FROM {BASE_TAG}
RUN ["pip", "install", "PuLP==2.4","duckdb==0.8.0", "XlsxWriter==1.3.7", "--index-url", "https://nexus-ha.cvshealth.com:9443/repository/pypi-proxy/simple"]"""
print(dfile_opt, file=open("dockerfile-opt", "w"))

In [None]:
!docker build -f dockerfile-opt -t {OPT_NAME}:{version} .
!docker image tag {OPT_NAME}:{version} {OPT_TAG}

#### Upload Images to GCP Container Registry

In [None]:
# upload image to container registry
!docker push {BASE_TAG}

In [None]:
!docker push {SCRIPT_RUN_TAG}

In [None]:
!docker push {OPT_TAG}