## Agents on Kubeflow

In this tutorial we will be training a reinforcement learning agent from the [tensorflow/agents](https://github.com/tensorflow/agents) project on Kubernetes using [Kubeflow](https://github.com/google/kubeflow).

The task the agent will be learning to perform is to operate a Kuka Robotics arm simulated in the OpenAI Gym Bullet Physics 'KukaBulletEnv-v0' environment. Feel free to [skip to the end](http://localhost:8888/notebooks/kubeflow-rl/apps/agents_ppo/demo.ipynb#Rendering-the-model) to see what this will look like!

### Setup

We need to create a Google Cloud Storage bucket to store job logs as well as a unique subdirectory of that bucket to store logs for this particular run. With the following we first create the GCS bucket then generate the path of a log dir to use in a later step.

Set the variables below to a project a bucket suitable for your use

In [74]:
# GCP project to use
PROJECT="kubeflow-rl"
# Bucket to use
BUCKET=PROJECT+"-kf"
# K8s cluster to use
CLUSTER="kubeflow"
ZONE="us-east1-d"
NAMESPACE="rl"
# Root directory for the kubeflow-rl repository
ROOT_DIR = "/home/jovyan/git_kubeflow-rl"
SECRET_NAME = "kubeflow-rl-gcp"

You will need GCP credentials to access the cluster and GCP resources
  * Create a service account and download the private key
  * Use JupyterLab to upload the service account to your pod
  * Set the path to your service account in the cell below and then execute it to activate the service account

In [21]:
KEY_FILE="/home/jovyan/kubeflow-rl-23683422ae6c.json"
!gcloud auth activate-service-account --key-file={KEY_FILE}

Activated service account credentials for: [jlewi-kubeflow-rl@kubeflow-rl.iam.gserviceaccount.com]


In [22]:
!gsutil mb -p {PROJECT} gs://{BUCKET}

Creating gs://kubeflow-rl-kf/...
ServiceException: 409 Bucket kubeflow-rl-kf already exists.


Get credentials for your cluster

In [25]:
!gcloud container clusters --project={PROJECT} --zone={ZONE} get-credentials {CLUSTER}

Fetching cluster endpoint and auth data.
kubeconfig entry generated for kubeflow.


Create the namespace

In [35]:
!kubectl create namespace {NAMESPACE}

Error from server (AlreadyExists): namespaces "rl" already exists


Download and install ksonnet if needed

In [53]:
!mkdir -p ${HOME}/bin
!curl -L -o ${HOME}/bin/ks "https://github.com/ksonnet/ksonnet/releases/download/v0.8.0/ks-linux-amd64"
!chmod a+rx ${HOME}/bin/ks

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   605    0   605    0     0   2037      0 --:--:-- --:--:-- --:--:--  2043
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  2 33.8M    2  950k    0     0   665k      0  0:00:52  0:00:01  0:00:51  987k  4 33.8M    4 1631k    0     0   670k      0  0:00:51  0:00:02  0:00:49  829k  6 33.8M    6 2106k    0     0   615k      0  0:00:56  0:00:03  0:00:53  711k  7 33.8M    7 2600k    0     0   587k      0  0:00:58  0:00:04  0:00:54  656k  9 33.8M    9 3263k    0     0   602k      0  0:00:57  0:00:05  0:00:52  658k 11 33.8M   11 3841k    0     0   598k      0  0:00:57  0:00:06  0:00:51  578k 13 33.8M   13 4605k    0     0   620k      0  0:00:55  0:00:07  0:00:48  595k 15 33.8M   15 5370k    0     0   637k      0  

If running on GCP (or possibly another Cloud) you probably need to create a key with credentials to use for your job

In [123]:
SECRET_FILE_NAME="secret.json"
!kubectl create -n {NAMESPACE} secret generic {SECRET_NAME}  --from-file={SECRET_FILE_NAME}={KEY_FILE}

Error from server (AlreadyExists): secrets "kubeflow-rl-gcp" already exists


### Training

The objective of the training phase is to learn the parameterization of our model that confers a high level of performance on the provided task. Here we'll launch and monitor a job.

#### Launching the TFJob

We'll use [ksonnet](https://ksonnet.io/) to parameterize and apply a TFJob configuration (i.e. run a job). Here you can change the image to be a custom job image, such as one built and deployed with build.sh, or use the one provided here if you only want to change parameters. Below we'll display the templated job YAML for reference.

In [67]:
# Check your cluster and see if that matches one of the existing ksonnet environments
# You want the kubernetes master server to be the same as the server listed for the ks environment
!kubectl cluster-info
!ks env list

[0;32mKubernetes master[0m is running at [0;33mhttps://35.196.10.29[0m
[0;32mGLBCDefaultBackend[0m is running at [0;33mhttps://35.196.10.29/api/v1/namespaces/kube-system/services/default-http-backend/proxy[0m
[0;32mHeapster[0m is running at [0;33mhttps://35.196.10.29/api/v1/namespaces/kube-system/services/heapster/proxy[0m
[0;32mKubeDNS[0m is running at [0;33mhttps://35.196.10.29/api/v1/namespaces/kube-system/services/kube-dns/proxy[0m
[0;32mkubernetes-dashboard[0m is running at [0;33mhttps://35.196.10.29/api/v1/namespaces/kube-system/services/kubernetes-dashboard/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
NAME     NAMESPACE SERVER
default            https://35.185.119.177
gke      rl        https://35.196.10.29
iap-test iap-test  https://35.196.10.29


In [125]:
import datetime
import uuid
import os

os.chdir(os.path.join(ROOT_DIR, "rl-app"))

HPARAM_SET="pybullet-kuka-ff"

now=datetime.datetime.now()

JOB_SALT=now.strftime("%m%d-%H%M") + "-" + uuid.uuid4().hex[0:4]
JOB_NAME=HPARAM_SET + "-" + JOB_SALT
LOG_DIR="gs://{0}/jobs/{1}".format(BUCKET, JOB_NAME)

!ks param set agents-ppo gcp_project kubeflow-rl
!ks param set agents-ppo num_cpu 30
!ks param set agents-ppo job_tag {JOB_SALT}
!ks param set agents-ppo log_dir {LOG_DIR}
!ks param set agents-ppo name {JOB_NAME}
!ks param set agents-ppo gcp_secret {SECRET_NAME}
!ks param set agents-ppo secret_file_name {SECRET_FILE_NAME}
!ks show default -c agents-ppo

!ks apply gke -c agents-ppo

[34mINFO  [0mParameter 'gcp_project' successfully set to '"kubeflow-rl"' for component 'agents-ppo'
[34mINFO  [0mParameter 'num_cpu' successfully set to '30' for component 'agents-ppo'
[34mINFO  [0mParameter 'job_tag' successfully set to '"0118-2346-bac2"' for component 'agents-ppo'
[34mINFO  [0mParameter 'log_dir' successfully set to '"gs://kubeflow-rl-kf/jobs/pybullet-kuka-ff-0118-2346-bac2"' for component 'agents-ppo'
[34mINFO  [0mParameter 'name' successfully set to '"pybullet-kuka-ff-0118-2346-bac2"' for component 'agents-ppo'
[34mINFO  [0mParameter 'gcp_secret' successfully set to '"kubeflow-rl-gcp"' for component 'agents-ppo'
[34mINFO  [0mParameter 'secret_file_name' successfully set to '"secret.json"' for component 'agents-ppo'
---
apiVersion: tensorflow.org/v1alpha1
kind: TfJob
metadata:
  name: pybullet-kuka-ff-0118-2346-bac2
  namespace: rl
spec:
  replicaSpecs:
  - replicas: 1
    template:
      spec:
        containers:
        - args:
  

Now we can list tfjobs and see that a job has been created.

In [142]:
!kubectl get tfjobs -n {NAMESPACE} -o yaml {JOB_NAME}

apiVersion: tensorflow.org/v1alpha1
kind: TfJob
metadata:
  clusterName: ""
  creationTimestamp: 2018-01-18T23:46:28Z
  generation: 0
  name: pybullet-kuka-ff-0118-2346-bac2
  namespace: rl
  resourceVersion: "1473597"
  selfLink: /apis/tensorflow.org/v1alpha1/namespaces/rl/tfjobs/pybullet-kuka-ff-0118-2346-bac2
  uid: cd3d7cf9-fca9-11e7-ac67-42010a8e00d6
spec:
  RuntimeId: njc7
  replicaSpecs:
  - IsDefaultPS: false
    replicas: 1
    template:
      metadata:
        creationTimestamp: null
      spec:
        containers:
        - args:
          - --logdir=gs://kubeflow-rl-kf/jobs/pybullet-kuka-ff-0118-2346-bac2
          - --config=pybullet_kuka_ff
          - --network=feed_forward_gaussian
          - --policy_layers=200,100
          - --value_layers=200,100
          - --num_agents=30
          - --steps=10000000
          - --discount=0.995
          - --kl_target=0.01
          - --kl_cutoff_factor=2
          - --kl_cutoff_coef=1000
       

#### Monitoring training

The IDs, status, and other metadata of pods involved in the training job can be displayed using the following:

In [128]:
!kubectl get pods -n rl --show-all

NAME                                                     READY     STATUS    RESTARTS   AGE
jupyter-accounts-2egoogle-2ecom-3ajlewi-40google-2ecom   1/1       Running   0          1d
pybullet-kuka-ff-0118-2340-8617-master-3q34-0-g7sx6      1/1       Running   0          6m
pybullet-kuka-ff-0118-2346-bac2-master-njc7-0-hf2pj      0/1       Pending   0          22s
tf-hub-0                                                 2/2       Running   0          7d
tf-job-dashboard-4152871061-wns5g                        1/1       Running   0          1d
tf-job-operator-4193103-0413l                            1/1       Running   0          1d


Logs from a specific pod can be displayed with the following (or streamed by adding the --follow flag):

In [117]:
!kubectl -n {NAMESPACE} get pods -o yaml pybullet-kuka-ff-0118-2331-5481-master-l7hz-0-vjp6q

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/created-by: |
      {"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"Job","namespace":"rl","name":"pybullet-kuka-ff-0118-2331-5481-master-l7hz-0","uid":"b63a2073-fca7-11e7-ac67-42010a8e00d6","apiVersion":"batch","resourceVersion":"1471868"}}
  creationTimestamp: 2018-01-18T23:31:30Z
  generateName: pybullet-kuka-ff-0118-2331-5481-master-l7hz-0-
  labels:
    controller-uid: b63a2073-fca7-11e7-ac67-42010a8e00d6
    job-name: pybullet-kuka-ff-0118-2331-5481-master-l7hz-0
    job_type: MASTER
    runtime_id: l7hz
    task_index: "0"
    tensorflow.org: ""
    tf_job_name: pybullet-kuka-ff-0118-2331-5481
  name: pybullet-kuka-ff-0118-2331-5481-master-l7hz-0-vjp6q
  namespace: rl
  ownerReferences:
  - apiVersion: batch/v1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: pybullet-kuka-ff-0118-2331-5481-master-l7hz-0
    uid: b63a2073-fca7-11e7-ac67-42010a8

In [132]:
JOB_NAME

'pybullet-kuka-ff-0118-2346-bac2'

In [149]:
import subprocess
master_pod = subprocess.check_output(["kubectl", "-n", NAMESPACE, "get", "pods", "--selector=tf_job_name=" + JOB_NAME,
                                      "-o", "jsonpath='{.items[*].metadata.name}'"]).decode("utf-8")
print(master_pod)


'pybullet-kuka-ff-0118-2346-bac2-master-njc7-0-hf2pj'


In [150]:
!kubectl logs -n {NAMESPACE} {master_pod}

INFO:tensorflow:Tensorflow version: 1.3.0
INFO:tensorflow:Tensorflow git version: v1.3.0-rc2-20-g0787eee
INFO:tensorflow:Start a new run and write summaries and checkpoints to gs://kubeflow-rl-kf/jobs/pybullet-kuka-ff-0118-2346-bac2.
INFO:tensorflow:Graph contains 44607 trainable variables.
2018-01-18 23:49:04.900428: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-18 23:49:04.900509: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2018-01-18 23:49:04.900520: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2018-01-18 23:49:04.900530: W te

## Launching tensorboard
You can run tensorboard in your job as follows

In [152]:
!ks param set tensorboard name {JOB_NAME}
!ks param set tensorboard namespace {NAMESPACE}
!ks param set tensorboard log_dir {LOG_DIR}
!ks param set tensorboard secret {SECRET_NAME}
!ks param set tensorboard secret_file_name {SECRET_FILE_NAME}
!ks show default -c tensorboard

!ks apply gke -c tensorboard

[34mINFO  [0mParameter 'name' successfully set to '"pybullet-kuka-ff-0118-2346-bac2"' for component 'tensorboard'
[34mINFO  [0mParameter 'namespace' successfully set to '"rl"' for component 'tensorboard'
[34mINFO  [0mParameter 'log_dir' successfully set to '"gs://kubeflow-rl-kf/jobs/pybullet-kuka-ff-0118-2346-bac2"' for component 'tensorboard'
[34mINFO  [0mParameter 'secret' successfully set to '"kubeflow-rl-gcp"' for component 'tensorboard'
[34mINFO  [0mParameter 'secret_file_name' successfully set to '"secret.json"' for component 'tensorboard'
---
apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: pybullet-kuka-ff-0118-2346-bac2-tb
  namespace: rl
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: tensorboard
        tb-job: pybullet-kuka-ff-0118-2346-bac2
      name: pybullet-kuka-ff-0118-2346-bac2
      namespace: rl
    spec:
      containers:
      - command:
        - /usr/local/bin/tensorboard
        - --logdir=g

### Connecting to Tensorboard
To connect to tensorboard use `kubectl proxy` and then access it and the url given by the URL returned by evaluating the next cell

In [None]:
PROXY_PORT=8001
url=print("http://127.0.0.1:{proxy_port}/api/v1/proxy/namespaces/{namespace}/services/{service_name}:80/".format(
    proxy_port=PROXY_PORT, namespace=NAMESPACE, service_name=JOB_NAME + "-tb"))
print(url)

#### Deleting jobs

In [3]:
!kubectl delete tfjobs -n {NAMESPACE} {JOB_NAME}

tfjob "pybullet-kuka-ff-0e90193e" deleted


### Rendering the model

When the job is complete there will be a subdirectory of the log dir named "render" with a number of short videos of episodes of the agent performing the grasping task. Here's an example of what one of those looks like in a well-trained model.

In [9]:
import io
import base64
from IPython.display import HTML

# Replace with the 
mp4_path = 'render.mp4'

video = io.open(mp4_path, 'r+b').read()
encoded = base64.b64encode(video)
HTML(data='''<video alt="test" controls>
                <source src="data:video/mp4;base64,{0}" type="video/mp4" />
             </video>'''.format(encoded.decode('ascii')))

### Great job! 🎉🎉🎉

If this is your first time working with these technologies you might be interested in some suggestions of good next steps. Here are some ideas:
- Try training with some other learning environments and tweet your results!
- Take a shot at implementing your own gym learning environment and repeat the above.