<a href="https://colab.research.google.com/gist/cwbeitel/18f01dfd62548452b344af430be84026/fx-dev.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Novel feedback experiences for enhanced mental training

Contributors: @cwbeitel

# Overview

Enhanced-feedback mental training experiences (FX) are the primary product of Project Clarify. Two categories of these are described in the following which in turn reference component issues:

- https://github.com/projectclarify/clarify/issues/84
- https://github.com/projectclarify/clarify/issues/83

This notebook is a central place for design docs (related to https://github.com/projectclarify/clarify/issues/147), interactive prototypes, and lightweight prototype development towards graduated prototypes later being integrated into our core front-end application.


# Setup

Please run the relevant sections here within and *remember* before newly-installed python dependencies become available the notebook runtime/kernel must be restarted (via "Runtime" > "Restart Runtime" above).

##### Clarify codebase and dependencies

In [85]:
# TODO: Check whether it's already present and if so skip
!git clone https://github.com/projectclarify/clarify.git

Cloning into 'clarify'...
remote: Enumerating objects: 72, done.[K
remote: Counting objects: 100% (72/72), done.[K
remote: Compressing objects: 100% (62/62), done.[K
remote: Total 10244 (delta 30), reused 25 (delta 10), pack-reused 10172[K
Receiving objects: 100% (10244/10244), 32.79 MiB | 10.61 MiB/s, done.
Resolving deltas: 100% (2503/2503), done.


In [86]:
# Check out a fairly old version of the codebase that can be used to restore the
# image FEC model we trained in tensorflow.
!cd clarify && git checkout -b demo e5252a37fcdd4ccabdd9ef564ece0385365b577d && pip install -e .[tensorflow,tests]

Checking out files:  85% (8278/9719)   Checking out files:  86% (8359/9719)   Checking out files:  87% (8456/9719)   Checking out files:  88% (8553/9719)   Checking out files:  89% (8650/9719)   Checking out files:  90% (8748/9719)   Checking out files:  91% (8845/9719)   Checking out files:  92% (8942/9719)   Checking out files:  93% (9039/9719)   Checking out files:  94% (9136/9719)   Checking out files:  95% (9234/9719)   Checking out files:  96% (9331/9719)   Checking out files:  97% (9428/9719)   Checking out files:  98% (9525/9719)   Checking out files:  99% (9622/9719)   Checking out files: 100% (9719/9719)   Checking out files: 100% (9719/9719), done.
Switched to a new branch 'demo'
Obtaining file:///content/clarify
Collecting numpy==1.16.2
[?25l  Downloading https://files.pythonhosted.org/packages/35/d5/4f8410ac303e690144f0a0603c4b8fd3b986feb2749c435f7cdbb288f17e/numpy-1.16.2-cp36-cp36m-manylinux1_x86_64.whl (17.3MB)
[K     |████████████████████████████████| 

In [87]:
# Dependency needed for FX prototypes that involved synthesized speech.
!pip install gtts



##### TFServing dependencies

This is only relevant for the prototypes that query a model served locally with TFServing.

In [0]:
!sh /content/clarify/tools/serving/install_tf_serving.sh

##### Model checkpoint

TODO: Currently this needs to be obtained from cloud storage following authentication because for whatever reason there is an auth error when the GCS checkpoint location is referenced directly despite that being publicly-accessible

In [0]:
# 1. Log into your GCloud account by running the following, clicking the
# link, authenticating, and providing the resulting code in the box below.
!gcloud auth login

In [8]:
# 2. Next, download the model parameters locally (instead of relying on
# tensorflow to propperly manage credentials to fetch these itself).
!gsutil -m cp -r gs://clarify-public/models/fec-train-j1030-0136-3a8f/output /tmp/

Copying gs://clarify-public/models/fec-train-j1030-0136-3a8f/output/eval/events.out.tfevents.1572401042.fec-train-j1030-0136-3a8f-master-0.v2...
Copying gs://clarify-public/models/fec-train-j1030-0136-3a8f/output/eval/events.out.tfevents.1572400837.fec-train-j1030-0136-3a8f-master-0.v2...
/ [0 files][    0.0 B/   96.0 B]                                                Copying gs://clarify-public/models/fec-train-j1030-0136-3a8f/output/eval/events.out.tfevents.1572401247.fec-train-j1030-0136-3a8f-master-0.v2...
Copying gs://clarify-public/models/fec-train-j1030-0136-3a8f/output/eval/events.out.tfevents.1572400018.fec-train-j1030-0136-3a8f-master-0.v2...
/ [0 files][    0.0 B/  192.0 B]                                                / [0 files][    0.0 B/  288.0 B]                                                / [0 files][    0.0 B/  384.0 B]                                                Copying gs://clarify-public/models/fec-train-j1030-0136-3a8f/output/checkpoint...
/ [0 files][  

##### FX-specific dependencies

In [11]:
# TODO: We can install it but having trouble importing it
!npm install -g lit-element

[K[?25h+ lit-element@2.2.1
added 2 packages from 1 contributor in 0.551s


##### Eager mode

Inference in tf.Eager mode is being phased out but for now it's important to initialize eager mode at startup if it's going to be used.

In [0]:
%tensorflow_version 1.x
import tensorflow as tf

tfe = tf.contrib.eager
tfe.enable_eager_execution()
Modes = tf.estimator.ModeKeys


##### Shared functions for image capture

In [1]:
from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode
import base64

import io
import numpy as np

from PIL import Image

from faced import FaceDetector
from faced.utils import annotate_image
import copy

import cv2

from pcml.datasets import image_aug


def take_photo(filename='photo.jpg', quality=0.8):
  js = Javascript('''
    async function takePhoto(quality) {
      const div = document.createElement('div');

      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({
          video: { width: 256, height: 128 }
      });

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      video.style.visibility = "hidden";
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();

      data = canvas.toDataURL('image/jpeg', quality);

      div.remove();
      canvas.remove();
      video.remove();

      return data
    }
    ''')
  display(js)
  return eval_js('takePhoto({})'.format(quality))


def load_image(img_string):
  binary = b64decode(img_string.split(',')[1])
  filename = "/tmp/stream.jpeg"
  with open(filename, 'wb') as f:
    f.write(binary)
  img = Image.open(filename)
  return np.array(img)


def _random_crop_square(image):

  x,y,c = image.shape

  x_crop_before = 0
  x_crop_after = 0
  y_crop_before = 0
  y_crop_after = 0

  if x > y:
    x_crop = x - y
    x_crop_before = np.random.randint(0, x_crop)
    x_crop_after = x_crop - x_crop_before
  elif y > x:
    y_crop = y - x
    y_crop_before = np.random.randint(0, y_crop)
    y_crop_after = y_crop - y_crop_before

  x_start = x_crop_before
  x_end = x - x_crop_after
  y_start = y_crop_before
  y_end = y - y_crop_after

  return image[x_start:x_end, y_start:y_end, :]


def _normalize_dimensions(image, target_shape):

  image = _random_crop_square(image)

  mn, mx = np.amin(image), np.amax(image)
  if mn >=0 and mx <= 255:
    image = image / 255.0

  source_shape = image.shape
  scale_x_factor = target_shape[0]/source_shape[0]
  scale_y_factor = target_shape[1]/source_shape[1]
  scale_x_first = (scale_x_factor <= scale_y_factor)

  if scale_x_first:

    new_x = target_shape[0]
    new_y = int(source_shape[1]*scale_x_factor)
    resize_dim = (new_x, new_y)
    newimg = cv2.resize(image, resize_dim)
    pad_width = target_shape[1] - new_y
    if pad_width > 0:
      # Pad in Y direction
      newimg = np.pad(newimg, [(0,pad_width),(0,0),(0,0)], mode="mean")

  else:

    new_y = target_shape[1]
    new_x = int(source_shape[0]*scale_y_factor)
    resize_dim = (new_x, new_y)
    newimg = cv2.resize(image, resize_dim)
    pad_width = target_shape[0] - new_x
    if pad_width > 0:
      # Pad in X direction
      newimg = np.pad(newimg, [(0,0),(0,pad_width),(0,0)], mode="mean")

  newimg = (newimg*255.0).astype(np.int64)

  return newimg


def detect_and_preprocess(image):

  detector = FaceDetector()
  detect_threshold = 0.5
  predictions = detector.predict(image, detect_threshold)

  xcenter = predictions[0][0]
  ycenter = predictions[0][1]
  width = predictions[0][2]*1.80
  height = predictions[0][3]*1.80

  xmax = image.shape[1]
  ymax = image.shape[0]

  ystart = max(0,int(ycenter-height/2))
  yend = min(ymax,int(ycenter+height/2))
  xstart = max(0,int(xcenter-width/2))
  xend = min(xmax,int(xcenter+width/2))

  img_with_face = image[ystart:yend,xstart:xend,:]

  image_shape = (128,128,3)
  img_post = _normalize_dimensions(img_with_face, target_shape=image_shape)

  return img_post



The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
  * https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  * https://github.com/tensorflow/addons
  * https://github.com/tensorflow/io (for I/O related ops)
If you depend on functionality not listed there, please file an issue.






INFO:tensorflow:Entry Point [tensor2tensor.envs.tic_tac_toe_env:TicTacToeEnv] registered with id [T2TEnv-TicTacToeEnv-v0]






# Prototypes

##### Basic JS in colab

In [64]:
%%javascript
function dev() {
  return 1
}
function dev2() {
  return dev()
}
console.log(dev2())

<IPython.core.display.Javascript object>

##### Basic HTMLElement subclass, HTTP request

In [48]:
# Display a basic HTMLElement subclass that makes an HTTP request

import IPython
display(IPython.display.HTML('''

  <head>

  <script>

  function queryServedModel() {
    const Http = new XMLHttpRequest();
    const url='https://jsonplaceholder.typicode.com/posts';
    Http.open("GET", url);
    Http.send();

    Http.onreadystatechange = (e) => {
      console.log(Http.responseText)
    }
  }

  class FXComponent extends HTMLElement {

    constructor() {
      super(); // always call super() first in the constructor.
      console.log("constructed");
    }

    connectedCallback() {
      // E.g. somewhere in its lifecycle an HTTP request is made...
      queryServedModel();
      console.log("connected")
    }

  }

  customElements.define('fx-component', FXComponent);

  </script>

  </head>

  <body>
    <fx-component></fx-component>
  </body>

'''))

###### LitElement in Colab

Here are some docs to get started:
- https://lit-element.polymer-project.org/
- https://lit-element.polymer-project.org/try

In [0]:
display(IPython.display.HTML('''

  <head>

  <script>
    
    // -------------
    // This does not work, thus the rest does not work:
    //
    // import { LitElement, html } from 'lit-element';
    // --------------

    class FXComponent extends LitElement {

      /**
       * Implement `render` to define a template for your element.
       *
       * You must provide an implementation of `render` for any element
       * that uses LitElement as a base class.
       */
      render(){
        /**
         * `render` must return a lit-html `TemplateResult`.
         *
         * To create a `TemplateResult`, tag a JavaScript template literal
         * with the `html` helper function:
         */
        return html`
          <!-- template content -->
          <p>A paragraph</p>
        `;
      }
    }

    customElements.define('fx-component', FXComponent);

  </script>

  </head>

  <body>
    <fx-component></fx-component>
  </body>

'''))

##### Polymer component in colab

For debugging tried Polymer < 3.0. This does not currently work.

In [0]:
import IPython
display(IPython.display.HTML('''

  <head>

  <link rel="import"  href="https://polygit.org/polymer+^1.9.1/webcomponentsjs+^0.7.0/components/polymer/polymer.html">

  <script>

  Polymer({
    is: "proto-element",

    // add a callback to the element's prototype
    ready: function() {
      this.textContent = "I'm a proto-element. Check out my prototype!"
    }
  });

  </script>

  </head>

  <body>
    <proto-element></proto-element>
  </body>

'''))


##### Call python model from JS

Towards taking the place of a remote HTTP query in the interest of simplicity while developing new FX components.

In [0]:
import IPython
from google.colab import output

def infer_state_from_image(input_tensor):
  # Use display.JSON to transfer a structured result.
  return IPython.display.JSON({'result': 1234})

output.register_callback('notebook.infer_state_from_image', infer_state_from_image)


In [83]:

%%javascript
(async function() {

  const result = await google.colab.kernel.invokeFunction(
    'notebook.infer_state_from_image', // The callback name.
    ['image_data'], // The arguments.
    {}); // kwargs

  const text = result.data['application/json'];

  document.querySelector("#output-area").appendChild(document.createTextNode(text.result));

})();


<IPython.core.display.Javascript object>

##### Faster js -> python image capture

This was the method used in the first notebook that was built and was somewhat slow. It could be sped up, perhaps by not removing and recreating media elements or we could opt for a different way of interoperating between python and js. 

In [0]:
from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode
import cv2
import numpy as np

import matplotlib
%matplotlib inline

from matplotlib.pyplot import imshow 


def get_frame(filename='photo.jpg', quality=0.8):
  """Some description of your FX demo."""

  js = Javascript('''

    async function takePhoto(quality=1234) {

      const div = document.createElement('div');

      const video = document.createElement('video');
      video.style.display = 'block';
      const stream = await navigator.mediaDevices.getUserMedia({video: true});

      document.body.appendChild(div);
      div.appendChild(video);
      video.srcObject = stream;
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      const canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      canvas.getContext('2d').drawImage(video, 0, 0);
      stream.getVideoTracks()[0].stop();
      div.remove();
      return canvas.toDataURL('image/jpeg', quality);
    }
    ''')

  display(js)

  data = eval_js('takePhoto({})'.format(1234))

  data = b64decode(data.split(',')[1])

  img = cv2.imdecode(np.frombuffer(data, dtype=np.uint8),
                     cv2.IMREAD_COLOR)
  img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

  return img

data = get_frame()

imshow(data)


##### Eager-mode inference

Code for the previous demo that did inference in TensorFlow Eager mode. Requires the runtime to be re-set each time the model is re-stored from a checkpoint.

In [0]:
%tensorflow_version 1.x
import tensorflow as tf

tfe = tf.contrib.eager
tfe.enable_eager_execution()
Modes = tf.estimator.ModeKeys

from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode
import base64

import io
import numpy as np

from PIL import Image

from faced import FaceDetector
from faced.utils import annotate_image
import copy

import cv2

import matplotlib
%matplotlib inline

from matplotlib import pyplot as plt

from pcml.models import percep_similarity_emb

from tensor2tensor.utils import registry

from pcml.datasets import image_aug

from tensor2tensor.serving import serving_utils

from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc
from pcml.utils.dev_utils import run_server, wait_for_server_ready

from gtts import gTTS
from IPython.display import (
    Audio, display, clear_output)
import time

problem_name = "facial_expression_correspondence"
model_name = "percep_similarity_triplet_emb"
hparams_set_name = "percep_similarity_triplet_emb"
mode = "predict"
ckpt_dir = "/tmp/output"
data_dir = ckpt_dir
export_dir = "/tmp/output/export"

hparams = registry.hparams(hparams_set_name)
hparams.data_dir = data_dir

problem_obj = registry.problem(problem_name)

p_hparams = problem_obj.get_hparams(hparams)

model_obj = registry.model(model_name)


def get_example():

  image_stats = {"mean": [0.330, 0.537, -0.242], "sd": [0.220, 0.169, 1.156]}
  shape = (64,64,3)
  mode = "eval"

  img = np.asarray((load_image(take_photo())))
  
  img = detect_and_preprocess(img)

  # Convert to int32

  example = {
    "image/a": img,
    "image/b": img,
    "image/c": img,
    "image/a/noaug": img,
    "image/b/noaug": img,
    "image/c/noaug": img,
    "triplet_code": [0],
    "type": [1],
    "targets": img
  }

  def _preproc(image):

    image = image_aug.preprocess_image(
      image, mode,
      resize_size=shape,
      normalize=True,
      image_statistics=image_stats,
      crop_area_min=1,
      contrast_lower=0.45,
      contrast_upper=0.55,
      brightness_delta_min=-0.01,
      brightness_delta_max=0.01)

    image.set_shape(shape)

    return image

  example["image/a"] = tf.expand_dims(_preproc(example["image/a"]),0)
  example["image/b"] = tf.expand_dims(_preproc(example["image/b"]),0)
  example["image/c"] = tf.expand_dims(_preproc(example["image/c"]),0)
  example["triplet_code"] = tf.expand_dims(tf.cast(example["triplet_code"], tf.int64),0)

  return example


In [0]:

def _goal_from_goals(goals):
  return goals[0]


def _tts(msg):
  filename = "/tmp/clarify-tmp.mp3"
  tts = gTTS(msg)
  tts.save(filename)
  display(Audio(filename, autoplay=True))


def play_instructions():
  message = "Goal definition will begin in twenty seconds following a tone and will "
  message += "last for ten seconds. Please assume a mental state that you want to "
  message += "be your goal. The definition period will begin in seven seconds."
  _tts(message)
  time.sleep(20)
  synth(600)


def play_beginning_session():
  _tts("Goal definition complete. Beginning session.")
  time.sleep(3)


def play_establishing_baseline():
  _tts("Establishing baseline. Please demonstrate a variety of expressions and poses.")
  time.sleep(3)


def play_baseline_complete():
  _tts("Finished establishing baseline.")
  time.sleep(3)


def synth(f):
  rate = 16000.
  duration = .25
  t = np.linspace(
    0., duration, int(rate * duration))

  x = np.sin(f * 2. * np.pi * t)
  display(Audio(x, rate=rate, autoplay=True))


def optimize():

  play_instructions()

  query_data = {}

  synth_min = 200
  synth_max = 600
  synth_range = synth_max - synth_min

  mn = None
  mx = None

  goals = []
  goal = None

  distance_threshold = 1.3

  num_sampled = 0

  baseline = True

  distances = []
  examples = []
  num_baseline_steps = 0

  with tfe.restore_variables_on_create(tf.train.latest_checkpoint(ckpt_dir)):

    model = model_obj(hparams, mode, p_hparams)

    while True:

      print("Doing step...")

      try:

        example = get_example()
        current, _ = model(example)

        num_sampled += 1

        if num_sampled <= 3:

          goals.append(current)
          distances.append(0)
          examples.append(example)

        else:

          if goal is None:
            goal = _goal_from_goals(goals)
            play_beginning_session()

          dist = np.linalg.norm(current - goal)
          print(dist)

          distances.append(dist)
          examples.append(example)

          if not mn:
            mn = dist
          if not mx:
            mx = dist

          if dist < mn:
            mn = dist

          if dist > mx:
            mx = dist

          if num_sampled == 4:
            play_establishing_baseline()

          synth_level = synth_min + synth_range*((dist-mn)/(mx-mn))

          #if distances:
          #  distance_threshold = np.mean(distances)

          print(synth_level)
          if dist > distance_threshold:
            synth(synth_level)

          if num_sampled == 20:
            play_baseline_complete()
            baseline = False
            num_baseline_steps = num_sampled

      except (KeyboardInterrupt, SystemExit):
        break

      except:
        raise
        print("there was an exception but we're not worried ;D")

  return locals()

In [0]:
optimize()

Critical assessment:

* It seems useful representations of goal vs. non-goal states are being made and there is a strong technical foundation for building state-based feedback experiences.
* This particular form of feedback experience can be improved in various ways. For one, it's distracting. Tones are uncomfortable and irregular (perhaps better to have these be continuous). Also, when a tone is played it interrupts music playing by temporarily turning down its volume - instead, regulation of system volume could itself be the feedback cue (i.e. on-target is desired volume).


##### Query locally served model

TODO: This is just temporary and to avoid having to restart the runtime for each run of the demo.

###### Run the server

Needs load_image and take_photo from eager section above

In [9]:
# Make sure you have actually obtained the checkpoint
!ls /tmp/output | wc -l

68


In [0]:

import numpy as np

from pcml.utils.dev_utils import run_server, wait_for_server_ready

from pcml.models import percep_similarity_emb
from tensor2tensor.serving import serving_utils

from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc

problem_name = "facial_expression_correspondence"
model_name = "percep_similarity_triplet_emb"
hparams_set_name = "percep_similarity_triplet_emb"
mode = "predict"
ckpt_dir = "/tmp/output"
data_dir = ckpt_dir
export_dir = "/tmp/output/export"

tfms_path = "/usr/bin/tensorflow_model_server"
!ls {tfms_path}

_, server, _ = run_server(
    model_name,
    export_dir,
    tfms_path
)

wait_for_server_ready(server)

def prepare_request():

  img = np.asarray((load_image(take_photo())))

  img = detect_and_preprocess(img)

  image_feature = tf.train.Feature(int64_list=tf.train.Int64List(value=img.flatten().tolist()))

  ex = tf.train.Example(features=tf.train.Features(feature={
    "image/a": image_feature,
    "image/b": image_feature,
    "image/c": image_feature,
    "triplet_code": tf.train.Feature(int64_list=tf.train.Int64List(value=[0] * 1)),
    "type": tf.train.Feature(int64_list=tf.train.Int64List(value=[0] * 1)),
    "rating/mode": tf.train.Feature(int64_list=tf.train.Int64List(value=[0] * 1)),
    "targets": tf.train.Feature(int64_list=tf.train.Int64List(value=[0] * 1))
  }))

  return ex


def single_served_model_request(ex, servable_name):

  proto = tf.make_tensor_proto([ex.SerializeToString()], shape=[1])

  request = predict_pb2.PredictRequest()
  request.model_spec.name = model_name
  request.inputs["input"].CopyFrom(proto)

  response = stub.Predict(request, timeout_secs)
  outputs = tf.make_ndarray(response.outputs["outputs"])
  return outputs[0]
  

timeout_secs = 5
stub = serving_utils._create_stub(server)


###### Make requests

In [0]:
# Make an expression that will be the goal
goal = single_served_model_request(prepare_request(), model_name)

In [0]:
# Collect a series of on-target images
on_target = []
for i in range(20):
  on_target.append(single_served_model_request(prepare_request(), model_name))

In [0]:
# Collect a series of off-target images
off_target = []
for i in range(20):
  off_target.append(single_served_model_request(prepare_request(), model_name))

In [0]:
import numpy as np

# The distances off-target should be larger than on-target

print("on-target")
tot_on = 0
for emb in on_target:
  d = np.linalg.norm(goal-emb)
  tot_on += d
  print(d)
print("on target mean distance: {}".format(tot_on/len(on_target)))

print("\n")

print("off-target")
tot_off = 0
for emb in off_target:
  d = np.linalg.norm(goal-emb)
  tot_off += d
  print(d)
print("on target mean distance: {}".format(tot_off/len(off_target)))
