Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video summary support #39

Open
teamdandelion opened this issue Jun 16, 2017 · 28 comments
Open

Video summary support #39

teamdandelion opened this issue Jun 16, 2017 · 28 comments

Comments

@teamdandelion
Copy link
Contributor

Migrated from tensorflow/tensorflow#3936

Just wondering if there are any plans to add video summaries to tensorboard??

@falcondai offered to implement it as a plugin. @falcondai, I think we are almost at the point where we're ready to accept a plugin like that. Give us a few more weeks to clean up the plugin API and get some good examples you can work off of.

@falcondai
Copy link

sounds good. i look forward to that 😄

@chrisranderson
Copy link
Contributor

@dandelionmane @falcondai @jart I'd be willing to build it as part of #130, if @falcondai doesn't mind. I think my project should be split into two pieces - the piece that generates the frames, and the summary piece that streams frames to TensorBoard.

@falcondai
Copy link

falcondai commented Jul 7, 2017

@chrisranderson i originally planned to implement the video summaries as GIF's: I have experience doing that from a related project tensorflow/models#553. I thought of the alternative of showing the videos as full-blown HTML5-standard videos (i assume that is what you have in mind) but decided it is less desirable. My reasons:

  • currently, many relevant research (video prediction, etc) is focused on generating very short clips so a looping GIF should suffice
  • universal support for GIF playback
  • little load on the tensorboard server (compile a GIF and sending it)?

That said, i can see the merit of a full-blown video playback being useful (in video segmentation and long video prediction?). and the playback controls associated with video formats are useful in these cases. I wonder what other people think.

It might be worth implementing both and let end-users choose whichever it is more appropriate for their usage. what do you think @dandelionmane?

in any case, help is more than welcome!

@teamdandelion
Copy link
Contributor Author

I have no objection to having a "gif dashboard" plugin and a "streaming video dashboard" plugin. They do seem like they satisfy potentially different usecases, and the more plugins the merrier, imo.

@falcondai
Copy link

falcondai commented Jul 7, 2017

out of curiosity, how would the tensorboard UI change to accommodate many plugins? i imagine that only showing the ones being used would be great (up to now a few tabs on mine are always empty).

@teamdandelion
Copy link
Contributor Author

teamdandelion commented Jul 8, 2017

Check out #181 from @wchargin! It makes it only display active plugins. We'll still need to change it if people start to have many active plugins at the same time, but we haven't reached that point yet.

In the long term, I've thought about moving the plugin list to the left side, and having it expand/collapse using a hamburger pullout style menu. That way, if the number of plugins is large, you can scroll in the list.

(When the pullout menu is retracted, each plugin would show a small icon representation, so if you remember the icons, you can switch plugins without pulling out the menu, and save horizontal space.)

@falcondai
Copy link

sounds good! look forward to these UI changes.

Sorry to digress (further) from the original issue: is mobile support on the feature timeline? i sometimes found myself using TB on my phone on-the-go. Honestly it works okay as is for inspecting a chart or two, but i think just a little CSS-fu responsive design will significantly improve it.

@teamdandelion
Copy link
Contributor Author

Mobile support isn't on the feature timeline, but if you want to take a stab at CSS-fu, we'll be happy to review the pull request :)

@miriaford
Copy link

Is the video summary feature still on the table? I think an important use case is reinforcement learning. For example, being able to see a video of the current Cartpole policy would be very helpful. The video is typically at most 1 minute for many Gym environments.

@chrisranderson
Copy link
Contributor

@miriaford You can, as of #613. There's an example on the README here of passing in an arrays parameter, which can be an image: https://github.com/chrisranderson/beholder/. There's also stuff available for recording the video.

image

@dabana
Copy link

dabana commented Nov 27, 2017

I would like to add to @miriaford 's comment. I also think it would be VERY useful for reinforcement learning (RL), just to control if the agent is actually learning something that makes sense. Like @miriaford says, these little GIFs are really short. In my case (playing VizDoom), at most a couple of hundred of stacks. Most often they are also low resolution (84x84) and single channel.

@chrisranderson your tool looks awesome, I will definitely try it out some day. But it looks a little overkill for the "quick and dirty" diagnosis of RL agents. I was wondering. Instead of an additional plugin, wouldn't it be more straightforward to just add GIF support to the already existing and excellent Image tab? What would be the main constrains then?

Many thanks to anyone working on this (these) feature(s)!

@danijar
Copy link

danijar commented Jan 26, 2018

@chrisranderson This looks nice. Is there a way to write these summaries from within the graph, something like a tf.summary.video(name, tensor_of_frames)?

@alexlee-gk
Copy link

alexlee-gk commented Jan 26, 2018

I'm not sure why this hasn't been mentioned before, but I just noticed that TensorBoard's image plugin has supported GIFs all along. What's missing is an out-of-the-box way to add GIF summaries with TF. Currently, it's not possible to do that because it lacks a GIF encoder and tf.summary.image can't take in encoded strings (it only takes in image tensors). One option is to encode the tensor with a third-party library, manually construct a protobuf image summary, and then add that to the summary writer. Here is a self-contained example of how to do that:

import tempfile
import moviepy.editor as mpy
import numpy as np
import tensorflow as tf

def convert_tensor_to_gif_summary(summ):
    if isinstance(summ, bytes):
        summary_proto = tf.Summary()
        summary_proto.ParseFromString(summ)
        summ = summary_proto

    summary = tf.Summary()
    for value in summ.value:
        tag = value.tag
        images_arr = tf.make_ndarray(value.tensor)

        if len(images_arr.shape) == 5:
            # concatenate batch dimension horizontally
            images_arr = np.concatenate(list(images_arr), axis=-2)
        if len(images_arr.shape) != 4:
            raise ValueError('Tensors must be 4-D or 5-D for gif summary.')
        if images_arr.shape[-1] != 3:
            raise ValueError('Tensors must have 3 channels.')

        # encode sequence of images into gif string
        clip = mpy.ImageSequenceClip(list(images_arr), fps=4)
        with tempfile.NamedTemporaryFile() as f:
            filename = f.name + '.gif'
        clip.write_gif(filename, verbose=False)
        with open(filename, 'rb') as f:
            encoded_image_string = f.read()

        image = tf.Summary.Image()
        image.height = images_arr.shape[-3]
        image.width = images_arr.shape[-2]
        image.colorspace = 3  # code for 'RGB'
        image.encoded_image_string = encoded_image_string
        summary.value.add(tag=tag, image=image)
    return summary

sess = tf.Session()
summary_writer = tf.summary.FileWriter('logs/image_summary', graph=tf.get_default_graph())

images_shape = (16, 12, 64, 64, 3)  # batch, time, height, width, channels
images = np.random.randint(256, size=images_shape).astype(np.uint8)
images = tf.convert_to_tensor(images)

tensor_summ = tf.summary.tensor_summary('images_gif', images)
tensor_value = sess.run(tensor_summ)
summary_writer.add_summary(convert_tensor_to_gif_summary(tensor_value), 0)

summ = tf.summary.image("images", images[:, 0])  # first time-step only
value = sess.run(summ)
summary_writer.add_summary(value, 0)

summary_writer.flush()

@ankush-me
Copy link

Thanks @alexlee-gk! I have simplified the interface to match the standard summary ops below.
Needs to be extended for taking in a batch of GIFS -- should be straight forward (I hope)!

import tempfile
import moviepy.editor as mpy
import os
import os.path as osp
import tensorflow as tf
import numpy as np
from StringIO import StringIO

from tensorflow.python.framework import constant_op 
from tensorflow.python.ops import summary_op_util


def py_encode_gif(im_thwc, tag, fps=4):
  """
  Given a 4D numpy tensor of images, encodes as a gif.
  """
  with tempfile.NamedTemporaryFile() as f: fname = f.name + '.gif'
  clip = mpy.ImageSequenceClip(list(im_thwc), fps=fps)
  clip.write_gif(fname, verbose=False, progress_bar=False)
  with open(fname, 'rb') as f: enc_gif = f.read()
  os.remove(fname)
  # create a tensorflow image summary protobuf:
  thwc = im_thwc.shape
  im_summ = tf.Summary.Image()
  im_summ.height = thwc[1]
  im_summ.width = thwc[2]
  im_summ.colorspace = 3 # fix to 3 == RGB
  im_summ.encoded_image_string = enc_gif
  # create a summary obj:
  summ = tf.Summary()
  summ.value.add(tag=tag, image=im_summ)
  summ_str = summ.SerializeToString()
  return summ_str


def add_gif_summary(name, im_thwc, fps=4, collections=None, family=None):
  """
  IM_THWC: 4D tensor (TxHxWxC) for which GIF is to be generated.
  COLLECTION: collections to which the summary op is to be added.
  """
  if summary_op_util.skip_summary(): return constant_op.constant('')
  with summary_op_util.summary_scope(name, family, values=[im_thwc]) as (tag, scope):
    gif_summ = tf.py_func(py_encode_gif, [im_thwc, tag, fps], tf.string, stateful=False)
    summary_op_util.collect(gif_summ, collections, [tf.GraphKeys.SUMMARIES])
  return gif_summ

@alexlee-gk
Copy link

That's neat! For the encoding of the GIF, I'd like to suggest to directly use ffmpeg instead of moviepy by using the encode_gif function I wrote here. It avoids writing to a temporary file by piping outputs directly to an encoded string. More importantly, it uses ffmpeg's palette generation which I have found to work better (in terms of artifacts and time) than the color optimizations available in moviepy.

@PHarshali
Copy link

Still waiting for the tensorboard video and very excited. Till then if there is anyone wanting to read about tensorboard can follow this link - https://data-flair.training/blogs/tensorboard-tutorial/

@alexlee-gk
Copy link

I also wrote gif summaries for the new summary API. Here are self-contained colabs that uses gif summaries for the original summary API and the summary API v2.

Original summary API:
https://colab.research.google.com/drive/1vgD2HML7Cea_z5c3kPBcsHUIxaEVDiIc

Summary API v2:
https://colab.research.google.com/drive/1CSOrCK8-iQCZfs3CVchLE42C52M_3Sej

@alexlee-gk
Copy link

The original ffmpeg command was dropping some frames for gifs that had more than a certain number of frames. This would only happen under certain circumstances (e.g. only in some machines I tried). The dropping issue is fixed by adding [x]fifo[x] to the filtering part of the ffmpeg command. I have updated the colabs with this fix. Thanks to @kpertsch for pointing and figuring this out!

Reference: https://superuser.com/questions/1323429/how-to-efficiently-create-a-best-palette-gif-from-a-video-portion-straight-from

@htung0101
Copy link

@alexlee-gk Is there a way to use this for tf2, which is in eager mode.
I copy-paste your code, but it is not showing up.
can we add something like tf.summary at the very end to write the data out.

@PeterMitrano
Copy link

PeterMitrano commented Oct 1, 2019

The summary api v2 works for me in 1.14 in eager mode

@kdbanman
Copy link

kdbanman commented Dec 25, 2019

This is a simple adaptation of the @alexlee-gk's original code, but for TF2.x. Demo collab:

https://colab.research.google.com/drive/1ut0eJJ3pLjJYrgqVfz2QG69JnKA_2wQO

Code

As earlier in the comments, this uses moviepy to encode the gif as bytes, then builds a v1 summary protobuf with it.

import tempfile
import moviepy.editor as mpy
import os
import tensorflow as tf

def encode_gif_bytes(im_thwc, fps=4):
  with tempfile.NamedTemporaryFile() as f: fname = f.name + '.gif'
  clip = mpy.ImageSequenceClip(list(im_thwc), fps=fps)
  clip.write_gif(fname, verbose=False, progress_bar=False)

  with open(fname, 'rb') as f: enc_gif = f.read()
  os.remove(fname)

  return enc_gif

def gif_summary(im_thwc, fps=4):
  """
  Given a 4D numpy tensor of images (TxHxWxC), encode a gif into a Summary protobuf.
  NOTE: Tensor must be in the range [0, 255] as opposed to the usual small float values.
  """
  # create a tensorflow image summary protobuf:
  thwc = im_thwc.shape
  im_summ = tf.compat.v1.Summary.Image()
  im_summ.height = thwc[1]
  im_summ.width = thwc[2]
  im_summ.colorspace = 3 # fix to 3 for RGB
  im_summ.encoded_image_string = encode_gif_bytes(im_thwc, fps)

  # create a serialized summary obj:
  summ = tf.compat.v1.Summary()
  summ.value.add(image=im_summ)
  return summ.SerializeToString()

Usage

The summary object can be used with the new summary writer as follows:

# Tensorboard boilerplate
from tensorflow import summary

import numpy as np
import datetime

current_time = str(datetime.datetime.now().timestamp())
log_dir = 'logs/tensorboard/' + current_time
summary_writer = summary.create_file_writer(log_dir)

# Here I'll make T=48 frames of greyscale noise.  Any tensor of
# shape TxHxWxC with pixel values in range [0, 255] will work.
gif_tensor = np.random.random((48, 100, 100, 1))
gif_tensor = gif_tensor * 255

# Just call gif_summary and pass it to the summary you imported 
gif = gif_summary(gif_tensor, fps=24)

with summary_writer.as_default():
  # Optionally pass step and name
  summary.experimental.write_raw_pb(gif, step=1, name='wow gifs')

@danijar
Copy link

danijar commented Dec 25, 2019

I also wrote a cleaned-up TF 2 version of Alex' GIF summary a few days ago. It works for both eager tensors, Numpy arrays, and with the note below also for graph tensors inside tf.function:

def video_summary(name, video, step=None, fps=20):
  name = tf.constant(name).numpy().decode('utf-8')
  video = np.array(video)
  if video.dtype in (np.float32, np.float64):
    video = np.clip(255 * video, 0, 255).astype(np.uint8)
  B, T, H, W, C = video.shape
  try:
    frames = video.transpose((1, 2, 0, 3, 4)).reshape((T, H, B * W, C))
    summary = tf.compat.v1.Summary()
    image = tf.compat.v1.Summary.Image(height=B * H, width=T * W, colorspace=C)
    image.encoded_image_string = encode_gif(frames, fps)
    summary.value.add(tag=name + '/gif', image=image)
    tf.summary.experimental.write_raw_pb(summary.SerializeToString(), step)
  except (IOError, OSError) as e:
    print('GIF summaries require ffmpeg in $PATH.', e)
    frames = video.transpose((0, 2, 1, 3, 4)).reshape((1, B * H, T * W, C))
    tf.summary.image(name + '/grid', frames, step)


def encode_gif(frames, fps):
  from subprocess import Popen, PIPE
  h, w, c = frames[0].shape
  pxfmt = {1: 'gray', 3: 'rgb24'}[c]
  cmd = ' '.join([
      f'ffmpeg -y -f rawvideo -vcodec rawvideo',
      f'-r {fps:.02f} -s {w}x{h} -pix_fmt {pxfmt} -i - -filter_complex',
      f'[0:v]split[x][z];[z]palettegen[y];[x]fifo[x];[x][y]paletteuse',
      f'-r {fps:.02f} -f gif -'])
  proc = Popen(cmd.split(' '), stdin=PIPE, stdout=PIPE, stderr=PIPE)
  for image in frames:
    proc.stdin.write(image.tostring())
  out, err = proc.communicate()
  if proc.returncode:
    raise IOError('\n'.join([' '.join(cmd), err.decode('utf8')]))
  del proc
  return out

If you want to use it inside tf.function, you need to give it access to the summary writer because it may be executed in a different thread for which your summary writer is not set as default:

@tf.function
def foo():
  # ...
  robust_video_summary(writer, 'name', video)
  # ...

def robust_video_summary(writer, name, video):
  step = tf.summary.experimental.get_step()
  def inner(name, video):
    if step is not None:
      tf.summary.experimental.set_step(step)
    with writer.as_default():
      video_summary(name, video)
  return tf.py_function(inner, args, [])

@alextp What do you think about adding something like this to tf.summary? The above solution falls back to a static image that shows all frames left-to-right in case ffmpeg is not available, so it wouldn't add any required dependencies to TF. I'm not sure if that's good enough.

@alextp
Copy link

alextp commented Dec 26, 2019

@nfelt I think we should add this; WDYT?

@danijar
Copy link

danijar commented Jan 27, 2020

Any update on this?

@wchargin
Copy link
Contributor

No; we’ll post here if there’s an update.

@danijar
Copy link

danijar commented Jan 27, 2020

Okay, thanks! Looking forward to this.

@rsandler00
Copy link

@wchargin @danijar any updates? Would be great to have this integrated!

@PawelFaron
Copy link

Doing reinforcement learning, it would be great to have it and be able to see video from last evaluation in tensorboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests