New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Video summary support #39

Open
decentralion opened this Issue Jun 16, 2017 · 16 comments

Comments

Projects
None yet
9 participants
@decentralion
Contributor

decentralion commented Jun 16, 2017

Migrated from tensorflow/tensorflow#3936

Just wondering if there are any plans to add video summaries to tensorboard??

@falcondai offered to implement it as a plugin. @falcondai, I think we are almost at the point where we're ready to accept a plugin like that. Give us a few more weeks to clean up the plugin API and get some good examples you can work off of.

@falcondai

This comment has been minimized.

falcondai commented Jun 20, 2017

sounds good. i look forward to that 😄

@chrisranderson

This comment has been minimized.

Contributor

chrisranderson commented Jun 29, 2017

@dandelionmane @falcondai @jart I'd be willing to build it as part of #130, if @falcondai doesn't mind. I think my project should be split into two pieces - the piece that generates the frames, and the summary piece that streams frames to TensorBoard.

@falcondai

This comment has been minimized.

falcondai commented Jul 7, 2017

@chrisranderson i originally planned to implement the video summaries as GIF's: I have experience doing that from a related project tensorflow/models#553. I thought of the alternative of showing the videos as full-blown HTML5-standard videos (i assume that is what you have in mind) but decided it is less desirable. My reasons:

  • currently, many relevant research (video prediction, etc) is focused on generating very short clips so a looping GIF should suffice
  • universal support for GIF playback
  • little load on the tensorboard server (compile a GIF and sending it)?

That said, i can see the merit of a full-blown video playback being useful (in video segmentation and long video prediction?). and the playback controls associated with video formats are useful in these cases. I wonder what other people think.

It might be worth implementing both and let end-users choose whichever it is more appropriate for their usage. what do you think @dandelionmane?

in any case, help is more than welcome!

@decentralion

This comment has been minimized.

Contributor

decentralion commented Jul 7, 2017

I have no objection to having a "gif dashboard" plugin and a "streaming video dashboard" plugin. They do seem like they satisfy potentially different usecases, and the more plugins the merrier, imo.

@falcondai

This comment has been minimized.

falcondai commented Jul 7, 2017

out of curiosity, how would the tensorboard UI change to accommodate many plugins? i imagine that only showing the ones being used would be great (up to now a few tabs on mine are always empty).

@decentralion

This comment has been minimized.

Contributor

decentralion commented Jul 8, 2017

Check out #181 from @wchargin! It makes it only display active plugins. We'll still need to change it if people start to have many active plugins at the same time, but we haven't reached that point yet.

In the long term, I've thought about moving the plugin list to the left side, and having it expand/collapse using a hamburger pullout style menu. That way, if the number of plugins is large, you can scroll in the list.

(When the pullout menu is retracted, each plugin would show a small icon representation, so if you remember the icons, you can switch plugins without pulling out the menu, and save horizontal space.)

@falcondai

This comment has been minimized.

falcondai commented Jul 8, 2017

sounds good! look forward to these UI changes.

Sorry to digress (further) from the original issue: is mobile support on the feature timeline? i sometimes found myself using TB on my phone on-the-go. Honestly it works okay as is for inspecting a chart or two, but i think just a little CSS-fu responsive design will significantly improve it.

@decentralion

This comment has been minimized.

Contributor

decentralion commented Jul 10, 2017

Mobile support isn't on the feature timeline, but if you want to take a stab at CSS-fu, we'll be happy to review the pull request :)

@miriaford

This comment has been minimized.

miriaford commented Nov 15, 2017

Is the video summary feature still on the table? I think an important use case is reinforcement learning. For example, being able to see a video of the current Cartpole policy would be very helpful. The video is typically at most 1 minute for many Gym environments.

@chrisranderson

This comment has been minimized.

Contributor

chrisranderson commented Nov 15, 2017

@miriaford You can, as of #613. There's an example on the README here of passing in an arrays parameter, which can be an image: https://github.com/chrisranderson/beholder/. There's also stuff available for recording the video.

image

@dabana

This comment has been minimized.

dabana commented Nov 27, 2017

I would like to add to @miriaford 's comment. I also think it would be VERY useful for reinforcement learning (RL), just to control if the agent is actually learning something that makes sense. Like @miriaford says, these little GIFs are really short. In my case (playing VizDoom), at most a couple of hundred of stacks. Most often they are also low resolution (84x84) and single channel.

@chrisranderson your tool looks awesome, I will definitely try it out some day. But it looks a little overkill for the "quick and dirty" diagnosis of RL agents. I was wondering. Instead of an additional plugin, wouldn't it be more straightforward to just add GIF support to the already existing and excellent Image tab? What would be the main constrains then?

Many thanks to anyone working on this (these) feature(s)!

@danijar

This comment has been minimized.

Member

danijar commented Jan 26, 2018

@chrisranderson This looks nice. Is there a way to write these summaries from within the graph, something like a tf.summary.video(name, tensor_of_frames)?

@alexlee-gk

This comment has been minimized.

alexlee-gk commented Jan 26, 2018

I'm not sure why this hasn't been mentioned before, but I just noticed that TensorBoard's image plugin has supported GIFs all along. What's missing is an out-of-the-box way to add GIF summaries with TF. Currently, it's not possible to do that because it lacks a GIF encoder and tf.summary.image can't take in encoded strings (it only takes in image tensors). One option is to encode the tensor with a third-party library, manually construct a protobuf image summary, and then add that to the summary writer. Here is a self-contained example of how to do that:

import tempfile
import moviepy.editor as mpy
import numpy as np
import tensorflow as tf

def convert_tensor_to_gif_summary(summ):
    if isinstance(summ, bytes):
        summary_proto = tf.Summary()
        summary_proto.ParseFromString(summ)
        summ = summary_proto

    summary = tf.Summary()
    for value in summ.value:
        tag = value.tag
        images_arr = tf.make_ndarray(value.tensor)

        if len(images_arr.shape) == 5:
            # concatenate batch dimension horizontally
            images_arr = np.concatenate(list(images_arr), axis=-2)
        if len(images_arr.shape) != 4:
            raise ValueError('Tensors must be 4-D or 5-D for gif summary.')
        if images_arr.shape[-1] != 3:
            raise ValueError('Tensors must have 3 channels.')

        # encode sequence of images into gif string
        clip = mpy.ImageSequenceClip(list(images_arr), fps=4)
        with tempfile.NamedTemporaryFile() as f:
            filename = f.name + '.gif'
        clip.write_gif(filename, verbose=False)
        with open(filename, 'rb') as f:
            encoded_image_string = f.read()

        image = tf.Summary.Image()
        image.height = images_arr.shape[-3]
        image.width = images_arr.shape[-2]
        image.colorspace = 3  # code for 'RGB'
        image.encoded_image_string = encoded_image_string
        summary.value.add(tag=tag, image=image)
    return summary

sess = tf.Session()
summary_writer = tf.summary.FileWriter('logs/image_summary', graph=tf.get_default_graph())

images_shape = (16, 12, 64, 64, 3)  # batch, time, height, width, channels
images = np.random.randint(256, size=images_shape).astype(np.uint8)
images = tf.convert_to_tensor(images)

tensor_summ = tf.summary.tensor_summary('images_gif', images)
tensor_value = sess.run(tensor_summ)
summary_writer.add_summary(convert_tensor_to_gif_summary(tensor_value), 0)

summ = tf.summary.image("images", images[:, 0])  # first time-step only
value = sess.run(summ)
summary_writer.add_summary(value, 0)

summary_writer.flush()
@ankush-me

This comment has been minimized.

ankush-me commented Jul 6, 2018

Thanks @alexlee-gk! I have simplified the interface to match the standard summary ops below.
Needs to be extended for taking in a batch of GIFS -- should be straight forward (I hope)!

import tempfile
import moviepy.editor as mpy
import os
import os.path as osp
import tensorflow as tf
import numpy as np
from StringIO import StringIO

from tensorflow.python.framework import constant_op 
from tensorflow.python.ops import summary_op_util


def py_encode_gif(im_thwc, tag, fps=4):
  """
  Given a 4D numpy tensor of images, encodes as a gif.
  """
  with tempfile.NamedTemporaryFile() as f: fname = f.name + '.gif'
  clip = mpy.ImageSequenceClip(list(im_thwc), fps=fps)
  clip.write_gif(fname, verbose=False, progress_bar=False)
  with open(fname, 'rb') as f: enc_gif = f.read()
  os.remove(fname)
  # create a tensorflow image summary protobuf:
  thwc = im_thwc.shape
  im_summ = tf.Summary.Image()
  im_summ.height = thwc[1]
  im_summ.width = thwc[2]
  im_summ.colorspace = 3 # fix to 3 == RGB
  im_summ.encoded_image_string = enc_gif
  # create a summary obj:
  summ = tf.Summary()
  summ.value.add(tag=tag, image=im_summ)
  summ_str = summ.SerializeToString()
  return summ_str


def add_gif_summary(name, im_thwc, fps=4, collections=None, family=None):
  """
  IM_THWC: 4D tensor (TxHxWxC) for which GIF is to be generated.
  COLLECTION: collections to which the summary op is to be added.
  """
  if summary_op_util.skip_summary(): return constant_op.constant('')
  with summary_op_util.summary_scope(name, family, values=[im_thwc]) as (tag, scope):
    gif_summ = tf.py_func(py_encode_gif, [im_thwc, tag, fps], tf.string, stateful=False)
    summary_op_util.collect(gif_summ, collections, [tf.GraphKeys.SUMMARIES])
  return gif_summ
@alexlee-gk

This comment has been minimized.

alexlee-gk commented Jul 7, 2018

That's neat! For the encoding of the GIF, I'd like to suggest to directly use ffmpeg instead of moviepy by using the encode_gif function I wrote here. It avoids writing to a temporary file by piping outputs directly to an encoded string. More importantly, it uses ffmpeg's palette generation which I have found to work better (in terms of artifacts and time) than the color optimizations available in moviepy.

@PHarshali

This comment has been minimized.

PHarshali commented Jul 17, 2018

Still waiting for the tensorboard video and very excited. Till then if there is anyone wanting to read about tensorboard can follow this link - https://data-flair.training/blogs/tensorboard-tutorial/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment