reproducing the video prediction model #553

falcondai · 2016-10-18T01:54:48Z

models/video_prediction @cbfinn

Thank you for generously sharing the code! I have three questions about the released code:

are the hyperparameters used in the paper the same as the default options in prediction_train.py? in particular the number of training steps.
can you share some figures on the expected performance of the trained model over the val/train sets? I observed some strange val_loss trend line so i wonder if i made a mistake.
is there a plan to also release the valuation/visualization script for the model? if not, i would love to contribute (and i am sure many other users will as well).

The text was updated successfully, but these errors were encountered:

cbfinn · 2016-10-18T03:34:37Z

are the hyperparameters used in the paper the same as the default options in prediction_train.py?

For the most part, yes. There are a few differences:

For the paper, I downsampled with PIL's antialiasing method, outside of tensorflow. In this code, the images are downsampled in tensorflow, using bicubic interpolation. This isn't a great option, as it causes the images to be a bit pixelated. A convolution-based downsampling would be a better option.
I use layer norm after every layer, which I didn't do in the paper. I think this only makes things more stable.
train/val split is different from what I used
The PSNR calculation that is saved in a scalar summary is not quite correct. It is done for an entire batch of images, but should be done for each image independently and then averaged. This is pretty easy to fix (and I probably should have fixed it earlier; I've just been really busy).

I observed some strange val_loss trend line so i wonder if i made a mistake.

That curve is about what I would expect. It looks strange is because of scheduled sampling, a curriculum which stochastically passes in ground truth frames at some times during the beginning of training. The curriculum ends around 12k. (See citation [2] in paper for details). To turn off scheduled sampling, you can set --schedsamp_k=-1
Alternatively, you could make a change to the code to set schedsamp_k=-1 for the validation model, regardless of what's used for the training model. This might be nice.

can you share some figures on the expected performance of the trained model over the val/train sets?

I did this work when I was an intern at Google Brain, and I no longer have access to data/code/training curves that I used for the paper.

is there a plan to also release the valuation/visualization script for the model?

I'm not planning on doing this in the immediate future, but love to have something like this added to the released code. I'd be happy to help review code for this, and potentially add to it. For example, I think that tiling animated gifs is a great way to visualize the model's predictions, as seen here: https://sites.google.com/site/robotprediction/ (scroll down about halfway). I have the code for tiling predictions together and saving into a gif, which I'd be happy to share.

It would also be really useful to visualize the gifs during training, e.g., in tensorboard (tensorflow/tensorflow#3936)

asimshankar · 2016-10-18T06:55:51Z

Thanks for the response @cbfinn.
@falcondai : seems you got the answers you were seeking.

Closing this out. If you have more concerns, please do file a new issue/check with @cbfinn

falcondai · 2016-10-18T14:34:49Z

@cbfinn Thanks for the clarifications and pointers! i will follow up with more specific issues should they arise.

tegg89 · 2017-04-18T08:38:37Z

@cbfinn
For your previous commentary, how can I get tiling animated gifs visualizing result of model's prediction? I have been tried to analyze and modify input and training files, but I couldn't do well. Can I get any help for that?

cbfinn · 2017-04-18T13:25:07Z

Here's an example script that loads images from the pushing dataset and exports them to gifs, using the moviepy package (though does not tile them).
grab_train_images.py.zip

It is straightforward to use moviepy to stack gifs side-by-side, to form a tiling.
http://zulko.github.io/moviepy/getting_started/compositing.html#stacking-and-concatenating-clips

tegg89 · 2017-04-18T14:00:54Z

@cbfinn @falcondai
Thanks for your generous reply! I will follow rest of the codes with respect to included codes :)

falcondai · 2017-04-18T14:08:22Z

@tegg89 i ended up using imageio for creating GIF. Its API is pretty straightforward. For an example (ipython notebook): https://gist.github.com/falcondai/1e22919e6ce8d6a8e3dd3da5a6a0ad94

tegg89 · 2017-04-20T08:17:42Z

@cbfinn @falcondai
When I put data to network for evaluation, result GIF file is created that is not sequential.
I have switched shuffle option in prediction_input.py to disable.

cbfinn · 2017-04-20T13:58:55Z

@tegg89 Make sure you are only calling session.run() once for the entire sequence, rather than once for each frame. The script grab_train_images.py shows how to extract a sequence of images in order, with a single sess.run() per sequence.

tegg89 · 2017-04-21T07:13:11Z

@cbfinn @falcondai
Sorry for bothering to keep ask you questions. But I have keep troubles on visualizing test data.

Referring to grab_train_images.py, I have changed the input file that returns sequential video frames. However, when I have put this file through network model, the gen_images came out with not sequential form. Modified code is in here. The steps
that I ran through are followed:

Get images, states and actions from prediction_input.py (I already checked that images are shown sequential order)
Put three data into network by Model class in prediction_train.py.
Create session and load trained model. (Up to this step, I could load model with some minimal code changes)
gen_images = sess.run([model.gen_images], feed_dict={model.iter_num: -1})
(learning_rate term from feed_dict is deleted because of not using)
Denormalize gen_images
Transform gen_images to gif

Then it came out with no sequential form. In my opinion, the network model makes input file not sequential form. How did you do visualizing evaluation?

carsonDB · 2017-06-09T13:11:37Z

@cbfinn Thanks for your paper and codes. And sorry to bother you for a little detail.

As you said above, "train/val split is different from what I used".
In the paper, train_val_split is 0.95. And I saw that this tf version also uses the same setting as default.
In the complete training (iteration 10K), I found that the validation psnr (I use this evaluation) is not always accordance with test psnr. I choose the best model through selecting best validation psnr when training. But sometimes, some periodic checkpoint models' psnr are higher than the selected best model (gap up to 0.5).

Is train_val_split == 0.95 not enough in practice?

cbfinn · 2017-06-09T15:47:08Z

@carsonDB The percentage is the same, but the actual videos used for training and validation are different (as they are randomized).

carsonDB · 2017-06-13T02:25:52Z

@cbfinn Thanks for your quick reply!

asimshankar closed this as completed Oct 18, 2016

cbfinn mentioned this issue Nov 22, 2016

training loss looks very strange #670

Closed

falcondai mentioned this issue May 4, 2017

Video summary tensorflow/tensorflow#3936

Closed

falcondai mentioned this issue Jul 7, 2017

Video summary support tensorflow/tensorboard#39

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reproducing the video prediction model #553

reproducing the video prediction model #553

falcondai commented Oct 18, 2016

cbfinn commented Oct 18, 2016

asimshankar commented Oct 18, 2016

falcondai commented Oct 18, 2016 •

edited

Loading

tegg89 commented Apr 18, 2017

cbfinn commented Apr 18, 2017

tegg89 commented Apr 18, 2017 •

edited

Loading

falcondai commented Apr 18, 2017 •

edited

Loading

tegg89 commented Apr 20, 2017 •

edited

Loading

cbfinn commented Apr 20, 2017

tegg89 commented Apr 21, 2017

carsonDB commented Jun 9, 2017

cbfinn commented Jun 9, 2017

carsonDB commented Jun 13, 2017

reproducing the video prediction model #553

reproducing the video prediction model #553

Comments

falcondai commented Oct 18, 2016

cbfinn commented Oct 18, 2016

asimshankar commented Oct 18, 2016

falcondai commented Oct 18, 2016 • edited Loading

tegg89 commented Apr 18, 2017

cbfinn commented Apr 18, 2017

tegg89 commented Apr 18, 2017 • edited Loading

falcondai commented Apr 18, 2017 • edited Loading

tegg89 commented Apr 20, 2017 • edited Loading

cbfinn commented Apr 20, 2017

tegg89 commented Apr 21, 2017

carsonDB commented Jun 9, 2017

cbfinn commented Jun 9, 2017

carsonDB commented Jun 13, 2017

falcondai commented Oct 18, 2016 •

edited

Loading

tegg89 commented Apr 18, 2017 •

edited

Loading

falcondai commented Apr 18, 2017 •

edited

Loading

tegg89 commented Apr 20, 2017 •

edited

Loading