Question: Does the model condition on future time steps? #4

llucid-97 · 2022-05-31T14:05:19Z

Hi, I read your paper and I found it really interesting.
I've been going through the code to understand it, port it and try it out.

I came across a section that I don't understand though:

From reading the paper I assumed that the model should only condition on information from the past to predict the future. Is this assumption correct first of all?

if so, In the convolutional encoder, you're predicting encodings for each RSSM level in the clockwork vae stack, then you're summing them over time here:

cwvae/cnns.py

Line 140 in 62dd505

layer = tf.reduce_sum(layer, axis=2)

Does this not mean that some prediction at time step X then has access to information from time step X+N (where N is some positive integer < "temporal abstraction factor" ** level), or am I misunderstanding the data flow?

vaibhavsaxena11 · 2022-06-02T09:46:01Z

Hey, thank you for your interest in my work!

To answer your question in short: no. This is a prediction model which does not condition on future timesteps at any level of the hierarchy. I elaborate on this below.

There are two parts that comprise this model: prediction (Decoder+RNN) and inference (Encoder). While the former is used to predict future video, the latter is used to facilitate future video prediction when some frames have been observed (all during testing).

During future prediction, the observation at any time t is conditioned on the state variable at time t, which in turn could be conditioned on a hierarchical variable that was generated at some timestep <t. This generalizes at all levels of the hierarchy and makes sure that only state variables generated at timesteps <=t can propagate information to generate the observation variable at time t.

Say we observed a sequence of 10 frames and wish to generate future video. This is equivalent to saying that we need a posterior distribution over the observation variables at timesteps >10 given the first 10 observations. Our (generative) model assumption dictates that to generate the observation at t=11, it is sufficient to have a posterior representation of the state at t=11. This is exactly what we compute. The posterior of the state at t=11 can be computed using samples from the posterior at t=10 at the same level and the hierarchical variable at t<=10 (as governed by the temporal abstraction factor). Note that all information flow until now was only forward in time.

Now, to compute the posterior at timesteps <=10, at all levels, we need to make use of our "inference" model. We defined our inference model to make use of all the observations that they directly conditioned, to compute the posterior representation of the state at any level. This makes sense because the posterior belief over a variable that conditioned more than one variable should be derived from the belief over all the variables that condition on it. This is what is accomplished by the summing-over-time that you referred to in the encoder here. While the information does flow back in time here, but it only does so to compute the posterior representations of the state variables, and is of the same flavor as computing the state after observing a frame even though the state precedes the observation during actual generation.

I hope this clarifies your doubts about the model structure! Let me know if you have any more questions.

llucid-97 · 2022-06-02T10:55:14Z

Aah ok, yes that does clarify things for me. Thank you

llucid-97 closed this as completed Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Does the model condition on future time steps? #4

Question: Does the model condition on future time steps? #4

llucid-97 commented May 31, 2022

vaibhavsaxena11 commented Jun 2, 2022

llucid-97 commented Jun 2, 2022

Question: Does the model condition on future time steps? #4

Question: Does the model condition on future time steps? #4

Comments

llucid-97 commented May 31, 2022

vaibhavsaxena11 commented Jun 2, 2022

llucid-97 commented Jun 2, 2022