You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If we look at this line, we see that the input sequence is padded on the right.
And in order to generate the predictive distribution, we do this , which makes sense - the last element in the sequence is not used for prediction. But then in an autoregressive setting, I would assume that you would have element 0, a_0 predict a_1 and so on.
This does not make sense to me. This would mean we are using the current element to predict itself. This has zero generative power, right? @saran-t@charlienash
The text was updated successfully, but these errors were encountered:
If we look at this line, we see that the input sequence is padded on the right.
And in order to generate the predictive distribution, we do this , which makes sense - the last element in the sequence is not used for prediction. But then in an autoregressive setting, I would assume that you would have element 0, a_0 predict a_1 and so on.
In the training script, we have
vertex_model_loss = -tf.reduce_sum( vertex_model_pred_dist.log_prob(vertex_model_batch['vertices_flat']) * vertex_model_batch['vertices_flat_mask'])
This does not make sense to me. This would mean we are using the current element to predict itself. This has zero generative power, right?
@saran-t @charlienash
The text was updated successfully, but these errors were encountered: