You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thanks for brining a new perspective to this field with this paper. I really enjoyed reading it.
While reading through the paper, I found a typo in the inline equation that describes the MPC.
In the Model predictive control subsection under Preliminaries, it is stated that the globally optimal policy
\Pi_{\theta} is proportional to the expectation of the negated Q-values.
I think the negation should be removed, intuitively.
Another question regarding the description of MPPI in Sec. 3 is about Eq.(4), that describes the CEM.
Here, the mean/var of the j-th policy is computed based on weighted/shifted \Gamma, where \Gamma is
denoted as sampled trajectory in the paragraph. I guess the authors meant the state-action sequence as \Gamma.
Thus, \Gamma^{\star}_i in Eq.(4) should be replaced with the action I guess.
Maybe the code snippet here
corresponds to this equation?
I wonder if I understood it correctly.
Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi, glad to hear that you find our work interesting, and thank you for the feedback! You are right, I'll make sure to include these things in the next revision. Thanks!
Hi, thanks for brining a new perspective to this field with this paper. I really enjoyed reading it.
While reading through the paper, I found a typo in the inline equation that describes the MPC.
In the Model predictive control subsection under Preliminaries, it is stated that the globally optimal policy
\Pi_{\theta} is proportional to the expectation of the negated Q-values.
I think the negation should be removed, intuitively.
Another question regarding the description of MPPI in Sec. 3 is about Eq.(4), that describes the CEM.
Here, the mean/var of the j-th policy is computed based on weighted/shifted \Gamma, where \Gamma is
denoted as sampled trajectory in the paragraph. I guess the authors meant the state-action sequence as \Gamma.
Thus, \Gamma^{\star}_i in Eq.(4) should be replaced with the action I guess.
Maybe the code snippet here
corresponds to this equation?
I wonder if I understood it correctly.
Thanks in advance!
The text was updated successfully, but these errors were encountered: