You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for some great code. I'm trying to read through it and make descriptive comments since theres a lot of dense code and several layers of wrappers of abstractions to TF :)
My question is to try to nail down the behavior of mgpr.predict_on_noisy_inputs (namely mgpr.predict_given_factorizations)
If I reference PILCO.py's propagate function i see the return is a delta value. (since the immediate next line to predict_on_noisy_inputs is
M_x = M_dx + m_x
However, its not clear to me where this delta is performed or predicted. From the description of mgpr.predict_given_factorizations, it says to return a mean and variance of X, and I also don't really see the line where it might only returning a "change" in X. It makes sense to me that only X is returned, since both the system model and the controller seem to be of the form x(t+1)=f(x(t),u(t)) and u(t)=g(x(t)) respectively.
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks for the interest. If you spend time on writing comments, we would be very happy if you were to make a PR so as to incorporate (some of) them in the main repo. We plan on adding more documentation & examples soon, and while writing them it would be helpful to get feedback from other people.
The delta comes because the datasets X, Y you provide to PILCO should be of the form:
X[i] = [state at time i;
control action at time i]
Y[i] = (state at time i + 1) - (state at time i).
So the underlying GPs are trained to provide delta as their output.
Hi,
Thanks for some great code. I'm trying to read through it and make descriptive comments since theres a lot of dense code and several layers of wrappers of abstractions to TF :)
My question is to try to nail down the behavior of mgpr.predict_on_noisy_inputs (namely mgpr.predict_given_factorizations)
If I reference PILCO.py's propagate function i see the return is a delta value. (since the immediate next line to predict_on_noisy_inputs is
M_x = M_dx + m_x
However, its not clear to me where this delta is performed or predicted. From the description of mgpr.predict_given_factorizations, it says to return a mean and variance of X, and I also don't really see the line where it might only returning a "change" in X. It makes sense to me that only X is returned, since both the system model and the controller seem to be of the form x(t+1)=f(x(t),u(t)) and u(t)=g(x(t)) respectively.
Thanks!
The text was updated successfully, but these errors were encountered: