Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outputs from mgpr.predict_on_noisy_inputs and pilco.propagate confusion #11

Closed
mcyip opened this issue Jul 17, 2018 · 2 comments
Closed

Comments

@mcyip
Copy link

mcyip commented Jul 17, 2018

Hi,

Thanks for some great code. I'm trying to read through it and make descriptive comments since theres a lot of dense code and several layers of wrappers of abstractions to TF :)

My question is to try to nail down the behavior of mgpr.predict_on_noisy_inputs (namely mgpr.predict_given_factorizations)

If I reference PILCO.py's propagate function i see the return is a delta value. (since the immediate next line to predict_on_noisy_inputs is
M_x = M_dx + m_x

However, its not clear to me where this delta is performed or predicted. From the description of mgpr.predict_given_factorizations, it says to return a mean and variance of X, and I also don't really see the line where it might only returning a "change" in X. It makes sense to me that only X is returned, since both the system model and the controller seem to be of the form x(t+1)=f(x(t),u(t)) and u(t)=g(x(t)) respectively.

Thanks!

@nrontsis
Copy link
Owner

nrontsis commented Jul 17, 2018

Hi,

Thanks for the interest. If you spend time on writing comments, we would be very happy if you were to make a PR so as to incorporate (some of) them in the main repo. We plan on adding more documentation & examples soon, and while writing them it would be helpful to get feedback from other people.

The delta comes because the datasets X, Y you provide to PILCO should be of the form:

X[i] = [state at time i;
       control action at time i]
Y[i] = (state at time i + 1) - (state at time i).

So the underlying GPs are trained to provide delta as their output.

See here how we create X and Y on the example.

@mcyip
Copy link
Author

mcyip commented Jul 17, 2018

Thanks! This is a silly mistake on my part. Thanks for the clarification. Will do a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants