Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a mathematical problem .. #27

Open
bofen97 opened this issue Oct 20, 2019 · 1 comment
Open

a mathematical problem .. #27

bofen97 opened this issue Oct 20, 2019 · 1 comment

Comments

@bofen97
Copy link

bofen97 commented Oct 20, 2019

I derived Equation 12, but the result is not the same as Equation 13 in your paper. In my derivation, I didn't get the first item in Equation 13, I don't know where it is wrong.
can you help me..?

@haarnoja
Copy link
Owner

Thanks for the question. You mean Equation 13 in this paper? It is the total derivative of J(\phi) with respect to the policy parameters \phi. Note that both \pi_\phi and a_t = f_\phi depend on these parameters, so we'll need to differentiate with respect to both, and use the chain rule for the latter. It's is exactly analogous to this example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants