-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Calculation Order #1019
Conversation
Updated order of `x` and `theta` calculations so that they are no longer one timestamp behind `x_dot` and `theta_dot`. #1018
gym/envs/classic_control/cartpole.py
Outdated
theta_dot = theta_dot + self.tau * thetaacc | ||
theta = theta + self.tau * theta_dot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why should it be in this new order? To me it would seem that Euler-method-like step should be
x_n+1 = x_n + tau * (xdot_n)
xdot_n+1 = xdot_n + (acc_n)
etc which corresponded to the old version... Are there some stability issues?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at the permalink and I think even the original had this bug. It is the order of operations in calculus. You need to integrate acceleration to get velocity, and integrate velocity to get position. If you switch up the order and do acceleration -> position -> velocity, the position will always be one timestamp behind because it will have used the previous velocity rather than the current velocity.
So for example if I gave you a math problem where you had the acceleration given you would have to calculate the velocity first to get the position, it's the same thing here. The reason it works is because on line 57 you are reading in the previous velocity.
gym/gym/envs/classic_control/cartpole.py
Line 57 in 293eea7
force = self.force_mag if action==1 else -self.force_mag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The integration that the code does now is called semi-implicit / symplectic euler integration, and is much more stable than euler integration. Most game physics simulators use this method. This isn't a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably comment this on the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, after looking more closely, it's just normal Euler integration as Peter said, and it's correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure they are kinematic equations 1 and 2 with some simplifications. If they are, the order of the lines should be switched.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An airplane accelerates down a runway at 3.20 m/s2 for 32.8 s until is finally lifts off the ground. Determine the distance traveled before takeoff.
If you take the above example, you would have to calculate the velocity before the position using the kinematic equations. The process is the same as what is being used in those lines of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per discussion with @joschu - both versions are correct. The old one is the vanilla Euler, and the suggested is semi-implicit Euler. While the new one is more stable, we'd rather not change the behavior of the existing environment. If you'd still like to use semi-implicit Euler, could you add a separate environment flag that turns it on (i.e. by default it is off, and a flag can turn it on)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I can do that. I have been looking at this from a physics stand point the whole time, but according to this post it appears that it depends on the system dynamic on which one you can use.
https://math.stackexchange.com/questions/2191071/confusion-about-the-semi-implicit-euler-method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add the new integration order as either as either conditioned on a flag or add an environment option (e.g. kinematics_integrator='euler'
is a default old version, and kinematics_integrator='semi-implicit-euler'
for new version.
I made the change and pushed it. Do I have to make a new pull request or anything? |
x_dot = x_dot + self.tau * xacc | ||
theta = theta + self.tau * theta_dot | ||
theta_dot = theta_dot + self.tau * thetaacc | ||
if self.kinematics_integrator == 'euler': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, but I think the if and else blocks need to be swapped. In this context euler is the method when we update position first (using old velocity), then velocity, and semi-implicit euler - when we first update velocity, then position using new velocity
No, no need for a new PR, this one is getting updated when you push new commits. I like it save for the comment above. @joshu ? |
Do you want me to change the default setting to semi-implicit then as well? That way the behaviour is only changed from the previous if they change the option? |
No, I'd like default to be euler,.so that it behaves like it used to by default (first coordinate is updated , then velocity) |
Okay yea, you're right. I was getting everything mixed up in my head. You're 100% correct. |
Do you get notified when I commit? |
* Updated order Calculation Updated order of `x` and `theta` calculations so that they are no longer one timestamp behind `x_dot` and `theta_dot`. openai#1018 * Added semi-implicit euler option * Got implicit and standard euler mixed up * switched default option
Updated order of
x
andtheta
calculations so that they are no longer one timestamp behindx_dot
andtheta_dot
.See issue below:
#1018