-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to 3DNNWalkers example #849
Improvements to 3DNNWalkers example #849
Conversation
68068f4
to
1fc36d0
Compare
Such reset will be non-deterministic, so outcomes will be slightly different. I'm planning to fix this for the shared memory API/pybullet, most robotics and machine learning / reinforcement learning projects need it.
You only have 50 walkers at one time, so you create 50 graphics shapes/instances in total? Why do you keep on creating them? And to 'delete' instances, you can move them far away and scale them to zero. Have you tried that? |
I keep on instantiating the walkers because when I reset a walker, I get these non-deterministic variations of performance of a single walker. Since the performance varies strongly (the walking patterns are not robust to slight variations), the formerly best walker is no longer the best in a second generation. Therefore I create a new walker every time I want to reset it and copy over the neural network configuration. By that I get deterministic performances, but it seems that generating new graphical objects every time (that is what m_guiHelper->autogenerateGraphicsObjects(m_dynamicsWorld); does right?) causes some issues on the OpenGL side. At least it can be prevented by not generating the graphics objects as long as they are not shown (as it is the case in headless simulation mode). Interestingly, the graphics disappear as soon as I delete the old walkers, so there seems to happen some clean up. Maybe I will have a look at your OpenGL code at some point. For now, my simulation has a toggle called REBUILD_WALKER, which deletes and reinstantiates when set to true, and uses the reset method when set to false (but with the non-deterministic effect). So by default it is set to true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of #include time.h please use the Bullet/examples/Utils/b3Clock instead.
…lace #include time.h with b3Clock.
I extended the b3Clock to expose the system current time in milliseconds. This functionality was missing. Now the example differs every time because the random numbers differ properly, but a single performance is deterministic. |
In case you need a simple test case for the Bullet Example to test the non-determinism of resetting a position, tell me and I can commit the one I wrote based on the BasicExample in a separate Pull Request. |
There is no determinism when only resetting a few values and not the others, so adding a test is non-sense. It would be good to deprecate this example and move to pybullet+TF/gym etc. See https://github.com/matpalm/cartpoleplusplus |
[edit] That is not a good idea. You could simply re-create the entire dynamics world and remove all graphics instances: instancingRenderer->removeAllInstances(); I'm working on allowing pybullet /shared memory C-API allows deterministic save/restore features for machine learning. |
Thanks for the comments.
What other parameters do I need to reset to complete the reset? Can this only be done by recreating the dynamics world? |
Yes, you have to re-create everything at the moment. I'm working on fixing this in pybullet / shared memory API / future versions of Bullet. |
I am working again on this in parallel with the porting of the cartpole++ example. I committed some changes which are not yet enough to fix the example. However, from what I have seen before I merged in the new commits from bullet is that rebuilding the world does not work and also rebuilding the walkers together with rebuilding the world does not work. I will look into it further because there are still some issues to be resolved. |
bfe7dae
to
dad9bf4
Compare
f67ed51
to
cd153eb
Compare
I updated the application to also remove the graphics instances. Additionally I implemented the full reset of the simulation after every generation. It deletes all creatures and the whole dynamics world, however, the simulation is still non-deterministic. I have no idea what is wrong. |
examples/Utils/b3Clock.cpp
Outdated
/// Returns the time in us since the last call to reset or since | ||
/// the Clock was created. | ||
/// Gets the system time in milliseconds | ||
unsigned long int b3Clock::getSystemTimeMilliseconds() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a duplicate of the already existing 'getTimeMilliseconds', why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it is not a copy. I is not a difference to the last time (currentTime.QuadPart - m_data->mStartTime.QuadPart) but it is the absolute system time. I use this to get a seed for randomness in the evolutionary algorithm. Do you propose another way to get a random seed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please remove any changes to b3Clock? I will add an option to 'reset' using absolute 0 time reference. See #1165
examples/Utils/b3Clock.cpp
Outdated
|
||
|
||
return msecTicks; | ||
LARGE_INTEGER currentTime, elapsedTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just a formatting/layout fix? Please undo (it makes reviewing very hard to mix layout changes with actual changes)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just made this code consistent with all other code segments using a very similar code segment here. But you are right, I should not have mixed it. I will undo it and maybe PR for it again so that this class is a bit cleaner. Ok?
Thanks for reviewing! |
If you remove the changes to b3Clock I can try it out and merge it. We can see why your sim is non-deterministic. |
Reverted the changes from b3Clock, thanks for adding the reset method. Let us see if the CI passes. |
examples/Evolution/NN3DWalkers.cpp
Outdated
@@ -160,7 +160,8 @@ class NN3DWalkersExample : public NN3DWalkersTimeWarpBase | |||
m_filterCallback(NULL) | |||
{ | |||
b3Clock clock; | |||
srand(clock.getSystemTimeMilliseconds()); // Set random milliseconds based on system time | |||
srand(clock.getTimeMilliseconds()); // Set random milliseconds based on system time | |||
clock.reset(true); // Reset clock to zero to get new random seed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may want to call this reset before calling 'getTimeMilliseconds', otherwise you still get the relative time since the clock was created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I could basically reset it directly after using it, so that it is reset for the next call. But that is ok too.
Thanks, I'll merge it soon after trying it out. Could you provide some links/paper/thesis/description about the details of your approach? |
Do you need it for debugging or is it for some documentation or similar? My thesis for my master´s degree did something similar and this was basically just a first version for the example browser. Next up would be some kind of body dimensions evolution together with the neural network weights. From the top of my head, the following is the approach of this example: [1] https://en.wikipedia.org/wiki/Evolutionary_algorithm Let me know if you need anything more specific or any more details. |
Thanks, it help if I'm aware what the examples (try to) do :-) Do you have a link to your thesis? |
I merged it, but get crashes on Windows when increasing the simulation speed (first slider). Reverting it: #1171 |
Sounds strange, on Linux it worked. I will look into it on Windows. |
I realized that the example is not deterministic. So I wanted to fix it quickly, but it took me a longer while.
Additionally I improved the TimeSeriesCanvas to support yMin and yMax instead of only yScale. It is backwards compatible with the other demos.
I have 2 issues that I could not resolve properly:
How do I reset a btRigidBody+btTypedConstraints setup to a certain position? I used the following method to reset all hinge joints, clear forces, angular velocity and linear velocity and reset all positions using the resetPosition as a baseposition and then apply the respective relative transform.
However, this does not reset the walker properly, instead its performance varies lightly. I will make a simple example to show the issue if necessary. The problem is currently solved by recreating every creature from scratch every time it is reset. A side-effect is that you can change its morphology using the sliders and see if the same neural network can control it.
Except for those problems, everything works well and the evolution converges to a certain speed.