deepbots-panda to deepworlds #33

KelvinYang0320 · 2021-03-10T14:36:20Z

No description provided.

tsampazk · 2021-03-11T12:08:14Z

Hey @KelvinYang0320! First of all, thank you for this great addition to the repository. I would also like to commend you on the quality of the code, great job!

One issue i noticed and that seems to cause a significant slow down in the training process is that when the arm self-collides, the simulation slows down to a crawl. I am including some screenshots below.

I guess that the the robot mesh becomes red when self-colliding and the simulation struggles to move forward, sometimes even showing the physics step warning. I am running the simulation in as-fast-as-possible speed and goes up to x15 when no self-collision is happening and slows down to x1.0 or less when self-colliding. Can you confirm that this is indeed happening on your system as well?

I can imagine a possible solution of detecting self-collision and add it as a termination (done) condition with some possible negative reward. It might greatly improve the speed of the training process. I can see that the webots docs it states that the simulation might slow down a lot, but i am not exactly sure if there's a way to detect it in code. I posted a question in the Webots discord and will notify you here when i get an answer.

KelvinYang0320 · 2021-03-11T12:47:14Z

Hello @tsampazk
Yes, this warning is sometimes happening on my system as well, but it only slows down the training process, so I ignore it.
The complex boundingObject is for grabing a beaker (my original training task).
In order to train faster, maybe we can just remove the boundingObject?
Without detecting self-collision, I think this task should still be able to be solved.

The gif below is the same goal reaching task that was solved in my IKPY-based RL Env, and this environment is without the self-collision and boundingObject.

tsampazk · 2021-03-11T12:55:00Z

I opened a feature request on the Webots repository, because it seems a great addition for RL tasks. For this PR maybe you can try removing the bounding box as you suggest as it is not critical. We can re-address this in the future, in relation to the feedback we will get from the Webots developers.

stefaniapedrazzi · 2021-03-11T12:55:56Z

Just also note that one main problem of this robot model causing low simulation performances are the boundingObjects.
Using these complex IndexedFaceSet geometries is computationally expensive. It is strongly recommended to simplify them and when possible use only simple geometries (Spheres, Cylinder, Boxes, Capsules) in boundingObject definition.

tsampazk · 2021-03-11T13:03:45Z

@all-contributors please add @KelvinYang0320 for code, doc, example, ideas

allcontributors · 2021-03-11T13:03:53Z

@tsampazk

I've put up a pull request to add @KelvinYang0320! 🎉

KelvinYang0320 · 2021-03-11T13:11:47Z

For this PR maybe you can try removing the bounding box as you suggest as it is not critical.

@tsampazk
Some parts of the robot arm will pass through itself during training by this solution.:sweat_smile:
Is that ok?

tsampazk · 2021-03-11T13:17:01Z

For this PR maybe you can try removing the bounding box as you suggest as it is not critical.

@tsampazk
Some parts of the robot arm will pass through itself during training by this solution.😅
Is that ok?

Yes, i guess it's ok for the purposes of this example! 😄 You can go forward and remove it and we will add more thorough reviews and it should be good to go.

KelvinYang0320 · 2021-03-11T13:44:46Z

@tsampazk Okay, I have removed it.
Thank you for adding me as a contributor!

KelvinYang0320 · 2021-03-12T02:14:14Z

@tsampazk After training for about 12 hours, I think I should set the env.solved() condition to a higher score due to this image.
Are there other bugs I need to fix?

tsampazk · 2021-03-12T09:50:11Z

@tsampazk After training for about 12 hours, I think I should set the env.solved() condition to a higher score due to this image.

Since you have observed the results many more times and have more experience on the problem than me, if you think it is better this way, it looks good to me!

As a minor note, i think that we would prefer to keep the solved conditions and hyperparameters in a state where it is as easy as possible for someone to train the agent out-of-the-box. For example in the find target problem we provide, the training procedure is problematic and we are looking into fixing it in the future.

Are there other bugs I need to fix?

To be honest i tried running the example on my system and the robot controller crashes silently after 50-100 episodes. Obviously this is not happening to you, so i am wondering what Webots version are you using? Moreover, i think that we are both running the latest dev version of deepbots installed from here, correct?

tsampazk · 2021-03-12T10:02:00Z

@KelvinYang0320 Also you can take a look at the suggestion posted here. Maybe it is something you can look into in the future if you find yourself working on a similar problem?

KelvinYang0320 · 2021-03-12T14:34:18Z

To be honest i tried running the example on my system and the robot controller crashes silently after 50-100 episodes. Obviously this is not happening to you, so i am wondering what Webots version are you using? Moreover, i think that we are both running the latest dev version of deepbots installed from here, correct?

Yes, we are both running the latest dev version of deepbots with Webots R2021a, and I ran this example on Ubuntu 18.04 & Ubuntu 20.04. I have listed this information on the READMD.md.
I think this is caused by memory limit (RAM, GPU)?

KelvinYang0320 · 2021-03-12T15:52:37Z

As a minor note, i think that we would prefer to keep the solved conditions and hyperparameters in a state where it is as easy as possible for someone to train the agent out-of-the-box. For example in the find target problem we provide, the training procedure is problematic and we are looking into fixing it in the future.

I can remove the 6 goals from the 9 goals to simplify this problem.

the robot controller crashes silently after 50-100 episodes.

However, since the robot controls 7 motor positions (in the huge action space), it still might not be able to be solved within 50-100 episodes.

@KelvinYang0320 Also you can take a look at the suggestion posted here. Maybe it is something you can look into in the future if you find yourself working on a similar problem?

Thank you very much for opening this issue on webots for me. 😄
We have done a project with IKPY and DDPG, but we didn't use IKPY to simplify the action that the agent sends.
We separated a sequence of actions into many parts and solved each part by IKPY or DDPG.
I will look into IKFast Kinematics Solver in the future.:thinking:

tsampazk · 2021-03-12T18:06:34Z

I have listed this information on the READMD.md.

Oops sorry i didn't notice that part of the README! I am having the issue on Windows 10.

I think this is caused by memory limit (RAM, GPU)?

Ram usage seems stable, GPU is barely used, and the rest of the system runs fine. Even disabling the SYNCHRONIZATION flag of the controller enables the rest of the simulation to continue running. One thing i noticed is that it happened during the handshaking process. I tried to reproduce it a another time but it has reached 250 episodes and haven't happened. Really weird indeed.

@ManosMagnus could you please do a test run and see if the issue appears on your system as well when you get a chance? If my colleague Manos can't reproduce i think we can blame it on my system.

However, since the robot controls 7 motor positions (in the huge action space), it still might not be able to be solved within 50-100 episodes.

Of course! Then we would have a pretty spectacular RL agent in our hands! 😛 You can keep it as it is, no problem at all. The fact that the provided code solves the problem is good enough. If you can find a way to make it faster it would be great, but it is not a must, so don't worry about it.

Thank you very much for opening this issue on webots for me. 😄

You are welcome! Just food for thought for the future! 😁

KelvinYang0320 · 2021-03-16T15:36:44Z

I tried to reproduce it a another time but it has reached 250 episodes and haven't happened. Really weird indeed.

I also tried to reproduce the issue you mentioned on my Windows 10, but it has reached over 500 episodes and haven't happened.:thinking:

You can keep it as it is, no problem at all.

Thank you! I think I will just keep it as it is.

eakirtas · 2021-03-26T10:53:35Z

Hello @KelvinYang0320,

Sorry for the delay, I will provide a code review on the following days

eakirtas

Look good to me. I've run example without any problem. Some minor comments were added.

eakirtas · 2021-03-26T18:29:54Z

examples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/DDPG_runner.py

+
+    # agent.load_models() # Load the pretrained model
+    episodeCount = 0 
+    episodeLimit = 50000


This might be a universal constant value or argument of run function?

episodeCount = 0 and episodeLimit = 50000 are set as universal constants.
But we still can change these constants when doing other tasks.
Similar to these lines in cartpole_discrete example.

I meant that this can be on top of the file (bellow imports something like EPISODE_LIMIT) in order to be visible to the end user. Indeed this is as in cartpole, maybe we should change there as well. Of course, it is really minor, if you feel that's better as it is, feel free to resolve the comment

examples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/DDPG_runner.py

eakirtas · 2021-03-26T18:41:17Z

examples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/agent/ddpg.py

+        #self.fc1.weight.data.uniform_(-f1, f1)
+        #self.fc1.bias.data.uniform_(-f1, f1)


Remove those lines?

examples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/checkConvergence.py

eakirtas · 2021-03-26T18:43:51Z

...ples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/robot_supervisor_ddpg.py

+        self.setup_motors()
+
+        # Set up misc
+        self.stepsPerEpisode = 300  # How many steps to run each episode (changing this messes up the solved condition)


Argument or constant?

self.stepsPerEpisode = 300 is a constant.
If user are trying to reach farther target, maybe user can change this constant.

Similar as the previous comment. Feel free resolve the comment if need be

eakirtas · 2021-03-26T18:44:03Z

...ples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/robot_supervisor_ddpg.py

+        self.stepsPerEpisode = 300  # How many steps to run each episode (changing this messes up the solved condition)
+        self.episodeScore = 0  # Score accumulated during an episode
+        self.episodeScoreList = []  # A list to save all the episode scores, used to check if task is solved
+        self.motorVelocity = 10


Argument or constant?

Similar to these lines in cartpole_discrete example.
I set these as constants, and users can modify these lines if they want.

examples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/checkConvergence.py

KelvinYang0320 · 2021-03-29T15:11:20Z

@ManosMagnus Thank you for your comments!
I have fixed these parts you mention.
If there are still some problems, plz let me know.:grinning:

eakirtas

LGTM, just two minors comments. Could you please squash the commits in order to get merged?

eakirtas · 2021-03-30T08:09:49Z

examples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/DDPG_runner.py

+
+    # agent.load_models() # Load the pretrained model
+    episodeCount = 0 
+    episodeLimit = 50000


I meant that this can be on top of the file (bellow imports something like EPISODE_LIMIT) in order to be visible to the end user. Indeed this is as in cartpole, maybe we should change there as well. Of course, it is really minor, if you feel that's better as it is, feel free to resolve the comment

eakirtas · 2021-03-30T08:11:23Z

...ples/panda/panda_goal_reaching/controllers/robot_supervisor_manager/robot_supervisor_ddpg.py

+        self.setup_motors()
+
+        # Set up misc
+        self.stepsPerEpisode = 300  # How many steps to run each episode (changing this messes up the solved condition)


Similar as the previous comment. Feel free resolve the comment if need be

KelvinYang0320 · 2021-03-31T15:16:39Z

@ManosMagnus I have set STEPS_PER_EPISODE and MOTRO_VELOCITY in robot_supervisor_ddpg.py as constants and set EPISODE_LIMIT and SAVE_MODELS_PERIOD as constants in DDPG_runner.py.
Thank you for your clear explanation!:sweat_smile:

Do you think we should create a python file, say constant.py, for these constants?
I need to use from robot_supervisor_ddpg import STEPS_PER_EPISODE in DDPG_runner.py since I need this constant from different python file.

eakirtas · 2021-04-02T09:22:08Z

Thank you @KelvinYang0320!

That's a good idea. However, I am afraid that it could be not visible to the end user. Do you think constants is better to be added at robot_supervisor_manager.py

KelvinYang0320 · 2021-04-06T13:13:01Z

Sorry for the delay in responding.

Do you think constants is better to be added at robot_supervisor_manager.py

Yes, that is a good idea. I will modify it.

KelvinYang0320 · 2021-04-06T15:56:14Z

@ManosMagnus Not sure why I get these error if I import these constants from robot_supervisor_manager.py

But no error if I import these constants from Constants.py😕
You can reproduce this error by change Constants with robot_supervisor_manager in this line or this line.

eakirtas · 2021-04-09T08:57:42Z

Might be because of the cross imports. We can just keep it on constants.py.

I think that can merge the PR for now. Please remove and constants from manager and squash the commit in order to merge the PR.

Thank you!

update readme update readme update readme del break Update README.md Update README.md refactoring refactoring refactoring forget to save my models! os os tmp/ddpg save models every 200 episodes del boundingObject del boundingObject fix solved() SMA100->SMA500 fix maintainer comments del #... in ddpg.py add const const:capital letters update

KelvinYang0320 · 2021-04-11T10:35:38Z

We can just keep it on constants.py.

Please remove and constants from manager and squash the commit in order to merge the PR.

Ok, I have removed them and squashed all commits to one commit.
Thank you.

eakirtas · 2021-04-12T07:21:42Z

Thank you @KelvinYang0320 for contributing such a nice example!

eakirtas self-requested a review March 10, 2021 20:30

tsampazk mentioned this pull request Mar 11, 2021

Detection of self-collisions cyberbotics/webots#2846

Closed

allcontributors bot mentioned this pull request Mar 11, 2021

docs: add KelvinYang0320 as a contributor #34

Merged

eakirtas requested changes Mar 26, 2021

View reviewed changes

eakirtas reviewed Mar 30, 2021

View reviewed changes

eakirtas mentioned this pull request Apr 5, 2021

Question: how to get kinect camera information aidudezzz/deepbots#88

Closed

eakirtas approved these changes Apr 12, 2021

View reviewed changes

eakirtas merged commit 36ee34c into aidudezzz:dev Apr 12, 2021

KelvinYang0320 mentioned this pull request Apr 15, 2021

add constants at robot_supervisor_manager.py for all examples #37

Closed

KelvinYang0320 mentioned this pull request Aug 18, 2021

Added multi cartpole example #48

Open

7 tasks

		#self.fc1.weight.data.uniform_(-f1, f1)
		#self.fc1.bias.data.uniform_(-f1, f1)

deepbots-panda to deepworlds #33

deepbots-panda to deepworlds #33

Conversation

KelvinYang0320 commented Mar 10, 2021

tsampazk commented Mar 11, 2021

KelvinYang0320 commented Mar 11, 2021

tsampazk commented Mar 11, 2021

stefaniapedrazzi commented Mar 11, 2021 • edited Loading

tsampazk commented Mar 11, 2021

allcontributors bot commented Mar 11, 2021

KelvinYang0320 commented Mar 11, 2021

tsampazk commented Mar 11, 2021

KelvinYang0320 commented Mar 11, 2021

KelvinYang0320 commented Mar 12, 2021

tsampazk commented Mar 12, 2021

tsampazk commented Mar 12, 2021

KelvinYang0320 commented Mar 12, 2021

KelvinYang0320 commented Mar 12, 2021 • edited Loading

tsampazk commented Mar 12, 2021

KelvinYang0320 commented Mar 16, 2021 • edited Loading

eakirtas commented Mar 26, 2021

eakirtas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KelvinYang0320 Mar 29, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KelvinYang0320 commented Mar 29, 2021

eakirtas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KelvinYang0320 commented Mar 31, 2021 • edited Loading

eakirtas commented Apr 2, 2021

KelvinYang0320 commented Apr 6, 2021

KelvinYang0320 commented Apr 6, 2021 • edited Loading

eakirtas commented Apr 9, 2021

KelvinYang0320 commented Apr 11, 2021

eakirtas commented Apr 12, 2021

stefaniapedrazzi commented Mar 11, 2021 •

edited

Loading

KelvinYang0320 commented Mar 12, 2021 •

edited

Loading

KelvinYang0320 commented Mar 16, 2021 •

edited

Loading

KelvinYang0320 Mar 29, 2021 •

edited

Loading

KelvinYang0320 commented Mar 31, 2021 •

edited

Loading

KelvinYang0320 commented Apr 6, 2021 •

edited

Loading