Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepbots-panda to deepworlds #33

Merged
merged 1 commit into from
Apr 12, 2021
Merged

deepbots-panda to deepworlds #33

merged 1 commit into from
Apr 12, 2021

Conversation

KelvinYang0320
Copy link
Member

No description provided.

@eakirtas eakirtas self-requested a review March 10, 2021 20:30
@tsampazk
Copy link
Member

Hey @KelvinYang0320! First of all, thank you for this great addition to the repository. I would also like to commend you on the quality of the code, great job!

One issue i noticed and that seems to cause a significant slow down in the training process is that when the arm self-collides, the simulation slows down to a crawl. I am including some screenshots below.

selfcollision
selfcollision2
nocollision
warningPhysicsStep

I guess that the the robot mesh becomes red when self-colliding and the simulation struggles to move forward, sometimes even showing the physics step warning. I am running the simulation in as-fast-as-possible speed and goes up to x15 when no self-collision is happening and slows down to x1.0 or less when self-colliding. Can you confirm that this is indeed happening on your system as well?

I can imagine a possible solution of detecting self-collision and add it as a termination (done) condition with some possible negative reward. It might greatly improve the speed of the training process. I can see that the webots docs it states that the simulation might slow down a lot, but i am not exactly sure if there's a way to detect it in code. I posted a question in the Webots discord and will notify you here when i get an answer.

@KelvinYang0320
Copy link
Member Author

Hello @tsampazk
Yes, this warning is sometimes happening on my system as well, but it only slows down the training process, so I ignore it.
The complex boundingObject is for grabing a beaker (my original training task).
In order to train faster, maybe we can just remove the boundingObject?
Without detecting self-collision, I think this task should still be able to be solved.

The gif below is the same goal reaching task that was solved in my IKPY-based RL Env, and this environment is without the self-collision and boundingObject.

@tsampazk
Copy link
Member

I opened a feature request on the Webots repository, because it seems a great addition for RL tasks. For this PR maybe you can try removing the bounding box as you suggest as it is not critical. We can re-address this in the future, in relation to the feedback we will get from the Webots developers.

@stefaniapedrazzi
Copy link

stefaniapedrazzi commented Mar 11, 2021

Just also note that one main problem of this robot model causing low simulation performances are the boundingObjects.
Using these complex IndexedFaceSet geometries is computationally expensive. It is strongly recommended to simplify them and when possible use only simple geometries (Spheres, Cylinder, Boxes, Capsules) in boundingObject definition.

@tsampazk
Copy link
Member

@all-contributors please add @KelvinYang0320 for code, doc, example, ideas

@allcontributors
Copy link
Contributor

@tsampazk

I've put up a pull request to add @KelvinYang0320! 🎉

@KelvinYang0320
Copy link
Member Author

For this PR maybe you can try removing the bounding box as you suggest as it is not critical.

@tsampazk
Some parts of the robot arm will pass through itself during training by this solution.:sweat_smile:
Is that ok?

@tsampazk
Copy link
Member

For this PR maybe you can try removing the bounding box as you suggest as it is not critical.

@tsampazk
Some parts of the robot arm will pass through itself during training by this solution.😅
Is that ok?

Yes, i guess it's ok for the purposes of this example! 😄 You can go forward and remove it and we will add more thorough reviews and it should be good to go.

@KelvinYang0320
Copy link
Member Author

@tsampazk Okay, I have removed it.
Thank you for adding me as a contributor!

@KelvinYang0320
Copy link
Member Author

@tsampazk After training for about 12 hours, I think I should set the env.solved() condition to a higher score due to this image.
Are there other bugs I need to fix?

@tsampazk
Copy link
Member

@tsampazk After training for about 12 hours, I think I should set the env.solved() condition to a higher score due to this image.

Since you have observed the results many more times and have more experience on the problem than me, if you think it is better this way, it looks good to me!

As a minor note, i think that we would prefer to keep the solved conditions and hyperparameters in a state where it is as easy as possible for someone to train the agent out-of-the-box. For example in the find target problem we provide, the training procedure is problematic and we are looking into fixing it in the future.

Are there other bugs I need to fix?

To be honest i tried running the example on my system and the robot controller crashes silently after 50-100 episodes. Obviously this is not happening to you, so i am wondering what Webots version are you using? Moreover, i think that we are both running the latest dev version of deepbots installed from here, correct?

@tsampazk
Copy link
Member

@KelvinYang0320 Also you can take a look at the suggestion posted here. Maybe it is something you can look into in the future if you find yourself working on a similar problem?

@KelvinYang0320
Copy link
Member Author

To be honest i tried running the example on my system and the robot controller crashes silently after 50-100 episodes. Obviously this is not happening to you, so i am wondering what Webots version are you using? Moreover, i think that we are both running the latest dev version of deepbots installed from here, correct?

Yes, we are both running the latest dev version of deepbots with Webots R2021a, and I ran this example on Ubuntu 18.04 & Ubuntu 20.04. I have listed this information on the READMD.md.
I think this is caused by memory limit (RAM, GPU)?

@KelvinYang0320
Copy link
Member Author

KelvinYang0320 commented Mar 12, 2021

As a minor note, i think that we would prefer to keep the solved conditions and hyperparameters in a state where it is as easy as possible for someone to train the agent out-of-the-box. For example in the find target problem we provide, the training procedure is problematic and we are looking into fixing it in the future.

I can remove the 6 goals from the 9 goals to simplify this problem.

the robot controller crashes silently after 50-100 episodes.

However, since the robot controls 7 motor positions (in the huge action space), it still might not be able to be solved within 50-100 episodes.

@KelvinYang0320 Also you can take a look at the suggestion posted here. Maybe it is something you can look into in the future if you find yourself working on a similar problem?

Thank you very much for opening this issue on webots for me. 😄
We have done a project with IKPY and DDPG, but we didn't use IKPY to simplify the action that the agent sends.
We separated a sequence of actions into many parts and solved each part by IKPY or DDPG.
I will look into IKFast Kinematics Solver in the future.:thinking:

@tsampazk
Copy link
Member

I have listed this information on the READMD.md.

Oops sorry i didn't notice that part of the README! I am having the issue on Windows 10.

I think this is caused by memory limit (RAM, GPU)?

Ram usage seems stable, GPU is barely used, and the rest of the system runs fine. Even disabling the SYNCHRONIZATION flag of the controller enables the rest of the simulation to continue running. One thing i noticed is that it happened during the handshaking process. I tried to reproduce it a another time but it has reached 250 episodes and haven't happened. Really weird indeed.

@ManosMagnus could you please do a test run and see if the issue appears on your system as well when you get a chance? If my colleague Manos can't reproduce i think we can blame it on my system.

However, since the robot controls 7 motor positions (in the huge action space), it still might not be able to be solved within 50-100 episodes.

Of course! Then we would have a pretty spectacular RL agent in our hands! 😛 You can keep it as it is, no problem at all. The fact that the provided code solves the problem is good enough. If you can find a way to make it faster it would be great, but it is not a must, so don't worry about it.

Thank you very much for opening this issue on webots for me. 😄

You are welcome! Just food for thought for the future! 😁

@KelvinYang0320
Copy link
Member Author

KelvinYang0320 commented Mar 16, 2021

I tried to reproduce it a another time but it has reached 250 episodes and haven't happened. Really weird indeed.

I also tried to reproduce the issue you mentioned on my Windows 10, but it has reached over 500 episodes and haven't happened.:thinking:

You can keep it as it is, no problem at all.

Thank you! I think I will just keep it as it is.

@eakirtas
Copy link
Member

Hello @KelvinYang0320,

Sorry for the delay, I will provide a code review on the following days

Copy link
Member

@eakirtas eakirtas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me. I've run example without any problem. Some minor comments were added.


# agent.load_models() # Load the pretrained model
episodeCount = 0
episodeLimit = 50000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a universal constant value or argument of run function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

episodeCount = 0 and episodeLimit = 50000 are set as universal constants.
But we still can change these constants when doing other tasks.
Similar to these lines in cartpole_discrete example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that this can be on top of the file (bellow imports something like EPISODE_LIMIT) in order to be visible to the end user. Indeed this is as in cartpole, maybe we should change there as well. Of course, it is really minor, if you feel that's better as it is, feel free to resolve the comment

Comment on lines 135 to 136
#self.fc1.weight.data.uniform_(-f1, f1)
#self.fc1.bias.data.uniform_(-f1, f1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove those lines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

self.setup_motors()

# Set up misc
self.stepsPerEpisode = 300 # How many steps to run each episode (changing this messes up the solved condition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument or constant?

Copy link
Member Author

@KelvinYang0320 KelvinYang0320 Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.stepsPerEpisode = 300 is a constant.
If user are trying to reach farther target, maybe user can change this constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar as the previous comment. Feel free resolve the comment if need be

self.stepsPerEpisode = 300 # How many steps to run each episode (changing this messes up the solved condition)
self.episodeScore = 0 # Score accumulated during an episode
self.episodeScoreList = [] # A list to save all the episode scores, used to check if task is solved
self.motorVelocity = 10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argument or constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to these lines in cartpole_discrete example.
I set these as constants, and users can modify these lines if they want.

@KelvinYang0320
Copy link
Member Author

@ManosMagnus Thank you for your comments!
I have fixed these parts you mention.
If there are still some problems, plz let me know.:grinning:

Copy link
Member

@eakirtas eakirtas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just two minors comments. Could you please squash the commits in order to get merged?


# agent.load_models() # Load the pretrained model
episodeCount = 0
episodeLimit = 50000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant that this can be on top of the file (bellow imports something like EPISODE_LIMIT) in order to be visible to the end user. Indeed this is as in cartpole, maybe we should change there as well. Of course, it is really minor, if you feel that's better as it is, feel free to resolve the comment

self.setup_motors()

# Set up misc
self.stepsPerEpisode = 300 # How many steps to run each episode (changing this messes up the solved condition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar as the previous comment. Feel free resolve the comment if need be

@KelvinYang0320
Copy link
Member Author

KelvinYang0320 commented Mar 31, 2021

@ManosMagnus I have set STEPS_PER_EPISODE and MOTRO_VELOCITY in robot_supervisor_ddpg.py as constants and set EPISODE_LIMIT and SAVE_MODELS_PERIOD as constants in DDPG_runner.py.
Thank you for your clear explanation!:sweat_smile:

Do you think we should create a python file, say constant.py, for these constants?
I need to use from robot_supervisor_ddpg import STEPS_PER_EPISODE in DDPG_runner.py since I need this constant from different python file.

@eakirtas
Copy link
Member

eakirtas commented Apr 2, 2021

Thank you @KelvinYang0320!

That's a good idea. However, I am afraid that it could be not visible to the end user. Do you think constants is better to be added at robot_supervisor_manager.py

@KelvinYang0320
Copy link
Member Author

Sorry for the delay in responding.

Do you think constants is better to be added at robot_supervisor_manager.py

Yes, that is a good idea. I will modify it.

@KelvinYang0320
Copy link
Member Author

KelvinYang0320 commented Apr 6, 2021

@ManosMagnus Not sure why I get these error if I import these constants from robot_supervisor_manager.py
errConst
But no error if I import these constants from Constants.py😕
You can reproduce this error by change Constants with robot_supervisor_manager in this line or this line.

@eakirtas
Copy link
Member

eakirtas commented Apr 9, 2021

Might be because of the cross imports. We can just keep it on constants.py.

I think that can merge the PR for now. Please remove and constants from manager and squash the commit in order to merge the PR.

Thank you!

update readme

update readme

update readme

del break

Update README.md

Update README.md

refactoring

refactoring

refactoring

forget to save my models!

os

os

tmp/ddpg

save models every 200 episodes

del boundingObject

del boundingObject

fix solved() SMA100->SMA500

fix maintainer comments

del #... in ddpg.py

add const

const:capital letters

update
@KelvinYang0320
Copy link
Member Author

We can just keep it on constants.py.

Please remove and constants from manager and squash the commit in order to merge the PR.

Ok, I have removed them and squashed all commits to one commit.
Thank you.

@eakirtas eakirtas merged commit 36ee34c into aidudezzz:dev Apr 12, 2021
@eakirtas
Copy link
Member

Thank you @KelvinYang0320 for contributing such a nice example!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants