Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added multi cartpole example #48

Open
wants to merge 20 commits into
base: dev
Choose a base branch
from
Open

Added multi cartpole example #48

wants to merge 20 commits into from

Conversation

veds12
Copy link

@veds12 veds12 commented Aug 17, 2021

Description:

  • The multi-cartpole grid example consists of 9 cartpole robots each present in one block of a 3x3 grid
  • Each cartpole robot functions independently of each other
  • The supervisor communicated with each of these 9 robots through separate emitter-receiver channels, i.e. there are 9 pairs of emitter-receiver channels
  • The robots are trained using Proximal Policy Optimisation
  • The supervisor controller can work for more / less number of robots by changing the value of the parameter num_robots while initializing it. The robot controller is agnostic to the number of robots. Hence, this world can be easily extended to different number of robots.

TO DO:

  • 9 cartpole grid env
  • Simultaneous communication using emmiter-receiver scheme working
  • Training
  • Clean the code
  • Change communication scheme
  • Move all the SolidBox nodes to a group node
  • Add reward plotting

@KelvinYang0320
Copy link
Member

@veds12 Thank you for this great and interesting example.:thumbsup:
I will do a code review as soon as possible.

Could you move all SolidBox nodes to a group node.
By doing so, I think it will be easier for end-users to understand the scene tree in this example.

@KelvinYang0320 KelvinYang0320 self-requested a review August 17, 2021 14:26
Copy link
Member

@KelvinYang0320 KelvinYang0320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the great code quality!
I've left a few comments.
Also, please add a readme with showcase & plot to this example as we have done for each example in deepworlds.
Cheers!

@veds12
Copy link
Author

veds12 commented Aug 18, 2021

Thanks for the review @KelvinYang0320. I'll make the changes and update the PR as soon as possible

@veds12
Copy link
Author

veds12 commented Jun 26, 2022

@KelvinYang0320 I have addressed the requested changes. Please take a look

@KelvinYang0320
Copy link
Member

@veds12 Did you use wandb in multi_cartpole?
I think you added evolutionary/cartpole_evolutionary/controllers/supervisor_manager/wandb by mistake.

Also, please add a readme with showcase & plot to this example as we have done for each example in deepworlds.

Please add a readme here.

Could you also fix the coordinate system (NUE → ENU) in this PR? Thank you!

@veds12
Copy link
Author

veds12 commented Jun 27, 2022

@KelvinYang0320 Yeah I left the wandb logs by mistake. Forgot to update the gitignore. I'll make the other changes.
Thanks!

@veds12
Copy link
Author

veds12 commented Jul 3, 2022

  • I have added a readme and changed the coordinate system (in the controller code).
  • I tried training the example but for some reason it's taking too long to converge / not converging at all. Currently looking into it. I'll add the training plot when I fix this.

poleAngle = [0.0 for _ in range(self.num_robots)]

# Angular velocity x of endpoint
endpointVelocity = [normalizeToRange(self.poleEndpoint[i].getVelocity()[3], -1.5, 1.5, -1.0, 1.0, clip=True) for i in range(self.num_robots)]
Copy link
Member

@KelvinYang0320 KelvinYang0320 Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the reason why it's taking too long to converge / not converging.

Suggested change
endpointVelocity = [normalizeToRange(self.poleEndpoint[i].getVelocity()[3], -1.5, 1.5, -1.0, 1.0, clip=True) for i in range(self.num_robots)]
endpointVelocity = [normalizeToRange(self.poleEndpoint[i].getVelocity()[4], -1.5, 1.5, -1.0, 1.0, clip=True) for i in range(self.num_robots)]

Copy link
Member

@KelvinYang0320 KelvinYang0320 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@veds12 I have left a comment about the training issue.
Could you format your code with PEP8?
I will do a review again after that.
Thank you!

@tsampazk
Copy link
Member

tsampazk commented Jul 7, 2022

@veds12 I have left a comment about the training issue. Could you format your code with PEP8? I will do a review again after that. Thank you!

Hey @veds12, just a clarification because we have been discussing it with @KelvinYang0320, i think that the only differences from PEP8 are that we use a line limit of 120, and prefer to use mixedCase (sometimes also called lowerCamelCase) instead of snake_case in variable names.

@veds12
Copy link
Author

veds12 commented Jul 7, 2022

@tsampazk thanks for the clarification! I'll address the changes this weekend

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants