Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/recurrent maddpg with launchpad #121

Merged
merged 4 commits into from
May 10, 2021

Conversation

DriesSmit
Copy link
Contributor

What?

Fix minor bugs to allow for recurrent training of MA-DDPG using Launchpad.

Why?

The bugs broke training if one used MA-DDPG with a recurrent executor.

How?

There were some minor bugs inside the MA-DDPG system, especially with the signatures, which meant incorrect data was being sent through the adders.

Extra

I also updated the transition adder to conform to the signature setup of the sequence adder.

@DriesSmit DriesSmit added the bug Something isn't working label May 10, 2021
@DriesSmit DriesSmit self-assigned this May 10, 2021
@DriesSmit DriesSmit linked an issue May 10, 2021 that may be closed by this pull request
Copy link
Contributor

@KaleabTessera KaleabTessera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @DriesSmit !

@@ -92,7 +93,7 @@ def make_networks(
observation_network,
snt.Flatten(),
snt.nets.MLP(policy_networks_layer_sizes[key]),
snt.LSTM(20),
snt.LSTM(25),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason for this or just seeing if it improved performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was used to make it easier to debug, as now it is different than the sequence length. This can change back to 20, but I find it easier to see what is going on in the adders when something breaks if the size is not the same as the sequence length.

Copy link
Collaborator

@arnupretorius arnupretorius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks @DriesSmit! 👍

@arnupretorius arnupretorius linked an issue May 10, 2021 that may be closed by this pull request
@arnupretorius arnupretorius merged commit 0ff7ecd into main May 10, 2021
@arnupretorius arnupretorius deleted the fix/recurrent-maddpg-with-launchpad branch May 10, 2021 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run recurrent MA-DDPG with Launchpad Adapt MADDPG trainer to use recurrent execution
4 participants