Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplementation in RL platform (CleanRL) #53

Open
cpwan opened this issue Mar 10, 2023 · 2 comments
Open

Reimplementation in RL platform (CleanRL) #53

cpwan opened this issue Mar 10, 2023 · 2 comments

Comments

@cpwan
Copy link

cpwan commented Mar 10, 2023

Hello there, my team has been trying to implement the Attention Model in RL platforms so that we can try out different RL algorithms. Eventually, we succeed to implement the most efficient one with PPO in CleanRL. We are able to train the Attention Model in 3 hours for 50-nodes problems (it took 25 hours in the original code).

Moreover, we have broken down the Attention Model into several components. It would be a good resource for anyone interested in learning or developing the Attention Model.

We implemented the vehicle routing problems with the OpenAI gym interface. It may be easier to extend to other new problems.

We have released the source code for our implementation in RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research. Feel free to check it out 😆 !

@wouterkool
Copy link
Owner

Hi! I'm sorry I'm not watching this repo frequently, but this is great. If you create a PR I'm happy to link to this from the README (otherwise I'll see when I find the time). Before I do that, can you confirm the results you get on the same dataset? Additionally, I would also encourage to reimplement POMO (https://arxiv.org/abs/2010.16011) which is a simple but significant improvement as well.

@wouterkool
Copy link
Owner

Hi! Thanks again for your implementation. I have linked to it in the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants