Reimplementation in RL platform (CleanRL) #53

cpwan · 2023-03-10T03:11:11Z

Hello there, my team has been trying to implement the Attention Model in RL platforms so that we can try out different RL algorithms. Eventually, we succeed to implement the most efficient one with PPO in CleanRL. We are able to train the Attention Model in 3 hours for 50-nodes problems (it took 25 hours in the original code).

Moreover, we have broken down the Attention Model into several components. It would be a good resource for anyone interested in learning or developing the Attention Model.

We implemented the vehicle routing problems with the OpenAI gym interface. It may be easier to extend to other new problems.

We have released the source code for our implementation in RLOR: A Flexible Framework of Deep Reinforcement Learning for Operation Research. Feel free to check it out 😆 !

wouterkool · 2023-05-30T13:58:18Z

Hi! I'm sorry I'm not watching this repo frequently, but this is great. If you create a PR I'm happy to link to this from the README (otherwise I'll see when I find the time). Before I do that, can you confirm the results you get on the same dataset? Additionally, I would also encourage to reimplement POMO (https://arxiv.org/abs/2010.16011) which is a simple but significant improvement as well.

wouterkool · 2024-01-09T12:51:16Z

Hi! Thanks again for your implementation. I have linked to it in the README.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reimplementation in RL platform (CleanRL) #53

Reimplementation in RL platform (CleanRL) #53

cpwan commented Mar 10, 2023 •

edited

Loading

wouterkool commented May 30, 2023

wouterkool commented Jan 9, 2024

Reimplementation in RL platform (CleanRL) #53

Reimplementation in RL platform (CleanRL) #53

Comments

cpwan commented Mar 10, 2023 • edited Loading

wouterkool commented May 30, 2023

wouterkool commented Jan 9, 2024

cpwan commented Mar 10, 2023 •

edited

Loading