REINFORCE in DI-engine? #517

cpwan · 2022-10-19T09:54:56Z

cpwan
Oct 19, 2022

The REINFORCE algorithm is a classical policy gradient method. It has been implemented in some other RL libraries, such as RLlib and Tianshou. I wonder if we can also have it in DI-engine.

PaParaZz1 · 2022-10-20T03:39:15Z

PaParaZz1
Oct 20, 2022
Maintainer

It is not difficult to implement naive policy gradient in DI-engine, but I want to know why you need it? We didn't add naive pg before because it shows poor performance in most environments.

2 replies

cpwan Oct 20, 2022
Author

I am doing RL on combinatorial problems. The prior works in this direction use REINFORCE. It would be nice if we can benchmark it.

PaParaZz1 Oct 20, 2022
Maintainer

Got it. I will implement naive pg in next two weeks.

PaParaZz1 · 2022-11-16T06:48:58Z

PaParaZz1
Nov 16, 2022
Maintainer

I have implemented REINFORCE in this #544.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REINFORCE in DI-engine? #517

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

REINFORCE in DI-engine? #517

cpwan Oct 19, 2022

Replies: 2 comments · 2 replies

PaParaZz1 Oct 20, 2022 Maintainer

cpwan Oct 20, 2022 Author

PaParaZz1 Oct 20, 2022 Maintainer

PaParaZz1 Nov 16, 2022 Maintainer

cpwan
Oct 19, 2022

Replies: 2 comments 2 replies

PaParaZz1
Oct 20, 2022
Maintainer

cpwan Oct 20, 2022
Author

PaParaZz1 Oct 20, 2022
Maintainer

PaParaZz1
Nov 16, 2022
Maintainer