Retraining code is a little different from the algorithm decription. #24

zdh2292390 · 2021-04-14T13:45:46Z

In policy_agent.py, the retraining code, why there is a BFS teacher-guided training after the agent failed?
This is not the same as the algorithm decription.
Does this mean BFS is the upper bound of the RL agent?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retraining code is a little different from the algorithm decription. #24

Retraining code is a little different from the algorithm decription. #24

zdh2292390 commented Apr 14, 2021

Retraining code is a little different from the algorithm decription. #24

Retraining code is a little different from the algorithm decription. #24

Comments

zdh2292390 commented Apr 14, 2021