Curious: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
Implementation of CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning.
This implementation requires the installation of the gym_flowers module, which overrides gym to enable the use of custom environments such as the one we use in this paper (Modular Multi-Goal Fetch Arm).
The video of the results can be seen here
To run an experiment, run:
- --num_cpu: Number of cpus. The paper uses 19 cpus (as in the original paper presenting this HER implementation. Running the code with fewer cpus for a longer time is NOT equivalent.
- --env: string of the gym_flowers env. Possible choices are MultiTaskFetchArm4-v5 (4 tasks: Reach, Push, Pick and Place, Stack), MultiTaskFetchArm8-v5 (same with 4 distracting tasks).
- --task_selection: use 'active_competence_progress' to use learning progress to guide module selection, 'random' otherwise.
- --goal_selection: 'random' is the only supported here.
- --goal_replay: 'her' uses Hindisght Experience Replay or 'none'.
- --task_replay: 'replay_task_cp_buffer' uses learning progress to sample into module-relevant replay buffers. 'replay_task_random_buffer' samples into a buffer associated to a random module.
- --structure: 'curious' uses the curious algorithm, 'task_experts' uses one UVFA policy per module.
- --trial_id: trial identifier.
Results are saved in: /curious/baselines/her/experiment/save/env_name/trial_id/