Skip to content

Commit

Permalink
Added new samples
Browse files Browse the repository at this point in the history
  • Loading branch information
sharif1093 committed Mar 20, 2020
1 parent 7842700 commit 3c4693c
Show file tree
Hide file tree
Showing 7 changed files with 20 additions and 14 deletions.
22 changes: 13 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,21 +68,25 @@ See [usage notes](https://digideep.readthedocs.io/en/latest/notes/02%20Usage.htm

### Sample Results

Sample results of running `SAC` on the toy environment `Pendulum-v0`:

```bash
# Running "SAC" on the default "Pendulum" environment:
python -m digideep.main --params digideep.params.sac_params --tensorboard

# Running "PPO" on "PongNoFrameskip-v4" environment:
python3 -m digideep.main --params digideep.params.atari_ppo --tensorboard

# Running `PPO` on dm_control's `DMBenchCheetahRun-v0` environment:
python3 -m digideep.main --params digideep.params.mujoco_ppo --cpanel '{"model_name":"DMBenchCheetahRun-v0", "from_module":"digideep.environment.dmc2gym"}' --tensorboard

```

<p align="center">
<img src="./doc/media/sac_pendulum_v0.gif" width="70%">
</p>

Also, the average return vs. episode graph (saved from TensorBoard):

<p align="center">
<img src="./doc/media/sac_pendulum_v0.svg" width="70%">
</p>
| Learning Graph | Trained Policy |
:-------------------------:|:-------------------------:
<img src="./doc/media/sac_pendulum_v0.svg" width="40%" /> | <img src="./doc/media/sac_pendulum_v0.gif" width="40%" />
<img src="./doc/media/ppo_atari_pong.svg" width="40%" /> | <img src="./doc/media/ppo_atari_pong.gif" width="40%" />
<img src="./doc/media/ppo_dm_cheetah.svg" width="40%" /> | <img src="./doc/media/ppo_dm_cheetah.gif" width="40%" />


## Changelog
Expand Down
8 changes: 4 additions & 4 deletions digideep/agent/ppo/agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -195,10 +195,10 @@ def step(self):
monitor("/update/action_loss", action_loss.item())
monitor("/update/dist_entropy", dist_entropy.item())

self.session.writer.add_scalar('loss/overall', Loss.item())
self.session.writer.add_scalar('loss/value', value_loss.item())
self.session.writer.add_scalar('loss/action', action_loss.item())
self.session.writer.add_scalar('loss/dist_entropy', dist_entropy.item())
self.session.writer.add_scalar('loss/overall', Loss.item(), self.state["i_step"])
self.session.writer.add_scalar('loss/value', value_loss.item(), self.state["i_step"])
self.session.writer.add_scalar('loss/action', action_loss.item(), self.state["i_step"])
self.session.writer.add_scalar('loss/dist_entropy', dist_entropy.item(), self.state["i_step"])

## Candidates for monitoring
# ratio.item()
Expand Down
2 changes: 1 addition & 1 deletion digideep/params/mujoco_ppo.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
# 'HalfCheetah-v2'
# 'DMBenchHumanoidStand-v0' | 'DMBenchCheetahRun-v0' | 'Ant-v2'
cpanel["model_name"] = 'Ant-v2' # MuJoCo Env
# cpanel["from_module"] = 'digideep.environment.dmc2gym'
# cpanel["from_module"] = "digideep.environment.dmc2gym"
cpanel["observation_key"] = "/agent"

# cpanel["model_name"] = 'Pendulum-v0' # Classic Control Env
Expand Down
Binary file added doc/media/ppo_atari_pong.gif
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions doc/media/ppo_atari_pong.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/media/ppo_dm_cheetah.gif
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions doc/media/ppo_dm_cheetah.svg
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3c4693c

Please sign in to comment.