Soft Actor Critic in PyTorch

A relatively minimal PyTorch SAC implementation from scratch. Uses a numerically stable Tanh Transformation to implement action sampling and log-prob calculation.

Quick Start

Simply run:

python train_agent.py

for default args. Changeable args are:

--env: String of environment name (Default: HalfCheetah-v2)
--seed: Int of seed (Default: 100)
--use_obs_filter: Boolean that is true when used (seems to degrade performance)
--update_every_n_steps: Int of how many env steps we take before optimizing the agent (Default: 1, ratio of steps v.s. backprop is tied to 1:1)
--n_random_actions: Int of how many random steps we take to 'seed' the replay pool (Default: 10000)
--n_collect_steps: Int of how steps we collect before training  (Default: 1000)
--n_evals: Int of how many episodes we run an evaluation for (Default: 1)
--save_model: Boolean that is true when used (saves model when GIFs are made, loading and running is left as an exercise for the reader (or until I get around to it))

Results

Single seed runs (smoothed)

Graph	Gif

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
run_experiments.py		run_experiments.py
sac.py		sac.py
train_agent.py		train_agent.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

init.py

init.py

run_experiments.py

run_experiments.py

sac.py

sac.py

train_agent.py

train_agent.py

utils.py

utils.py

Repository files navigation

Soft Actor Critic in PyTorch

Quick Start

Results

About

Releases

Packages

Languages

License

philipjball/SAC_PyTorch

Folders and files

Latest commit

History

Repository files navigation

Soft Actor Critic in PyTorch

Quick Start

Results

About

Resources

License

Stars

Watchers

Forks

Languages