Big documentation update

alex-petrenko · Nov 25, 2022 · da78c76 · da78c76
1 parent 56620aa
commit da78c76
Show file tree

Hide file tree

Showing 41 changed files with 1,583 additions and 709 deletions.
diff --git a/Makefile b/Makefile
@@ -66,7 +66,7 @@ test-cov-core:
 
 
 # docs
-.PHONY: docs
+.PHONY: docs-serve
 
-docs:
+docs-serve:
 	bash ./docs/cfg-params.sh && mkdocs serve
diff --git a/README.md b/README.md
@@ -29,10 +29,12 @@ High-throughput reinforcement learning codebase.
 
 ### What is Sample Factory?
 
-Sample Factory is one of the fastest RL libraries. Instead of implementing multiple algorithm families,
-we focused on very efficient synchronous and asynchronous implementations of policy gradients (PPO). 
+Sample Factory is one of the fastest RL libraries.
+We focused on very efficient synchronous and asynchronous implementations of policy gradients (PPO). 
 
-Below are ViZDoom, IsaacGym, DMLab-30, Megaverse, Mujoco, and Atari agents trained with Sample Factory:
+Sample Factory is thoroughly tested, used by a number of researchers and practitioners, and is actively maintained.
+Our implementation is known to reach SOTA performance in a variety of domains in a short amount of time.
+Clips below demonstrate ViZDoom, IsaacGym, DMLab-30, Megaverse, Mujoco, and Atari agents trained with Sample Factory:
 
 <p align="middle">
 <img src="https://github.com/alex-petrenko/sf_assets/blob/main/gifs/vizdoom.gif?raw=true" width="360" alt="VizDoom agents traned using Sample Factory 2.0">

diff --git a/docs/01-get-started/basic-usage.md b/docs/01-get-started/basic-usage.md
@@ -0,0 +1,36 @@
+# Basic Usage
+
+## Usage examples
+
+Use command line to train an agent using one of the existing integrations, e.g. Mujoco (might need to run `pip install sample-factory[mujoco]`):
+
+```bash
+python -m sf_examples.mujoco.train_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir
+```
+
+Stop the experiment when the desired performance is reached and then evaluate the agent:
+
+```bash
+python -m sf_examples.mujoco.enjoy_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir
+```
+
+Do the same in a pixel-based VizDoom environment (might need to run `pip install sample-factory[vizdoom]`, please also see docs for VizDoom-specific instructions):
+
+```bash
+python -m sf_examples.vizdoom.train_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=./train_dir --num_workers=16 --num_envs_per_worker=10 --train_for_env_steps=1000000
+python -m sf_examples.vizdoom.enjoy_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=./train_dir
+```
+
+## Monitoring experiments
+
+Monitor any running or completed experiment with Tensorboard:
+
+```bash
+tensorboard --logdir=./train_dir
+```
+(or see the docs for WandB integration).
+
+## Next steps
+
+* Read more about configuring experiments in the [Configuration](../02-configuration/configuration.md) guide.
+* Follow the instructions in the [Customizing](../03-customization/custom-environments.md) guide to train an agent in your own environment.
diff --git a/docs/01-get-started/installation.md b/docs/01-get-started/installation.md
@@ -0,0 +1,32 @@
+# Installation
+
+Just install from PyPI:
+
+```pip install sample-factory```
+
+SF is known to work on Linux and macOS. There is no Windows support at this time.
+
+## Install from sources
+
+```bash
+git clone git@github.com:alex-petrenko/sample-factory.git
+cd sample-factory
+pip install -e .
+
+# or install with optional dependencies
+pip install -e .[dev,mujoco,atari,vizdoom]
+```
+
+## Environment support
+
+To run Sample Factory with one of the available environment integrations, please refer to the corresponding documentation sections: 
+
+- [Mujoco](../09-environment-integrations/mujoco.md)
+- [Atari](../09-environment-integrations/atari.md)
+- [ViZDoom](../09-environment-integrations/vizdoom.md)
+- [DeepMind Lab](../09-environment-integrations/dmlab.md)
+- [Megaverse](../09-environment-integrations/megaverse.md)
+- [Envpool](../09-environment-integrations/envpool.md)
+- [Isaac Gym](../09-environment-integrations/isaacgym.md)
+
+Sample Factory allows users to easily add custom environments and models, refer to [Customizing Sample Factory](../03-customization/custom-environments.md) for more information.
diff --git a/docs/01-get-started/running-experiments.md b/docs/01-get-started/running-experiments.md
@@ -0,0 +1,49 @@
+[//]: # (# Running Experiments)
+
+[//]: # ()
+[//]: # (Here we provide command lines that can be used to reproduce the experiments from the paper, which also serve as an example on how to configure large-scale RL experiments.)
+
+[//]: # ()
+[//]: # ()
+[//]: # (#### DMLab)
+
+[//]: # (##### DMLab level cache)
+
+[//]: # ()
+[//]: # (Note `--dmlab_level_cache_path` parameter. This location will be used for level layout cache.)
+
+[//]: # (Subsequent DMLab experiments on envs that require level generation will become faster since environment files from)
+
+[//]: # (previous runs can be reused.)
+
+[//]: # ()
+[//]: # (Generating environment levels for the first time can be really slow, especially for the full multi-task)
+
+[//]: # (benchmark like DMLab-30. On 36-core server generating enough environments for a 10B training session can take up to)
+
+[//]: # (a week. We provide a dataset of pre-generated levels to make training on DMLab-30 easier.)
+
+[//]: # ([Download here]&#40;https://drive.google.com/file/d/17JCp3DbuiqcfO9I_yLjbBP4a7N7Q4c2v/view?usp=sharing&#41;.)
+
+[//]: # ()
+
+
+[//]: # (### Caveats)
+
+[//]: # ()
+[//]: # (- Multiplayer VizDoom environments can freeze your console sometimes, simple `reset` takes care of this)
+
+[//]: # (- Sometimes VizDoom instances don't clear their internal shared memory buffers used to communicate between Python and)
+
+[//]: # (a Doom executable. The file descriptors for these buffers tend to pile up. `rm /dev/shm/ViZDoom*` will take care of this issue.)
+
+[//]: # (- It's best to use the standard `--fps=35` to visualize VizDoom results. `--fps=0` enables)
+
+[//]: # (Async execution mode for the Doom environments, although the results are not always reproducible between sync and async modes.)
+
+[//]: # (- Multiplayer VizDoom environments are significantly slower than single-player envs because actual network)
+
+[//]: # (communication between the environment instances is required which results in a lot of syscalls.)
+
+[//]: # (For prototyping and testing consider single-player environments with bots instead.)
+