Skip to content

Commit

Permalink
Big documentation update
Browse files Browse the repository at this point in the history
  • Loading branch information
alex-petrenko committed Nov 25, 2022
1 parent 56620aa commit da78c76
Show file tree
Hide file tree
Showing 41 changed files with 1,583 additions and 709 deletions.
4 changes: 2 additions & 2 deletions Makefile
Expand Up @@ -66,7 +66,7 @@ test-cov-core:


# docs
.PHONY: docs
.PHONY: docs-serve

docs:
docs-serve:
bash ./docs/cfg-params.sh && mkdocs serve
8 changes: 5 additions & 3 deletions README.md
Expand Up @@ -29,10 +29,12 @@ High-throughput reinforcement learning codebase.

### What is Sample Factory?

Sample Factory is one of the fastest RL libraries. Instead of implementing multiple algorithm families,
we focused on very efficient synchronous and asynchronous implementations of policy gradients (PPO).
Sample Factory is one of the fastest RL libraries.
We focused on very efficient synchronous and asynchronous implementations of policy gradients (PPO).

Below are ViZDoom, IsaacGym, DMLab-30, Megaverse, Mujoco, and Atari agents trained with Sample Factory:
Sample Factory is thoroughly tested, used by a number of researchers and practitioners, and is actively maintained.
Our implementation is known to reach SOTA performance in a variety of domains in a short amount of time.
Clips below demonstrate ViZDoom, IsaacGym, DMLab-30, Megaverse, Mujoco, and Atari agents trained with Sample Factory:

<p align="middle">
<img src="https://github.com/alex-petrenko/sf_assets/blob/main/gifs/vizdoom.gif?raw=true" width="360" alt="VizDoom agents traned using Sample Factory 2.0">
Expand Down
36 changes: 36 additions & 0 deletions docs/01-get-started/basic-usage.md
@@ -0,0 +1,36 @@
# Basic Usage

## Usage examples

Use command line to train an agent using one of the existing integrations, e.g. Mujoco (might need to run `pip install sample-factory[mujoco]`):

```bash
python -m sf_examples.mujoco.train_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir
```

Stop the experiment when the desired performance is reached and then evaluate the agent:

```bash
python -m sf_examples.mujoco.enjoy_mujoco --env=mujoco_ant --experiment=Ant --train_dir=./train_dir
```

Do the same in a pixel-based VizDoom environment (might need to run `pip install sample-factory[vizdoom]`, please also see docs for VizDoom-specific instructions):

```bash
python -m sf_examples.vizdoom.train_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=./train_dir --num_workers=16 --num_envs_per_worker=10 --train_for_env_steps=1000000
python -m sf_examples.vizdoom.enjoy_vizdoom --env=doom_basic --experiment=DoomBasic --train_dir=./train_dir
```

## Monitoring experiments

Monitor any running or completed experiment with Tensorboard:

```bash
tensorboard --logdir=./train_dir
```
(or see the docs for WandB integration).

## Next steps

* Read more about configuring experiments in the [Configuration](../02-configuration/configuration.md) guide.
* Follow the instructions in the [Customizing](../03-customization/custom-environments.md) guide to train an agent in your own environment.
32 changes: 32 additions & 0 deletions docs/01-get-started/installation.md
@@ -0,0 +1,32 @@
# Installation

Just install from PyPI:

```pip install sample-factory```

SF is known to work on Linux and macOS. There is no Windows support at this time.

## Install from sources

```bash
git clone git@github.com:alex-petrenko/sample-factory.git
cd sample-factory
pip install -e .

# or install with optional dependencies
pip install -e .[dev,mujoco,atari,vizdoom]
```

## Environment support

To run Sample Factory with one of the available environment integrations, please refer to the corresponding documentation sections:

- [Mujoco](../09-environment-integrations/mujoco.md)
- [Atari](../09-environment-integrations/atari.md)
- [ViZDoom](../09-environment-integrations/vizdoom.md)
- [DeepMind Lab](../09-environment-integrations/dmlab.md)
- [Megaverse](../09-environment-integrations/megaverse.md)
- [Envpool](../09-environment-integrations/envpool.md)
- [Isaac Gym](../09-environment-integrations/isaacgym.md)

Sample Factory allows users to easily add custom environments and models, refer to [Customizing Sample Factory](../03-customization/custom-environments.md) for more information.
49 changes: 49 additions & 0 deletions docs/01-get-started/running-experiments.md
@@ -0,0 +1,49 @@
[//]: # (# Running Experiments)

[//]: # ()
[//]: # (Here we provide command lines that can be used to reproduce the experiments from the paper, which also serve as an example on how to configure large-scale RL experiments.)

[//]: # ()
[//]: # ()
[//]: # (#### DMLab)

[//]: # (##### DMLab level cache)

[//]: # ()
[//]: # (Note `--dmlab_level_cache_path` parameter. This location will be used for level layout cache.)

[//]: # (Subsequent DMLab experiments on envs that require level generation will become faster since environment files from)

[//]: # (previous runs can be reused.)

[//]: # ()
[//]: # (Generating environment levels for the first time can be really slow, especially for the full multi-task)

[//]: # (benchmark like DMLab-30. On 36-core server generating enough environments for a 10B training session can take up to)

[//]: # (a week. We provide a dataset of pre-generated levels to make training on DMLab-30 easier.)

[//]: # ([Download here]&#40;https://drive.google.com/file/d/17JCp3DbuiqcfO9I_yLjbBP4a7N7Q4c2v/view?usp=sharing&#41;.)

[//]: # ()


[//]: # (### Caveats)

[//]: # ()
[//]: # (- Multiplayer VizDoom environments can freeze your console sometimes, simple `reset` takes care of this)

[//]: # (- Sometimes VizDoom instances don't clear their internal shared memory buffers used to communicate between Python and)

[//]: # (a Doom executable. The file descriptors for these buffers tend to pile up. `rm /dev/shm/ViZDoom*` will take care of this issue.)

[//]: # (- It's best to use the standard `--fps=35` to visualize VizDoom results. `--fps=0` enables)

[//]: # (Async execution mode for the Doom environments, although the results are not always reproducible between sync and async modes.)

[//]: # (- Multiplayer VizDoom environments are significantly slower than single-player envs because actual network)

[//]: # (communication between the environment instances is required which results in a lot of syscalls.)

[//]: # (For prototyping and testing consider single-player environments with bots instead.)

0 comments on commit da78c76

Please sign in to comment.