Skip to content

Latest commit

 

History

History
85 lines (58 loc) · 4.25 KB

feature_overview.rst

File metadata and controls

85 lines (58 loc) · 4.25 KB

image

Highly distributed learning: Our RLlib algorithms (such as our "PPO" or "IMPALA") allow you to set the num_env_runners config parameter, such that your workloads can run on 100s of CPUs/nodes thus parallelizing and speeding up learning.

image

Multi-agent RL (MARL): Convert your (custom) gym.Envs into a multi-agent one via a few simple steps and start training your agents in any of the following fashions:
1) Cooperative with shared or separate policies and/or value functions.
2) Adversarial scenarios using self-play and league-based training.
3) Independent learning of neutral/co-existing agents.

image

External simulators: Don't have your simulation running as a gym.Env in python? No problem! RLlib supports an external environment API and comes with a pluggable, off-the-shelve client/ server setup that allows you to run 100s of independent simulators on the "outside" (e.g. a Windows cloud) connecting to a central RLlib Policy-Server that learns and serves actions. Alternatively, actions can be computed on the client side to save on network traffic.

image

Offline RL and imitation learning/behavior cloning: You don't have a simulator for your particular problem, but tons of historic data recorded by a legacy (maybe non-RL/ML) system? This branch of reinforcement learning is for you! RLlib's comes with several offline RL algorithms (CQL, MARWIL, and DQfD), allowing you to either purely behavior-clone your existing system or learn how to further improve over it.