## Multiprocessing Demo


[Vectorized Environments](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html) are a method for stacking multiple independent environments into a single environment. Instead of training an RL agent on 1 environment per step, it allows us to train it on n environments per step. This provides two benefits:
* Agent experience can be collected more quickly
* The experience will contain a more diverse range of states, it usually improves exploration

Stable-Baselines provides two types of Vectorized Environment:
- SubprocVecEnv which run each environment in a separate process
- DummyVecEnv which run all environment on the same process

In practice, DummyVecEnv is usually faster than SubprocVecEnv because of communication delays that subprocesses have.

In [None]:
import time

from stable_baselines3.common.env_util import make_vec_env

In [None]:
env = gym.make("Pendulum-v1")
n_steps = 1024

In [None]:
start_time_one_env = time.time()
model = PPO("MlpPolicy", env, n_epochs=1, n_steps=n_steps, verbose=1).learn(int(2e4))
time_one_env = time.time() - start_time_one_env

In [None]:
print(f"Took {time_one_env:.2f}s")

Took 20.17s


In [None]:
start_time_vec_env = time.time()
# Create 16 environments
vec_env = make_vec_env("Pendulum-v1", n_envs=16)
# At each call to `env.step()`, 16 transitions will be collected, so we account for that for fair comparison
model = PPO("MlpPolicy", vec_env, n_epochs=1, n_steps=n_steps // 16, verbose=1).learn(int(2e4))

time_vec_env = time.time() - start_time_vec_env

In [None]:
print(f"Took {time_vec_env:.2f}s")

Took 5.01s


Note: the speedup is not linear but it is already significant.