# Vectorised Environment
## Overview
In Gym, an environment recieves an action and returns the next observation and reward. This process is slow and sometimes can be the throughout bottleneck in a DRL experiment.

Tianshou provides vectorised environment wrapper for a Gym environment. This wrapper allows you to make use of multiple cpu cores in your server to accelerate the data sampling.

In [22]:
from tianshou.env import SubprocVectorEnv
import numpy as np
import gymnasium as gym
import time

num_cpus = [1, 2, 5]
for num_cpu in num_cpus:
    env = SubprocVectorEnv([lambda: gym.make('CartPole-v0') for _ in range(num_cpu)])
    env.reset()
    sampled_steps = 0
    time_start = time.time()
    while sampled_steps < 1000:
        act = np.random.choice(2, size=num_cpu)
        obs, rew, terminated, truncated, info = env.step(act)
        done = terminated + truncated
        if np.sum(done):
            env.reset(np.where(done)[0])
        sampled_steps += num_cpu
    time_used = time.time() - time_start
    print("{}s used to sample 1000 steps if using {} cpus.".format(time_used, num_cpu))

2023-11-13 11:29:11.282229: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  logger.deprecation(


0.30324292182922363s used to sample 1000 steps if using 1 cpus.


2023-11-13 11:29:20.275636: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  logger.deprecation(
2023-11-13 11:29:28.801389: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


0.14555811882019043s used to sample 1000 steps if using 2 cpus.


  logger.deprecation(
2023-11-13 11:29:37.541092: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  logger.deprecation(
2023-11-13 11:29:46.263816: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
  logger.deprecation(
2023-11-13 11:29:54.864149: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operat

0.1420140266418457s used to sample 1000 steps if using 5 cpus.


  logger.deprecation(


## Usages
### Initialisation
Just pass in a list of functions which return the initialised environment upon called.

In [23]:
from tianshou.env import DummyVectorEnv
# in gym
env = gym.make("CartPole-v0")

# in tianshou
def helper_function():
    env = gym.make("CartPole-v0")
    # other operations such as env.seed(np.random.choice(10))
    return env

envs = DummyVectorEnv([helper_function for _ in range(5)])
print(envs)

<tianshou.env.venvs.DummyVectorEnv object at 0x167cd64d0>


### Environment exection and resetting
The only difference between vectorised environment and standard gym environments is that passed in actions and returned rewards/observations are also vectorised.

In [24]:
# In gym, env.reset() returns a single observation
print(env.reset())

# In Tianshou, envs.reset() returns stacked observations
print("========================================")
print(envs.reset())

obs, rew, terminated, truncated, info = envs.step(np.random.choice(2, size=num_cpu))
print(info)

(array([ 0.02075961, -0.01380869, -0.03240999,  0.01060295], dtype=float32), {})
(array([[-0.01293166,  0.02808111,  0.02603966,  0.00533931],
       [-0.04412643, -0.0465113 ,  0.03728771, -0.01175328],
       [ 0.04276416, -0.00653805, -0.04383047, -0.0284849 ],
       [-0.04929027, -0.02613566, -0.02086342,  0.00562016],
       [ 0.04259569, -0.04781166, -0.01953969,  0.03225222]],
      dtype=float32), [{}, {}, {}, {}, {}])
[{'env_id': 0} {'env_id': 1} {'env_id': 2} {'env_id': 3} {'env_id': 4}]


If we only want to execute several environments. The `id` argument can be used.

In [25]:
print(envs.step(np.random.choice(2, size=3), id=[0, 3, 1]))

(array([[-0.00791364,  0.02733513,  0.02056614,  0.02179806],
       [-0.05423203, -0.0255407 , -0.01491806, -0.00750654],
       [-0.04989961, -0.43777773,  0.04290179,  0.596592  ]],
      dtype=float32), array([1., 1., 1.]), array([False, False, False]), array([False, False, False]), array([{'env_id': 0}, {'env_id': 3}, {'env_id': 1}], dtype=object))
