This is a vectorized version of the environment. How does it work? Simply put, instead of running a single environment, we're running them into batches (vectorization). At each time step, instead of formulating a single action $a$, we'll define it as a vector $a=[0, .., n_\text{envs}]$ where each entry corresponds to the action to be performed for an environment.

Thus, if you run 4 simulatenous environments, your observations space becomes (4, num_observations). Because we use a step size of 10 and have a total of 9 snesors, each observation will result into a space of (40,9) elements. The core idea is to speed up the inference and training time of the model instead of querying a single environment.

In our implementation, based on what model you decide, you should take as inputs the observations reshaped to your liking, and predict the actions $a$ from your policy.

In [1]:
import numpy as np
from student_client.student_gym_env_vectorized import create_student_gym_env_vectorized

env = create_student_gym_env_vectorized(
            num_envs=4,
            step_size=10,
            user_token='vec1'
        )

SERVER_URL=http://rlchallenge.orailix.com
USER_TOKEN=student_user
ENV_TYPE=DegradationEnv
MAX_STEPS_PER_EPISODE=1000
AUTO_RESET=True
TIMEOUT=30.0
2026-02-11 14:16:37,164 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/session/create "HTTP/1.1 200 OK"
2026-02-11 14:16:37,165 - student_client.student_gym_env_vectorized - INFO - Created new session: 1c3db4e6-19ce-4284-b5d1-ac3d0d2afc99
2026-02-11 14:16:38,070 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/create "HTTP/1.1 200 OK"
2026-02-11 14:16:38,073 - student_client.student_gym_env_vectorized - INFO - Created new episode 1/4: 8872c56e-f514-4657-8cfc-756771c158db
2026-02-11 14:16:39,093 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/create "HTTP/1.1 200 OK"
2026-02-11 14:16:39,093 - student_client.student_gym_env_vectorized - INFO - Created new episode 2/4: 31f25a3e-21d9-41c3-ab9c-cbf6026d38ce
2026-02-11 14:16:40,016 - httpx - INFO - HTTP Re

In [2]:
print(f"Environment created with {env.num_envs} parallel environments")
print(f"   Episode IDs: {env.episode_ids}")

# Reset all environments
print(f"\nüîÑ Resetting all environments...")
observations, infos = env.reset()

print(f"   Observations shape: {observations.shape}")
print(f"   First observation: {observations[0]}")

Environment created with 4 parallel environments
   Episode IDs: ['8872c56e-f514-4657-8cfc-756771c158db', '31f25a3e-21d9-41c3-ab9c-cbf6026d38ce', '0428ffa9-47fc-4aaf-bbb8-8d04d581710e', '045473e8-609d-44ca-ac99-9a4de856fe59']

üîÑ Resetting all environments...


2026-02-11 14:16:44,282 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"
2026-02-11 14:16:44,284 - student_client.student_gym_env_vectorized - INFO - All 4 environments reset successfully


   Observations shape: (4, 9)
   First observation: [7.9713269e+02 1.9374654e+04 3.3586411e+02 1.1202363e+03 3.7212172e-01
 1.3696256e+06 3.9583135e+03 0.0000000e+00 9.9665842e+00]


## Training / Iterations

Here you can iterate through the vectorized environments. You'll notice that actions are a vector where each entry corresponds to the associated environment in the vector.

In A), we automatically reset the envs that have terminated so you can continue for an indefinite amount of steps. As environments don't have the same length, they stop at different times, this helps you reset terminated episodes on the fly.

Tips:
- The step_size return many observations, should you feed each one-by-one in your model, or the full step_size=10 one? The choice is yours!
- There exists multiple ways of exploring the dataset

In [None]:
for step in range(40):

    # A) Check if any environments terminated
    terminated_envs = env.get_terminated_env_indices()
    if terminated_envs:
        print(f"   ‚ö†Ô∏è  Environments {terminated_envs} terminated")
        reset_obs, reset_infos = env.reset_specific_envs(terminated_envs)
        for i, env_id in enumerate(terminated_envs):
            infos[env_id] = reset_infos[i] # reset previous info dict

    # Generate random actions for all environments
    actions = np.random.randint(0, 3, size=env.num_envs)

    print(f"\n   Step {step + 1}:")
    print(f"      Actions: {actions}")

    # Take step
    observations, rewards, terminateds, truncateds, infos = env.step(actions)

    print(f"      Rewards: {rewards}")
    print(f"      Terminated: {terminateds}")
    print(f"      Active environments: {env.get_active_count()}/{env.num_envs}")

    # Show filtered info (production mode)
    for i, info in enumerate(infos):
        print(f"      Env {i} info: {info}")

env.close()


   Step 1:
      Actions: [0 2 2 1]


2026-02-11 14:16:50,876 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 464.49234  133.59898    0.      -522.8883 ]
      Terminated: [False  True  True False]
      Active environments: 2/4
      Env 0 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 1, 'terminated': True, 'truncated': False}
      Env 2 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 9, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [1, 2] terminated


2026-02-11 14:16:52,514 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 2:
      Actions: [0 0 2 1]


2026-02-11 14:16:56,117 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 460.59943  455.47632  131.81326 -527.1177 ]
      Terminated: [False False  True False]
      Active environments: 3/4
      Env 0 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 1, 'terminated': True, 'truncated': False}
      Env 3 info: {'step': 9, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [2] terminated


2026-02-11 14:16:56,911 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 3:
      Actions: [1 1 1 2]


2026-02-11 14:17:00,563 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [-500.79492 -525.0327  -535.9648   128.45535]
      Terminated: [False False False  True]
      Active environments: 3/4
      Env 0 info: {'step': 19, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 11, 'terminated': True, 'truncated': False}
   ‚ö†Ô∏è  Environments [3] terminated


2026-02-11 14:17:01,452 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 4:
      Actions: [0 0 1 1]


2026-02-11 14:17:06,434 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 409.44702  396.5     -523.43854 -525.85956]
      Terminated: [False False False False]
      Active environments: 4/4
      Env 0 info: {'step': 29, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 19, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 19, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 9, 'terminated': False, 'truncated': False}

   Step 5:
      Actions: [1 2 0 0]


2026-02-11 14:17:09,954 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [-532.2233   130.52625  435.09396  427.73395]
      Terminated: [False  True False False]
      Active environments: 3/4
      Env 0 info: {'step': 19, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 11, 'terminated': True, 'truncated': False}
      Env 2 info: {'step': 28, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 19, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [1] terminated


2026-02-11 14:17:10,837 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 6:
      Actions: [1 1 2 0]


2026-02-11 14:17:14,428 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [-522.4911  -507.18637  131.95181  443.30777]
      Terminated: [False False  True False]
      Active environments: 3/4
      Env 0 info: {'step': 39, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 29, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 21, 'terminated': True, 'truncated': False}
      Env 3 info: {'step': 19, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [2] terminated


2026-02-11 14:17:15,240 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 7:
      Actions: [0 0 1 1]


2026-02-11 14:17:19,892 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 431.02142  417.262   -529.46344 -535.55743]
      Terminated: [False False False False]
      Active environments: 4/4
      Env 0 info: {'step': 29, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 38, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 29, 'terminated': False, 'truncated': False}

   Step 8:
      Actions: [2 2 2 0]


2026-02-11 14:17:21,167 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [  0.        0.        0.      416.90872]
      Terminated: [ True  True  True False]
      Active environments: 1/4
      Env 0 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 1 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 2 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 38, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [0, 1, 2] terminated
