This is a vectorized version of the environment. How does it work? Simply put, instead of running a single environment, we're running them into batches (vectorization). At each time step, instead of formulating a single action $a$, we'll define it as a vector $a=[0, .., n_\text{envs}]$ where each entry corresponds to the action to be performed for an environment.

Thus, if you run 4 simulatenous environments, your observations space becomes (4, num_observations). Because we use a step size of 10 and have a total of 9 snesors, each observation will result into a space of (40,9) elements. The core idea is to speed up the inference and training time of the model instead of querying a single environment.

In our implementation, based on what model you decide, you should take as inputs the observations reshaped to your liking, and predict the actions $a$ from your policy.

In [1]:
import numpy as np
from student_client.student_gym_env_vectorized import create_student_gym_env_vectorized

env = create_student_gym_env_vectorized(
            num_envs=4,
            step_size=10,
            user_token='vec1'
        )

2026-02-11 16:15:27,625 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/session/create "HTTP/1.1 200 OK"
2026-02-11 16:15:27,626 - student_client.student_gym_env_vectorized - INFO - Created new session: 5e0c765d-88cc-4a54-8690-9fdea00c1a3b
2026-02-11 16:15:28,772 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/create "HTTP/1.1 200 OK"
2026-02-11 16:15:28,773 - student_client.student_gym_env_vectorized - INFO - Created new episode 1/4: 2d819ed1-b962-4717-b386-fe7f04a953a6
2026-02-11 16:15:29,679 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/create "HTTP/1.1 200 OK"
2026-02-11 16:15:29,681 - student_client.student_gym_env_vectorized - INFO - Created new episode 2/4: 308b9215-60c1-4695-84f8-01b40871fa4a
2026-02-11 16:15:30,508 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/create "HTTP/1.1 200 OK"
2026-02-11 16:15:30,509 - student_client.student_gym_env_vector

In [2]:
print(f"Environment created with {env.num_envs} parallel environments")
print(f"   Episode IDs: {env.episode_ids}")

# Reset all environments
print(f"\nüîÑ Resetting all environments...")
observations, infos = env.reset()

print(f"   Observations shape: {observations.shape}")
print(f"   First observation: {observations[0]}")

Environment created with 4 parallel environments
   Episode IDs: ['2d819ed1-b962-4717-b386-fe7f04a953a6', '308b9215-60c1-4695-84f8-01b40871fa4a', '11ec19e0-3801-4317-9104-919cf74002e0', 'd0c27395-9d57-4ec7-9a78-8e014d67c297']

üîÑ Resetting all environments...


2026-02-11 16:15:34,400 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"
2026-02-11 16:15:34,402 - student_client.student_gym_env_vectorized - INFO - All 4 environments reset successfully


   Observations shape: (4, 9)
   First observation: [7.9758289e+02 1.9365225e+04 3.3577109e+02 1.1195775e+03 3.7200153e-01
 1.3737528e+06 3.9579189e+03 0.0000000e+00 9.9219837e+00]


## Training / Iterations

Here you can iterate through the vectorized environments. You'll notice that actions are a vector where each entry corresponds to the associated environment in the vector.

In A), we automatically reset the envs that have terminated so you can continue for an indefinite amount of steps. As environments don't have the same length, they stop at different times, this helps you reset terminated episodes on the fly.

Tips:
- The step_size return many observations, should you feed each one-by-one in your model, or the full step_size=10 one? The choice is yours!
- There exists multiple ways of exploring the dataset

In [3]:
for step in range(40):

    # A) Check if any environments terminated
    terminated_envs = env.get_terminated_env_indices()
    if terminated_envs:
        print(f"   ‚ö†Ô∏è  Environments {terminated_envs} terminated")
        reset_obs, reset_infos = env.reset_specific_envs(terminated_envs)
        for i, env_id in enumerate(terminated_envs):
            infos[env_id] = reset_infos[i] # reset previous info dict

    # Generate random actions for all environments
    actions = np.random.randint(0, 3, size=env.num_envs)

    print(f"\n   Step {step + 1}:")
    print(f"      Actions: {actions}")

    # Take step
    observations, rewards, terminateds, truncateds, infos = env.step(actions)

    print(f"      Rewards: {rewards}")
    print(f"      Terminated: {terminateds}")
    print(f"      Active environments: {env.get_active_count()}/{env.num_envs}")

    # Show filtered info (production mode)
    for i, info in enumerate(infos):
        print(f"      Env {i} info: {info}")

env.close()


   Step 1:
      Actions: [1 2 2 0]


2026-02-11 16:15:36,857 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [-555.4827   133.73833  128.70547  487.02316]
      Terminated: [False  True  True False]
      Active environments: 2/4
      Env 0 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 1, 'terminated': True, 'truncated': False}
      Env 2 info: {'step': 1, 'terminated': True, 'truncated': False}
      Env 3 info: {'step': 9, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [1, 2] terminated


2026-02-11 16:15:38,599 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 2:
      Actions: [2 0 2 2]


2026-02-11 16:15:39,931 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [  0.      457.19626   0.        0.     ]
      Terminated: [ True False  True  True]
      Active environments: 1/4
      Env 0 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 2 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 3 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [0, 2, 3] terminated


2026-02-11 16:15:42,416 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 3:
      Actions: [0 1 0 1]


2026-02-11 16:15:47,405 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 455.20413 -558.6028   438.6104  -518.1828 ]
      Terminated: [False False False False]
      Active environments: 4/4
      Env 0 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 19, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 9, 'terminated': False, 'truncated': False}

   Step 4:
      Actions: [0 0 0 2]


2026-02-11 16:15:51,001 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [434.07562 478.7954  472.9767    0.     ]
      Terminated: [False False False  True]
      Active environments: 3/4
      Env 0 info: {'step': 18, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 3 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [3] terminated


2026-02-11 16:15:51,911 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 5:
      Actions: [0 2 0 2]


2026-02-11 16:15:54,573 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [400.09677   0.      417.41748 128.58778]
      Terminated: [False  True False  True]
      Active environments: 2/4
      Env 0 info: {'step': 19, 'terminated': False, 'truncated': False}
      Env 1 info: {'error': "'NoneType' object has no attribute 'astype'", 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 18, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 1, 'terminated': True, 'truncated': False}
   ‚ö†Ô∏è  Environments [1, 3] terminated


2026-02-11 16:15:56,176 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 6:
      Actions: [0 1 0 2]


2026-02-11 16:15:59,984 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 349.99722 -522.5033   389.69565  134.27933]
      Terminated: [False False False  True]
      Active environments: 3/4
      Env 0 info: {'step': 29, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 9, 'terminated': False, 'truncated': False}
      Env 2 info: {'step': 28, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 1, 'terminated': True, 'truncated': False}
   ‚ö†Ô∏è  Environments [3] terminated


2026-02-11 16:16:00,891 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 7:
      Actions: [0 2 1 1]


2026-02-11 16:16:04,779 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_step "HTTP/1.1 200 OK"


      Rewards: [ 301.84586  128.5166  -502.3661  -560.706  ]
      Terminated: [False  True False False]
      Active environments: 3/4
      Env 0 info: {'step': 39, 'terminated': False, 'truncated': False}
      Env 1 info: {'step': 11, 'terminated': True, 'truncated': False}
      Env 2 info: {'step': 38, 'terminated': False, 'truncated': False}
      Env 3 info: {'step': 9, 'terminated': False, 'truncated': False}
   ‚ö†Ô∏è  Environments [1] terminated


2026-02-11 16:16:05,631 - httpx - INFO - HTTP Request: POST http://rlchallenge.orailix.com/api/v1/episode/vectorized_reset "HTTP/1.1 200 OK"



   Step 8:
      Actions: [1 2 0 1]


2026-02-11 16:20:05,320 - student_client.student_gym_env_vectorized - ERROR - Failed to step environments: timed out
2026-02-11 16:20:05,329 - student_client.student_gym_env_vectorized - ERROR - Failed to step environments: [Errno 8] nodename nor servname provided, or not known
2026-02-11 16:20:05,332 - student_client.student_gym_env_vectorized - ERROR - Failed to step environments: [Errno 8] nodename nor servname provided, or not known
2026-02-11 16:20:05,334 - student_client.student_gym_env_vectorized - ERROR - Failed to step environments: [Errno 8] nodename nor servname provided, or not known
2026-02-11 16:20:05,336 - student_client.student_gym_env_vectorized - ERROR - Failed to step environments: [Errno 8] nodename nor servname provided, or not known
2026-02-11 16:20:05,337 - student_client.student_gym_env_vectorized - ERROR - Failed to step environments: [Errno 8] nodename nor servname provided, or not known
2026-02-11 16:20:05,339 - student_client.student_gym_env_vectorized - ERR

      Rewards: [0. 0. 0. 0.]
      Terminated: [ True  True  True  True]
      Active environments: 4/4
      Env 0 info: {'error': 'timed out'}
      Env 1 info: {'error': 'timed out'}
      Env 2 info: {'error': 'timed out'}
      Env 3 info: {'error': 'timed out'}

   Step 9:
      Actions: [0 1 2 1]
      Rewards: [0. 0. 0. 0.]
      Terminated: [ True  True  True  True]
      Active environments: 4/4
      Env 0 info: {'error': '[Errno 8] nodename nor servname provided, or not known'}
      Env 1 info: {'error': '[Errno 8] nodename nor servname provided, or not known'}
      Env 2 info: {'error': '[Errno 8] nodename nor servname provided, or not known'}
      Env 3 info: {'error': '[Errno 8] nodename nor servname provided, or not known'}

   Step 10:
      Actions: [1 0 1 2]
      Rewards: [0. 0. 0. 0.]
      Terminated: [ True  True  True  True]
      Active environments: 4/4
      Env 0 info: {'error': '[Errno 8] nodename nor servname provided, or not known'}
      Env 1 info: {