Skip to content

[Doc] Document EGL multi-GPU limitations in containers#3456

Merged
vmoens merged 1 commit intomainfrom
doc/egl-container-limitations
Feb 6, 2026
Merged

[Doc] Document EGL multi-GPU limitations in containers#3456
vmoens merged 1 commit intomainfrom
doc/egl-container-limitations

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Feb 6, 2026

Summary

  • Documents the EGL multi-GPU device visibility limitation when running dm_control pixel environments inside Docker/SLURM containers
  • Explains why MUJOCO_EGL_DEVICE_ID / EGL_DEVICE_ID only allows device 0 in containers
  • Documents the lack of batched rendering support in MuJoCo
  • Links upstream issues: mujoco#572, dm_control#345, mujoco#1604
  • Lists workarounds (container config, bare metal, reducing rendering overhead)

Test plan

  • Documentation-only change, no code modified

Made with Cursor

Add detailed documentation about EGL device visibility limitations
when running dm_control pixel environments inside Docker/SLURM
containers, and the lack of batched rendering in MuJoCo.

Links upstream issues mujoco#572 and dm_control#345.

Co-authored-by: Cursor <cursoragent@cursor.com>
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 6, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3456

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2026
@github-actions github-actions bot added the Documentation Improvements or additions to documentation label Feb 6, 2026
@vmoens vmoens merged commit ab49b59 into main Feb 6, 2026
80 of 107 checks passed
@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 173. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}17$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 82.9184μs 80.8502μs 12.3686 KOps/s 12.3232 KOps/s $\color{#35bf28}+0.37\%$
test_tensor_to_bytestream_speed[torch.save] 0.1400ms 0.1398ms 7.1506 KOps/s 7.1197 KOps/s $\color{#35bf28}+0.43\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1116s 0.1114s 8.9729 Ops/s 8.6037 Ops/s $\color{#35bf28}+4.29\%$
test_tensor_to_bytestream_speed[numpy] 2.6369μs 2.6268μs 380.6855 KOps/s 392.8496 KOps/s $\color{#d91a1a}-3.10\%$
test_tensor_to_bytestream_speed[safetensors] 39.2158μs 38.4198μs 26.0282 KOps/s 25.9554 KOps/s $\color{#35bf28}+0.28\%$
test_simple 0.5543s 0.5537s 1.8060 Ops/s 1.7211 Ops/s $\color{#35bf28}+4.93\%$
test_transformed 1.1472s 1.1464s 0.8723 Ops/s 0.8531 Ops/s $\color{#35bf28}+2.25\%$
test_serial 1.7101s 1.6999s 0.5883 Ops/s 0.5770 Ops/s $\color{#35bf28}+1.96\%$
test_parallel 1.2154s 1.1373s 0.8793 Ops/s 0.7802 Ops/s $\textbf{\color{#35bf28}+12.70\%}$
test_step_mdp_speed[True-True-True-True-True] 0.1280ms 43.5732μs 22.9499 KOps/s 22.4076 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[True-True-True-True-False] 51.6610μs 24.9310μs 40.1108 KOps/s 39.4275 KOps/s $\color{#35bf28}+1.73\%$
test_step_mdp_speed[True-True-True-False-True] 79.0820μs 24.5852μs 40.6749 KOps/s 38.9499 KOps/s $\color{#35bf28}+4.43\%$
test_step_mdp_speed[True-True-True-False-False] 43.1810μs 13.9841μs 71.5099 KOps/s 70.9205 KOps/s $\color{#35bf28}+0.83\%$
test_step_mdp_speed[True-True-False-True-True] 75.7910μs 48.2246μs 20.7363 KOps/s 20.3006 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[True-True-False-True-False] 62.4620μs 28.4042μs 35.2061 KOps/s 35.7194 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[True-True-False-False-True] 60.1310μs 27.9025μs 35.8391 KOps/s 34.9474 KOps/s $\color{#35bf28}+2.55\%$
test_step_mdp_speed[True-True-False-False-False] 44.9910μs 16.9240μs 59.0878 KOps/s 59.5768 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-False-True-True-True] 86.7320μs 51.4910μs 19.4209 KOps/s 19.3130 KOps/s $\color{#35bf28}+0.56\%$
test_step_mdp_speed[True-False-True-True-False] 0.1307ms 31.2179μs 32.0329 KOps/s 31.9699 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[True-False-True-False-True] 58.8610μs 27.6057μs 36.2245 KOps/s 35.1546 KOps/s $\color{#35bf28}+3.04\%$
test_step_mdp_speed[True-False-True-False-False] 45.2110μs 16.8070μs 59.4992 KOps/s 58.8386 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[True-False-False-True-True] 91.1420μs 53.4299μs 18.7161 KOps/s 18.3277 KOps/s $\color{#35bf28}+2.12\%$
test_step_mdp_speed[True-False-False-True-False] 62.3820μs 33.9311μs 29.4715 KOps/s 29.4727 KOps/s $-0.00\%$
test_step_mdp_speed[True-False-False-False-True] 72.6610μs 30.1956μs 33.1174 KOps/s 32.3009 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[True-False-False-False-False] 45.9310μs 19.5834μs 51.0637 KOps/s 51.0123 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[False-True-True-True-True] 87.3220μs 51.5691μs 19.3915 KOps/s 19.4032 KOps/s $\color{#d91a1a}-0.06\%$
test_step_mdp_speed[False-True-True-True-False] 55.2810μs 31.1763μs 32.0757 KOps/s 32.0327 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[False-True-True-False-True] 2.3071ms 31.6318μs 31.6137 KOps/s 31.0839 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-True-True-False-False] 55.3310μs 18.6111μs 53.7315 KOps/s 54.1286 KOps/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[False-True-False-True-True] 94.8020μs 53.6996μs 18.6221 KOps/s 18.5032 KOps/s $\color{#35bf28}+0.64\%$
test_step_mdp_speed[False-True-False-True-False] 60.0510μs 34.1055μs 29.3208 KOps/s 29.5608 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[False-True-False-False-True] 66.2810μs 33.8499μs 29.5421 KOps/s 28.2485 KOps/s $\color{#35bf28}+4.58\%$
test_step_mdp_speed[False-True-False-False-False] 54.6510μs 21.2291μs 47.1053 KOps/s 47.3837 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-True-True-True] 85.6920μs 56.9374μs 17.5631 KOps/s 17.5695 KOps/s $\color{#d91a1a}-0.04\%$
test_step_mdp_speed[False-False-True-True-False] 85.6320μs 37.1480μs 26.9193 KOps/s 27.0780 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[False-False-True-False-True] 69.4920μs 34.6707μs 28.8428 KOps/s 28.4252 KOps/s $\color{#35bf28}+1.47\%$
test_step_mdp_speed[False-False-True-False-False] 52.3520μs 21.5369μs 46.4319 KOps/s 46.7607 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[False-False-False-True-True] 87.1620μs 59.0475μs 16.9355 KOps/s 16.8464 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-False-False-True-False] 72.5910μs 39.5737μs 25.2693 KOps/s 25.4618 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[False-False-False-False-True] 80.4120μs 37.2599μs 26.8385 KOps/s 26.8462 KOps/s $\color{#d91a1a}-0.03\%$
test_step_mdp_speed[False-False-False-False-False] 54.7820μs 24.3465μs 41.0737 KOps/s 41.9102 KOps/s $\color{#d91a1a}-2.00\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7524s 0.7504s 1.3326 Ops/s 1.2768 Ops/s $\color{#35bf28}+4.36\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7318s 0.6372s 1.5695 Ops/s 1.5576 Ops/s $\color{#35bf28}+0.76\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7723s 1.6973s 0.5892 Ops/s 0.5878 Ops/s $\color{#35bf28}+0.24\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5435s 1.4666s 0.6819 Ops/s 0.6794 Ops/s $\color{#35bf28}+0.36\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0245s 1.9480s 0.5133 Ops/s 0.5130 Ops/s $\color{#35bf28}+0.06\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.8060s 1.7282s 0.5786 Ops/s 0.5796 Ops/s $\color{#d91a1a}-0.17\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6792s 4.6534s 0.2149 Ops/s 0.2120 Ops/s $\color{#35bf28}+1.39\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5444s 4.4937s 0.2225 Ops/s 0.2224 Ops/s $\color{#35bf28}+0.07\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.1463s 2.0555s 0.4865 Ops/s 0.5051 Ops/s $\color{#d91a1a}-3.69\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7999s 1.7139s 0.5835 Ops/s 0.5713 Ops/s $\color{#35bf28}+2.13\%$
test_values[generalized_advantage_estimate-True-True] 10.8115ms 10.5385ms 94.8903 Ops/s 95.1467 Ops/s $\color{#d91a1a}-0.27\%$
test_values[vec_generalized_advantage_estimate-True-True] 20.7558ms 17.7190ms 56.4365 Ops/s 89.6055 Ops/s $\textbf{\color{#d91a1a}-37.02\%}$
test_values[td0_return_estimate-False-False] 0.2040ms 0.1271ms 7.8693 KOps/s 7.9400 KOps/s $\color{#d91a1a}-0.89\%$
test_values[td1_return_estimate-False-False] 29.4049ms 28.7902ms 34.7341 Ops/s 34.7629 Ops/s $\color{#d91a1a}-0.08\%$
test_values[vec_td1_return_estimate-False-False] 18.5007ms 17.6599ms 56.6255 Ops/s 84.8888 Ops/s $\textbf{\color{#d91a1a}-33.29\%}$
test_values[td_lambda_return_estimate-True-False] 43.5282ms 42.8014ms 23.3637 Ops/s 23.5350 Ops/s $\color{#d91a1a}-0.73\%$
test_values[vec_td_lambda_return_estimate-True-False] 20.2252ms 18.5400ms 53.9375 Ops/s 89.3618 Ops/s $\textbf{\color{#d91a1a}-39.64\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.4934ms 9.3919ms 106.4751 Ops/s 105.8640 Ops/s $\color{#35bf28}+0.58\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.7077ms 1.5398ms 649.4310 Ops/s 652.7258 Ops/s $\color{#d91a1a}-0.50\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4729ms 0.4273ms 2.3402 KOps/s 2.3326 KOps/s $\color{#35bf28}+0.33\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.5541ms 34.8961ms 28.6565 Ops/s 33.6939 Ops/s $\textbf{\color{#d91a1a}-14.95\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.1541ms 1.7121ms 584.0612 Ops/s 577.2627 Ops/s $\color{#35bf28}+1.18\%$
test_dqn_speed[False-None] 1.7966ms 1.4266ms 700.9727 Ops/s 711.4394 Ops/s $\color{#d91a1a}-1.47\%$
test_dqn_speed[False-backward] 2.0031ms 1.9406ms 515.2986 Ops/s 517.4279 Ops/s $\color{#d91a1a}-0.41\%$
test_dqn_speed[True-None] 1.0710ms 0.5602ms 1.7849 KOps/s 1.7707 KOps/s $\color{#35bf28}+0.81\%$
test_dqn_speed[True-backward] 1.0739ms 1.0181ms 982.2165 Ops/s 792.3243 Ops/s $\textbf{\color{#35bf28}+23.97\%}$
test_dqn_speed[reduce-overhead-None] 1.0606ms 0.5589ms 1.7891 KOps/s 1.7493 KOps/s $\color{#35bf28}+2.28\%$
test_ddpg_speed[False-None] 3.3474ms 2.9100ms 343.6392 Ops/s 350.9014 Ops/s $\color{#d91a1a}-2.07\%$
test_ddpg_speed[False-backward] 4.2530ms 4.1247ms 242.4429 Ops/s 243.5494 Ops/s $\color{#d91a1a}-0.45\%$
test_ddpg_speed[True-None] 1.9122ms 1.4386ms 695.1042 Ops/s 698.7500 Ops/s $\color{#d91a1a}-0.52\%$
test_ddpg_speed[True-backward] 2.5091ms 2.4289ms 411.7012 Ops/s 376.2715 Ops/s $\textbf{\color{#35bf28}+9.42\%}$
test_ddpg_speed[reduce-overhead-None] 1.5804ms 1.4298ms 699.3952 Ops/s 704.6539 Ops/s $\color{#d91a1a}-0.75\%$
test_sac_speed[False-None] 8.8913ms 8.1330ms 122.9561 Ops/s 123.2975 Ops/s $\color{#d91a1a}-0.28\%$
test_sac_speed[False-backward] 11.8531ms 11.3694ms 87.9554 Ops/s 87.9132 Ops/s $\color{#35bf28}+0.05\%$
test_sac_speed[True-None] 2.6567ms 2.2065ms 453.2057 Ops/s 440.6891 Ops/s $\color{#35bf28}+2.84\%$
test_sac_speed[True-backward] 4.2050ms 4.0810ms 245.0365 Ops/s 242.0960 Ops/s $\color{#35bf28}+1.21\%$
test_sac_speed[reduce-overhead-None] 2.4104ms 2.1817ms 458.3574 Ops/s 409.3694 Ops/s $\textbf{\color{#35bf28}+11.97\%}$
test_redq_speed[False-None] 10.9485ms 10.4508ms 95.6868 Ops/s 96.7782 Ops/s $\color{#d91a1a}-1.13\%$
test_redq_speed[False-backward] 18.4492ms 17.9380ms 55.7475 Ops/s 57.1498 Ops/s $\color{#d91a1a}-2.45\%$
test_redq_speed[True-None] 4.9227ms 4.5720ms 218.7224 Ops/s 219.7396 Ops/s $\color{#d91a1a}-0.46\%$
test_redq_speed[True-backward] 10.3986ms 9.9816ms 100.1845 Ops/s 101.0381 Ops/s $\color{#d91a1a}-0.84\%$
test_redq_speed[reduce-overhead-None] 4.8297ms 4.6123ms 216.8112 Ops/s 224.1312 Ops/s $\color{#d91a1a}-3.27\%$
test_redq_deprec_speed[False-None] 12.0694ms 11.2642ms 88.7767 Ops/s 89.3833 Ops/s $\color{#d91a1a}-0.68\%$
test_redq_deprec_speed[False-backward] 16.6945ms 16.1552ms 61.8994 Ops/s 62.3957 Ops/s $\color{#d91a1a}-0.80\%$
test_redq_deprec_speed[True-None] 4.2558ms 3.7601ms 265.9499 Ops/s 264.4798 Ops/s $\color{#35bf28}+0.56\%$
test_redq_deprec_speed[True-backward] 7.9871ms 7.7927ms 128.3254 Ops/s 116.3803 Ops/s $\textbf{\color{#35bf28}+10.26\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.8918ms 3.6902ms 270.9893 Ops/s 263.4668 Ops/s $\color{#35bf28}+2.86\%$
test_td3_speed[False-None] 8.2446ms 8.1360ms 122.9109 Ops/s 124.2301 Ops/s $\color{#d91a1a}-1.06\%$
test_td3_speed[False-backward] 11.6298ms 11.0118ms 90.8113 Ops/s 91.4011 Ops/s $\color{#d91a1a}-0.65\%$
test_td3_speed[True-None] 1.9353ms 1.8783ms 532.3973 Ops/s 530.8869 Ops/s $\color{#35bf28}+0.28\%$
test_td3_speed[True-backward] 3.8093ms 3.6972ms 270.4738 Ops/s 250.7914 Ops/s $\textbf{\color{#35bf28}+7.85\%}$
test_td3_speed[reduce-overhead-None] 1.9208ms 1.8525ms 539.8034 Ops/s 541.5349 Ops/s $\color{#d91a1a}-0.32\%$
test_cql_speed[False-None] 26.8395ms 26.2704ms 38.0656 Ops/s 37.5589 Ops/s $\color{#35bf28}+1.35\%$
test_cql_speed[False-backward] 36.7493ms 35.8049ms 27.9291 Ops/s 28.0926 Ops/s $\color{#d91a1a}-0.58\%$
test_cql_speed[True-None] 12.7845ms 12.4587ms 80.2655 Ops/s 80.5473 Ops/s $\color{#d91a1a}-0.35\%$
test_cql_speed[True-backward] 18.5646ms 18.1165ms 55.1982 Ops/s 56.2585 Ops/s $\color{#d91a1a}-1.88\%$
test_cql_speed[reduce-overhead-None] 12.7888ms 12.5455ms 79.7102 Ops/s 79.1332 Ops/s $\color{#35bf28}+0.73\%$
test_a2c_speed[False-None] 6.0741ms 5.4945ms 181.9995 Ops/s 179.5088 Ops/s $\color{#35bf28}+1.39\%$
test_a2c_speed[False-backward] 12.3754ms 11.9905ms 83.3995 Ops/s 83.2900 Ops/s $\color{#35bf28}+0.13\%$
test_a2c_speed[True-None] 3.8738ms 3.7244ms 268.4971 Ops/s 263.0255 Ops/s $\color{#35bf28}+2.08\%$
test_a2c_speed[True-backward] 8.8652ms 8.5736ms 116.6373 Ops/s 116.7441 Ops/s $\color{#d91a1a}-0.09\%$
test_a2c_speed[reduce-overhead-None] 4.3746ms 3.7804ms 264.5208 Ops/s 265.7532 Ops/s $\color{#d91a1a}-0.46\%$
test_ppo_speed[False-None] 6.2115ms 6.0142ms 166.2743 Ops/s 168.5232 Ops/s $\color{#d91a1a}-1.33\%$
test_ppo_speed[False-backward] 13.1921ms 12.8967ms 77.5393 Ops/s 80.8981 Ops/s $\color{#d91a1a}-4.15\%$
test_ppo_speed[True-None] 4.1831ms 3.6846ms 271.4020 Ops/s 262.4445 Ops/s $\color{#35bf28}+3.41\%$
test_ppo_speed[True-backward] 8.7254ms 8.4732ms 118.0190 Ops/s 118.9596 Ops/s $\color{#d91a1a}-0.79\%$
test_ppo_speed[reduce-overhead-None] 4.1014ms 3.6650ms 272.8510 Ops/s 269.6860 Ops/s $\color{#35bf28}+1.17\%$
test_reinforce_speed[False-None] 5.0586ms 4.6564ms 214.7602 Ops/s 217.6006 Ops/s $\color{#d91a1a}-1.31\%$
test_reinforce_speed[False-backward] 7.8795ms 7.5615ms 132.2497 Ops/s 136.4449 Ops/s $\color{#d91a1a}-3.07\%$
test_reinforce_speed[True-None] 3.0551ms 2.8920ms 345.7867 Ops/s 340.7525 Ops/s $\color{#35bf28}+1.48\%$
test_reinforce_speed[True-backward] 8.3811ms 7.8496ms 127.3950 Ops/s 126.5300 Ops/s $\color{#35bf28}+0.68\%$
test_reinforce_speed[reduce-overhead-None] 3.6065ms 2.9085ms 343.8225 Ops/s 345.0991 Ops/s $\color{#d91a1a}-0.37\%$
test_iql_speed[False-None] 21.5403ms 20.3855ms 49.0544 Ops/s 49.0986 Ops/s $\color{#d91a1a}-0.09\%$
test_iql_speed[False-backward] 31.5001ms 30.9059ms 32.3563 Ops/s 32.2090 Ops/s $\color{#35bf28}+0.46\%$
test_iql_speed[True-None] 8.9534ms 8.6387ms 115.7587 Ops/s 112.3905 Ops/s $\color{#35bf28}+3.00\%$
test_iql_speed[True-backward] 17.0822ms 16.7399ms 59.7376 Ops/s 56.8001 Ops/s $\textbf{\color{#35bf28}+5.17\%}$
test_iql_speed[reduce-overhead-None] 9.0812ms 8.6639ms 115.4209 Ops/s 113.9089 Ops/s $\color{#35bf28}+1.33\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.2804ms 6.1381ms 162.9179 Ops/s 162.6826 Ops/s $\color{#35bf28}+0.14\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.8727ms 0.3733ms 2.6789 KOps/s 2.7440 KOps/s $\color{#d91a1a}-2.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7657ms 0.3705ms 2.6993 KOps/s 2.9030 KOps/s $\textbf{\color{#d91a1a}-7.02\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4146ms 5.8274ms 171.6038 Ops/s 170.9657 Ops/s $\color{#35bf28}+0.37\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.3173ms 0.3777ms 2.6479 KOps/s 3.4695 KOps/s $\textbf{\color{#d91a1a}-23.68\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7682ms 0.3637ms 2.7499 KOps/s 3.7512 KOps/s $\textbf{\color{#d91a1a}-26.69\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6511ms 1.4744ms 678.2452 Ops/s 771.8851 Ops/s $\textbf{\color{#d91a1a}-12.13\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.7231ms 1.3936ms 717.5845 Ops/s 820.3057 Ops/s $\textbf{\color{#d91a1a}-12.52\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.1024ms 6.1273ms 163.2034 Ops/s 167.3553 Ops/s $\color{#d91a1a}-2.48\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.6459ms 0.5323ms 1.8786 KOps/s 2.2818 KOps/s $\textbf{\color{#d91a1a}-17.67\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6381ms 0.4490ms 2.2270 KOps/s 2.3757 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0324ms 5.8369ms 171.3247 Ops/s 168.5120 Ops/s $\color{#35bf28}+1.67\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.0810ms 0.3776ms 2.6486 KOps/s 3.4627 KOps/s $\textbf{\color{#d91a1a}-23.51\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6113ms 0.3603ms 2.7752 KOps/s 3.6784 KOps/s $\textbf{\color{#d91a1a}-24.55\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3708ms 5.8285ms 171.5720 Ops/s 170.4395 Ops/s $\color{#35bf28}+0.66\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7818ms 0.3682ms 2.7156 KOps/s 2.6394 KOps/s $\color{#35bf28}+2.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5615ms 0.3551ms 2.8161 KOps/s 2.7220 KOps/s $\color{#35bf28}+3.46\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1267ms 5.9767ms 167.3174 Ops/s 166.0664 Ops/s $\color{#35bf28}+0.75\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0044ms 0.5255ms 1.9029 KOps/s 2.0966 KOps/s $\textbf{\color{#d91a1a}-9.24\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0210ms 0.5200ms 1.9231 KOps/s 2.3667 KOps/s $\textbf{\color{#d91a1a}-18.74\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.5585ms 5.0861ms 196.6132 Ops/s 56.8576 Ops/s $\textbf{\color{#35bf28}+245.80\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 11.8616ms 2.2217ms 450.1010 Ops/s 530.1718 Ops/s $\textbf{\color{#d91a1a}-15.10\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0287ms 0.8916ms 1.1216 KOps/s 814.8509 Ops/s $\textbf{\color{#35bf28}+37.64\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.5414s 15.8558ms 63.0685 Ops/s 195.7250 Ops/s $\textbf{\color{#d91a1a}-67.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.9563ms 1.8578ms 538.2584 Ops/s 552.9459 Ops/s $\color{#d91a1a}-2.66\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.0701ms 1.0823ms 923.9938 Ops/s 814.5896 Ops/s $\textbf{\color{#35bf28}+13.43\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.1830ms 5.2832ms 189.2784 Ops/s 60.1154 Ops/s $\textbf{\color{#35bf28}+214.86\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.2467ms 1.9546ms 511.6072 Ops/s 432.1190 Ops/s $\textbf{\color{#35bf28}+18.39\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.3362ms 1.0559ms 947.0614 Ops/s 948.4377 Ops/s $\color{#d91a1a}-0.15\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.3922ms 36.2088ms 27.6176 Ops/s 27.3355 Ops/s $\color{#35bf28}+1.03\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9384ms 18.5128ms 54.0166 Ops/s 54.9482 Ops/s $\color{#d91a1a}-1.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.2583ms 37.5552ms 26.6274 Ops/s 26.4479 Ops/s $\color{#35bf28}+0.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.1990ms 18.7686ms 53.2805 Ops/s 53.5144 Ops/s $\color{#d91a1a}-0.44\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.7676ms 39.3629ms 25.4046 Ops/s 25.2634 Ops/s $\color{#35bf28}+0.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.3917ms 20.1522ms 49.6225 Ops/s 49.2402 Ops/s $\color{#35bf28}+0.78\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8235ms 0.2190ms 4.5657 KOps/s 4.3282 KOps/s $\textbf{\color{#35bf28}+5.49\%}$
test_storage_write_lazystack[100-img_shape1-atari] 1.8568ms 1.4020ms 713.2602 Ops/s 706.8296 Ops/s $\color{#35bf28}+0.91\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7893ms 2.3343ms 428.3962 Ops/s 429.3200 Ops/s $\color{#d91a1a}-0.22\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.3655ms 2.9070ms 343.9915 Ops/s 338.7174 Ops/s $\color{#35bf28}+1.56\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2187ms 0.1412ms 7.0827 KOps/s 7.2474 KOps/s $\color{#d91a1a}-2.27\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3124ms 0.1914ms 5.2249 KOps/s 5.2439 KOps/s $\color{#d91a1a}-0.36\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.2273ms 1.7719ms 564.3734 Ops/s 549.3562 Ops/s $\color{#35bf28}+2.73\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.7586ms 1.3342ms 749.5181 Ops/s 737.8149 Ops/s $\color{#35bf28}+1.59\%$
test_collector_stack_then_write[50-img_shape0-small] 1.5712ms 1.1346ms 881.3785 Ops/s 887.6373 Ops/s $\color{#d91a1a}-0.71\%$
test_collector_stack_then_write[100-img_shape1-atari] 4.0837ms 3.5955ms 278.1241 Ops/s 284.6096 Ops/s $\color{#d91a1a}-2.28\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.0889ms 5.7117ms 175.0808 Ops/s 172.4788 Ops/s $\color{#35bf28}+1.51\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 13.7662ms 7.0100ms 142.6531 Ops/s 136.1567 Ops/s $\color{#35bf28}+4.77\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4729ms 0.2745ms 3.6432 KOps/s 3.5873 KOps/s $\color{#35bf28}+1.56\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 2.1224ms 1.5155ms 659.8694 Ops/s 661.6548 Ops/s $\color{#d91a1a}-0.27\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.8682ms 2.4397ms 409.8792 Ops/s 412.5905 Ops/s $\color{#d91a1a}-0.66\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.5954ms 3.1258ms 319.9215 Ops/s 319.2228 Ops/s $\color{#35bf28}+0.22\%$
test_collector_without_rb[100-img_shape0-atari] 35.1644ms 34.4262ms 29.0476 Ops/s 29.1589 Ops/s $\color{#d91a1a}-0.38\%$
test_collector_without_rb[200-img_shape1-large_batch] 68.9444ms 67.9276ms 14.7215 Ops/s 14.8270 Ops/s $\color{#d91a1a}-0.71\%$
test_collector_with_rb[100-img_shape0-atari] 40.0481ms 39.1735ms 25.5275 Ops/s 25.6975 Ops/s $\color{#d91a1a}-0.66\%$
test_collector_with_rb[200-img_shape1-large_batch] 77.9882ms 76.5950ms 13.0557 Ops/s 13.0949 Ops/s $\color{#d91a1a}-0.30\%$

@github-actions
Copy link
Contributor

github-actions bot commented Feb 6, 2026

$\color{#D29922}\textsf{\Large&amp;#x26A0;\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 172. Improved: $\large\color{#35bf28}11$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 83.1751μs 81.9881μs 12.1969 KOps/s 12.5025 KOps/s $\color{#d91a1a}-2.44\%$
test_tensor_to_bytestream_speed[torch.save] 0.1395ms 0.1391ms 7.1871 KOps/s 7.1527 KOps/s $\color{#35bf28}+0.48\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1173s 0.1168s 8.5591 Ops/s 8.8759 Ops/s $\color{#d91a1a}-3.57\%$
test_tensor_to_bytestream_speed[numpy] 2.5547μs 2.5499μs 392.1738 KOps/s 379.7872 KOps/s $\color{#35bf28}+3.26\%$
test_tensor_to_bytestream_speed[safetensors] 39.4014μs 39.2525μs 25.4761 KOps/s 26.5559 KOps/s $\color{#d91a1a}-4.07\%$
test_simple 0.7975s 0.7970s 1.2546 Ops/s 1.2089 Ops/s $\color{#35bf28}+3.78\%$
test_transformed 1.5515s 1.4557s 0.6869 Ops/s 0.6830 Ops/s $\color{#35bf28}+0.58\%$
test_serial 2.4338s 2.3395s 0.4274 Ops/s 0.4252 Ops/s $\color{#35bf28}+0.54\%$
test_parallel 2.0986s 2.0034s 0.4991 Ops/s 0.5141 Ops/s $\color{#d91a1a}-2.90\%$
test_step_mdp_speed[True-True-True-True-True] 0.3480ms 45.9583μs 21.7588 KOps/s 21.7778 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-True-True-True-False] 55.8420μs 25.4272μs 39.3279 KOps/s 38.4338 KOps/s $\color{#35bf28}+2.33\%$
test_step_mdp_speed[True-True-True-False-True] 58.0410μs 25.4922μs 39.2278 KOps/s 38.2669 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[True-True-True-False-False] 74.3220μs 13.8105μs 72.4088 KOps/s 70.6615 KOps/s $\color{#35bf28}+2.47\%$
test_step_mdp_speed[True-True-False-True-True] 89.7620μs 48.2941μs 20.7064 KOps/s 20.1013 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[True-True-False-True-False] 57.7120μs 27.9710μs 35.7514 KOps/s 35.4071 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-True-False-False-True] 63.6210μs 29.0046μs 34.4773 KOps/s 35.2299 KOps/s $\color{#d91a1a}-2.14\%$
test_step_mdp_speed[True-True-False-False-False] 47.8110μs 16.8290μs 59.4214 KOps/s 59.4623 KOps/s $\color{#d91a1a}-0.07\%$
test_step_mdp_speed[True-False-True-True-True] 85.1620μs 52.2881μs 19.1248 KOps/s 19.3943 KOps/s $\color{#d91a1a}-1.39\%$
test_step_mdp_speed[True-False-True-True-False] 64.3820μs 31.0998μs 32.1545 KOps/s 32.3505 KOps/s $\color{#d91a1a}-0.61\%$
test_step_mdp_speed[True-False-True-False-True] 55.5520μs 28.1849μs 35.4800 KOps/s 35.3391 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[True-False-True-False-False] 46.8910μs 16.8574μs 59.3212 KOps/s 59.4724 KOps/s $\color{#d91a1a}-0.25\%$
test_step_mdp_speed[True-False-False-True-True] 85.0920μs 53.5473μs 18.6751 KOps/s 18.8518 KOps/s $\color{#d91a1a}-0.94\%$
test_step_mdp_speed[True-False-False-True-False] 63.0210μs 33.7825μs 29.6011 KOps/s 29.7878 KOps/s $\color{#d91a1a}-0.63\%$
test_step_mdp_speed[True-False-False-False-True] 62.9010μs 30.3553μs 32.9431 KOps/s 32.3655 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[True-False-False-False-False] 52.5110μs 19.5078μs 51.2616 KOps/s 51.9111 KOps/s $\color{#d91a1a}-1.25\%$
test_step_mdp_speed[False-True-True-True-True] 86.5120μs 50.9328μs 19.6337 KOps/s 19.5927 KOps/s $\color{#35bf28}+0.21\%$
test_step_mdp_speed[False-True-True-True-False] 68.3620μs 30.9639μs 32.2956 KOps/s 31.6382 KOps/s $\color{#35bf28}+2.08\%$
test_step_mdp_speed[False-True-True-False-True] 2.3119ms 32.2242μs 31.0326 KOps/s 30.5473 KOps/s $\color{#35bf28}+1.59\%$
test_step_mdp_speed[False-True-True-False-False] 53.2510μs 18.4496μs 54.2018 KOps/s 54.0665 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[False-True-False-True-True] 96.2920μs 54.1675μs 18.4613 KOps/s 18.4917 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-True-False-True-False] 66.0320μs 33.3730μs 29.9643 KOps/s 29.3291 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[False-True-False-False-True] 69.3110μs 34.8296μs 28.7112 KOps/s 28.3299 KOps/s $\color{#35bf28}+1.35\%$
test_step_mdp_speed[False-True-False-False-False] 60.7110μs 20.9903μs 47.6412 KOps/s 46.4742 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[False-False-True-True-True] 89.9210μs 56.0175μs 17.8516 KOps/s 17.6112 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[False-False-True-True-False] 74.9820μs 36.4182μs 27.4588 KOps/s 27.2143 KOps/s $\color{#35bf28}+0.90\%$
test_step_mdp_speed[False-False-True-False-True] 70.2510μs 33.9083μs 29.4913 KOps/s 29.7966 KOps/s $\color{#d91a1a}-1.02\%$
test_step_mdp_speed[False-False-True-False-False] 55.8910μs 21.4501μs 46.6199 KOps/s 48.1475 KOps/s $\color{#d91a1a}-3.17\%$
test_step_mdp_speed[False-False-False-True-True] 94.6520μs 57.5956μs 17.3624 KOps/s 16.8999 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[False-False-False-True-False] 73.2910μs 38.9765μs 25.6565 KOps/s 25.3892 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-False-False-False-True] 76.9720μs 35.8242μs 27.9141 KOps/s 27.7268 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[False-False-False-False-False] 49.5010μs 23.7534μs 42.0992 KOps/s 42.4797 KOps/s $\color{#d91a1a}-0.90\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8690s 0.7735s 1.2929 Ops/s 1.2848 Ops/s $\color{#35bf28}+0.63\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7314s 0.6338s 1.5777 Ops/s 1.5471 Ops/s $\color{#35bf28}+1.98\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7702s 1.6887s 0.5922 Ops/s 0.5885 Ops/s $\color{#35bf28}+0.62\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5472s 1.4631s 0.6835 Ops/s 0.6823 Ops/s $\color{#35bf28}+0.18\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0267s 1.9436s 0.5145 Ops/s 0.5020 Ops/s $\color{#35bf28}+2.49\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7961s 1.7142s 0.5834 Ops/s 0.5636 Ops/s $\color{#35bf28}+3.50\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7497s 4.6790s 0.2137 Ops/s 0.2107 Ops/s $\color{#35bf28}+1.44\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.6557s 4.5436s 0.2201 Ops/s 0.2206 Ops/s $\color{#d91a1a}-0.25\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.0996s 1.9702s 0.5076 Ops/s 0.5151 Ops/s $\color{#d91a1a}-1.47\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7517s 1.6733s 0.5976 Ops/s 0.5936 Ops/s $\color{#35bf28}+0.67\%$
test_values[generalized_advantage_estimate-True-True] 21.9557ms 21.0456ms 47.5158 Ops/s 46.9181 Ops/s $\color{#35bf28}+1.27\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1374s 3.6795ms 271.7754 Ops/s 280.5650 Ops/s $\color{#d91a1a}-3.13\%$
test_values[td0_return_estimate-False-False] 0.1058ms 84.3929μs 11.8493 KOps/s 11.8554 KOps/s $\color{#d91a1a}-0.05\%$
test_values[td1_return_estimate-False-False] 52.0624ms 50.4657ms 19.8154 Ops/s 19.9685 Ops/s $\color{#d91a1a}-0.77\%$
test_values[vec_td1_return_estimate-False-False] 1.3660ms 1.1047ms 905.2062 Ops/s 903.8078 Ops/s $\color{#35bf28}+0.15\%$
test_values[td_lambda_return_estimate-True-False] 84.9228ms 81.9650ms 12.2003 Ops/s 12.2181 Ops/s $\color{#d91a1a}-0.15\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2828ms 1.0993ms 909.6913 Ops/s 904.9826 Ops/s $\color{#35bf28}+0.52\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 21.3981ms 21.2165ms 47.1330 Ops/s 46.3603 Ops/s $\color{#35bf28}+1.67\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0507ms 0.7762ms 1.2883 KOps/s 1.2947 KOps/s $\color{#d91a1a}-0.50\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7417ms 0.6897ms 1.4498 KOps/s 1.4268 KOps/s $\color{#35bf28}+1.61\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5481ms 1.5035ms 665.0948 Ops/s 663.2513 Ops/s $\color{#35bf28}+0.28\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7896ms 0.7164ms 1.3958 KOps/s 1.3957 KOps/s $+0.01\%$
test_dqn_speed[False-None] 1.7058ms 1.5571ms 642.2258 Ops/s 642.1458 Ops/s $\color{#35bf28}+0.01\%$
test_dqn_speed[False-backward] 2.4415ms 2.2168ms 451.1065 Ops/s 451.6635 Ops/s $\color{#d91a1a}-0.12\%$
test_dqn_speed[True-None] 0.7098ms 0.5766ms 1.7343 KOps/s 1.6665 KOps/s $\color{#35bf28}+4.06\%$
test_dqn_speed[True-backward] 1.2947ms 1.2293ms 813.5005 Ops/s 897.8145 Ops/s $\textbf{\color{#d91a1a}-9.39\%}$
test_dqn_speed[reduce-overhead-None] 0.7179ms 0.5983ms 1.6715 KOps/s 1.6319 KOps/s $\color{#35bf28}+2.43\%$
test_ddpg_speed[False-None] 3.2673ms 2.9505ms 338.9257 Ops/s 339.7196 Ops/s $\color{#d91a1a}-0.23\%$
test_ddpg_speed[False-backward] 4.9278ms 4.3561ms 229.5628 Ops/s 234.6394 Ops/s $\color{#d91a1a}-2.16\%$
test_ddpg_speed[True-None] 1.4949ms 1.3388ms 746.9357 Ops/s 738.4406 Ops/s $\color{#35bf28}+1.15\%$
test_ddpg_speed[True-backward] 2.6710ms 2.5743ms 388.4600 Ops/s 411.1340 Ops/s $\textbf{\color{#d91a1a}-5.52\%}$
test_ddpg_speed[reduce-overhead-None] 1.8058ms 1.3688ms 730.5536 Ops/s 721.7594 Ops/s $\color{#35bf28}+1.22\%$
test_sac_speed[False-None] 9.0300ms 8.4619ms 118.1771 Ops/s 118.6093 Ops/s $\color{#d91a1a}-0.36\%$
test_sac_speed[False-backward] 12.2523ms 11.7756ms 84.9215 Ops/s 86.8181 Ops/s $\color{#d91a1a}-2.18\%$
test_sac_speed[True-None] 1.9686ms 1.8440ms 542.2938 Ops/s 536.3456 Ops/s $\color{#35bf28}+1.11\%$
test_sac_speed[True-backward] 4.1027ms 3.6660ms 272.7764 Ops/s 271.9390 Ops/s $\color{#35bf28}+0.31\%$
test_sac_speed[reduce-overhead-None] 19.2669ms 10.9577ms 91.2603 Ops/s 81.8947 Ops/s $\textbf{\color{#35bf28}+11.44\%}$
test_redq_deprec_speed[False-None] 9.7159ms 9.3943ms 106.4475 Ops/s 104.7169 Ops/s $\color{#35bf28}+1.65\%$
test_redq_deprec_speed[False-backward] 13.3450ms 12.8519ms 77.8094 Ops/s 76.7391 Ops/s $\color{#35bf28}+1.39\%$
test_redq_deprec_speed[True-None] 2.7840ms 2.5931ms 385.6367 Ops/s 382.9259 Ops/s $\color{#35bf28}+0.71\%$
test_redq_deprec_speed[True-backward] 4.8200ms 4.3706ms 228.8016 Ops/s 234.0338 Ops/s $\color{#d91a1a}-2.24\%$
test_redq_deprec_speed[reduce-overhead-None] 16.4657ms 10.0263ms 99.7375 Ops/s 101.4489 Ops/s $\color{#d91a1a}-1.69\%$
test_td3_speed[False-None] 8.4070ms 8.2909ms 120.6144 Ops/s 119.7194 Ops/s $\color{#35bf28}+0.75\%$
test_td3_speed[False-backward] 11.5277ms 10.9282ms 91.5066 Ops/s 92.0930 Ops/s $\color{#d91a1a}-0.64\%$
test_td3_speed[True-None] 1.7125ms 1.6798ms 595.2948 Ops/s 593.4346 Ops/s $\color{#35bf28}+0.31\%$
test_td3_speed[True-backward] 3.7631ms 3.3302ms 300.2805 Ops/s 308.0537 Ops/s $\color{#d91a1a}-2.52\%$
test_td3_speed[reduce-overhead-None] 84.3696ms 25.0595ms 39.9050 Ops/s 40.2714 Ops/s $\color{#d91a1a}-0.91\%$
test_cql_speed[False-None] 17.7061ms 17.4716ms 57.2358 Ops/s 56.7200 Ops/s $\color{#35bf28}+0.91\%$
test_cql_speed[False-backward] 23.6294ms 23.1926ms 43.1171 Ops/s 42.7161 Ops/s $\color{#35bf28}+0.94\%$
test_cql_speed[True-None] 3.8635ms 3.3254ms 300.7171 Ops/s 296.6844 Ops/s $\color{#35bf28}+1.36\%$
test_cql_speed[True-backward] 6.0013ms 5.6076ms 178.3302 Ops/s 175.7524 Ops/s $\color{#35bf28}+1.47\%$
test_cql_speed[reduce-overhead-None] 19.3179ms 12.1493ms 82.3093 Ops/s 83.9997 Ops/s $\color{#d91a1a}-2.01\%$
test_a2c_speed[False-None] 4.0009ms 3.3010ms 302.9381 Ops/s 303.6860 Ops/s $\color{#d91a1a}-0.25\%$
test_a2c_speed[False-backward] 6.9252ms 6.5103ms 153.6026 Ops/s 159.1873 Ops/s $\color{#d91a1a}-3.51\%$
test_a2c_speed[True-None] 1.4441ms 1.3559ms 737.5195 Ops/s 727.7627 Ops/s $\color{#35bf28}+1.34\%$
test_a2c_speed[True-backward] 3.9548ms 3.1955ms 312.9448 Ops/s 313.9568 Ops/s $\color{#d91a1a}-0.32\%$
test_a2c_speed[reduce-overhead-None] 1.1033ms 0.9913ms 1.0087 KOps/s 1.0143 KOps/s $\color{#d91a1a}-0.55\%$
test_ppo_speed[False-None] 4.0958ms 3.9117ms 255.6420 Ops/s 253.7388 Ops/s $\color{#35bf28}+0.75\%$
test_ppo_speed[False-backward] 7.7732ms 7.3536ms 135.9881 Ops/s 135.4631 Ops/s $\color{#35bf28}+0.39\%$
test_ppo_speed[True-None] 1.4960ms 1.4466ms 691.2625 Ops/s 691.9732 Ops/s $\color{#d91a1a}-0.10\%$
test_ppo_speed[True-backward] 3.4518ms 3.3259ms 300.6738 Ops/s 308.0288 Ops/s $\color{#d91a1a}-2.39\%$
test_ppo_speed[reduce-overhead-None] 1.1361ms 1.0587ms 944.5150 Ops/s 933.3898 Ops/s $\color{#35bf28}+1.19\%$
test_reinforce_speed[False-None] 2.4587ms 2.3129ms 432.3582 Ops/s 427.7152 Ops/s $\color{#35bf28}+1.09\%$
test_reinforce_speed[False-backward] 3.9182ms 3.4635ms 288.7215 Ops/s 285.3868 Ops/s $\color{#35bf28}+1.17\%$
test_reinforce_speed[True-None] 1.3822ms 1.3098ms 763.4897 Ops/s 752.0773 Ops/s $\color{#35bf28}+1.52\%$
test_reinforce_speed[True-backward] 3.2149ms 3.0859ms 324.0506 Ops/s 318.1786 Ops/s $\color{#35bf28}+1.85\%$
test_reinforce_speed[reduce-overhead-None] 17.9226ms 9.6940ms 103.1571 Ops/s 105.3038 Ops/s $\color{#d91a1a}-2.04\%$
test_iql_speed[False-None] 9.9779ms 9.4929ms 105.3416 Ops/s 104.1025 Ops/s $\color{#35bf28}+1.19\%$
test_iql_speed[False-backward] 13.9968ms 13.6428ms 73.2988 Ops/s 72.8919 Ops/s $\color{#35bf28}+0.56\%$
test_iql_speed[True-None] 2.3487ms 2.2185ms 450.7554 Ops/s 441.1239 Ops/s $\color{#35bf28}+2.18\%$
test_iql_speed[True-backward] 5.0789ms 4.9334ms 202.7003 Ops/s 203.1307 Ops/s $\color{#d91a1a}-0.21\%$
test_iql_speed[reduce-overhead-None] 17.9978ms 10.6285ms 94.0865 Ops/s 95.8945 Ops/s $\color{#d91a1a}-1.89\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0939ms 5.9966ms 166.7617 Ops/s 168.5632 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7040ms 0.3965ms 2.5223 KOps/s 3.5323 KOps/s $\textbf{\color{#d91a1a}-28.59\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6021ms 0.3819ms 2.6182 KOps/s 3.7704 KOps/s $\textbf{\color{#d91a1a}-30.56\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0745ms 5.8465ms 171.0419 Ops/s 177.2984 Ops/s $\color{#d91a1a}-3.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.0920ms 0.3347ms 2.9880 KOps/s 3.5478 KOps/s $\textbf{\color{#d91a1a}-15.78\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5196ms 0.3130ms 3.1952 KOps/s 3.8170 KOps/s $\textbf{\color{#d91a1a}-16.29\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.7981ms 1.3440ms 744.0284 Ops/s 775.0215 Ops/s $\color{#d91a1a}-4.00\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5755ms 1.2673ms 789.0975 Ops/s 823.3786 Ops/s $\color{#d91a1a}-4.16\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0439ms 5.9546ms 167.9383 Ops/s 171.1628 Ops/s $\color{#d91a1a}-1.88\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0906ms 0.4290ms 2.3312 KOps/s 1.8899 KOps/s $\textbf{\color{#35bf28}+23.35\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6500ms 0.4103ms 2.4370 KOps/s 2.0205 KOps/s $\textbf{\color{#35bf28}+20.61\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9879ms 5.8761ms 170.1804 Ops/s 169.3951 Ops/s $\color{#35bf28}+0.46\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8882ms 0.3497ms 2.8595 KOps/s 2.6195 KOps/s $\textbf{\color{#35bf28}+9.16\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5627ms 0.3440ms 2.9071 KOps/s 2.7662 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0857ms 5.8374ms 171.3077 Ops/s 169.8164 Ops/s $\color{#35bf28}+0.88\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7650ms 0.3338ms 2.9955 KOps/s 3.0001 KOps/s $\color{#d91a1a}-0.16\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5769ms 0.2933ms 3.4094 KOps/s 2.9321 KOps/s $\textbf{\color{#35bf28}+16.28\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1475ms 6.0065ms 166.4859 Ops/s 167.1232 Ops/s $\color{#d91a1a}-0.38\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.5809s 1.3777ms 725.8555 Ops/s 2.0923 KOps/s $\textbf{\color{#d91a1a}-65.31\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6282ms 0.4546ms 2.1995 KOps/s 2.1823 KOps/s $\color{#35bf28}+0.79\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.6368ms 5.0957ms 196.2434 Ops/s 198.1387 Ops/s $\color{#d91a1a}-0.96\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 12.1783ms 1.9899ms 502.5403 Ops/s 459.4428 Ops/s $\textbf{\color{#35bf28}+9.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.3820ms 0.9239ms 1.0823 KOps/s 870.9736 Ops/s $\textbf{\color{#35bf28}+24.26\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.5102ms 4.9835ms 200.6620 Ops/s 50.3735 Ops/s $\textbf{\color{#35bf28}+298.35\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 5.0151ms 1.9232ms 519.9737 Ops/s 547.7837 Ops/s $\textbf{\color{#d91a1a}-5.08\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.0494ms 0.9277ms 1.0780 KOps/s 1.0967 KOps/s $\color{#d91a1a}-1.71\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5479s 16.2282ms 61.6213 Ops/s 183.9051 Ops/s $\textbf{\color{#d91a1a}-66.49\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 7.8625ms 2.0873ms 479.0838 Ops/s 467.2392 Ops/s $\color{#35bf28}+2.54\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.8922ms 1.0845ms 922.0459 Ops/s 817.6973 Ops/s $\textbf{\color{#35bf28}+12.76\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 37.7242ms 35.6061ms 28.0850 Ops/s 26.8581 Ops/s $\color{#35bf28}+4.57\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.8607ms 18.1719ms 55.0299 Ops/s 53.0675 Ops/s $\color{#35bf28}+3.70\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.5306ms 37.0366ms 27.0004 Ops/s 26.4432 Ops/s $\color{#35bf28}+2.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 20.4985ms 18.5634ms 53.8695 Ops/s 52.4751 Ops/s $\color{#35bf28}+2.66\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.7312ms 38.9374ms 25.6823 Ops/s 25.3744 Ops/s $\color{#35bf28}+1.21\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.6099ms 19.9882ms 50.0295 Ops/s 47.6967 Ops/s $\color{#35bf28}+4.89\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8498ms 0.2209ms 4.5261 KOps/s 4.4933 KOps/s $\color{#35bf28}+0.73\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.5646ms 1.4090ms 709.7357 Ops/s 736.3443 Ops/s $\color{#d91a1a}-3.61\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.4934ms 2.2997ms 434.8358 Ops/s 428.8456 Ops/s $\color{#35bf28}+1.40\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0869ms 2.9584ms 338.0226 Ops/s 339.7040 Ops/s $\color{#d91a1a}-0.49\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2524ms 0.1505ms 6.6441 KOps/s 6.5396 KOps/s $\color{#35bf28}+1.60\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3555ms 0.2181ms 4.5854 KOps/s 4.5049 KOps/s $\color{#35bf28}+1.79\%$
test_storage_write_contiguous[100-img_shape2-large_img] 1.8905ms 1.7971ms 556.4512 Ops/s 562.2575 Ops/s $\color{#d91a1a}-1.03\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.6028ms 1.3726ms 728.5586 Ops/s 731.7945 Ops/s $\color{#d91a1a}-0.44\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2697ms 1.1517ms 868.2693 Ops/s 878.1904 Ops/s $\color{#d91a1a}-1.13\%$
test_collector_stack_then_write[100-img_shape1-atari] 7.6024ms 3.7417ms 267.2567 Ops/s 276.7740 Ops/s $\color{#d91a1a}-3.44\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.3687ms 5.9083ms 169.2521 Ops/s 172.5112 Ops/s $\color{#d91a1a}-1.89\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.8023ms 7.2195ms 138.5135 Ops/s 142.0075 Ops/s $\color{#d91a1a}-2.46\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4155ms 0.2749ms 3.6382 KOps/s 3.6276 KOps/s $\color{#35bf28}+0.29\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.6336ms 1.5125ms 661.1595 Ops/s 702.4079 Ops/s $\textbf{\color{#d91a1a}-5.87\%}$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.5949ms 2.4230ms 412.7193 Ops/s 407.6597 Ops/s $\color{#35bf28}+1.24\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.3801ms 3.1694ms 315.5145 Ops/s 316.6403 Ops/s $\color{#d91a1a}-0.36\%$
test_collector_without_rb[100-img_shape0-atari] 34.5014ms 34.1781ms 29.2585 Ops/s 28.8141 Ops/s $\color{#35bf28}+1.54\%$
test_collector_without_rb[200-img_shape1-large_batch] 67.4607ms 66.9778ms 14.9303 Ops/s 14.7144 Ops/s $\color{#35bf28}+1.47\%$
test_collector_with_rb[100-img_shape0-atari] 38.9867ms 38.5331ms 25.9517 Ops/s 25.6749 Ops/s $\color{#35bf28}+1.08\%$
test_collector_with_rb[200-img_shape1-large_batch] 98.1284ms 77.4419ms 12.9129 Ops/s 7.5314 Ops/s $\textbf{\color{#35bf28}+71.45\%}$
test_collector_without_rb_cuda[100-img_shape0-atari] 59.1511ms 57.0760ms 17.5205 Ops/s 17.3864 Ops/s $\color{#35bf28}+0.77\%$
test_collector_without_rb_cuda[200-img_shape1-large_batch] 0.1153s 0.1136s 8.8058 Ops/s 8.7333 Ops/s $\color{#35bf28}+0.83\%$
test_collector_with_rb_cuda[100-img_shape0-atari] 0.7572s 0.1000s 9.9995 Ops/s 16.7741 Ops/s $\textbf{\color{#d91a1a}-40.39\%}$
test_collector_with_rb_cuda[200-img_shape1-large_batch] 0.1208s 0.1182s 8.4614 Ops/s 8.4598 Ops/s $\color{#35bf28}+0.02\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant