[Performance] Add out= parameter to _StepMDP for output buffer reuse#3561
Open
vmoens wants to merge 1 commit intogh/vmoens/242/basefrom
Open
[Performance] Add out= parameter to _StepMDP for output buffer reuse#3561vmoens wants to merge 1 commit intogh/vmoens/242/basefrom
vmoens wants to merge 1 commit intogh/vmoens/242/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3561
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 25ab629 with merge base 0a1aea6 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This was referenced Mar 23, 2026
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 79.8752μs | 78.6439μs | 12.7155 KOps/s | 12.7280 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1419ms | 0.1410ms | 7.0928 KOps/s | 7.2902 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1141s | 0.1135s | 8.8134 Ops/s | 8.9491 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5025μs | 2.4899μs | 401.6302 KOps/s | 410.7562 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.7851μs | 38.5081μs | 25.9686 KOps/s | 27.6605 KOps/s | |
| test_simple | 0.5435s | 0.5400s | 1.8518 Ops/s | 1.7530 Ops/s | |
| test_transformed | 1.0840s | 1.0747s | 0.9305 Ops/s | 0.9127 Ops/s | |
| test_serial | 1.6791s | 1.6669s | 0.5999 Ops/s | 0.5937 Ops/s | |
| test_parallel | 1.1383s | 1.0361s | 0.9652 Ops/s | 0.9653 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2459ms | 41.9468μs | 23.8397 KOps/s | 24.3065 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 54.8710μs | 22.8858μs | 43.6952 KOps/s | 44.2194 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 56.5600μs | 23.4365μs | 42.6685 KOps/s | 42.9172 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 41.4800μs | 12.7016μs | 78.7305 KOps/s | 79.7327 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 73.4210μs | 44.1110μs | 22.6701 KOps/s | 22.6301 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 54.7710μs | 24.9724μs | 40.0442 KOps/s | 39.6918 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 52.9800μs | 25.9803μs | 38.4908 KOps/s | 39.0207 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 39.9100μs | 15.2963μs | 65.3752 KOps/s | 65.3078 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 89.2810μs | 46.6604μs | 21.4315 KOps/s | 21.4295 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 58.6000μs | 27.4822μs | 36.3872 KOps/s | 35.9326 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 55.2610μs | 26.6663μs | 37.5006 KOps/s | 39.1142 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 42.9410μs | 15.1011μs | 66.2205 KOps/s | 65.7985 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 77.7110μs | 48.9897μs | 20.4124 KOps/s | 20.4399 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 60.6010μs | 29.8330μs | 33.5200 KOps/s | 33.0023 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 60.0710μs | 28.6909μs | 34.8543 KOps/s | 35.3569 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 43.7910μs | 17.7795μs | 56.2444 KOps/s | 57.6078 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 0.1007ms | 46.2003μs | 21.6449 KOps/s | 21.6965 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 61.8100μs | 27.5213μs | 36.3354 KOps/s | 36.8202 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4307ms | 30.0212μs | 33.3098 KOps/s | 33.5609 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 49.4610μs | 16.8811μs | 59.2378 KOps/s | 59.7867 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 83.4510μs | 48.6854μs | 20.5400 KOps/s | 20.3904 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 59.2410μs | 30.0120μs | 33.3200 KOps/s | 32.8179 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 69.1910μs | 31.7131μs | 31.5328 KOps/s | 31.6589 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 58.7010μs | 19.3797μs | 51.6005 KOps/s | 51.5515 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 91.1620μs | 51.6341μs | 19.3670 KOps/s | 19.8596 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 65.9610μs | 33.2897μs | 30.0393 KOps/s | 30.9154 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 58.9300μs | 32.1524μs | 31.1019 KOps/s | 31.3379 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 57.1010μs | 19.2172μs | 52.0367 KOps/s | 51.3743 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 87.3010μs | 53.6802μs | 18.6288 KOps/s | 19.0522 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 65.5210μs | 36.0162μs | 27.7653 KOps/s | 28.7360 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 67.7110μs | 33.6136μs | 29.7499 KOps/s | 29.9209 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 52.9000μs | 21.5252μs | 46.4572 KOps/s | 46.7138 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7216s | 0.7129s | 1.4027 Ops/s | 1.3649 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.6969s | 0.5957s | 1.6787 Ops/s | 1.6607 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.6996s | 1.6204s | 0.6171 Ops/s | 0.6156 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4748s | 1.3882s | 0.7204 Ops/s | 0.7108 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9536s | 1.8621s | 0.5370 Ops/s | 0.5311 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7287s | 1.6447s | 0.6080 Ops/s | 0.6047 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6457s | 4.5403s | 0.2202 Ops/s | 0.2171 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.6478s | 4.4614s | 0.2241 Ops/s | 0.2261 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9151s | 1.8405s | 0.5433 Ops/s | 0.5315 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7311s | 1.6174s | 0.6183 Ops/s | 0.6369 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 11.0760ms | 10.1437ms | 98.5831 Ops/s | 101.0909 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 20.7675ms | 17.9777ms | 55.6244 Ops/s | 56.0261 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2053ms | 0.1396ms | 7.1650 KOps/s | 8.0350 KOps/s | |
| test_values[td1_return_estimate-False-False] | 27.1020ms | 26.7452ms | 37.3899 Ops/s | 37.5606 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.5596ms | 18.0316ms | 55.4582 Ops/s | 55.4986 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 40.3722ms | 39.6507ms | 25.2202 Ops/s | 24.9965 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.6595ms | 17.9621ms | 55.6729 Ops/s | 56.0655 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.9240ms | 8.7899ms | 113.7667 Ops/s | 114.0728 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.7036ms | 1.5068ms | 663.6583 Ops/s | 646.6056 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.5188ms | 0.4148ms | 2.4107 KOps/s | 2.3335 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 38.1771ms | 35.6664ms | 28.0376 Ops/s | 28.9325 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 1.9103ms | 1.7766ms | 562.8725 Ops/s | 566.7262 Ops/s | |
| test_dqn_speed[False-None] | 1.4696ms | 1.3750ms | 727.2937 Ops/s | 720.9457 Ops/s | |
| test_dqn_speed[False-backward] | 1.9880ms | 1.8869ms | 529.9633 Ops/s | 520.7588 Ops/s | |
| test_dqn_speed[True-None] | 0.7064ms | 0.5542ms | 1.8045 KOps/s | 1.8333 KOps/s | |
| test_dqn_speed[True-backward] | 1.2144ms | 0.9930ms | 1.0070 KOps/s | 984.3845 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7841ms | 0.5311ms | 1.8828 KOps/s | 1.8531 KOps/s | |
| test_ddpg_speed[False-None] | 3.3503ms | 2.8296ms | 353.4102 Ops/s | 358.1395 Ops/s | |
| test_ddpg_speed[False-backward] | 4.3116ms | 4.0729ms | 245.5276 Ops/s | 247.4715 Ops/s | |
| test_ddpg_speed[True-None] | 2.0871ms | 1.4278ms | 700.3589 Ops/s | 697.7319 Ops/s | |
| test_ddpg_speed[True-backward] | 2.4168ms | 2.3752ms | 421.0200 Ops/s | 406.2668 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.7805ms | 1.3910ms | 718.9057 Ops/s | 703.0539 Ops/s | |
| test_sac_speed[False-None] | 8.6309ms | 7.9966ms | 125.0528 Ops/s | 125.2696 Ops/s | |
| test_sac_speed[False-backward] | 11.6983ms | 11.1156ms | 89.9640 Ops/s | 88.7671 Ops/s | |
| test_sac_speed[True-None] | 2.4560ms | 2.1372ms | 467.9076 Ops/s | 460.1716 Ops/s | |
| test_sac_speed[True-backward] | 4.0586ms | 3.9731ms | 251.6936 Ops/s | 213.5678 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.5742ms | 2.1248ms | 470.6312 Ops/s | 466.8412 Ops/s | |
| test_redq_speed[False-None] | 14.4316ms | 10.4524ms | 95.6720 Ops/s | 95.3862 Ops/s | |
| test_redq_speed[False-backward] | 22.6107ms | 17.9135ms | 55.8237 Ops/s | 55.6466 Ops/s | |
| test_redq_speed[True-None] | 4.6665ms | 4.3196ms | 231.5048 Ops/s | 220.7558 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 4.7185ms | 4.3252ms | 231.2048 Ops/s | 231.3630 Ops/s | |
| test_redq_deprec_speed[False-None] | 11.9154ms | 10.8736ms | 91.9659 Ops/s | 91.3037 Ops/s | |
| test_redq_deprec_speed[False-backward] | 16.2317ms | 15.5719ms | 64.2183 Ops/s | 63.9449 Ops/s | |
| test_redq_deprec_speed[True-None] | 3.9874ms | 3.4902ms | 286.5157 Ops/s | 268.7167 Ops/s | |
| test_redq_deprec_speed[True-backward] | 7.1781ms | 6.9817ms | 143.2317 Ops/s | 126.2311 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 3.6777ms | 3.4777ms | 287.5458 Ops/s | 271.0572 Ops/s | |
| test_td3_speed[False-None] | 8.0889ms | 7.9350ms | 126.0240 Ops/s | 123.8964 Ops/s | |
| test_td3_speed[False-backward] | 11.4356ms | 10.8401ms | 92.2499 Ops/s | 91.5463 Ops/s | |
| test_td3_speed[True-None] | 1.8745ms | 1.7799ms | 561.8336 Ops/s | 556.8255 Ops/s | |
| test_td3_speed[True-backward] | 3.7202ms | 3.5106ms | 284.8538 Ops/s | 246.1045 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.8615ms | 1.7664ms | 566.1169 Ops/s | 566.9728 Ops/s | |
| test_cql_speed[False-None] | 28.4744ms | 26.0053ms | 38.4538 Ops/s | 38.8467 Ops/s | |
| test_cql_speed[False-backward] | 35.8859ms | 35.0606ms | 28.5220 Ops/s | 28.6702 Ops/s | |
| test_cql_speed[True-None] | 12.4634ms | 12.1380ms | 82.3858 Ops/s | 82.3784 Ops/s | |
| test_cql_speed[True-backward] | 17.8846ms | 17.4900ms | 57.1756 Ops/s | 57.9686 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 12.5748ms | 12.0719ms | 82.8372 Ops/s | 81.6441 Ops/s | |
| test_a2c_speed[False-None] | 5.6357ms | 5.3664ms | 186.3436 Ops/s | 185.7596 Ops/s | |
| test_a2c_speed[False-backward] | 12.1337ms | 11.6380ms | 85.9254 Ops/s | 86.5593 Ops/s | |
| test_a2c_speed[True-None] | 3.9293ms | 3.7302ms | 268.0847 Ops/s | 266.5511 Ops/s | |
| test_a2c_speed[True-backward] | 8.6795ms | 8.4682ms | 118.0893 Ops/s | 116.7657 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.1514ms | 3.7209ms | 268.7546 Ops/s | 264.5665 Ops/s | |
| test_ppo_speed[False-None] | 6.3477ms | 5.8902ms | 169.7747 Ops/s | 165.8569 Ops/s | |
| test_ppo_speed[False-backward] | 12.8943ms | 12.4744ms | 80.1641 Ops/s | 79.7430 Ops/s | |
| test_ppo_speed[True-None] | 3.8156ms | 3.6452ms | 274.3364 Ops/s | 271.7299 Ops/s | |
| test_ppo_speed[True-backward] | 8.5729ms | 8.3817ms | 119.3076 Ops/s | 117.3006 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 4.1404ms | 3.6499ms | 273.9809 Ops/s | 270.7897 Ops/s | |
| test_reinforce_speed[False-None] | 4.7890ms | 4.5163ms | 221.4201 Ops/s | 217.9198 Ops/s | |
| test_reinforce_speed[False-backward] | 7.4699ms | 7.2801ms | 137.3614 Ops/s | 134.6479 Ops/s | |
| test_reinforce_speed[True-None] | 3.0252ms | 2.8708ms | 348.3384 Ops/s | 336.3573 Ops/s | |
| test_reinforce_speed[True-backward] | 7.8749ms | 7.6118ms | 131.3750 Ops/s | 128.7701 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.1436ms | 2.8457ms | 351.4055 Ops/s | 348.0734 Ops/s | |
| test_iql_speed[False-None] | 20.6643ms | 19.7728ms | 50.5746 Ops/s | 49.2549 Ops/s | |
| test_iql_speed[False-backward] | 30.5998ms | 29.9584ms | 33.3796 Ops/s | 32.5966 Ops/s | |
| test_iql_speed[True-None] | 8.9292ms | 8.3732ms | 119.4282 Ops/s | 118.4554 Ops/s | |
| test_iql_speed[True-backward] | 16.7099ms | 16.3787ms | 61.0549 Ops/s | 61.1542 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 8.8921ms | 8.4168ms | 118.8101 Ops/s | 117.3418 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0307ms | 5.8831ms | 169.9796 Ops/s | 170.6088 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 3.0493ms | 0.4098ms | 2.4403 KOps/s | 3.3689 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7738ms | 0.3894ms | 2.5680 KOps/s | 3.6271 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.8449ms | 5.6448ms | 177.1557 Ops/s | 176.0157 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.4995ms | 0.3959ms | 2.5257 KOps/s | 3.0575 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6002ms | 0.3812ms | 2.6231 KOps/s | 3.0626 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 7.6662ms | 1.5053ms | 664.3338 Ops/s | 781.1236 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.7018ms | 1.4072ms | 710.6497 Ops/s | 844.2770 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 8.9375ms | 5.8552ms | 170.7889 Ops/s | 171.3029 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.2297ms | 0.5518ms | 1.8123 KOps/s | 2.0994 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8661ms | 0.5323ms | 1.8788 KOps/s | 2.2872 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.0062ms | 5.6313ms | 177.5778 Ops/s | 175.2823 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.0989ms | 0.2811ms | 3.5580 KOps/s | 2.8748 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.6849ms | 0.2618ms | 3.8197 KOps/s | 3.0140 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 5.8003ms | 5.5862ms | 179.0127 Ops/s | 177.5743 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.8045ms | 0.2805ms | 3.5650 KOps/s | 3.1086 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4753ms | 0.2600ms | 3.8460 KOps/s | 3.3414 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 5.8873ms | 5.7360ms | 174.3365 Ops/s | 171.6097 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.3551ms | 0.4894ms | 2.0431 KOps/s | 1.9360 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7300ms | 0.5174ms | 1.9329 KOps/s | 2.1934 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 6.3784ms | 4.9375ms | 202.5299 Ops/s | 197.7105 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 4.0596ms | 1.9728ms | 506.9010 Ops/s | 468.7439 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 3.2636ms | 0.9077ms | 1.1016 KOps/s | 1.1043 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.6488s | 17.9017ms | 55.8607 Ops/s | 38.1170 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 12.7425ms | 1.9381ms | 515.9775 Ops/s | 547.7474 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 7.0825ms | 1.1855ms | 843.5564 Ops/s | 752.9170 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 6.7308ms | 5.1607ms | 193.7730 Ops/s | 192.6829 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 5.7648ms | 1.9222ms | 520.2285 Ops/s | 489.0429 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.1927ms | 1.0352ms | 966.0147 Ops/s | 939.5700 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 39.1698ms | 37.3343ms | 26.7850 Ops/s | 26.1420 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.9265ms | 17.7964ms | 56.1911 Ops/s | 55.6880 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 43.0282ms | 38.6966ms | 25.8421 Ops/s | 25.3296 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.4218ms | 18.0740ms | 55.3282 Ops/s | 32.3032 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.5559ms | 40.1089ms | 24.9321 Ops/s | 23.9116 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 20.4558ms | 19.4758ms | 51.3457 Ops/s | 50.4919 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8345ms | 0.2169ms | 4.6095 KOps/s | 4.3346 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.6879ms | 1.3760ms | 726.7412 Ops/s | 712.1419 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7469ms | 2.3689ms | 422.1331 Ops/s | 432.7927 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.0503ms | 2.8551ms | 350.2447 Ops/s | 341.6514 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.4672ms | 0.1340ms | 7.4638 KOps/s | 7.4796 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3379ms | 0.1837ms | 5.4436 KOps/s | 5.2025 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.2013ms | 1.7595ms | 568.3362 Ops/s | 563.7643 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.4519ms | 1.2895ms | 775.4954 Ops/s | 768.0659 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.2596ms | 1.0939ms | 914.1980 Ops/s | 911.3042 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.6256ms | 3.4421ms | 290.5168 Ops/s | 287.3819 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 6.0010ms | 5.7701ms | 173.3084 Ops/s | 175.5667 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.3081ms | 7.0767ms | 141.3087 Ops/s | 138.4352 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4487ms | 0.2818ms | 3.5491 KOps/s | 3.4469 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6734ms | 1.4695ms | 680.5260 Ops/s | 656.6624 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.7176ms | 2.4893ms | 401.7258 Ops/s | 409.3907 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.1941ms | 3.0536ms | 327.4871 Ops/s | 318.0906 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 33.1210ms | 32.1798ms | 31.0754 Ops/s | 30.7102 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 0.6458s | 0.1002s | 9.9752 Ops/s | 15.6426 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 37.7109ms | 37.1889ms | 26.8897 Ops/s | 26.8459 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 74.1472ms | 73.2302ms | 13.6556 Ops/s | 13.6888 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 82.2061μs | 81.3948μs | 12.2858 KOps/s | 12.2276 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1506ms | 0.1487ms | 6.7248 KOps/s | 7.0173 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1188s | 0.1185s | 8.4361 Ops/s | 8.2795 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5125μs | 2.5084μs | 398.6571 KOps/s | 395.6212 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 37.1004μs | 36.8819μs | 27.1135 KOps/s | 27.0615 KOps/s | |
| test_simple | 0.8132s | 0.8002s | 1.2497 Ops/s | 1.2090 Ops/s | |
| test_transformed | 1.3998s | 1.3987s | 0.7149 Ops/s | 0.7080 Ops/s | |
| test_serial | 2.4574s | 2.3700s | 0.4219 Ops/s | 0.4192 Ops/s | |
| test_parallel | 1.9333s | 1.8792s | 0.5321 Ops/s | 0.5487 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1688ms | 42.0165μs | 23.8002 KOps/s | 23.6964 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 57.8740μs | 22.9129μs | 43.6435 KOps/s | 43.0469 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 59.8130μs | 23.1203μs | 43.2520 KOps/s | 42.8717 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 42.1330μs | 12.7176μs | 78.6310 KOps/s | 78.0322 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 78.2850μs | 44.2376μs | 22.6052 KOps/s | 22.1760 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 58.6640μs | 25.5491μs | 39.1403 KOps/s | 38.8396 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 54.0530μs | 25.7454μs | 38.8419 KOps/s | 38.2686 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 52.8230μs | 15.3020μs | 65.3511 KOps/s | 64.3435 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 79.7550μs | 46.0832μs | 21.6999 KOps/s | 21.5996 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 58.7840μs | 28.2841μs | 35.3555 KOps/s | 35.6327 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 59.2930μs | 25.5939μs | 39.0717 KOps/s | 37.9379 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 47.0730μs | 15.2494μs | 65.5762 KOps/s | 65.2903 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 79.6450μs | 49.1063μs | 20.3640 KOps/s | 20.4559 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 63.1640μs | 30.2491μs | 33.0589 KOps/s | 32.1612 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 59.2040μs | 28.3910μs | 35.2225 KOps/s | 34.8156 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 47.3130μs | 17.6171μs | 56.7632 KOps/s | 55.8781 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 81.9240μs | 46.9618μs | 21.2939 KOps/s | 21.1186 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 63.3230μs | 27.8784μs | 35.8701 KOps/s | 35.2236 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.5122ms | 29.7716μs | 33.5891 KOps/s | 33.1040 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 48.4630μs | 17.1018μs | 58.4733 KOps/s | 58.7743 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 79.7940μs | 49.0506μs | 20.3871 KOps/s | 20.2514 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 64.1740μs | 30.6387μs | 32.6384 KOps/s | 33.1177 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 61.6640μs | 31.7842μs | 31.4622 KOps/s | 31.2170 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 60.3040μs | 19.3638μs | 51.6427 KOps/s | 52.2289 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 84.9650μs | 51.6157μs | 19.3739 KOps/s | 19.1052 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 64.5240μs | 33.6370μs | 29.7292 KOps/s | 30.4645 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 71.1540μs | 32.0997μs | 31.1529 KOps/s | 30.9651 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 49.8930μs | 19.3643μs | 51.6415 KOps/s | 51.8894 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 91.2550μs | 55.1574μs | 18.1299 KOps/s | 18.6104 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 78.6450μs | 36.0440μs | 27.7439 KOps/s | 28.2113 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.1047ms | 34.2862μs | 29.1662 KOps/s | 29.0991 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 50.2530μs | 22.0058μs | 45.4425 KOps/s | 45.6486 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8527s | 0.7514s | 1.3309 Ops/s | 1.3405 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7103s | 0.6084s | 1.6438 Ops/s | 1.6404 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7432s | 1.6553s | 0.6041 Ops/s | 0.6055 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5092s | 1.4264s | 0.7011 Ops/s | 0.7006 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9813s | 1.8968s | 0.5272 Ops/s | 0.5274 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7579s | 1.6787s | 0.5957 Ops/s | 0.5987 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7145s | 4.5968s | 0.2175 Ops/s | 0.2159 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.4923s | 4.4425s | 0.2251 Ops/s | 0.2251 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9674s | 1.8967s | 0.5272 Ops/s | 0.5292 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7285s | 1.6273s | 0.6145 Ops/s | 0.6258 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 23.1400ms | 21.2675ms | 47.0200 Ops/s | 46.7034 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1337s | 3.6082ms | 277.1493 Ops/s | 280.2886 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1084ms | 84.6432μs | 11.8143 KOps/s | 11.8446 KOps/s | |
| test_values[td1_return_estimate-False-False] | 50.9713ms | 49.8219ms | 20.0715 Ops/s | 20.1528 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3829ms | 1.1152ms | 896.6628 Ops/s | 907.0700 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 84.1277ms | 81.9853ms | 12.1973 Ops/s | 12.3752 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3377ms | 1.1006ms | 908.6202 Ops/s | 909.8099 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 21.2771ms | 21.1250ms | 47.3373 Ops/s | 47.3050 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0496ms | 0.7731ms | 1.2935 KOps/s | 1.2938 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8390ms | 0.6916ms | 1.4460 KOps/s | 1.4441 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5354ms | 1.5044ms | 664.6974 Ops/s | 665.7213 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7852ms | 0.7081ms | 1.4123 KOps/s | 1.4130 KOps/s | |
| test_dqn_speed[False-None] | 1.7124ms | 1.6258ms | 615.0795 Ops/s | 621.5379 Ops/s | |
| test_dqn_speed[False-backward] | 2.6423ms | 2.2752ms | 439.5141 Ops/s | 439.6583 Ops/s | |
| test_dqn_speed[True-None] | 0.7681ms | 0.5925ms | 1.6879 KOps/s | 1.6765 KOps/s | |
| test_dqn_speed[True-backward] | 1.3881ms | 1.2438ms | 803.9649 Ops/s | 809.9802 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7677ms | 0.6365ms | 1.5711 KOps/s | 1.5894 KOps/s | |
| test_ddpg_speed[False-None] | 3.4718ms | 3.1772ms | 314.7428 Ops/s | 326.7278 Ops/s | |
| test_ddpg_speed[False-backward] | 5.0646ms | 4.6026ms | 217.2676 Ops/s | 222.2544 Ops/s | |
| test_ddpg_speed[True-None] | 1.6105ms | 1.4069ms | 710.8025 Ops/s | 732.1398 Ops/s | |
| test_ddpg_speed[True-backward] | 2.6384ms | 2.5527ms | 391.7399 Ops/s | 391.7254 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.6330ms | 1.4127ms | 707.8465 Ops/s | 714.1584 Ops/s | |
| test_sac_speed[False-None] | 8.9446ms | 8.5939ms | 116.3614 Ops/s | 116.0574 Ops/s | |
| test_sac_speed[False-backward] | 12.2541ms | 11.8827ms | 84.1561 Ops/s | 84.0896 Ops/s | |
| test_sac_speed[True-None] | 2.2441ms | 1.8948ms | 527.7499 Ops/s | 534.1525 Ops/s | |
| test_sac_speed[True-backward] | 3.8486ms | 3.7457ms | 266.9703 Ops/s | 271.9216 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 16.6397ms | 10.1162ms | 98.8511 Ops/s | 98.7586 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.5189ms | 9.6618ms | 103.5005 Ops/s | 103.6336 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.6767ms | 13.1281ms | 76.1725 Ops/s | 76.4335 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.8770ms | 2.6321ms | 379.9308 Ops/s | 381.4294 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.7404ms | 4.3007ms | 232.5223 Ops/s | 229.2037 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 14.5449ms | 9.6486ms | 103.6421 Ops/s | 103.0209 Ops/s | |
| test_td3_speed[False-None] | 8.6628ms | 8.4954ms | 117.7105 Ops/s | 117.8534 Ops/s | |
| test_td3_speed[False-backward] | 12.0296ms | 11.1770ms | 89.4693 Ops/s | 89.7715 Ops/s | |
| test_td3_speed[True-None] | 1.7568ms | 1.7207ms | 581.1545 Ops/s | 607.5491 Ops/s | |
| test_td3_speed[True-backward] | 3.2956ms | 3.2006ms | 312.4406 Ops/s | 310.9313 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 98.6499ms | 25.9290ms | 38.5669 Ops/s | 38.1199 Ops/s | |
| test_cql_speed[False-None] | 18.2812ms | 17.9987ms | 55.5595 Ops/s | 55.6628 Ops/s | |
| test_cql_speed[False-backward] | 24.4449ms | 23.8655ms | 41.9015 Ops/s | 42.0936 Ops/s | |
| test_cql_speed[True-None] | 3.4891ms | 3.3650ms | 297.1789 Ops/s | 298.9559 Ops/s | |
| test_cql_speed[True-backward] | 5.8022ms | 5.6194ms | 177.9535 Ops/s | 175.4869 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 17.9892ms | 11.9073ms | 83.9819 Ops/s | 83.1174 Ops/s | |
| test_a2c_speed[False-None] | 3.5866ms | 3.4054ms | 293.6556 Ops/s | 291.2374 Ops/s | |
| test_a2c_speed[False-backward] | 7.3889ms | 6.6500ms | 150.3764 Ops/s | 149.5127 Ops/s | |
| test_a2c_speed[True-None] | 1.5435ms | 1.3825ms | 723.3134 Ops/s | 697.8725 Ops/s | |
| test_a2c_speed[True-backward] | 3.2817ms | 3.2303ms | 309.5673 Ops/s | 321.9697 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.2047ms | 1.0543ms | 948.5247 Ops/s | 944.4109 Ops/s | |
| test_ppo_speed[False-None] | 4.1914ms | 4.0769ms | 245.2830 Ops/s | 237.7262 Ops/s | |
| test_ppo_speed[False-backward] | 7.9712ms | 7.5877ms | 131.7921 Ops/s | 135.3200 Ops/s | |
| test_ppo_speed[True-None] | 1.6346ms | 1.5312ms | 653.0878 Ops/s | 653.8251 Ops/s | |
| test_ppo_speed[True-backward] | 3.4717ms | 3.4261ms | 291.8779 Ops/s | 310.5376 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.1779ms | 1.1029ms | 906.7205 Ops/s | 900.8899 Ops/s | |
| test_reinforce_speed[False-None] | 3.2088ms | 2.4615ms | 406.2490 Ops/s | 410.2081 Ops/s | |
| test_reinforce_speed[False-backward] | 3.7660ms | 3.6105ms | 276.9692 Ops/s | 276.4654 Ops/s | |
| test_reinforce_speed[True-None] | 1.5068ms | 1.3885ms | 720.2065 Ops/s | 735.6074 Ops/s | |
| test_reinforce_speed[True-backward] | 3.3597ms | 3.2113ms | 311.4044 Ops/s | 323.8252 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 15.8549ms | 8.8956ms | 112.4156 Ops/s | 112.1263 Ops/s | |
| test_iql_speed[False-None] | 10.4712ms | 9.8516ms | 101.5060 Ops/s | 101.5894 Ops/s | |
| test_iql_speed[False-backward] | 14.7354ms | 13.9830ms | 71.5155 Ops/s | 72.9793 Ops/s | |
| test_iql_speed[True-None] | 2.4591ms | 2.2963ms | 435.4738 Ops/s | 434.0339 Ops/s | |
| test_iql_speed[True-backward] | 5.4792ms | 5.0681ms | 197.3122 Ops/s | 204.2989 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 16.1719ms | 10.0347ms | 99.6542 Ops/s | 99.2410 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.3358ms | 5.9394ms | 168.3662 Ops/s | 167.4721 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7269ms | 0.3535ms | 2.8289 KOps/s | 2.7398 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5595ms | 0.3095ms | 3.2313 KOps/s | 2.8734 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1228ms | 5.7385ms | 174.2604 Ops/s | 172.4803 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9937ms | 0.3508ms | 2.8508 KOps/s | 3.2107 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6096ms | 0.3015ms | 3.3166 KOps/s | 3.1084 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5588ms | 1.3294ms | 752.2010 Ops/s | 779.6286 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5536ms | 1.2408ms | 805.9571 Ops/s | 831.3161 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.2056ms | 6.0293ms | 165.8563 Ops/s | 166.6352 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2495ms | 0.4422ms | 2.2613 KOps/s | 2.2782 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6489ms | 0.4258ms | 2.3483 KOps/s | 2.3466 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9107ms | 5.8232ms | 171.7282 Ops/s | 171.9553 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.9083ms | 0.2903ms | 3.4443 KOps/s | 2.9381 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.4747ms | 0.2762ms | 3.6210 KOps/s | 2.6644 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0220ms | 5.7361ms | 174.3338 Ops/s | 173.8428 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.6716ms | 0.3211ms | 3.1144 KOps/s | 3.3974 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5389ms | 0.3142ms | 3.1827 KOps/s | 3.3446 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 8.9060ms | 5.9217ms | 168.8695 Ops/s | 166.0937 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.7794ms | 0.4790ms | 2.0875 KOps/s | 2.2445 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8650ms | 0.4831ms | 2.0700 KOps/s | 2.3560 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.9478s | 23.9421ms | 41.7674 Ops/s | 194.8081 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 3.9898ms | 1.9329ms | 517.3542 Ops/s | 552.7010 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 7.3539ms | 1.3064ms | 765.4332 Ops/s | 805.3738 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.5465ms | 5.0412ms | 198.3665 Ops/s | 160.5633 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9159ms | 1.8486ms | 540.9478 Ops/s | 477.7089 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 6.3134ms | 1.2703ms | 787.2191 Ops/s | 704.6573 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.6584s | 18.3065ms | 54.6253 Ops/s | 185.4539 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 6.2367ms | 2.1421ms | 466.8310 Ops/s | 467.6496 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 2.3919ms | 1.1848ms | 844.0400 Ops/s | 55.4431 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 41.0489ms | 38.9263ms | 25.6896 Ops/s | 25.7023 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.7881ms | 18.1774ms | 55.0135 Ops/s | 54.4410 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 44.5686ms | 40.1481ms | 24.9078 Ops/s | 24.3933 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.9574ms | 18.7929ms | 53.2117 Ops/s | 53.2255 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 44.5462ms | 42.4405ms | 23.5624 Ops/s | 23.6306 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.7735ms | 20.1644ms | 49.5923 Ops/s | 49.3862 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8743ms | 0.2301ms | 4.3461 KOps/s | 4.5899 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7055ms | 1.3671ms | 731.4710 Ops/s | 737.7719 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7545ms | 2.3102ms | 432.8572 Ops/s | 432.4771 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.1109ms | 2.9199ms | 342.4726 Ops/s | 347.3819 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.5358ms | 0.1701ms | 5.8805 KOps/s | 6.0187 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3931ms | 0.2329ms | 4.2930 KOps/s | 4.2681 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9500ms | 1.7779ms | 562.4643 Ops/s | 543.3703 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5857ms | 1.3812ms | 723.9959 Ops/s | 717.5430 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3175ms | 1.1548ms | 865.9871 Ops/s | 870.0505 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.8167ms | 3.6197ms | 276.2695 Ops/s | 278.8514 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 5.9553ms | 5.6735ms | 176.2594 Ops/s | 172.6535 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.1259ms | 6.9307ms | 144.2848 Ops/s | 141.5608 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4404ms | 0.2777ms | 3.6009 KOps/s | 3.6454 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.7215ms | 1.5561ms | 642.6174 Ops/s | 656.0202 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.7135ms | 2.4281ms | 411.8408 Ops/s | 409.3859 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.3980ms | 3.1191ms | 320.6074 Ops/s | 323.8943 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.4576ms | 33.4915ms | 29.8584 Ops/s | 30.2327 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.7240ms | 65.6280ms | 15.2374 Ops/s | 15.3268 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.8018ms | 37.8771ms | 26.4012 Ops/s | 26.6543 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 76.0888ms | 75.1861ms | 13.3003 Ops/s | 13.4810 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 58.2142ms | 57.2455ms | 17.4686 Ops/s | 17.7656 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1155s | 0.1126s | 8.8824 Ops/s | 8.9986 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 59.8349ms | 58.2246ms | 17.1749 Ops/s | 17.1434 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1195s | 0.1161s | 8.6098 Ops/s | 8.7055 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
_StepMDP.call now accepts an optional out parameter. When provided,
the output TensorDict is reused instead of allocating a new one each call.
This enables callers (collectors, rollout loops) to pre-allocate a buffer
and avoid per-step TensorDict creation overhead.
Also fixes _exclude return type annotation and ensures it returns the
pre-provided out buffer even when no new keys are set.
Made-with: Cursor