Skip to content

[Test] Add tests and benchmarks for collector throughput optimizations#3567

Open
vmoens wants to merge 1 commit intogh/vmoens/248/basefrom
gh/vmoens/248/head
Open

[Test] Add tests and benchmarks for collector throughput optimizations#3567
vmoens wants to merge 1 commit intogh/vmoens/248/basefrom
gh/vmoens/248/head

Conversation

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Mar 24, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3567

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures, 1 Cancelled Job, 1 Unrelated Failure

As of commit 76204bb with merge base 0a1aea6 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Mar 24, 2026
Cover all 7 performance features: _skip_maybe_reset, _StepMDP out= reuse,
_trust_step_output, update_traj_ids, combined optimization flags,
torch.compile fullgraph, and fast-path benchmarks.

Made-with: Cursor
ghstack-source-id: ad18afe
Pull-Request: #3567
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 24, 2026
@github-actions github-actions bot added Tests Incomplete or broken unit tests Benchmarks rl/benchmark changes Collectors labels Mar 24, 2026
@github-actions
Copy link
Contributor

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 174. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 85.4458μs 84.6226μs 11.8172 KOps/s 12.4629 KOps/s $\textbf{\color{#d91a1a}-5.18\%}$
test_tensor_to_bytestream_speed[torch.save] 0.1429ms 0.1390ms 7.1944 KOps/s 7.1928 KOps/s $\color{#35bf28}+0.02\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1085s 0.1082s 9.2417 Ops/s 9.1717 Ops/s $\color{#35bf28}+0.76\%$
test_tensor_to_bytestream_speed[numpy] 2.5199μs 2.5129μs 397.9404 KOps/s 400.4593 KOps/s $\color{#d91a1a}-0.63\%$
test_tensor_to_bytestream_speed[safetensors] 36.9397μs 36.7779μs 27.1902 KOps/s 27.5039 KOps/s $\color{#d91a1a}-1.14\%$
test_simple 0.5466s 0.5462s 1.8307 Ops/s 1.7588 Ops/s $\color{#35bf28}+4.08\%$
test_transformed 1.0906s 1.0896s 0.9178 Ops/s 0.9045 Ops/s $\color{#35bf28}+1.47\%$
test_serial 1.6907s 1.6853s 0.5934 Ops/s 0.5908 Ops/s $\color{#35bf28}+0.43\%$
test_parallel 1.1523s 1.0509s 0.9515 Ops/s 0.9560 Ops/s $\color{#d91a1a}-0.47\%$
test_step_mdp_speed[True-True-True-True-True] 0.3355ms 41.8766μs 23.8797 KOps/s 24.4475 KOps/s $\color{#d91a1a}-2.32\%$
test_step_mdp_speed[True-True-True-True-False] 54.7710μs 22.9791μs 43.5179 KOps/s 41.6641 KOps/s $\color{#35bf28}+4.45\%$
test_step_mdp_speed[True-True-True-False-True] 53.2110μs 23.4881μs 42.5747 KOps/s 40.3036 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_step_mdp_speed[True-True-True-False-False] 40.1310μs 12.9588μs 77.1679 KOps/s 74.9761 KOps/s $\color{#35bf28}+2.92\%$
test_step_mdp_speed[True-True-False-True-True] 72.6720μs 44.5231μs 22.4602 KOps/s 22.8842 KOps/s $\color{#d91a1a}-1.85\%$
test_step_mdp_speed[True-True-False-True-False] 62.6610μs 25.8194μs 38.7306 KOps/s 38.8477 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-True-False-False-True] 59.9210μs 25.6964μs 38.9160 KOps/s 38.0020 KOps/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[True-True-False-False-False] 44.8210μs 15.4645μs 64.6643 KOps/s 63.8152 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[True-False-True-True-True] 80.8820μs 47.4237μs 21.0865 KOps/s 20.9319 KOps/s $\color{#35bf28}+0.74\%$
test_step_mdp_speed[True-False-True-True-False] 58.0610μs 28.0803μs 35.6121 KOps/s 34.7343 KOps/s $\color{#35bf28}+2.53\%$
test_step_mdp_speed[True-False-True-False-True] 97.5820μs 25.7231μs 38.8755 KOps/s 38.0527 KOps/s $\color{#35bf28}+2.16\%$
test_step_mdp_speed[True-False-True-False-False] 39.6010μs 15.3688μs 65.0668 KOps/s 64.3738 KOps/s $\color{#35bf28}+1.08\%$
test_step_mdp_speed[True-False-False-True-True] 81.1410μs 49.2425μs 20.3077 KOps/s 20.2822 KOps/s $\color{#35bf28}+0.13\%$
test_step_mdp_speed[True-False-False-True-False] 56.5610μs 30.7716μs 32.4975 KOps/s 32.1794 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[True-False-False-False-True] 69.5410μs 28.5347μs 35.0450 KOps/s 35.1457 KOps/s $\color{#d91a1a}-0.29\%$
test_step_mdp_speed[True-False-False-False-False] 49.8610μs 17.9839μs 55.6053 KOps/s 55.3756 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[False-True-True-True-True] 93.2920μs 47.2551μs 21.1617 KOps/s 21.2160 KOps/s $\color{#d91a1a}-0.26\%$
test_step_mdp_speed[False-True-True-True-False] 63.5610μs 28.2439μs 35.4059 KOps/s 35.6734 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-True-True-False-True] 2.3641ms 30.0144μs 33.3173 KOps/s 32.8605 KOps/s $\color{#35bf28}+1.39\%$
test_step_mdp_speed[False-True-True-False-False] 50.3310μs 17.4123μs 57.4305 KOps/s 57.8065 KOps/s $\color{#d91a1a}-0.65\%$
test_step_mdp_speed[False-True-False-True-True] 94.4110μs 50.1069μs 19.9573 KOps/s 20.2587 KOps/s $\color{#d91a1a}-1.49\%$
test_step_mdp_speed[False-True-False-True-False] 70.6810μs 30.2772μs 33.0281 KOps/s 32.3268 KOps/s $\color{#35bf28}+2.17\%$
test_step_mdp_speed[False-True-False-False-True] 77.0710μs 32.1890μs 31.0665 KOps/s 30.3766 KOps/s $\color{#35bf28}+2.27\%$
test_step_mdp_speed[False-True-False-False-False] 63.4910μs 19.4433μs 51.4316 KOps/s 51.0870 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[False-False-True-True-True] 0.1087ms 52.3546μs 19.1005 KOps/s 19.0490 KOps/s $\color{#35bf28}+0.27\%$
test_step_mdp_speed[False-False-True-True-False] 61.5610μs 33.4234μs 29.9192 KOps/s 30.1300 KOps/s $\color{#d91a1a}-0.70\%$
test_step_mdp_speed[False-False-True-False-True] 73.4610μs 32.1296μs 31.1239 KOps/s 31.1990 KOps/s $\color{#d91a1a}-0.24\%$
test_step_mdp_speed[False-False-True-False-False] 42.2310μs 19.5406μs 51.1756 KOps/s 50.9184 KOps/s $\color{#35bf28}+0.51\%$
test_step_mdp_speed[False-False-False-True-True] 99.7010μs 53.6243μs 18.6483 KOps/s 18.7307 KOps/s $\color{#d91a1a}-0.44\%$
test_step_mdp_speed[False-False-False-True-False] 79.7710μs 35.7719μs 27.9549 KOps/s 28.2455 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[False-False-False-False-True] 68.2410μs 34.3463μs 29.1152 KOps/s 29.2241 KOps/s $\color{#d91a1a}-0.37\%$
test_step_mdp_speed[False-False-False-False-False] 52.4510μs 22.0262μs 45.4004 KOps/s 44.7864 KOps/s $\color{#35bf28}+1.37\%$
test_step_and_maybe_reset_fast_path 90.3506ms 87.0068ms 11.4933 Ops/s 11.7721 Ops/s $\color{#d91a1a}-2.37\%$
test_step_and_maybe_reset_normal 0.1060s 0.1041s 9.6051 Ops/s 9.4427 Ops/s $\color{#35bf28}+1.72\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8499s 0.7480s 1.3369 Ops/s 1.3443 Ops/s $\color{#d91a1a}-0.55\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7134s 0.6101s 1.6390 Ops/s 1.6504 Ops/s $\color{#d91a1a}-0.69\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7388s 1.6554s 0.6041 Ops/s 0.6044 Ops/s $\color{#d91a1a}-0.06\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5009s 1.4226s 0.7029 Ops/s 0.7012 Ops/s $\color{#35bf28}+0.25\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9891s 1.8964s 0.5273 Ops/s 0.5279 Ops/s $\color{#d91a1a}-0.12\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7562s 1.6696s 0.5989 Ops/s 0.5978 Ops/s $\color{#35bf28}+0.19\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.6170s 4.5536s 0.2196 Ops/s 0.2166 Ops/s $\color{#35bf28}+1.37\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5068s 4.3789s 0.2284 Ops/s 0.2273 Ops/s $\color{#35bf28}+0.46\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 1.9762s 1.8835s 0.5309 Ops/s 0.5312 Ops/s $\color{#d91a1a}-0.06\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7041s 1.5910s 0.6285 Ops/s 0.6094 Ops/s $\color{#35bf28}+3.13\%$
test_values[generalized_advantage_estimate-True-True] 10.1376ms 10.0144ms 99.8565 Ops/s 99.8911 Ops/s $\color{#d91a1a}-0.03\%$
test_values[vec_generalized_advantage_estimate-True-True] 20.4689ms 17.9696ms 55.6496 Ops/s 56.1664 Ops/s $\color{#d91a1a}-0.92\%$
test_values[td0_return_estimate-False-False] 0.2345ms 0.1359ms 7.3565 KOps/s 7.7471 KOps/s $\textbf{\color{#d91a1a}-5.04\%}$
test_values[td1_return_estimate-False-False] 27.2077ms 26.8886ms 37.1905 Ops/s 36.7702 Ops/s $\color{#35bf28}+1.14\%$
test_values[vec_td1_return_estimate-False-False] 18.1433ms 17.6528ms 56.6483 Ops/s 56.1697 Ops/s $\color{#35bf28}+0.85\%$
test_values[td_lambda_return_estimate-True-False] 40.7172ms 40.1767ms 24.8900 Ops/s 24.8191 Ops/s $\color{#35bf28}+0.29\%$
test_values[vec_td_lambda_return_estimate-True-False] 18.7164ms 17.2132ms 58.0949 Ops/s 56.1840 Ops/s $\color{#35bf28}+3.40\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.9427ms 8.8523ms 112.9646 Ops/s 113.3599 Ops/s $\color{#d91a1a}-0.35\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.9597ms 1.5146ms 660.2543 Ops/s 679.1340 Ops/s $\color{#d91a1a}-2.78\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5697ms 0.4204ms 2.3785 KOps/s 2.4187 KOps/s $\color{#d91a1a}-1.66\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.0154ms 30.4848ms 32.8033 Ops/s 29.0124 Ops/s $\textbf{\color{#35bf28}+13.07\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 1.9198ms 1.7499ms 571.4736 Ops/s 571.6817 Ops/s $\color{#d91a1a}-0.04\%$
test_dqn_speed[False-None] 1.5918ms 1.3912ms 718.8107 Ops/s 710.5712 Ops/s $\color{#35bf28}+1.16\%$
test_dqn_speed[False-backward] 2.0914ms 1.9172ms 521.5897 Ops/s 521.5074 Ops/s $\color{#35bf28}+0.02\%$
test_dqn_speed[True-None] 1.0883ms 0.5650ms 1.7700 KOps/s 1.7274 KOps/s $\color{#35bf28}+2.47\%$
test_dqn_speed[True-backward] 1.0850ms 1.0099ms 990.1968 Ops/s 842.3083 Ops/s $\textbf{\color{#35bf28}+17.56\%}$
test_dqn_speed[reduce-overhead-None] 0.6868ms 0.5458ms 1.8322 KOps/s 1.8436 KOps/s $\color{#d91a1a}-0.62\%$
test_ddpg_speed[False-None] 3.3380ms 2.8807ms 347.1436 Ops/s 354.6050 Ops/s $\color{#d91a1a}-2.10\%$
test_ddpg_speed[False-backward] 4.2283ms 4.1374ms 241.6993 Ops/s 245.7003 Ops/s $\color{#d91a1a}-1.63\%$
test_ddpg_speed[True-None] 1.8172ms 1.4193ms 704.5668 Ops/s 686.9548 Ops/s $\color{#35bf28}+2.56\%$
test_ddpg_speed[True-backward] 2.4355ms 2.3978ms 417.0476 Ops/s 360.6874 Ops/s $\textbf{\color{#35bf28}+15.63\%}$
test_ddpg_speed[reduce-overhead-None] 1.8266ms 1.4323ms 698.1712 Ops/s 707.8506 Ops/s $\color{#d91a1a}-1.37\%$
test_sac_speed[False-None] 9.0249ms 8.2924ms 120.5918 Ops/s 124.0347 Ops/s $\color{#d91a1a}-2.78\%$
test_sac_speed[False-backward] 12.0124ms 11.3466ms 88.1323 Ops/s 87.3074 Ops/s $\color{#35bf28}+0.94\%$
test_sac_speed[True-None] 2.2200ms 2.1376ms 467.8121 Ops/s 452.6544 Ops/s $\color{#35bf28}+3.35\%$
test_sac_speed[True-backward] 4.1039ms 3.9827ms 251.0886 Ops/s 241.5264 Ops/s $\color{#35bf28}+3.96\%$
test_sac_speed[reduce-overhead-None] 2.2780ms 2.1400ms 467.2810 Ops/s 456.5108 Ops/s $\color{#35bf28}+2.36\%$
test_redq_speed[False-None] 14.6602ms 10.4782ms 95.4365 Ops/s 91.9082 Ops/s $\color{#35bf28}+3.84\%$
test_redq_speed[False-backward] 18.7401ms 17.6074ms 56.7943 Ops/s 55.2552 Ops/s $\color{#35bf28}+2.79\%$
test_redq_speed[True-None] 4.8178ms 4.3559ms 229.5727 Ops/s 226.6943 Ops/s $\color{#35bf28}+1.27\%$
test_redq_speed[reduce-overhead-None] 4.5992ms 4.3747ms 228.5876 Ops/s 230.5407 Ops/s $\color{#d91a1a}-0.85\%$
test_redq_deprec_speed[False-None] 12.4420ms 11.2841ms 88.6202 Ops/s 90.1072 Ops/s $\color{#d91a1a}-1.65\%$
test_redq_deprec_speed[False-backward] 16.5836ms 16.0586ms 62.2719 Ops/s 62.7733 Ops/s $\color{#d91a1a}-0.80\%$
test_redq_deprec_speed[True-None] 4.8796ms 3.6015ms 277.6587 Ops/s 272.6644 Ops/s $\color{#35bf28}+1.83\%$
test_redq_deprec_speed[True-backward] 7.7534ms 7.3666ms 135.7486 Ops/s 137.2742 Ops/s $\color{#d91a1a}-1.11\%$
test_redq_deprec_speed[reduce-overhead-None] 3.9006ms 3.5480ms 281.8481 Ops/s 273.3360 Ops/s $\color{#35bf28}+3.11\%$
test_td3_speed[False-None] 8.8466ms 8.1440ms 122.7905 Ops/s 121.4726 Ops/s $\color{#35bf28}+1.08\%$
test_td3_speed[False-backward] 11.4604ms 10.9280ms 91.5078 Ops/s 91.0203 Ops/s $\color{#35bf28}+0.54\%$
test_td3_speed[True-None] 2.3857ms 1.8293ms 546.6605 Ops/s 552.8167 Ops/s $\color{#d91a1a}-1.11\%$
test_td3_speed[True-backward] 3.6722ms 3.4946ms 286.1518 Ops/s 243.8840 Ops/s $\textbf{\color{#35bf28}+17.33\%}$
test_td3_speed[reduce-overhead-None] 1.9296ms 1.7615ms 567.6895 Ops/s 556.5280 Ops/s $\color{#35bf28}+2.01\%$
test_cql_speed[False-None] 28.8858ms 26.1467ms 38.2457 Ops/s 38.8626 Ops/s $\color{#d91a1a}-1.59\%$
test_cql_speed[False-backward] 36.3764ms 35.3119ms 28.3190 Ops/s 28.3760 Ops/s $\color{#d91a1a}-0.20\%$
test_cql_speed[True-None] 15.1851ms 12.1669ms 82.1900 Ops/s 76.6685 Ops/s $\textbf{\color{#35bf28}+7.20\%}$
test_cql_speed[True-backward] 17.8709ms 17.5102ms 57.1097 Ops/s 56.2299 Ops/s $\color{#35bf28}+1.56\%$
test_cql_speed[reduce-overhead-None] 12.4932ms 12.1719ms 82.1563 Ops/s 79.0016 Ops/s $\color{#35bf28}+3.99\%$
test_a2c_speed[False-None] 5.7300ms 5.4456ms 183.6336 Ops/s 185.8264 Ops/s $\color{#d91a1a}-1.18\%$
test_a2c_speed[False-backward] 12.2691ms 11.9277ms 83.8386 Ops/s 84.7093 Ops/s $\color{#d91a1a}-1.03\%$
test_a2c_speed[True-None] 4.1122ms 3.7571ms 266.1659 Ops/s 266.4604 Ops/s $\color{#d91a1a}-0.11\%$
test_a2c_speed[True-backward] 8.8428ms 8.4194ms 118.7729 Ops/s 117.9193 Ops/s $\color{#35bf28}+0.72\%$
test_a2c_speed[reduce-overhead-None] 4.1040ms 3.7127ms 269.3466 Ops/s 266.3973 Ops/s $\color{#35bf28}+1.11\%$
test_ppo_speed[False-None] 6.0821ms 5.8167ms 171.9198 Ops/s 168.3921 Ops/s $\color{#35bf28}+2.09\%$
test_ppo_speed[False-backward] 12.8230ms 12.4913ms 80.0557 Ops/s 79.7680 Ops/s $\color{#35bf28}+0.36\%$
test_ppo_speed[True-None] 3.8580ms 3.6624ms 273.0432 Ops/s 260.7812 Ops/s $\color{#35bf28}+4.70\%$
test_ppo_speed[True-backward] 8.8544ms 8.4126ms 118.8689 Ops/s 104.5791 Ops/s $\textbf{\color{#35bf28}+13.66\%}$
test_ppo_speed[reduce-overhead-None] 4.0582ms 3.6225ms 276.0560 Ops/s 271.1794 Ops/s $\color{#35bf28}+1.80\%$
test_reinforce_speed[False-None] 4.7779ms 4.5460ms 219.9732 Ops/s 219.5129 Ops/s $\color{#35bf28}+0.21\%$
test_reinforce_speed[False-backward] 7.7307ms 7.3925ms 135.2716 Ops/s 135.3105 Ops/s $\color{#d91a1a}-0.03\%$
test_reinforce_speed[True-None] 3.3688ms 2.8928ms 345.6856 Ops/s 344.4812 Ops/s $\color{#35bf28}+0.35\%$
test_reinforce_speed[True-backward] 7.9289ms 7.6690ms 130.3948 Ops/s 127.8120 Ops/s $\color{#35bf28}+2.02\%$
test_reinforce_speed[reduce-overhead-None] 3.2963ms 2.8607ms 349.5620 Ops/s 335.5519 Ops/s $\color{#35bf28}+4.18\%$
test_iql_speed[False-None] 20.7046ms 19.8728ms 50.3201 Ops/s 49.2242 Ops/s $\color{#35bf28}+2.23\%$
test_iql_speed[False-backward] 30.9365ms 30.3411ms 32.9586 Ops/s 32.6428 Ops/s $\color{#35bf28}+0.97\%$
test_iql_speed[True-None] 9.0462ms 8.3961ms 119.1030 Ops/s 117.5045 Ops/s $\color{#35bf28}+1.36\%$
test_iql_speed[True-backward] 16.7600ms 16.2996ms 61.3513 Ops/s 60.9601 Ops/s $\color{#35bf28}+0.64\%$
test_iql_speed[reduce-overhead-None] 8.6970ms 8.3996ms 119.0526 Ops/s 114.2105 Ops/s $\color{#35bf28}+4.24\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2101ms 6.0651ms 164.8779 Ops/s 165.8822 Ops/s $\color{#d91a1a}-0.61\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9932ms 0.2863ms 3.4924 KOps/s 2.7749 KOps/s $\textbf{\color{#35bf28}+25.86\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7207ms 0.2687ms 3.7220 KOps/s 2.8716 KOps/s $\textbf{\color{#35bf28}+29.62\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1060ms 5.8014ms 172.3712 Ops/s 171.5082 Ops/s $\color{#35bf28}+0.50\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8862ms 0.2798ms 3.5739 KOps/s 3.5439 KOps/s $\color{#35bf28}+0.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5685ms 0.2633ms 3.7980 KOps/s 3.7950 KOps/s $\color{#35bf28}+0.08\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5414ms 1.2790ms 781.8355 Ops/s 782.5232 Ops/s $\color{#d91a1a}-0.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.6689ms 1.2042ms 830.4465 Ops/s 843.6035 Ops/s $\color{#d91a1a}-1.56\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.2010ms 6.0667ms 164.8352 Ops/s 167.2155 Ops/s $\color{#d91a1a}-1.42\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0660ms 0.4439ms 2.2526 KOps/s 2.2998 KOps/s $\color{#d91a1a}-2.05\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6605ms 0.4144ms 2.4134 KOps/s 2.4059 KOps/s $\color{#35bf28}+0.31\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0887ms 5.8388ms 171.2675 Ops/s 170.6547 Ops/s $\color{#35bf28}+0.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.0477ms 0.3324ms 3.0083 KOps/s 3.4435 KOps/s $\textbf{\color{#d91a1a}-12.64\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6562ms 0.3416ms 2.9275 KOps/s 3.6891 KOps/s $\textbf{\color{#d91a1a}-20.64\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1111ms 5.7314ms 174.4760 Ops/s 172.6567 Ops/s $\color{#35bf28}+1.05\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.9470ms 0.3460ms 2.8904 KOps/s 2.6401 KOps/s $\textbf{\color{#35bf28}+9.48\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5417ms 0.3379ms 2.9593 KOps/s 3.0757 KOps/s $\color{#d91a1a}-3.78\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1417ms 5.9411ms 168.3181 Ops/s 166.7861 Ops/s $\color{#35bf28}+0.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7916ms 0.4934ms 2.0266 KOps/s 2.2952 KOps/s $\textbf{\color{#d91a1a}-11.70\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6690ms 0.4750ms 2.1055 KOps/s 2.2019 KOps/s $\color{#d91a1a}-4.38\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.6427ms 5.0808ms 196.8188 Ops/s 193.8176 Ops/s $\color{#35bf28}+1.55\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.9816ms 1.9691ms 507.8506 Ops/s 491.2153 Ops/s $\color{#35bf28}+3.39\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.0793ms 0.9024ms 1.1081 KOps/s 884.7212 Ops/s $\textbf{\color{#35bf28}+25.25\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.6555s 18.2102ms 54.9141 Ops/s 37.9305 Ops/s $\textbf{\color{#35bf28}+44.78\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 3.8854ms 1.8532ms 539.6173 Ops/s 534.1172 Ops/s $\color{#35bf28}+1.03\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.1893ms 1.0682ms 936.1587 Ops/s 877.2510 Ops/s $\textbf{\color{#35bf28}+6.72\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 8.0768ms 5.2705ms 189.7342 Ops/s 189.1159 Ops/s $\color{#35bf28}+0.33\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.7615ms 2.0076ms 498.1194 Ops/s 487.9985 Ops/s $\color{#35bf28}+2.07\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.2127ms 1.1973ms 835.1821 Ops/s 811.8309 Ops/s $\color{#35bf28}+2.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 44.0947ms 38.4434ms 26.0122 Ops/s 25.6501 Ops/s $\color{#35bf28}+1.41\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.0552ms 18.1707ms 55.0338 Ops/s 55.2390 Ops/s $\color{#d91a1a}-0.37\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 43.6724ms 39.3281ms 25.4271 Ops/s 25.2789 Ops/s $\color{#35bf28}+0.59\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.7712ms 18.2403ms 54.8237 Ops/s 54.5656 Ops/s $\color{#35bf28}+0.47\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 44.2792ms 41.9182ms 23.8560 Ops/s 24.2061 Ops/s $\color{#d91a1a}-1.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 20.8779ms 19.7377ms 50.6645 Ops/s 50.4879 Ops/s $\color{#35bf28}+0.35\%$
test_storage_write_lazystack[50-img_shape0-small] 0.8737ms 0.2291ms 4.3641 KOps/s 4.4809 KOps/s $\color{#d91a1a}-2.61\%$
test_storage_write_lazystack[100-img_shape1-atari] 1.8528ms 1.4096ms 709.4174 Ops/s 704.4040 Ops/s $\color{#35bf28}+0.71\%$
test_storage_write_lazystack[100-img_shape2-large_img] 2.7335ms 2.3263ms 429.8609 Ops/s 434.5768 Ops/s $\color{#d91a1a}-1.09\%$
test_storage_write_lazystack[200-img_shape3-large_batch] 3.0737ms 2.9070ms 343.9951 Ops/s 338.5918 Ops/s $\color{#35bf28}+1.60\%$
test_storage_write_contiguous[50-img_shape0-small] 0.2283ms 0.1394ms 7.1713 KOps/s 7.4760 KOps/s $\color{#d91a1a}-4.08\%$
test_storage_write_contiguous[100-img_shape1-atari] 0.3566ms 0.2015ms 4.9618 KOps/s 5.0451 KOps/s $\color{#d91a1a}-1.65\%$
test_storage_write_contiguous[100-img_shape2-large_img] 2.1408ms 1.7793ms 562.0255 Ops/s 564.5358 Ops/s $\color{#d91a1a}-0.44\%$
test_storage_write_contiguous[200-img_shape3-large_batch] 1.5468ms 1.3062ms 765.5671 Ops/s 740.0309 Ops/s $\color{#35bf28}+3.45\%$
test_collector_stack_then_write[50-img_shape0-small] 1.2614ms 1.1200ms 892.8494 Ops/s 891.4816 Ops/s $\color{#35bf28}+0.15\%$
test_collector_stack_then_write[100-img_shape1-atari] 3.7073ms 3.5482ms 281.8366 Ops/s 272.4251 Ops/s $\color{#35bf28}+3.45\%$
test_collector_stack_then_write[100-img_shape2-large_img] 11.2098ms 5.6828ms 175.9706 Ops/s 176.8504 Ops/s $\color{#d91a1a}-0.50\%$
test_collector_stack_then_write[200-img_shape3-large_batch] 7.5222ms 7.2604ms 137.7342 Ops/s 135.8812 Ops/s $\color{#35bf28}+1.36\%$
test_collector_lazystack_then_write[50-img_shape0-small] 0.4649ms 0.2850ms 3.5087 KOps/s 3.5547 KOps/s $\color{#d91a1a}-1.29\%$
test_collector_lazystack_then_write[100-img_shape1-atari] 1.7096ms 1.5073ms 663.4183 Ops/s 651.5061 Ops/s $\color{#35bf28}+1.83\%$
test_collector_lazystack_then_write[100-img_shape2-large_img] 2.7058ms 2.4540ms 407.5007 Ops/s 413.0254 Ops/s $\color{#d91a1a}-1.34\%$
test_collector_lazystack_then_write[200-img_shape3-large_batch] 3.4321ms 3.1392ms 318.5562 Ops/s 314.5309 Ops/s $\color{#35bf28}+1.28\%$
test_collector_without_rb[100-img_shape0-atari] 32.5095ms 32.0004ms 31.2496 Ops/s 31.0173 Ops/s $\color{#35bf28}+0.75\%$
test_collector_without_rb[200-img_shape1-large_batch] 62.8643ms 62.6673ms 15.9573 Ops/s 15.7139 Ops/s $\color{#35bf28}+1.55\%$
test_collector_with_rb[100-img_shape0-atari] 0.6980s 60.5620ms 16.5120 Ops/s 27.0805 Ops/s $\textbf{\color{#d91a1a}-39.03\%}$
test_collector_with_rb[200-img_shape1-large_batch] 75.7231ms 72.8268ms 13.7312 Ops/s 13.7872 Ops/s $\color{#d91a1a}-0.41\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Benchmarks rl/benchmark changes CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Collectors Tests Incomplete or broken unit tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant