Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Use td.transpose in multi-step transform #2288

Merged
merged 1 commit into from
Jul 10, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 10, 2024

No description provided.

Copy link

pytorch-bot bot commented Jul 10, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2288

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Pending, 1 Unrelated Failure

As of commit 9f5ef17 with merge base d0fa836 (image):

NEW FAILURE - The following job has failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 10, 2024
@vmoens vmoens added the Refactoring Refactoring of an existing feature label Jul 10, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 91. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1317s 61.5478ms 16.2475 Ops/s 17.1003 Ops/s $\color{#d91a1a}-4.99\%$
test_sync 36.7978ms 31.7547ms 31.4914 Ops/s 31.2560 Ops/s $\color{#35bf28}+0.75\%$
test_async 45.2210ms 29.8115ms 33.5442 Ops/s 34.3339 Ops/s $\color{#d91a1a}-2.30\%$
test_simple 0.4779s 0.4038s 2.4766 Ops/s 2.5879 Ops/s $\color{#d91a1a}-4.30\%$
test_transformed 0.5527s 0.5502s 1.8177 Ops/s 1.7706 Ops/s $\color{#35bf28}+2.66\%$
test_serial 1.3494s 1.2825s 0.7797 Ops/s 0.7716 Ops/s $\color{#35bf28}+1.06\%$
test_parallel 1.1929s 1.1261s 0.8881 Ops/s 0.8946 Ops/s $\color{#d91a1a}-0.73\%$
test_step_mdp_speed[True-True-True-True-True] 0.2511ms 22.1942μs 45.0567 KOps/s 44.8314 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-True-True-True-False] 55.5730μs 13.1641μs 75.9640 KOps/s 76.5948 KOps/s $\color{#d91a1a}-0.82\%$
test_step_mdp_speed[True-True-True-False-True] 34.7340μs 12.8751μs 77.6696 KOps/s 77.5461 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[True-True-True-False-False] 36.8890μs 7.7482μs 129.0625 KOps/s 132.2537 KOps/s $\color{#d91a1a}-2.41\%$
test_step_mdp_speed[True-True-False-True-True] 0.1119ms 23.6987μs 42.1965 KOps/s 42.0728 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[True-True-False-True-False] 54.7220μs 14.3560μs 69.6575 KOps/s 69.1493 KOps/s $\color{#35bf28}+0.73\%$
test_step_mdp_speed[True-True-False-False-True] 35.7270μs 14.2257μs 70.2956 KOps/s 70.2785 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[True-True-False-False-False] 66.1230μs 9.0013μs 111.0951 KOps/s 111.1540 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[True-False-True-True-True] 0.2848ms 26.4046μs 37.8722 KOps/s 39.0465 KOps/s $\color{#d91a1a}-3.01\%$
test_step_mdp_speed[True-False-True-True-False] 76.8450μs 15.7678μs 63.4203 KOps/s 63.1861 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-False-True-False-True] 47.1780μs 14.1641μs 70.6010 KOps/s 70.7018 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[True-False-True-False-False] 39.7640μs 8.9905μs 111.2288 KOps/s 112.3546 KOps/s $\color{#d91a1a}-1.00\%$
test_step_mdp_speed[True-False-False-True-True] 86.0800μs 26.4159μs 37.8560 KOps/s 38.3711 KOps/s $\color{#d91a1a}-1.34\%$
test_step_mdp_speed[True-False-False-True-False] 47.5180μs 17.1770μs 58.2174 KOps/s 58.9513 KOps/s $\color{#d91a1a}-1.24\%$
test_step_mdp_speed[True-False-False-False-True] 65.2820μs 15.5964μs 64.1173 KOps/s 64.7471 KOps/s $\color{#d91a1a}-0.97\%$
test_step_mdp_speed[True-False-False-False-False] 33.0820μs 10.2106μs 97.9370 KOps/s 98.9787 KOps/s $\color{#d91a1a}-1.05\%$
test_step_mdp_speed[False-True-True-True-True] 75.3610μs 25.0699μs 39.8885 KOps/s 40.5759 KOps/s $\color{#d91a1a}-1.69\%$
test_step_mdp_speed[False-True-True-True-False] 43.0110μs 15.9012μs 62.8882 KOps/s 63.0619 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[False-True-True-False-True] 0.2262ms 16.4618μs 60.7467 KOps/s 60.6851 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[False-True-True-False-False] 0.1473ms 10.2578μs 97.4866 KOps/s 98.2093 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[False-True-False-True-True] 89.0560μs 26.1507μs 38.2400 KOps/s 38.4284 KOps/s $\color{#d91a1a}-0.49\%$
test_step_mdp_speed[False-True-False-True-False] 64.4200μs 17.0371μs 58.6953 KOps/s 58.6982 KOps/s $-0.00\%$
test_step_mdp_speed[False-True-False-False-True] 96.7870μs 18.0803μs 55.3088 KOps/s 57.1657 KOps/s $\color{#d91a1a}-3.25\%$
test_step_mdp_speed[False-True-False-False-False] 48.5310μs 11.1751μs 89.4849 KOps/s 88.7544 KOps/s $\color{#35bf28}+0.82\%$
test_step_mdp_speed[False-False-True-True-True] 94.7270μs 27.1707μs 36.8044 KOps/s 36.5310 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-False-True-True-False] 52.3180μs 18.2265μs 54.8651 KOps/s 54.3450 KOps/s $\color{#35bf28}+0.96\%$
test_step_mdp_speed[False-False-True-False-True] 71.9840μs 17.4324μs 57.3645 KOps/s 56.8031 KOps/s $\color{#35bf28}+0.99\%$
test_step_mdp_speed[False-False-True-False-False] 54.0210μs 11.2529μs 88.8657 KOps/s 88.1662 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-False-False-True-True] 44.4730μs 28.7489μs 34.7840 KOps/s 34.6520 KOps/s $\color{#35bf28}+0.38\%$
test_step_mdp_speed[False-False-False-True-False] 50.8350μs 19.3826μs 51.5926 KOps/s 50.8186 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[False-False-False-False-True] 56.0850μs 18.5405μs 53.9360 KOps/s 52.7642 KOps/s $\color{#35bf28}+2.22\%$
test_step_mdp_speed[False-False-False-False-False] 42.6300μs 12.4168μs 80.5363 KOps/s 80.3722 KOps/s $\color{#35bf28}+0.20\%$
test_values[generalized_advantage_estimate-True-True] 9.7488ms 9.3970ms 106.4175 Ops/s 104.0720 Ops/s $\color{#35bf28}+2.25\%$
test_values[vec_generalized_advantage_estimate-True-True] 38.2913ms 36.2072ms 27.6188 Ops/s 29.6686 Ops/s $\textbf{\color{#d91a1a}-6.91\%}$
test_values[td0_return_estimate-False-False] 0.2353ms 0.1830ms 5.4656 KOps/s 5.2558 KOps/s $\color{#35bf28}+3.99\%$
test_values[td1_return_estimate-False-False] 25.9076ms 23.9543ms 41.7462 Ops/s 41.1161 Ops/s $\color{#35bf28}+1.53\%$
test_values[vec_td1_return_estimate-False-False] 38.4636ms 36.3650ms 27.4989 Ops/s 29.6928 Ops/s $\textbf{\color{#d91a1a}-7.39\%}$
test_values[td_lambda_return_estimate-True-False] 36.2154ms 34.6224ms 28.8830 Ops/s 27.6694 Ops/s $\color{#35bf28}+4.39\%$
test_values[vec_td_lambda_return_estimate-True-False] 39.0783ms 36.2741ms 27.5679 Ops/s 27.2497 Ops/s $\color{#35bf28}+1.17\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 12.3527ms 8.2780ms 120.8022 Ops/s 119.8623 Ops/s $\color{#35bf28}+0.78\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3743ms 2.0251ms 493.8068 Ops/s 507.5857 Ops/s $\color{#d91a1a}-2.71\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4619ms 0.3585ms 2.7890 KOps/s 2.7484 KOps/s $\color{#35bf28}+1.48\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.2710ms 47.1199ms 21.2224 Ops/s 20.8465 Ops/s $\color{#35bf28}+1.80\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.0317ms 3.0931ms 323.2954 Ops/s 323.3819 Ops/s $\color{#d91a1a}-0.03\%$
test_dqn_speed 2.3982ms 1.3573ms 736.7567 Ops/s 724.8592 Ops/s $\color{#35bf28}+1.64\%$
test_ddpg_speed 3.9109ms 2.8615ms 349.4619 Ops/s 345.8767 Ops/s $\color{#35bf28}+1.04\%$
test_sac_speed 10.1184ms 8.6543ms 115.5496 Ops/s 113.3266 Ops/s $\color{#35bf28}+1.96\%$
test_redq_speed 20.5841ms 14.0405ms 71.2226 Ops/s 63.0194 Ops/s $\textbf{\color{#35bf28}+13.02\%}$
test_redq_deprec_speed 16.0267ms 13.9977ms 71.4405 Ops/s 68.9057 Ops/s $\color{#35bf28}+3.68\%$
test_td3_speed 19.0816ms 8.7231ms 114.6380 Ops/s 114.0988 Ops/s $\color{#35bf28}+0.47\%$
test_cql_speed 38.9406ms 37.5035ms 26.6642 Ops/s 26.4477 Ops/s $\color{#35bf28}+0.82\%$
test_a2c_speed 9.5267ms 7.8614ms 127.2034 Ops/s 122.0418 Ops/s $\color{#35bf28}+4.23\%$
test_ppo_speed 8.6540ms 8.1751ms 122.3229 Ops/s 119.3349 Ops/s $\color{#35bf28}+2.50\%$
test_reinforce_speed 7.8405ms 6.8742ms 145.4719 Ops/s 144.4475 Ops/s $\color{#35bf28}+0.71\%$
test_iql_speed 35.5039ms 33.3948ms 29.9448 Ops/s 29.9020 Ops/s $\color{#35bf28}+0.14\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.0111ms 3.7650ms 265.6030 Ops/s 262.1542 Ops/s $\color{#35bf28}+1.32\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8829ms 0.5154ms 1.9401 KOps/s 1.9267 KOps/s $\color{#35bf28}+0.70\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7883ms 0.4966ms 2.0138 KOps/s 2.0194 KOps/s $\color{#d91a1a}-0.28\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.2660ms 3.7670ms 265.4652 Ops/s 266.2962 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8090ms 0.5134ms 1.9477 KOps/s 1.9176 KOps/s $\color{#35bf28}+1.57\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 7.8623ms 0.4931ms 2.0280 KOps/s 2.0302 KOps/s $\color{#d91a1a}-0.11\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.5270ms 1.7533ms 570.3526 Ops/s 566.6980 Ops/s $\color{#35bf28}+0.64\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 8.6401ms 1.6847ms 593.5888 Ops/s 593.4633 Ops/s $\color{#35bf28}+0.02\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.1215ms 3.8534ms 259.5081 Ops/s 253.5270 Ops/s $\color{#35bf28}+2.36\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0291ms 0.6616ms 1.5114 KOps/s 1.4976 KOps/s $\color{#35bf28}+0.92\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 7.7458ms 0.6401ms 1.5623 KOps/s 1.5660 KOps/s $\color{#d91a1a}-0.24\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.3317ms 3.7135ms 269.2859 Ops/s 259.6843 Ops/s $\color{#35bf28}+3.70\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7937ms 0.5173ms 1.9331 KOps/s 1.9359 KOps/s $\color{#d91a1a}-0.15\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6868ms 0.4906ms 2.0381 KOps/s 1.9721 KOps/s $\color{#35bf28}+3.35\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.0866ms 3.7590ms 266.0296 Ops/s 260.0133 Ops/s $\color{#35bf28}+2.31\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1413ms 0.5201ms 1.9227 KOps/s 1.9538 KOps/s $\color{#d91a1a}-1.59\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6145ms 0.4972ms 2.0114 KOps/s 2.0507 KOps/s $\color{#d91a1a}-1.92\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.8029ms 3.9310ms 254.3854 Ops/s 249.2902 Ops/s $\color{#35bf28}+2.04\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.0609ms 0.6588ms 1.5179 KOps/s 1.4901 KOps/s $\color{#35bf28}+1.87\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 1.0009ms 0.6341ms 1.5769 KOps/s 1.5025 KOps/s $\color{#35bf28}+4.95\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1464s 6.7446ms 148.2677 Ops/s 144.8127 Ops/s $\color{#35bf28}+2.39\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 17.4439ms 13.2133ms 75.6813 Ops/s 61.8436 Ops/s $\textbf{\color{#35bf28}+22.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 1.6072ms 1.0355ms 965.7067 Ops/s 913.4560 Ops/s $\textbf{\color{#35bf28}+5.72\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1312s 8.6256ms 115.9333 Ops/s 153.4375 Ops/s $\textbf{\color{#d91a1a}-24.44\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 17.4067ms 13.0141ms 76.8398 Ops/s 74.9224 Ops/s $\color{#35bf28}+2.56\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.6466ms 1.0273ms 973.4386 Ops/s 941.8653 Ops/s $\color{#35bf28}+3.35\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1303s 6.4698ms 154.5639 Ops/s 153.3149 Ops/s $\color{#35bf28}+0.81\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 17.3079ms 13.0793ms 76.4565 Ops/s 70.1571 Ops/s $\textbf{\color{#35bf28}+8.98\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.0793ms 1.3008ms 768.7792 Ops/s 815.4450 Ops/s $\textbf{\color{#d91a1a}-5.72\%}$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 94. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1657s 0.1123s 8.9030 Ops/s 9.3750 Ops/s $\textbf{\color{#d91a1a}-5.03\%}$
test_sync 95.5687ms 94.8139ms 10.5470 Ops/s 10.5104 Ops/s $\color{#35bf28}+0.35\%$
test_async 0.1846s 92.1152ms 10.8560 Ops/s 12.9490 Ops/s $\textbf{\color{#d91a1a}-16.16\%}$
test_single_pixels 0.1183s 0.1160s 8.6187 Ops/s 8.6372 Ops/s $\color{#d91a1a}-0.22\%$
test_sync_pixels 77.1331ms 72.3407ms 13.8235 Ops/s 12.7612 Ops/s $\textbf{\color{#35bf28}+8.32\%}$
test_async_pixels 0.1449s 62.1060ms 16.1015 Ops/s 14.1288 Ops/s $\textbf{\color{#35bf28}+13.96\%}$
test_simple 0.7797s 0.7715s 1.2962 Ops/s 1.2565 Ops/s $\color{#35bf28}+3.16\%$
test_transformed 0.9983s 0.9928s 1.0072 Ops/s 0.9716 Ops/s $\color{#35bf28}+3.67\%$
test_serial 2.3471s 2.2879s 0.4371 Ops/s 0.4429 Ops/s $\color{#d91a1a}-1.32\%$
test_parallel 2.0560s 1.9811s 0.5048 Ops/s 0.5132 Ops/s $\color{#d91a1a}-1.64\%$
test_step_mdp_speed[True-True-True-True-True] 91.4020μs 33.1773μs 30.1411 KOps/s 28.4638 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_step_mdp_speed[True-True-True-True-False] 42.5310μs 19.3970μs 51.5543 KOps/s 52.5796 KOps/s $\color{#d91a1a}-1.95\%$
test_step_mdp_speed[True-True-True-False-True] 37.3810μs 18.7243μs 53.4066 KOps/s 54.7414 KOps/s $\color{#d91a1a}-2.44\%$
test_step_mdp_speed[True-True-True-False-False] 27.0010μs 11.0406μs 90.5751 KOps/s 93.9356 KOps/s $\color{#d91a1a}-3.58\%$
test_step_mdp_speed[True-True-False-True-True] 63.7610μs 34.8071μs 28.7298 KOps/s 29.4612 KOps/s $\color{#d91a1a}-2.48\%$
test_step_mdp_speed[True-True-False-True-False] 44.5910μs 21.0664μs 47.4689 KOps/s 48.6906 KOps/s $\color{#d91a1a}-2.51\%$
test_step_mdp_speed[True-True-False-False-True] 50.9410μs 20.4715μs 48.8484 KOps/s 50.5779 KOps/s $\color{#d91a1a}-3.42\%$
test_step_mdp_speed[True-True-False-False-False] 37.8410μs 13.0443μs 76.6618 KOps/s 80.2211 KOps/s $\color{#d91a1a}-4.44\%$
test_step_mdp_speed[True-False-True-True-True] 81.0720μs 37.2141μs 26.8715 KOps/s 27.5244 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-False-True-True-False] 39.8410μs 22.9609μs 43.5522 KOps/s 44.3838 KOps/s $\color{#d91a1a}-1.87\%$
test_step_mdp_speed[True-False-True-False-True] 38.6010μs 20.5797μs 48.5916 KOps/s 50.5547 KOps/s $\color{#d91a1a}-3.88\%$
test_step_mdp_speed[True-False-True-False-False] 41.9910μs 12.8422μs 77.8684 KOps/s 80.0375 KOps/s $\color{#d91a1a}-2.71\%$
test_step_mdp_speed[True-False-False-True-True] 64.6910μs 38.6676μs 25.8614 KOps/s 26.7558 KOps/s $\color{#d91a1a}-3.34\%$
test_step_mdp_speed[True-False-False-True-False] 48.7510μs 25.0646μs 39.8969 KOps/s 41.0676 KOps/s $\color{#d91a1a}-2.85\%$
test_step_mdp_speed[True-False-False-False-True] 66.8920μs 22.3273μs 44.7883 KOps/s 46.4141 KOps/s $\color{#d91a1a}-3.50\%$
test_step_mdp_speed[True-False-False-False-False] 32.5210μs 14.7213μs 67.9287 KOps/s 70.5368 KOps/s $\color{#d91a1a}-3.70\%$
test_step_mdp_speed[False-True-True-True-True] 57.0210μs 37.5003μs 26.6664 KOps/s 27.9539 KOps/s $\color{#d91a1a}-4.61\%$
test_step_mdp_speed[False-True-True-True-False] 48.5210μs 23.4423μs 42.6580 KOps/s 44.4699 KOps/s $\color{#d91a1a}-4.07\%$
test_step_mdp_speed[False-True-True-False-True] 40.5310μs 24.7021μs 40.4823 KOps/s 41.8436 KOps/s $\color{#d91a1a}-3.25\%$
test_step_mdp_speed[False-True-True-False-False] 32.7000μs 14.8241μs 67.4579 KOps/s 70.1905 KOps/s $\color{#d91a1a}-3.89\%$
test_step_mdp_speed[False-True-False-True-True] 57.1110μs 38.9159μs 25.6964 KOps/s 26.8899 KOps/s $\color{#d91a1a}-4.44\%$
test_step_mdp_speed[False-True-False-True-False] 43.7210μs 24.8879μs 40.1802 KOps/s 41.4562 KOps/s $\color{#d91a1a}-3.08\%$
test_step_mdp_speed[False-True-False-False-True] 50.9710μs 26.4617μs 37.7905 KOps/s 39.6171 KOps/s $\color{#d91a1a}-4.61\%$
test_step_mdp_speed[False-True-False-False-False] 33.1510μs 16.5289μs 60.5000 KOps/s 62.9286 KOps/s $\color{#d91a1a}-3.86\%$
test_step_mdp_speed[False-False-True-True-True] 58.1610μs 40.3767μs 24.7667 KOps/s 25.3312 KOps/s $\color{#d91a1a}-2.23\%$
test_step_mdp_speed[False-False-True-True-False] 52.0210μs 27.0046μs 37.0307 KOps/s 38.0609 KOps/s $\color{#d91a1a}-2.71\%$
test_step_mdp_speed[False-False-True-False-True] 48.6910μs 26.5664μs 37.6415 KOps/s 39.8989 KOps/s $\textbf{\color{#d91a1a}-5.66\%}$
test_step_mdp_speed[False-False-True-False-False] 35.7510μs 16.5931μs 60.2660 KOps/s 63.4367 KOps/s $\color{#d91a1a}-5.00\%$
test_step_mdp_speed[False-False-False-True-True] 57.3610μs 43.0327μs 23.2382 KOps/s 23.5241 KOps/s $\color{#d91a1a}-1.22\%$
test_step_mdp_speed[False-False-False-True-False] 49.2410μs 28.9066μs 34.5942 KOps/s 35.3632 KOps/s $\color{#d91a1a}-2.17\%$
test_step_mdp_speed[False-False-False-False-True] 47.7410μs 27.8919μs 35.8526 KOps/s 37.2338 KOps/s $\color{#d91a1a}-3.71\%$
test_step_mdp_speed[False-False-False-False-False] 33.6710μs 18.2477μs 54.8015 KOps/s 56.7935 KOps/s $\color{#d91a1a}-3.51\%$
test_values[generalized_advantage_estimate-True-True] 26.3352ms 25.8232ms 38.7249 Ops/s 39.3133 Ops/s $\color{#d91a1a}-1.50\%$
test_values[vec_generalized_advantage_estimate-True-True] 90.2490ms 2.7120ms 368.7381 Ops/s 369.6240 Ops/s $\color{#d91a1a}-0.24\%$
test_values[td0_return_estimate-False-False] 92.8620μs 67.9357μs 14.7198 KOps/s 14.8593 KOps/s $\color{#d91a1a}-0.94\%$
test_values[td1_return_estimate-False-False] 59.1242ms 57.8965ms 17.2722 Ops/s 16.9612 Ops/s $\color{#35bf28}+1.83\%$
test_values[vec_td1_return_estimate-False-False] 1.3196ms 1.0994ms 909.6250 Ops/s 908.3955 Ops/s $\color{#35bf28}+0.14\%$
test_values[td_lambda_return_estimate-True-False] 92.9411ms 90.5861ms 11.0392 Ops/s 10.7170 Ops/s $\color{#35bf28}+3.01\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.4083ms 1.0969ms 911.6348 Ops/s 913.0235 Ops/s $\color{#d91a1a}-0.15\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 25.9444ms 25.8285ms 38.7170 Ops/s 37.3151 Ops/s $\color{#35bf28}+3.76\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.9722ms 0.7388ms 1.3536 KOps/s 1.3659 KOps/s $\color{#d91a1a}-0.90\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7700ms 0.6991ms 1.4304 KOps/s 1.4517 KOps/s $\color{#d91a1a}-1.47\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5310ms 1.4842ms 673.7827 Ops/s 671.1782 Ops/s $\color{#35bf28}+0.39\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7561ms 0.6984ms 1.4318 KOps/s 1.4020 KOps/s $\color{#35bf28}+2.13\%$
test_dqn_speed 80.1073ms 1.5929ms 627.7873 Ops/s 682.8854 Ops/s $\textbf{\color{#d91a1a}-8.07\%}$
test_ddpg_speed 3.1756ms 2.9105ms 343.5836 Ops/s 337.6786 Ops/s $\color{#35bf28}+1.75\%$
test_sac_speed 8.9261ms 8.3836ms 119.2807 Ops/s 118.2235 Ops/s $\color{#35bf28}+0.89\%$
test_redq_speed 11.6669ms 10.6746ms 93.6806 Ops/s 92.9567 Ops/s $\color{#35bf28}+0.78\%$
test_redq_deprec_speed 0.1063s 12.5117ms 79.9250 Ops/s 87.7468 Ops/s $\textbf{\color{#d91a1a}-8.91\%}$
test_td3_speed 8.4784ms 8.2969ms 120.5267 Ops/s 119.5192 Ops/s $\color{#35bf28}+0.84\%$
test_cql_speed 26.5492ms 25.8341ms 38.7085 Ops/s 38.5498 Ops/s $\color{#35bf28}+0.41\%$
test_a2c_speed 5.9192ms 5.7053ms 175.2752 Ops/s 173.9008 Ops/s $\color{#35bf28}+0.79\%$
test_ppo_speed 6.9322ms 5.9959ms 166.7803 Ops/s 164.3859 Ops/s $\color{#35bf28}+1.46\%$
test_reinforce_speed 4.8939ms 4.6123ms 216.8096 Ops/s 213.1812 Ops/s $\color{#35bf28}+1.70\%$
test_iql_speed 20.5386ms 19.6883ms 50.7916 Ops/s 50.4265 Ops/s $\color{#35bf28}+0.72\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8093ms 4.6274ms 216.1053 Ops/s 221.2292 Ops/s $\color{#d91a1a}-2.32\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1092ms 0.5485ms 1.8231 KOps/s 1.8122 KOps/s $\color{#35bf28}+0.60\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6832ms 0.5281ms 1.8936 KOps/s 1.8819 KOps/s $\color{#35bf28}+0.62\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.8166ms 4.6018ms 217.3060 Ops/s 220.5744 Ops/s $\color{#d91a1a}-1.48\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.1383ms 0.5412ms 1.8477 KOps/s 1.8303 KOps/s $\color{#35bf28}+0.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6679ms 0.5195ms 1.9250 KOps/s 1.8956 KOps/s $\color{#35bf28}+1.55\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 4.4324ms 2.0175ms 495.6635 Ops/s 493.0627 Ops/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0910ms 1.9196ms 520.9287 Ops/s 519.7013 Ops/s $\color{#35bf28}+0.24\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.2069ms 4.7610ms 210.0414 Ops/s 212.3195 Ops/s $\color{#d91a1a}-1.07\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8183ms 0.6942ms 1.4405 KOps/s 1.4305 KOps/s $\color{#35bf28}+0.69\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 79.0281ms 0.7502ms 1.3330 KOps/s 1.4739 KOps/s $\textbf{\color{#d91a1a}-9.56\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 4.8404ms 4.6405ms 215.4936 Ops/s 220.4899 Ops/s $\color{#d91a1a}-2.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8308ms 0.5472ms 1.8274 KOps/s 1.8361 KOps/s $\color{#d91a1a}-0.48\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.1640ms 0.5338ms 1.8733 KOps/s 1.8785 KOps/s $\color{#d91a1a}-0.28\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.7529ms 4.5666ms 218.9813 Ops/s 220.0355 Ops/s $\color{#d91a1a}-0.48\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2985ms 0.5486ms 1.8229 KOps/s 1.8362 KOps/s $\color{#d91a1a}-0.72\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6770ms 0.5254ms 1.9032 KOps/s 1.9195 KOps/s $\color{#d91a1a}-0.85\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 4.8440ms 4.7610ms 210.0397 Ops/s 212.5992 Ops/s $\color{#d91a1a}-1.20\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.8143ms 0.7021ms 1.4244 KOps/s 1.4176 KOps/s $\color{#35bf28}+0.47\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.4697ms 0.6838ms 1.4625 KOps/s 1.4610 KOps/s $\color{#35bf28}+0.10\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1297s 7.3532ms 135.9944 Ops/s 130.7384 Ops/s $\color{#35bf28}+4.02\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 19.1174ms 16.0118ms 62.4540 Ops/s 64.4718 Ops/s $\color{#d91a1a}-3.13\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.2234ms 1.1685ms 855.8184 Ops/s 754.9374 Ops/s $\textbf{\color{#35bf28}+13.36\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1273s 9.7744ms 102.3081 Ops/s 137.4975 Ops/s $\textbf{\color{#d91a1a}-25.59\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 18.2289ms 15.9822ms 62.5694 Ops/s 64.4065 Ops/s $\color{#d91a1a}-2.85\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.2897ms 1.1674ms 856.6059 Ops/s 802.2716 Ops/s $\textbf{\color{#35bf28}+6.77\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1263s 7.5143ms 133.0804 Ops/s 101.1527 Ops/s $\textbf{\color{#35bf28}+31.56\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 18.4814ms 16.1077ms 62.0822 Ops/s 63.9919 Ops/s $\color{#d91a1a}-2.98\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.3752ms 1.3053ms 766.0915 Ops/s 745.6658 Ops/s $\color{#35bf28}+2.74\%$

@vmoens vmoens merged commit 8e43ac8 into main Jul 10, 2024
52 of 54 checks passed
@vmoens vmoens deleted the minor-improvements-multistep branch July 10, 2024 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Refactoring Refactoring of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants