Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix RLHF #1757

Merged
merged 3 commits into from
Dec 23, 2023
Merged

[BugFix] Fix RLHF #1757

merged 3 commits into from
Dec 23, 2023

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Dec 21, 2023

No description provided.

Copy link

pytorch-bot bot commented Dec 21, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1757

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit c0666a8 with merge base 6d217c6 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 21, 2023
@vmoens vmoens added bug Something isn't working CI Has to do with CI setup (e.g. wheels & builds, tests...) labels Dec 21, 2023
Copy link

github-actions bot commented Dec 21, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 89. Improved: $\large\color{#35bf28}19$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 76.3254ms 68.3304ms 14.6348 Ops/s 15.0790 Ops/s $\color{#d91a1a}-2.95\%$
test_sync 37.2002ms 36.0497ms 27.7395 Ops/s 27.7890 Ops/s $\color{#d91a1a}-0.18\%$
test_async 98.9969ms 35.4178ms 28.2344 Ops/s 27.9294 Ops/s $\color{#35bf28}+1.09\%$
test_simple 0.5301s 0.4691s 2.1315 Ops/s 2.1386 Ops/s $\color{#d91a1a}-0.33\%$
test_transformed 0.7071s 0.6426s 1.5562 Ops/s 1.5848 Ops/s $\color{#d91a1a}-1.81\%$
test_serial 1.4896s 1.4419s 0.6935 Ops/s 0.7040 Ops/s $\color{#d91a1a}-1.48\%$
test_parallel 1.4214s 1.3567s 0.7371 Ops/s 0.7391 Ops/s $\color{#d91a1a}-0.27\%$
test_step_mdp_speed[True-True-True-True-True] 0.1716ms 21.4742μs 46.5676 KOps/s 45.2818 KOps/s $\color{#35bf28}+2.84\%$
test_step_mdp_speed[True-True-True-True-False] 47.5090μs 13.1433μs 76.0844 KOps/s 75.3988 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[True-True-True-False-True] 39.2340μs 12.8419μs 77.8699 KOps/s 75.9828 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[True-True-True-False-False] 41.6580μs 7.7940μs 128.3031 KOps/s 127.4542 KOps/s $\color{#35bf28}+0.67\%$
test_step_mdp_speed[True-True-False-True-True] 70.5420μs 22.8584μs 43.7475 KOps/s 42.8293 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[True-True-False-True-False] 41.2070μs 14.3375μs 69.7473 KOps/s 68.3940 KOps/s $\color{#35bf28}+1.98\%$
test_step_mdp_speed[True-True-False-False-True] 74.2490μs 14.0253μs 71.2997 KOps/s 70.7320 KOps/s $\color{#35bf28}+0.80\%$
test_step_mdp_speed[True-True-False-False-False] 73.8480μs 9.1941μs 108.7652 KOps/s 109.4135 KOps/s $\color{#d91a1a}-0.59\%$
test_step_mdp_speed[True-False-True-True-True] 64.0500μs 24.5350μs 40.7580 KOps/s 40.3877 KOps/s $\color{#35bf28}+0.92\%$
test_step_mdp_speed[True-False-True-True-False] 0.1345ms 15.6949μs 63.7150 KOps/s 62.8710 KOps/s $\color{#35bf28}+1.34\%$
test_step_mdp_speed[True-False-True-False-True] 49.8130μs 13.9189μs 71.8447 KOps/s 70.1618 KOps/s $\color{#35bf28}+2.40\%$
test_step_mdp_speed[True-False-True-False-False] 30.9780μs 8.9780μs 111.3832 KOps/s 109.6799 KOps/s $\color{#35bf28}+1.55\%$
test_step_mdp_speed[True-False-False-True-True] 55.3430μs 25.6605μs 38.9704 KOps/s 38.3245 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[True-False-False-True-False] 47.6490μs 17.0990μs 58.4831 KOps/s 58.5381 KOps/s $\color{#d91a1a}-0.09\%$
test_step_mdp_speed[True-False-False-False-True] 53.4200μs 15.3613μs 65.0986 KOps/s 64.1895 KOps/s $\color{#35bf28}+1.42\%$
test_step_mdp_speed[True-False-False-False-False] 38.1820μs 10.2884μs 97.1971 KOps/s 96.1680 KOps/s $\color{#35bf28}+1.07\%$
test_step_mdp_speed[False-True-True-True-True] 51.7860μs 24.5377μs 40.7536 KOps/s 40.3584 KOps/s $\color{#35bf28}+0.98\%$
test_step_mdp_speed[False-True-True-True-False] 56.0450μs 15.8336μs 63.1569 KOps/s 61.9567 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[False-True-True-False-True] 52.5990μs 16.2423μs 61.5678 KOps/s 60.6484 KOps/s $\color{#35bf28}+1.52\%$
test_step_mdp_speed[False-True-True-False-False] 33.8330μs 10.1060μs 98.9511 KOps/s 96.8342 KOps/s $\color{#35bf28}+2.19\%$
test_step_mdp_speed[False-True-False-True-True] 90.2290μs 25.7502μs 38.8347 KOps/s 38.5015 KOps/s $\color{#35bf28}+0.87\%$
test_step_mdp_speed[False-True-False-True-False] 48.7310μs 16.9523μs 58.9892 KOps/s 57.4356 KOps/s $\color{#35bf28}+2.70\%$
test_step_mdp_speed[False-True-False-False-True] 53.8910μs 17.4905μs 57.1739 KOps/s 55.7806 KOps/s $\color{#35bf28}+2.50\%$
test_step_mdp_speed[False-True-False-False-False] 37.2590μs 11.6097μs 86.1348 KOps/s 85.4030 KOps/s $\color{#35bf28}+0.86\%$
test_step_mdp_speed[False-False-True-True-True] 58.0380μs 26.8525μs 37.2405 KOps/s 36.3766 KOps/s $\color{#35bf28}+2.37\%$
test_step_mdp_speed[False-False-True-True-False] 52.8090μs 18.3405μs 54.5242 KOps/s 53.7112 KOps/s $\color{#35bf28}+1.51\%$
test_step_mdp_speed[False-False-True-False-True] 53.5300μs 17.3975μs 57.4794 KOps/s 56.0650 KOps/s $\color{#35bf28}+2.52\%$
test_step_mdp_speed[False-False-True-False-False] 45.8760μs 11.3318μs 88.2470 KOps/s 86.1721 KOps/s $\color{#35bf28}+2.41\%$
test_step_mdp_speed[False-False-False-True-True] 65.2520μs 27.9816μs 35.7377 KOps/s 34.7749 KOps/s $\color{#35bf28}+2.77\%$
test_step_mdp_speed[False-False-False-True-False] 49.4130μs 19.3990μs 51.5490 KOps/s 50.5096 KOps/s $\color{#35bf28}+2.06\%$
test_step_mdp_speed[False-False-False-False-True] 74.1890μs 18.4471μs 54.2090 KOps/s 52.6849 KOps/s $\color{#35bf28}+2.89\%$
test_step_mdp_speed[False-False-False-False-False] 45.0940μs 12.4304μs 80.4477 KOps/s 78.7576 KOps/s $\color{#35bf28}+2.15\%$
test_values[generalized_advantage_estimate-True-True] 12.9572ms 12.1499ms 82.3053 Ops/s 82.5357 Ops/s $\color{#d91a1a}-0.28\%$
test_values[vec_generalized_advantage_estimate-True-True] 34.4158ms 27.6479ms 36.1690 Ops/s 37.2761 Ops/s $\color{#d91a1a}-2.97\%$
test_values[td0_return_estimate-False-False] 0.2487ms 0.1919ms 5.2114 KOps/s 4.6623 KOps/s $\textbf{\color{#35bf28}+11.78\%}$
test_values[td1_return_estimate-False-False] 26.1098ms 25.8296ms 38.7153 Ops/s 38.7031 Ops/s $\color{#35bf28}+0.03\%$
test_values[vec_td1_return_estimate-False-False] 36.9503ms 28.0566ms 35.6422 Ops/s 36.8797 Ops/s $\color{#d91a1a}-3.36\%$
test_values[td_lambda_return_estimate-True-False] 36.4518ms 36.0271ms 27.7569 Ops/s 27.4762 Ops/s $\color{#35bf28}+1.02\%$
test_values[vec_td_lambda_return_estimate-True-False] 36.0163ms 27.9808ms 35.7388 Ops/s 37.0279 Ops/s $\color{#d91a1a}-3.48\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.1279ms 7.9221ms 126.2286 Ops/s 124.0031 Ops/s $\color{#35bf28}+1.79\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 10.4771ms 1.9567ms 511.0708 Ops/s 453.6934 Ops/s $\textbf{\color{#35bf28}+12.65\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5222ms 0.4169ms 2.3986 KOps/s 2.1066 KOps/s $\textbf{\color{#35bf28}+13.86\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 48.5925ms 39.1485ms 25.5437 Ops/s 25.3293 Ops/s $\color{#35bf28}+0.85\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 11.7336ms 2.6766ms 373.6018 Ops/s 371.2005 Ops/s $\color{#35bf28}+0.65\%$
test_dqn_speed 93.4005ms 8.6911ms 115.0597 Ops/s 110.0521 Ops/s $\color{#35bf28}+4.55\%$
test_ddpg_speed 22.8814ms 15.3958ms 64.9530 Ops/s 66.9355 Ops/s $\color{#d91a1a}-2.96\%$
test_sac_speed 36.3402ms 30.7165ms 32.5558 Ops/s 32.0915 Ops/s $\color{#35bf28}+1.45\%$
test_redq_speed 40.2084ms 36.9975ms 27.0289 Ops/s 26.6387 Ops/s $\color{#35bf28}+1.46\%$
test_redq_deprec_speed 35.4623ms 26.5793ms 37.6232 Ops/s 36.6032 Ops/s $\color{#35bf28}+2.79\%$
test_td3_speed 29.9130ms 20.9345ms 47.7680 Ops/s 45.7223 Ops/s $\color{#35bf28}+4.47\%$
test_cql_speed 0.1281s 94.8399ms 10.5441 Ops/s 10.9569 Ops/s $\color{#d91a1a}-3.77\%$
test_a2c_speed 35.7263ms 27.6287ms 36.1943 Ops/s 35.8407 Ops/s $\color{#35bf28}+0.99\%$
test_ppo_speed 35.9546ms 28.0391ms 35.6645 Ops/s 35.7061 Ops/s $\color{#d91a1a}-0.12\%$
test_reinforce_speed 27.8123ms 26.4138ms 37.8590 Ops/s 37.4066 Ops/s $\color{#35bf28}+1.21\%$
test_iql_speed 64.0562ms 63.6653ms 15.7072 Ops/s 15.1576 Ops/s $\color{#35bf28}+3.63\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.3559ms 1.9220ms 520.2844 Ops/s 478.1667 Ops/s $\textbf{\color{#35bf28}+8.81\%}$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.9776ms 2.0343ms 491.5600 Ops/s 399.5749 Ops/s $\textbf{\color{#35bf28}+23.02\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.1647ms 2.0197ms 495.1212 Ops/s 451.6383 Ops/s $\textbf{\color{#35bf28}+9.63\%}$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.9537ms 1.9066ms 524.4845 Ops/s 483.3013 Ops/s $\textbf{\color{#35bf28}+8.52\%}$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1055s 2.2169ms 451.0897 Ops/s 404.0642 Ops/s $\textbf{\color{#35bf28}+11.64\%}$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.2016ms 2.0196ms 495.1532 Ops/s 454.5748 Ops/s $\textbf{\color{#35bf28}+8.93\%}$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.4394ms 1.9229ms 520.0415 Ops/s 467.8689 Ops/s $\textbf{\color{#35bf28}+11.15\%}$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1044s 2.2835ms 437.9153 Ops/s 396.7346 Ops/s $\textbf{\color{#35bf28}+10.38\%}$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.1442ms 2.0332ms 491.8346 Ops/s 443.8837 Ops/s $\textbf{\color{#35bf28}+10.80\%}$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.5302ms 1.9554ms 511.3968 Ops/s 479.8250 Ops/s $\textbf{\color{#35bf28}+6.58\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.1330s 2.3979ms 417.0242 Ops/s 380.2642 Ops/s $\textbf{\color{#35bf28}+9.67\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 2.7490ms 2.0913ms 478.1792 Ops/s 449.2505 Ops/s $\textbf{\color{#35bf28}+6.44\%}$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.1373ms 2.0813ms 480.4647 Ops/s 470.3826 Ops/s $\color{#35bf28}+2.14\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.1255s 2.4385ms 410.0844 Ops/s 389.3509 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.3697ms 2.1969ms 455.1884 Ops/s 442.7354 Ops/s $\color{#35bf28}+2.81\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.9902ms 2.0307ms 492.4478 Ops/s 473.3958 Ops/s $\color{#35bf28}+4.02\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1204s 2.4278ms 411.8891 Ops/s 395.2602 Ops/s $\color{#35bf28}+4.21\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 3.1092ms 2.1597ms 463.0301 Ops/s 443.5150 Ops/s $\color{#35bf28}+4.40\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1582s 17.7407ms 56.3676 Ops/s 50.9596 Ops/s $\textbf{\color{#35bf28}+10.61\%}$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1140s 16.7430ms 59.7264 Ops/s 55.4793 Ops/s $\textbf{\color{#35bf28}+7.66\%}$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1206s 16.7807ms 59.5922 Ops/s 66.5401 Ops/s $\textbf{\color{#d91a1a}-10.44\%}$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1158s 16.9191ms 59.1048 Ops/s 56.9047 Ops/s $\color{#35bf28}+3.87\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1167s 16.8484ms 59.3529 Ops/s 56.9008 Ops/s $\color{#35bf28}+4.31\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1188s 14.7184ms 67.9421 Ops/s 65.6431 Ops/s $\color{#35bf28}+3.50\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1113s 16.5490ms 60.4267 Ops/s 57.5279 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1213s 14.9384ms 66.9414 Ops/s 65.6199 Ops/s $\color{#35bf28}+2.01\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1176s 16.7736ms 59.6176 Ops/s 57.0775 Ops/s $\color{#35bf28}+4.45\%$

Copy link

github-actions bot commented Dec 21, 2023

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 92. Improved: $\large\color{#35bf28}12$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_single 0.1263s 0.1224s 8.1681 Ops/s 7.9567 Ops/s $\color{#35bf28}+2.66\%$
test_sync 0.1804s 0.1105s 9.0527 Ops/s 9.0143 Ops/s $\color{#35bf28}+0.43\%$
test_async 0.2053s 0.1000s 9.9993 Ops/s 9.7850 Ops/s $\color{#35bf28}+2.19\%$
test_single_pixels 0.1326s 0.1322s 7.5643 Ops/s 7.3640 Ops/s $\color{#35bf28}+2.72\%$
test_sync_pixels 99.5081ms 96.4932ms 10.3634 Ops/s 10.3813 Ops/s $\color{#d91a1a}-0.17\%$
test_async_pixels 0.2583s 91.3504ms 10.9469 Ops/s 10.8535 Ops/s $\color{#35bf28}+0.86\%$
test_simple 0.9817s 0.8971s 1.1147 Ops/s 1.0814 Ops/s $\color{#35bf28}+3.08\%$
test_transformed 1.2058s 1.1412s 0.8763 Ops/s 0.8509 Ops/s $\color{#35bf28}+2.98\%$
test_serial 2.5153s 2.5133s 0.3979 Ops/s 0.3837 Ops/s $\color{#35bf28}+3.71\%$
test_parallel 2.6223s 2.5404s 0.3936 Ops/s 0.3766 Ops/s $\color{#35bf28}+4.52\%$
test_step_mdp_speed[True-True-True-True-True] 99.1120μs 32.6209μs 30.6552 KOps/s 29.6994 KOps/s $\color{#35bf28}+3.22\%$
test_step_mdp_speed[True-True-True-True-False] 38.2800μs 19.2303μs 52.0014 KOps/s 50.4304 KOps/s $\color{#35bf28}+3.12\%$
test_step_mdp_speed[True-True-True-False-True] 35.0010μs 18.4051μs 54.3328 KOps/s 52.2500 KOps/s $\color{#35bf28}+3.99\%$
test_step_mdp_speed[True-True-True-False-False] 38.2100μs 11.1245μs 89.8916 KOps/s 87.9658 KOps/s $\color{#35bf28}+2.19\%$
test_step_mdp_speed[True-True-False-True-True] 61.9620μs 34.3143μs 29.1423 KOps/s 28.2174 KOps/s $\color{#35bf28}+3.28\%$
test_step_mdp_speed[True-True-False-True-False] 52.0510μs 21.3019μs 46.9441 KOps/s 45.5497 KOps/s $\color{#35bf28}+3.06\%$
test_step_mdp_speed[True-True-False-False-True] 41.7300μs 20.2163μs 49.4651 KOps/s 47.7797 KOps/s $\color{#35bf28}+3.53\%$
test_step_mdp_speed[True-True-False-False-False] 28.8500μs 13.0650μs 76.5403 KOps/s 74.3074 KOps/s $\color{#35bf28}+3.00\%$
test_step_mdp_speed[True-False-True-True-True] 65.7610μs 35.9814μs 27.7921 KOps/s 26.5144 KOps/s $\color{#35bf28}+4.82\%$
test_step_mdp_speed[True-False-True-True-False] 42.6800μs 22.8622μs 43.7402 KOps/s 41.6874 KOps/s $\color{#35bf28}+4.92\%$
test_step_mdp_speed[True-False-True-False-True] 93.3610μs 20.0792μs 49.8027 KOps/s 47.3609 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_step_mdp_speed[True-False-True-False-False] 35.8700μs 12.9466μs 77.2405 KOps/s 73.1339 KOps/s $\textbf{\color{#35bf28}+5.62\%}$
test_step_mdp_speed[True-False-False-True-True] 65.5410μs 38.7959μs 25.7760 KOps/s 25.4984 KOps/s $\color{#35bf28}+1.09\%$
test_step_mdp_speed[True-False-False-True-False] 40.0020μs 25.2284μs 39.6379 KOps/s 38.7109 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[True-False-False-False-True] 67.2620μs 22.1963μs 45.0526 KOps/s 43.6937 KOps/s $\color{#35bf28}+3.11\%$
test_step_mdp_speed[True-False-False-False-False] 42.2310μs 14.6630μs 68.1990 KOps/s 64.6603 KOps/s $\textbf{\color{#35bf28}+5.47\%}$
test_step_mdp_speed[False-True-True-True-True] 63.5100μs 36.5776μs 27.3391 KOps/s 27.1742 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-True-True-True-False] 37.4010μs 23.1730μs 43.1537 KOps/s 42.4466 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-True-True-False-True] 41.5320μs 24.7918μs 40.3358 KOps/s 39.8822 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[False-True-True-False-False] 32.1510μs 14.7336μs 67.8723 KOps/s 66.2830 KOps/s $\color{#35bf28}+2.40\%$
test_step_mdp_speed[False-True-False-True-True] 60.5010μs 38.3750μs 26.0586 KOps/s 25.4092 KOps/s $\color{#35bf28}+2.56\%$
test_step_mdp_speed[False-True-False-True-False] 49.2700μs 25.0811μs 39.8706 KOps/s 38.9063 KOps/s $\color{#35bf28}+2.48\%$
test_step_mdp_speed[False-True-False-False-True] 51.9710μs 26.2187μs 38.1408 KOps/s 36.8621 KOps/s $\color{#35bf28}+3.47\%$
test_step_mdp_speed[False-True-False-False-False] 36.5400μs 16.3736μs 61.0738 KOps/s 58.5302 KOps/s $\color{#35bf28}+4.35\%$
test_step_mdp_speed[False-False-True-True-True] 58.3200μs 39.8303μs 25.1065 KOps/s 24.4588 KOps/s $\color{#35bf28}+2.65\%$
test_step_mdp_speed[False-False-True-True-False] 45.7810μs 26.8746μs 37.2098 KOps/s 36.2463 KOps/s $\color{#35bf28}+2.66\%$
test_step_mdp_speed[False-False-True-False-True] 44.9300μs 25.7164μs 38.8856 KOps/s 37.1414 KOps/s $\color{#35bf28}+4.70\%$
test_step_mdp_speed[False-False-True-False-False] 54.7910μs 16.6510μs 60.0563 KOps/s 58.6548 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[False-False-False-True-True] 60.6510μs 40.8521μs 24.4786 KOps/s 23.4964 KOps/s $\color{#35bf28}+4.18\%$
test_step_mdp_speed[False-False-False-True-False] 46.2400μs 28.6391μs 34.9173 KOps/s 34.1862 KOps/s $\color{#35bf28}+2.14\%$
test_step_mdp_speed[False-False-False-False-True] 45.7010μs 26.9214μs 37.1451 KOps/s 35.5648 KOps/s $\color{#35bf28}+4.44\%$
test_step_mdp_speed[False-False-False-False-False] 48.5810μs 18.1656μs 55.0491 KOps/s 52.7438 KOps/s $\color{#35bf28}+4.37\%$
test_values[generalized_advantage_estimate-True-True] 25.0051ms 24.5664ms 40.7060 Ops/s 37.8224 Ops/s $\textbf{\color{#35bf28}+7.62\%}$
test_values[vec_generalized_advantage_estimate-True-True] 84.9494ms 3.2755ms 305.2970 Ops/s 290.4319 Ops/s $\textbf{\color{#35bf28}+5.12\%}$
test_values[td0_return_estimate-False-False] 0.1010ms 63.6188μs 15.7186 KOps/s 15.0201 KOps/s $\color{#35bf28}+4.65\%$
test_values[td1_return_estimate-False-False] 53.4564ms 53.0915ms 18.8354 Ops/s 17.6329 Ops/s $\textbf{\color{#35bf28}+6.82\%}$
test_values[vec_td1_return_estimate-False-False] 2.1745ms 1.7757ms 563.1424 Ops/s 553.1802 Ops/s $\color{#35bf28}+1.80\%$
test_values[td_lambda_return_estimate-True-False] 86.8149ms 84.9204ms 11.7757 Ops/s 10.9924 Ops/s $\textbf{\color{#35bf28}+7.13\%}$
test_values[vec_td_lambda_return_estimate-True-False] 2.1024ms 1.7731ms 563.9887 Ops/s 554.5880 Ops/s $\color{#35bf28}+1.70\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.8222ms 23.5228ms 42.5120 Ops/s 39.3192 Ops/s $\textbf{\color{#35bf28}+8.12\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 0.8655ms 0.7067ms 1.4150 KOps/s 1.3502 KOps/s $\color{#35bf28}+4.80\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.8148ms 0.6699ms 1.4929 KOps/s 1.4386 KOps/s $\color{#35bf28}+3.78\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.6267ms 1.4622ms 683.9042 Ops/s 669.7730 Ops/s $\color{#35bf28}+2.11\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.9801ms 0.6886ms 1.4523 KOps/s 1.4087 KOps/s $\color{#35bf28}+3.09\%$
test_dqn_speed 14.3086ms 7.4202ms 134.7673 Ops/s 129.5360 Ops/s $\color{#35bf28}+4.04\%$
test_ddpg_speed 15.4713ms 14.6385ms 68.3131 Ops/s 66.9350 Ops/s $\color{#35bf28}+2.06\%$
test_sac_speed 30.1025ms 29.3252ms 34.1004 Ops/s 32.9333 Ops/s $\color{#35bf28}+3.54\%$
test_redq_speed 35.9988ms 35.2222ms 28.3912 Ops/s 27.5331 Ops/s $\color{#35bf28}+3.12\%$
test_redq_deprec_speed 25.8782ms 24.3945ms 40.9929 Ops/s 40.1440 Ops/s $\color{#35bf28}+2.11\%$
test_td3_speed 29.0953ms 19.9290ms 50.1780 Ops/s 48.9169 Ops/s $\color{#35bf28}+2.58\%$
test_cql_speed 85.1995ms 84.2886ms 11.8640 Ops/s 11.5670 Ops/s $\color{#35bf28}+2.57\%$
test_a2c_speed 28.0852ms 26.9953ms 37.0435 Ops/s 35.9246 Ops/s $\color{#35bf28}+3.11\%$
test_ppo_speed 28.0571ms 27.1024ms 36.8971 Ops/s 32.3340 Ops/s $\textbf{\color{#35bf28}+14.11\%}$
test_reinforce_speed 27.1420ms 26.1941ms 38.1766 Ops/s 37.1710 Ops/s $\color{#35bf28}+2.71\%$
test_iql_speed 59.0143ms 57.9521ms 17.2556 Ops/s 16.8403 Ops/s $\color{#35bf28}+2.47\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 3.1306ms 2.5537ms 391.5815 Ops/s 346.7143 Ops/s $\textbf{\color{#35bf28}+12.94\%}$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 3.8396ms 2.7414ms 364.7712 Ops/s 357.9854 Ops/s $\color{#35bf28}+1.90\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 3.6673ms 2.7305ms 366.2360 Ops/s 358.7241 Ops/s $\color{#35bf28}+2.09\%$
test_sample_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 2.7617ms 2.5416ms 393.4503 Ops/s 384.9851 Ops/s $\color{#35bf28}+2.20\%$
test_sample_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 4.0481ms 2.7293ms 366.3963 Ops/s 357.8021 Ops/s $\color{#35bf28}+2.40\%$
test_sample_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.1390s 3.1182ms 320.6931 Ops/s 357.1290 Ops/s $\textbf{\color{#d91a1a}-10.20\%}$
test_sample_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 2.8855ms 2.5472ms 392.5823 Ops/s 381.2620 Ops/s $\color{#35bf28}+2.97\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 4.4635ms 2.7351ms 365.6118 Ops/s 357.0202 Ops/s $\color{#35bf28}+2.41\%$
test_sample_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.1294s 3.0736ms 325.3548 Ops/s 315.6110 Ops/s $\color{#35bf28}+3.09\%$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 2.9388ms 2.5193ms 396.9363 Ops/s 382.3682 Ops/s $\color{#35bf28}+3.81\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 4.1840ms 2.7376ms 365.2811 Ops/s 356.8318 Ops/s $\color{#35bf28}+2.37\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 4.1651ms 2.7461ms 364.1578 Ops/s 315.3223 Ops/s $\textbf{\color{#35bf28}+15.49\%}$
test_iterate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 3.0411ms 2.5435ms 393.1539 Ops/s 384.7634 Ops/s $\color{#35bf28}+2.18\%$
test_iterate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.8681ms 2.7427ms 364.6055 Ops/s 357.5287 Ops/s $\color{#35bf28}+1.98\%$
test_iterate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 3.9654ms 2.7295ms 366.3684 Ops/s 356.2976 Ops/s $\color{#35bf28}+2.83\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 3.1086ms 2.5471ms 392.6101 Ops/s 384.0285 Ops/s $\color{#35bf28}+2.23\%$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.1292s 3.1024ms 322.3348 Ops/s 357.4859 Ops/s $\textbf{\color{#d91a1a}-9.83\%}$
test_iterate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 4.2018ms 2.7309ms 366.1763 Ops/s 359.6746 Ops/s $\color{#35bf28}+1.81\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.1852s 18.5851ms 53.8066 Ops/s 53.2781 Ops/s $\color{#35bf28}+0.99\%$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 0.1275s 17.3602ms 57.6030 Ops/s 56.8704 Ops/s $\color{#35bf28}+1.29\%$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 0.1296s 17.3399ms 57.6704 Ops/s 56.7771 Ops/s $\color{#35bf28}+1.57\%$
test_populate_rb[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.1295s 17.2683ms 57.9096 Ops/s 65.4414 Ops/s $\textbf{\color{#d91a1a}-11.51\%}$
test_populate_rb[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 0.1267s 15.0540ms 66.4273 Ops/s 56.4278 Ops/s $\textbf{\color{#35bf28}+17.72\%}$
test_populate_rb[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 0.1265s 17.2811ms 57.8665 Ops/s 65.2194 Ops/s $\textbf{\color{#d91a1a}-11.27\%}$
test_populate_rb[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.1253s 17.2278ms 58.0459 Ops/s 56.7528 Ops/s $\color{#35bf28}+2.28\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 0.1270s 17.2716ms 57.8986 Ops/s 56.8173 Ops/s $\color{#35bf28}+1.90\%$
test_populate_rb[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 0.1228s 17.0898ms 58.5145 Ops/s 64.4418 Ops/s $\textbf{\color{#d91a1a}-9.20\%}$

@vmoens vmoens marked this pull request as ready for review December 23, 2023 06:48
@vmoens vmoens merged commit 15950d1 into main Dec 23, 2023
63 of 64 checks passed
@vmoens vmoens deleted the fix-rlhf branch December 23, 2023 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants