-
Notifications
You must be signed in to change notification settings - Fork 418
[Refactor,Feature] Refactor collector shapes and stack_result in sync collector #1994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/1994
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 13 New FailuresAs of commit c742a43 with merge base 4bce371 ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_single | 53.1089ms | 52.4619ms | 19.0615 Ops/s | 17.8592 Ops/s | |
| test_sync | 35.5930ms | 29.7961ms | 33.5614 Ops/s | 30.5180 Ops/s | |
| test_async | 48.0734ms | 27.8519ms | 35.9041 Ops/s | 35.5460 Ops/s | |
| test_simple | 0.3818s | 0.3279s | 3.0499 Ops/s | 3.0865 Ops/s | |
| test_transformed | 0.5020s | 0.4594s | 2.1768 Ops/s | 2.1565 Ops/s | |
| test_serial | 1.1867s | 1.1606s | 0.8616 Ops/s | 0.8428 Ops/s | |
| test_parallel | 1.0493s | 1.0099s | 0.9902 Ops/s | 0.9702 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.1250ms | 21.3903μs | 46.7502 KOps/s | 46.7949 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 35.2950μs | 13.0374μs | 76.7025 KOps/s | 76.8782 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 42.2580μs | 12.3168μs | 81.1900 KOps/s | 80.0651 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 26.4190μs | 7.5499μs | 132.4519 KOps/s | 129.5662 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 74.0880μs | 22.3446μs | 44.7536 KOps/s | 44.1395 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 40.1950μs | 14.2987μs | 69.9366 KOps/s | 69.4999 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 38.1520μs | 13.6421μs | 73.3025 KOps/s | 73.1831 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 32.5910μs | 8.6569μs | 115.5152 KOps/s | 112.7608 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 57.0970μs | 23.5144μs | 42.5271 KOps/s | 41.9789 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 50.8550μs | 15.3461μs | 65.1630 KOps/s | 63.6732 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 34.3850μs | 13.5242μs | 73.9413 KOps/s | 72.0770 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 33.3020μs | 8.7365μs | 114.4623 KOps/s | 113.3148 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 52.3570μs | 24.8933μs | 40.1714 KOps/s | 39.5818 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 37.2990μs | 16.7168μs | 59.8199 KOps/s | 60.3541 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 50.5350μs | 14.5713μs | 68.6279 KOps/s | 67.2846 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 26.2490μs | 9.6708μs | 103.4043 KOps/s | 99.4530 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 53.1390μs | 23.6391μs | 42.3028 KOps/s | 41.1045 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 36.0170μs | 15.5179μs | 64.4416 KOps/s | 63.7756 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 42.8390μs | 15.5322μs | 64.3823 KOps/s | 61.4712 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 50.5540μs | 9.8581μs | 101.4390 KOps/s | 99.9243 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 36.6280μs | 25.2565μs | 39.5938 KOps/s | 38.7349 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 39.2430μs | 16.8569μs | 59.3230 KOps/s | 59.9915 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 45.7250μs | 16.8425μs | 59.3737 KOps/s | 58.6620 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 35.4870μs | 11.1355μs | 89.8032 KOps/s | 88.5966 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 54.7720μs | 26.1864μs | 38.1877 KOps/s | 37.5347 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 41.4570μs | 17.9696μs | 55.6497 KOps/s | 55.3146 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 45.6250μs | 16.6239μs | 60.1545 KOps/s | 58.4985 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 32.2400μs | 11.1159μs | 89.9610 KOps/s | 88.1230 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 54.9230μs | 27.3030μs | 36.6260 KOps/s | 36.2383 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 60.4120μs | 19.2345μs | 51.9900 KOps/s | 52.4694 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 57.1770μs | 17.6069μs | 56.7959 KOps/s | 54.4672 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 38.3810μs | 12.1714μs | 82.1599 KOps/s | 81.0185 KOps/s | |
| test_values[generalized_advantage_estimate-True-True] | 9.8043ms | 9.0875ms | 110.0414 Ops/s | 107.2400 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 37.5890ms | 35.0775ms | 28.5083 Ops/s | 28.7461 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2203ms | 0.1611ms | 6.2061 KOps/s | 6.2029 KOps/s | |
| test_values[td1_return_estimate-False-False] | 25.7561ms | 22.7775ms | 43.9030 Ops/s | 43.0699 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 38.1532ms | 35.2206ms | 28.3925 Ops/s | 28.6193 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 35.0813ms | 32.7377ms | 30.5458 Ops/s | 30.0173 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 36.8560ms | 35.2269ms | 28.3874 Ops/s | 28.6598 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 8.6559ms | 8.0781ms | 123.7914 Ops/s | 121.1603 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 2.4300ms | 1.9107ms | 523.3781 Ops/s | 529.2330 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4744ms | 0.3450ms | 2.8986 KOps/s | 2.8970 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 46.1720ms | 44.9930ms | 22.2257 Ops/s | 22.8807 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 3.6552ms | 3.0170ms | 331.4513 Ops/s | 332.6410 Ops/s | |
| test_dqn_speed | 1.5418ms | 1.3210ms | 757.0176 Ops/s | 759.5988 Ops/s | |
| test_ddpg_speed | 3.4279ms | 2.6397ms | 378.8267 Ops/s | 381.2410 Ops/s | |
| test_sac_speed | 8.3517ms | 8.0126ms | 124.8033 Ops/s | 125.0693 Ops/s | |
| test_redq_speed | 14.8748ms | 12.8585ms | 77.7696 Ops/s | 77.6354 Ops/s | |
| test_redq_deprec_speed | 13.1945ms | 12.7046ms | 78.7114 Ops/s | 78.2568 Ops/s | |
| test_td3_speed | 8.1084ms | 7.9417ms | 125.9182 Ops/s | 124.5643 Ops/s | |
| test_cql_speed | 0.1139s | 38.9495ms | 25.6743 Ops/s | 27.7055 Ops/s | |
| test_a2c_speed | 8.5951ms | 7.2642ms | 137.6606 Ops/s | 138.8972 Ops/s | |
| test_ppo_speed | 8.1655ms | 7.5537ms | 132.3863 Ops/s | 133.3126 Ops/s | |
| test_reinforce_speed | 7.3639ms | 6.4781ms | 154.3657 Ops/s | 155.0059 Ops/s | |
| test_iql_speed | 32.5968ms | 32.0920ms | 31.1604 Ops/s | 31.3865 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.1668ms | 2.0510ms | 487.5612 Ops/s | 478.3535 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6458ms | 0.4924ms | 2.0307 KOps/s | 2.0466 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7082ms | 0.4665ms | 2.1436 KOps/s | 2.1760 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 2.9805ms | 2.0553ms | 486.5561 Ops/s | 491.5167 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.8988ms | 0.4852ms | 2.0612 KOps/s | 2.0990 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6528ms | 0.4577ms | 2.1851 KOps/s | 2.1917 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5383ms | 1.2749ms | 784.3799 Ops/s | 792.6227 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6343ms | 1.2081ms | 827.7760 Ops/s | 834.7936 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.4567ms | 2.1776ms | 459.2168 Ops/s | 452.2404 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9387ms | 0.6012ms | 1.6634 KOps/s | 1.6702 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 96.2632ms | 0.6454ms | 1.5493 KOps/s | 1.7655 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.2259ms | 2.0476ms | 488.3712 Ops/s | 484.7456 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6462ms | 0.4926ms | 2.0300 KOps/s | 2.0431 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 3.7594ms | 0.4710ms | 2.1230 KOps/s | 2.1677 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.2010ms | 2.0844ms | 479.7431 Ops/s | 478.5863 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.5640ms | 0.4834ms | 2.0687 KOps/s | 2.0792 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 3.5818ms | 0.4670ms | 2.1413 KOps/s | 2.1529 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 2.2854ms | 2.1726ms | 460.2683 Ops/s | 453.1496 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.1800ms | 0.6076ms | 1.6458 KOps/s | 1.6660 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7689ms | 0.5763ms | 1.7351 KOps/s | 1.7422 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1024s | 7.3330ms | 136.3692 Ops/s | 144.2807 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 13.9910ms | 11.9698ms | 83.5438 Ops/s | 83.6562 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.5816ms | 1.0728ms | 932.1422 Ops/s | 993.7897 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 85.9577ms | 5.1925ms | 192.5868 Ops/s | 149.4518 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 93.1944ms | 13.5613ms | 73.7390 Ops/s | 83.6510 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.5437ms | 1.0427ms | 959.0321 Ops/s | 944.3283 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 88.8976ms | 5.6054ms | 178.3986 Ops/s | 183.2893 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 14.7068ms | 12.3973ms | 80.6627 Ops/s | 71.8944 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.8868ms | 1.3606ms | 734.9822 Ops/s | 717.0357 Ops/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_single | 99.5076ms | 98.5673ms | 10.1454 Ops/s | 9.3467 Ops/s | |
| test_sync | 91.5732ms | 87.6182ms | 11.4131 Ops/s | 11.6097 Ops/s | |
| test_async | 0.1747s | 87.6835ms | 11.4047 Ops/s | 11.5133 Ops/s | |
| test_single_pixels | 0.1094s | 0.1090s | 9.1743 Ops/s | 9.0476 Ops/s | |
| test_sync_pixels | 68.2034ms | 65.8578ms | 15.1842 Ops/s | 15.1073 Ops/s | |
| test_async_pixels | 0.1218s | 55.3405ms | 18.0700 Ops/s | 18.0921 Ops/s | |
| test_simple | 0.6423s | 0.6417s | 1.5584 Ops/s | 1.4795 Ops/s | |
| test_transformed | 0.8439s | 0.8428s | 1.1865 Ops/s | 1.1405 Ops/s | |
| test_serial | 2.0915s | 2.0288s | 0.4929 Ops/s | 0.4724 Ops/s | |
| test_parallel | 1.8423s | 1.7811s | 0.5615 Ops/s | 0.5512 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 88.6910μs | 33.5437μs | 29.8119 KOps/s | 30.3453 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 43.9100μs | 19.5849μs | 51.0596 KOps/s | 50.1738 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 35.8900μs | 18.6420μs | 53.6424 KOps/s | 52.5745 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 42.6010μs | 11.2057μs | 89.2401 KOps/s | 87.4749 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 54.5510μs | 34.3886μs | 29.0794 KOps/s | 28.3625 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 42.0000μs | 21.3755μs | 46.7826 KOps/s | 45.7955 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 36.9910μs | 20.2928μs | 49.2786 KOps/s | 47.9571 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 37.6100μs | 13.0823μs | 76.4390 KOps/s | 74.6943 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 90.0610μs | 36.2543μs | 27.5829 KOps/s | 26.8850 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 42.2400μs | 23.3749μs | 42.7810 KOps/s | 41.6122 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 47.5500μs | 20.4923μs | 48.7989 KOps/s | 47.8742 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 30.6300μs | 13.0634μs | 76.5500 KOps/s | 74.9468 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.1067ms | 37.4070μs | 26.7330 KOps/s | 25.9027 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 91.7510μs | 25.1786μs | 39.7163 KOps/s | 38.8560 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 38.8610μs | 22.1574μs | 45.1316 KOps/s | 44.6352 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 49.2200μs | 14.9155μs | 67.0446 KOps/s | 66.6733 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 53.0900μs | 36.3837μs | 27.4849 KOps/s | 27.2041 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 66.1100μs | 23.4607μs | 42.6244 KOps/s | 41.8236 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 42.1310μs | 24.2554μs | 41.2279 KOps/s | 40.6812 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 40.0510μs | 14.9457μs | 66.9088 KOps/s | 66.1627 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 59.5210μs | 37.8985μs | 26.3863 KOps/s | 25.7345 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 41.1310μs | 25.0729μs | 39.8837 KOps/s | 38.2875 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 41.2710μs | 25.7931μs | 38.7700 KOps/s | 37.7910 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 33.3300μs | 16.6132μs | 60.1931 KOps/s | 58.6200 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 65.7110μs | 40.2592μs | 24.8390 KOps/s | 24.7378 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 53.7410μs | 26.9172μs | 37.1509 KOps/s | 36.2873 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 92.2810μs | 25.8202μs | 38.7294 KOps/s | 37.9929 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 33.8210μs | 16.5543μs | 60.4071 KOps/s | 59.1055 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 58.4100μs | 40.6822μs | 24.5808 KOps/s | 23.6408 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 51.0000μs | 28.4764μs | 35.1168 KOps/s | 33.9722 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 46.3910μs | 27.3161μs | 36.6085 KOps/s | 35.3213 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 34.2400μs | 18.5225μs | 53.9883 KOps/s | 53.4061 KOps/s | |
| test_values[generalized_advantage_estimate-True-True] | 22.9465ms | 22.6116ms | 44.2251 Ops/s | 43.0435 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 83.7874ms | 3.2079ms | 311.7277 Ops/s | 304.1930 Ops/s | |
| test_values[td0_return_estimate-False-False] | 86.2010μs | 60.7794μs | 16.4529 KOps/s | 15.7910 KOps/s | |
| test_values[td1_return_estimate-False-False] | 49.2283ms | 48.6248ms | 20.5656 Ops/s | 19.9354 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 2.0336ms | 1.7116ms | 584.2618 Ops/s | 579.7887 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 82.9479ms | 81.2637ms | 12.3056 Ops/s | 12.4587 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 2.0806ms | 1.7208ms | 581.1133 Ops/s | 578.5999 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 22.9709ms | 22.1209ms | 45.2062 Ops/s | 46.5883 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 0.8616ms | 0.6521ms | 1.5334 KOps/s | 1.4902 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.6793ms | 0.6076ms | 1.6458 KOps/s | 1.6110 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5060ms | 1.4184ms | 705.0085 Ops/s | 700.4333 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.9378ms | 0.6199ms | 1.6131 KOps/s | 1.5415 KOps/s | |
| test_dqn_speed | 8.0564ms | 1.4167ms | 705.8470 Ops/s | 710.0952 Ops/s | |
| test_ddpg_speed | 2.8392ms | 2.6670ms | 374.9578 Ops/s | 371.4999 Ops/s | |
| test_sac_speed | 8.4142ms | 7.7995ms | 128.2140 Ops/s | 126.8708 Ops/s | |
| test_redq_speed | 11.7780ms | 10.4265ms | 95.9091 Ops/s | 94.9090 Ops/s | |
| test_redq_deprec_speed | 11.7469ms | 10.9656ms | 91.1946 Ops/s | 88.6271 Ops/s | |
| test_td3_speed | 8.0616ms | 7.7992ms | 128.2178 Ops/s | 128.2228 Ops/s | |
| test_cql_speed | 26.0683ms | 25.2926ms | 39.5372 Ops/s | 39.1353 Ops/s | |
| test_a2c_speed | 6.0259ms | 5.6002ms | 178.5643 Ops/s | 181.6196 Ops/s | |
| test_ppo_speed | 6.2873ms | 5.9761ms | 167.3345 Ops/s | 170.2161 Ops/s | |
| test_reinforce_speed | 5.4095ms | 4.5331ms | 220.6016 Ops/s | 219.2982 Ops/s | |
| test_iql_speed | 20.4522ms | 19.6837ms | 50.8035 Ops/s | 50.3173 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 3.1392ms | 2.8778ms | 347.4923 Ops/s | 347.7342 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.2408ms | 0.5462ms | 1.8309 KOps/s | 1.8050 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7325ms | 0.5258ms | 1.9017 KOps/s | 1.8570 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.0943ms | 2.8991ms | 344.9310 Ops/s | 342.8397 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.3471ms | 0.5426ms | 1.8428 KOps/s | 1.8268 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.8125ms | 0.5237ms | 1.9097 KOps/s | 1.8965 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 3.9098ms | 1.4612ms | 684.3508 Ops/s | 657.9487 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6256ms | 1.4152ms | 706.6100 Ops/s | 687.4975 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.1949ms | 3.0009ms | 333.2360 Ops/s | 333.9926 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.9300ms | 0.6681ms | 1.4968 KOps/s | 1.4744 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 4.4668ms | 0.6539ms | 1.5292 KOps/s | 1.5445 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 2.9668ms | 2.8590ms | 349.7754 Ops/s | 346.3722 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.7978ms | 0.5430ms | 1.8415 KOps/s | 1.7937 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 4.2654ms | 0.5363ms | 1.8645 KOps/s | 1.8448 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 3.1183ms | 2.9096ms | 343.6874 Ops/s | 342.0067 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.2687ms | 0.5423ms | 1.8441 KOps/s | 1.8367 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6665ms | 0.5254ms | 1.9032 KOps/s | 1.8860 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 3.1426ms | 3.0017ms | 333.1396 Ops/s | 330.9529 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 0.8267ms | 0.6713ms | 1.4897 KOps/s | 1.4702 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 4.6274ms | 0.6588ms | 1.5178 KOps/s | 1.5137 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.1148s | 8.9723ms | 111.4541 Ops/s | 114.9524 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 16.3224ms | 14.0385ms | 71.2328 Ops/s | 69.6984 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.1576ms | 1.0645ms | 939.4067 Ops/s | 880.1653 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 0.1117s | 8.8931ms | 112.4473 Ops/s | 148.3898 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 16.2913ms | 14.0095ms | 71.3802 Ops/s | 70.0752 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.3436ms | 1.1956ms | 836.3700 Ops/s | 869.4117 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 0.1096s | 7.1626ms | 139.6148 Ops/s | 107.1246 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 16.6940ms | 14.4449ms | 69.2284 Ops/s | 68.7534 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 7.8783ms | 1.6527ms | 605.0576 Ops/s | 693.1828 Ops/s |
|
@albertbou92 I also took this opportunity to refactor preemption. Now you can stack (with padding) or cat (with masking) the values coming from the collectors. In a nutshell: # in SyncDataCollector.rollout
def rollout(self):
...
if self.interruptor is not None:
self._collected_frames += frame_count
# in MultiSyncDataCollector.iterator
def iterator(self):
if self.preemptive_threshold and ...:
if self._collected_frames > self.frames_per_batch * self.preemptive_threshold:
break # or similar |
Yes the preemption was based on the idea that all workers collect a fixed number of frames and can not communicate during collection (it was also the assumption in DDPPO), but if we can have a shared global value that tracks how many frames have been collected globally and that does not impact speed that would be much better. Yes, I can give it shot! |
|
@matteobettini This PR addresses an issue that you pointed a long time ago RE the stacking of results in MultiSync The idea is to build the collector with Sorry it took so long to solve this! |
Amamzing! So This is what i wanted yes! |
docs/source/reference/collectors.rst
Outdated
| +--------------------+---------------------+-------------+---------------+------------------------------+ | ||
| | Single env | [T] | `[B, T]` | `[B*(T//B)` | [T] | | ||
| +--------------------+---------------------+-------------+---------------+------------------------------+ | ||
| | Batched env (n=P) | [P, T] | `[B, P, T]` | `[B * P, T]` | [P, T] | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[B * P, T], this is given that cat_results=0
my personal preference, to avoid having cat_results be both int and str is to have 2 args
cat_resultstrue or falsecollectors_dimdimension where to cat or stack (or similar nicer name)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then what is collector_dim=0 and cat_results=False? That should not be allowed. So we need a complicated doc that lists the available configs
If we allow them we also need to test every single combination...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the goal is to get rid of this arg in v0.6 or v0.7 so adding 2 args is twice as bad as adding 1 given that we want to break things on the long run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collector_dim=0 and cat_results=False
isn't this what happens currently when cat_results="stack"?
Anyway yes how you prefer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sorry, I meant collector_dim=-1 or 1 or anything that isn't 0
you can't stack along any dim (and should not be able to do so), that's what I mean. Stack should be along dim 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't super see this as in the example table we are referring to, in the row for stacking in multisync i can see both
- B, P, T (collector_dim=0 or -3, cat_results=False) (currently the only available option)
- P, B, T (collector_dim=1 or -2, cat_results=False)
- P, T, B (collector_dim=2 or -1, cat_results=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no cat_results=False, it can only be a string ("stack") or an int
cat_results="stack" => stack B collectors along 0 => [B, P, T]
cat_results=0 => cat B collectors along 0 => [B * P, T]
cat_results=-1 => cat B collectors along -1 => [B, P * T]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I didn't get what you were trying to say though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, that is the point of this thread, I am discussing why for me it makes sense to separate them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P, B, T (collector_dim=1 or -2, cat_results=False)
P, T, B (collector_dim=2 or -1, cat_results=False)
I don't think we should support these. If you want that you can transpose your resulting tensordict. Implementing it would require a lot of expensive tests for a feature we will deprecate in 2 releases.
This PR:
TODO:
stack_resultkeyword in docstringsstack_resultraise an exception in MultiaSyncDataCollectorCheck the updated doc below in collectors to learn more