[BugFix] Fix stale model reference in MultiCollector weight sync after device-cast#3587
Merged
Conversation
…eepcopy When policy_device differs from the policy's native device, _get_policy_and_device creates a deepcopy on the target device. However, the weight sync scheme's model reference was set before the deepcopy (in _make_policy_factory), so subsequent weight updates via the background thread would silently update the original (unused) object instead of the collector's actual policy. This caused one worker to never receive weight updates in MultiAsyncCollector when workers had heterogeneous devices. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3587
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Adds logging at key points in the weight sync pipeline to diagnose why one async worker may not be receiving weight updates: - MultiAsyncCollector: log which worker produced each batch, and weight param fingerprint (sum) when update_policy_weights_ is called - _runner.py: log policy param fingerprint at rollout start, and whether the stale-model-reference fix fires - _mp.py send(): log number of transports and weight fingerprint - _mp.py _background_receive_loop(): log param fingerprint BEFORE and AFTER weight application per worker, plus model identity All gated behind DEBUG level (torchrl_logger.isEnabledFor(10)). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 78.0560μs | 76.9697μs | 12.9921 KOps/s | 12.5707 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1420ms | 0.1382ms | 7.2361 KOps/s | 7.2265 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1047s | 0.1042s | 9.5985 Ops/s | 9.2060 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5526μs | 2.5322μs | 394.9093 KOps/s | 386.0641 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.5535μs | 38.1551μs | 26.2088 KOps/s | 28.2183 KOps/s | |
| test_simple | 0.5524s | 0.5462s | 1.8309 Ops/s | 1.7822 Ops/s | |
| test_transformed | 1.1939s | 1.1031s | 0.9065 Ops/s | 0.9177 Ops/s | |
| test_serial | 1.6573s | 1.6496s | 0.6062 Ops/s | 0.5938 Ops/s | |
| test_parallel | 0.9969s | 0.9940s | 1.0060 Ops/s | 0.9970 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2021ms | 39.6338μs | 25.2310 KOps/s | 24.8805 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 46.9810μs | 21.9805μs | 45.4949 KOps/s | 45.0283 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 59.6210μs | 23.4043μs | 42.7273 KOps/s | 44.2039 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 37.2010μs | 12.3613μs | 80.8975 KOps/s | 81.3374 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 79.8910μs | 43.1638μs | 23.1675 KOps/s | 23.4459 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 55.0500μs | 24.5238μs | 40.7768 KOps/s | 40.5577 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 0.1053ms | 25.8015μs | 38.7575 KOps/s | 40.3099 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 51.2010μs | 15.0839μs | 66.2958 KOps/s | 67.5398 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 74.6710μs | 45.9358μs | 21.7695 KOps/s | 22.5052 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 53.3710μs | 27.2811μs | 36.6555 KOps/s | 37.2960 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 55.2610μs | 25.6570μs | 38.9757 KOps/s | 40.2623 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 44.5300μs | 14.7705μs | 67.7025 KOps/s | 67.3860 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.1154ms | 48.0401μs | 20.8159 KOps/s | 21.1825 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 69.6900μs | 29.3416μs | 34.0813 KOps/s | 34.1542 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 58.7910μs | 27.6349μs | 36.1862 KOps/s | 36.1603 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 44.6410μs | 17.0353μs | 58.7017 KOps/s | 56.9676 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 92.9410μs | 45.3877μs | 22.0324 KOps/s | 22.3972 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 95.2210μs | 27.1877μs | 36.7814 KOps/s | 37.3188 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.8978ms | 30.6969μs | 32.5766 KOps/s | 35.2272 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 45.5610μs | 16.5693μs | 60.3525 KOps/s | 61.1902 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 79.2410μs | 47.9696μs | 20.8465 KOps/s | 21.2707 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 63.1800μs | 29.5209μs | 33.8744 KOps/s | 34.1258 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 66.0510μs | 31.7170μs | 31.5289 KOps/s | 32.6788 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 90.7810μs | 19.2272μs | 52.0098 KOps/s | 54.0100 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 94.7910μs | 51.0408μs | 19.5922 KOps/s | 19.7547 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 65.9410μs | 32.6366μs | 30.6405 KOps/s | 31.1354 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 79.4910μs | 31.4523μs | 31.7942 KOps/s | 32.3424 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 45.7300μs | 18.6911μs | 53.5013 KOps/s | 53.5159 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.1208ms | 52.5186μs | 19.0409 KOps/s | 19.2844 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 65.9700μs | 34.5489μs | 28.9445 KOps/s | 29.3048 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.2214ms | 32.5072μs | 30.7624 KOps/s | 31.0611 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 49.7610μs | 21.1047μs | 47.3828 KOps/s | 47.7245 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.8578s | 0.7355s | 1.3597 Ops/s | 1.3906 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7213s | 0.6122s | 1.6335 Ops/s | 1.6997 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7280s | 1.6446s | 0.6081 Ops/s | 0.6245 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.4580s | 1.3786s | 0.7254 Ops/s | 0.7292 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9860s | 1.8801s | 0.5319 Ops/s | 0.5427 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7574s | 1.6760s | 0.5966 Ops/s | 0.6129 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6040s | 4.5507s | 0.2197 Ops/s | 0.2215 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.3993s | 4.2966s | 0.2327 Ops/s | 0.2331 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0059s | 1.8483s | 0.5410 Ops/s | 0.5431 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6573s | 1.5670s | 0.6382 Ops/s | 0.6377 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 9.9790ms | 9.7287ms | 102.7882 Ops/s | 96.3014 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 16.3805ms | 12.2679ms | 81.5136 Ops/s | 55.5394 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.2456ms | 0.1294ms | 7.7298 KOps/s | 7.6296 KOps/s | |
| test_values[td1_return_estimate-False-False] | 26.9315ms | 26.2447ms | 38.1029 Ops/s | 35.8893 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 18.3795ms | 12.2319ms | 81.7532 Ops/s | 55.7823 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 39.5590ms | 38.6869ms | 25.8486 Ops/s | 24.5863 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 18.2593ms | 12.2536ms | 81.6090 Ops/s | 56.3889 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 9.2568ms | 8.6472ms | 115.6440 Ops/s | 109.9454 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.8542ms | 1.5706ms | 636.6880 Ops/s | 655.2986 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.4618ms | 0.4176ms | 2.3948 KOps/s | 2.4153 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 31.7262ms | 30.7958ms | 32.4719 Ops/s | 32.1320 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 2.2832ms | 1.7976ms | 556.3041 Ops/s | 556.2903 Ops/s | |
| test_dqn_speed[False-None] | 1.8222ms | 1.3678ms | 731.1045 Ops/s | 727.5999 Ops/s | |
| test_dqn_speed[False-backward] | 1.9394ms | 1.8908ms | 528.8839 Ops/s | 524.3179 Ops/s | |
| test_dqn_speed[True-None] | 1.0153ms | 0.5633ms | 1.7752 KOps/s | 1.6780 KOps/s | |
| test_dqn_speed[True-backward] | 1.0756ms | 1.0323ms | 968.7412 Ops/s | 920.0856 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 1.0200ms | 0.5642ms | 1.7725 KOps/s | 1.7280 KOps/s | |
| test_ddpg_speed[False-None] | 3.2684ms | 2.8600ms | 349.6463 Ops/s | 348.9763 Ops/s | |
| test_ddpg_speed[False-backward] | 4.2076ms | 4.0543ms | 246.6506 Ops/s | 245.0534 Ops/s | |
| test_ddpg_speed[True-None] | 1.9172ms | 1.4729ms | 678.9198 Ops/s | 664.5766 Ops/s | |
| test_ddpg_speed[True-backward] | 2.6031ms | 2.5089ms | 398.5815 Ops/s | 389.5341 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.8913ms | 1.4459ms | 691.6318 Ops/s | 687.9280 Ops/s | |
| test_sac_speed[False-None] | 8.5723ms | 8.0413ms | 124.3579 Ops/s | 123.2364 Ops/s | |
| test_sac_speed[False-backward] | 12.1033ms | 11.3442ms | 88.1508 Ops/s | 86.9935 Ops/s | |
| test_sac_speed[True-None] | 2.7251ms | 2.2790ms | 438.7950 Ops/s | 429.5764 Ops/s | |
| test_sac_speed[True-backward] | 4.6346ms | 4.2612ms | 234.6771 Ops/s | 206.9155 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 2.8124ms | 2.2593ms | 442.6196 Ops/s | 421.9421 Ops/s | |
| test_redq_speed[False-None] | 15.1188ms | 11.0019ms | 90.8933 Ops/s | 90.7640 Ops/s | |
| test_redq_speed[False-backward] | 24.1625ms | 19.3675ms | 51.6329 Ops/s | 54.7298 Ops/s | |
| test_redq_speed[True-None] | 6.3102ms | 4.9414ms | 202.3702 Ops/s | 207.8298 Ops/s | |
| test_redq_speed[reduce-overhead-None] | 5.2163ms | 4.7525ms | 210.4148 Ops/s | 195.1319 Ops/s | |
| test_redq_deprec_speed[False-None] | 12.0362ms | 11.3966ms | 87.7458 Ops/s | 86.3507 Ops/s | |
| test_redq_deprec_speed[False-backward] | 17.0810ms | 16.3806ms | 61.0480 Ops/s | 60.0173 Ops/s | |
| test_redq_deprec_speed[True-None] | 4.1366ms | 3.7754ms | 264.8746 Ops/s | 248.8793 Ops/s | |
| test_redq_deprec_speed[True-backward] | 8.1177ms | 7.8627ms | 127.1828 Ops/s | 124.5740 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 4.1812ms | 3.7379ms | 267.5330 Ops/s | 256.7343 Ops/s | |
| test_td3_speed[False-None] | 8.1452ms | 7.9915ms | 125.1335 Ops/s | 122.1691 Ops/s | |
| test_td3_speed[False-backward] | 11.2616ms | 10.9153ms | 91.6143 Ops/s | 90.3298 Ops/s | |
| test_td3_speed[True-None] | 1.9927ms | 1.9123ms | 522.9255 Ops/s | 510.1473 Ops/s | |
| test_td3_speed[True-backward] | 3.9819ms | 3.8023ms | 262.9972 Ops/s | 264.2577 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 1.9398ms | 1.8764ms | 532.9388 Ops/s | 528.5495 Ops/s | |
| test_cql_speed[False-None] | 30.4610ms | 27.1701ms | 36.8051 Ops/s | 37.0688 Ops/s | |
| test_cql_speed[False-backward] | 36.9375ms | 36.2677ms | 27.5728 Ops/s | 26.9028 Ops/s | |
| test_cql_speed[True-None] | 13.7107ms | 13.1236ms | 76.1987 Ops/s | 76.5937 Ops/s | |
| test_cql_speed[True-backward] | 19.6710ms | 19.0399ms | 52.5212 Ops/s | 53.1036 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 13.5389ms | 12.7863ms | 78.2089 Ops/s | 74.8299 Ops/s | |
| test_a2c_speed[False-None] | 5.9817ms | 5.5557ms | 179.9950 Ops/s | 176.3455 Ops/s | |
| test_a2c_speed[False-backward] | 12.6884ms | 12.1699ms | 82.1702 Ops/s | 80.6880 Ops/s | |
| test_a2c_speed[True-None] | 4.2179ms | 4.0065ms | 249.5926 Ops/s | 249.5182 Ops/s | |
| test_a2c_speed[True-backward] | 9.2147ms | 8.9838ms | 111.3109 Ops/s | 110.2261 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 4.6544ms | 3.9369ms | 254.0070 Ops/s | 247.9305 Ops/s | |
| test_ppo_speed[False-None] | 6.4481ms | 5.9763ms | 167.3263 Ops/s | 167.8428 Ops/s | |
| test_ppo_speed[False-backward] | 13.0462ms | 12.6326ms | 79.1602 Ops/s | 77.9470 Ops/s | |
| test_ppo_speed[True-None] | 4.3136ms | 3.8667ms | 258.6177 Ops/s | 246.7814 Ops/s | |
| test_ppo_speed[True-backward] | 9.0845ms | 8.7531ms | 114.2455 Ops/s | 104.3004 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 3.9664ms | 3.8148ms | 262.1353 Ops/s | 252.2799 Ops/s | |
| test_reinforce_speed[False-None] | 5.1098ms | 4.6292ms | 216.0188 Ops/s | 212.7638 Ops/s | |
| test_reinforce_speed[False-backward] | 10.8007ms | 7.7688ms | 128.7206 Ops/s | 131.2831 Ops/s | |
| test_reinforce_speed[True-None] | 3.6644ms | 3.0984ms | 322.7432 Ops/s | 315.0846 Ops/s | |
| test_reinforce_speed[True-backward] | 8.8094ms | 8.3748ms | 119.4056 Ops/s | 120.9890 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 3.6034ms | 3.0598ms | 326.8158 Ops/s | 317.4713 Ops/s | |
| test_iql_speed[False-None] | 21.2401ms | 20.3626ms | 49.1096 Ops/s | 48.2434 Ops/s | |
| test_iql_speed[False-backward] | 32.1804ms | 31.3636ms | 31.8840 Ops/s | 31.9018 Ops/s | |
| test_iql_speed[True-None] | 9.5261ms | 8.9340ms | 111.9322 Ops/s | 110.9971 Ops/s | |
| test_iql_speed[True-backward] | 18.1869ms | 17.5181ms | 57.0838 Ops/s | 57.5661 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 9.4336ms | 9.0075ms | 111.0184 Ops/s | 110.3658 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 7.0760ms | 5.8404ms | 171.2208 Ops/s | 169.1350 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.7364ms | 0.3720ms | 2.6883 KOps/s | 3.2298 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7784ms | 0.3060ms | 3.2674 KOps/s | 3.5676 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.2125ms | 5.6474ms | 177.0731 Ops/s | 175.3499 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7109ms | 0.2995ms | 3.3390 KOps/s | 3.1575 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.7901ms | 0.2750ms | 3.6364 KOps/s | 3.3425 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.8134ms | 1.3247ms | 754.9013 Ops/s | 738.8817 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.5164ms | 1.2453ms | 803.0171 Ops/s | 795.2523 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.9148ms | 5.9587ms | 167.8223 Ops/s | 170.7736 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0222ms | 0.4573ms | 2.1869 KOps/s | 2.0666 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.9135ms | 0.4339ms | 2.3045 KOps/s | 2.1983 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.1152ms | 5.6553ms | 176.8247 Ops/s | 175.6041 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.3071ms | 0.3576ms | 2.7967 KOps/s | 3.3244 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.4935ms | 0.2772ms | 3.6077 KOps/s | 2.8124 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1537ms | 5.6252ms | 177.7717 Ops/s | 177.0100 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.1891ms | 0.2930ms | 3.4133 KOps/s | 2.7999 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.6358ms | 0.4166ms | 2.4006 KOps/s | 2.9359 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 6.1346ms | 5.7351ms | 174.3648 Ops/s | 171.4488 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0013ms | 0.4519ms | 2.2128 KOps/s | 1.8924 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7682ms | 0.4359ms | 2.2941 KOps/s | 1.9834 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.9952s | 24.8224ms | 40.2863 Ops/s | 49.3589 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 11.3688ms | 2.0122ms | 496.9686 Ops/s | 502.3853 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 7.0910ms | 1.2142ms | 823.5713 Ops/s | 804.9381 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.5553ms | 4.9509ms | 201.9848 Ops/s | 194.1783 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 12.8358ms | 1.9648ms | 508.9454 Ops/s | 489.3299 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 2.9187ms | 1.1496ms | 869.8773 Ops/s | 857.5569 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 9.0830ms | 5.1605ms | 193.7808 Ops/s | 190.9139 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 4.0159ms | 1.9184ms | 521.2801 Ops/s | 513.7577 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.3533ms | 1.0789ms | 926.8557 Ops/s | 905.7621 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 40.4908ms | 38.2294ms | 26.1579 Ops/s | 25.5181 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.3329ms | 17.6835ms | 56.5498 Ops/s | 54.7876 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 42.6817ms | 39.1820ms | 25.5219 Ops/s | 24.3539 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.0581ms | 18.3473ms | 54.5038 Ops/s | 54.2682 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 41.3236ms | 40.4809ms | 24.7030 Ops/s | 23.5990 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.0901ms | 19.5350ms | 51.1902 Ops/s | 50.3379 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8760ms | 0.2206ms | 4.5336 KOps/s | 4.5212 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.7037ms | 1.4708ms | 679.9022 Ops/s | 647.1109 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 3.3047ms | 2.5069ms | 398.9028 Ops/s | 391.8193 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.6278ms | 3.1204ms | 320.4712 Ops/s | 305.9890 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.1985ms | 0.1334ms | 7.4961 KOps/s | 7.3820 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.3445ms | 0.1872ms | 5.3415 KOps/s | 4.9100 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.5630ms | 1.9355ms | 516.6550 Ops/s | 520.5096 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.7171ms | 1.4275ms | 700.5160 Ops/s | 710.9646 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.5572ms | 1.1147ms | 897.1170 Ops/s | 904.6944 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 4.0506ms | 3.5630ms | 280.6598 Ops/s | 273.0624 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 6.6762ms | 5.9173ms | 168.9971 Ops/s | 166.8631 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.8508ms | 7.3569ms | 135.9273 Ops/s | 135.9802 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.7037ms | 0.2719ms | 3.6785 KOps/s | 3.5202 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.9873ms | 1.5266ms | 655.0445 Ops/s | 608.1558 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.9529ms | 2.5729ms | 388.6660 Ops/s | 380.8271 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.7905ms | 3.2511ms | 307.5873 Ops/s | 293.7758 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 32.5890ms | 32.1095ms | 31.1434 Ops/s | 30.4110 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 63.4069ms | 62.9628ms | 15.8824 Ops/s | 15.4028 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.4355ms | 36.5557ms | 27.3555 Ops/s | 26.6583 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 72.6072ms | 72.0978ms | 13.8701 Ops/s | 13.5890 Ops/s |
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 81.0278μs | 80.2491μs | 12.4612 KOps/s | 11.9585 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1410ms | 0.1404ms | 7.1202 KOps/s | 7.0780 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1099s | 0.1092s | 9.1541 Ops/s | 9.5327 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.6168μs | 2.6035μs | 384.0947 KOps/s | 403.9074 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 36.7711μs | 36.5890μs | 27.3306 KOps/s | 27.1449 KOps/s | |
| test_simple | 0.9235s | 0.8226s | 1.2157 Ops/s | 1.2365 Ops/s | |
| test_transformed | 1.3840s | 1.3801s | 0.7246 Ops/s | 0.7145 Ops/s | |
| test_serial | 2.3253s | 2.3161s | 0.4318 Ops/s | 0.4326 Ops/s | |
| test_parallel | 1.9129s | 1.8068s | 0.5535 Ops/s | 0.5502 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.2337ms | 41.7979μs | 23.9247 KOps/s | 23.7834 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 61.1810μs | 23.4587μs | 42.6280 KOps/s | 43.5288 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 57.4410μs | 23.4648μs | 42.6171 KOps/s | 42.4611 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 40.0610μs | 12.9605μs | 77.1574 KOps/s | 77.8724 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 78.8010μs | 44.8350μs | 22.3040 KOps/s | 22.9979 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 58.4110μs | 25.7928μs | 38.7705 KOps/s | 39.8502 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 89.7820μs | 26.3567μs | 37.9411 KOps/s | 38.9386 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 40.9310μs | 15.6802μs | 63.7745 KOps/s | 66.1746 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 95.9530μs | 46.4156μs | 21.5445 KOps/s | 21.4923 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 60.8510μs | 27.9247μs | 35.8106 KOps/s | 36.0940 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 60.6720μs | 25.8868μs | 38.6297 KOps/s | 37.9891 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 42.6410μs | 15.5052μs | 64.4946 KOps/s | 64.3111 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.1111ms | 49.4319μs | 20.2298 KOps/s | 20.3737 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 62.6310μs | 30.9371μs | 32.3236 KOps/s | 33.1618 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 63.9810μs | 28.8396μs | 34.6745 KOps/s | 35.4076 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 46.5320μs | 17.9528μs | 55.7017 KOps/s | 55.7393 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 76.7220μs | 47.0095μs | 21.2723 KOps/s | 21.2922 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 92.1720μs | 28.3315μs | 35.2964 KOps/s | 35.7227 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.5406ms | 30.6587μs | 32.6171 KOps/s | 33.6092 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 47.1410μs | 17.5303μs | 57.0441 KOps/s | 59.8289 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 88.0420μs | 49.3152μs | 20.2777 KOps/s | 20.3972 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 59.7810μs | 30.7375μs | 32.5335 KOps/s | 32.6283 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 97.8520μs | 32.1565μs | 31.0979 KOps/s | 31.2468 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 54.1110μs | 19.6096μs | 50.9955 KOps/s | 52.4813 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 86.4920μs | 51.7592μs | 19.3202 KOps/s | 19.5614 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 64.9920μs | 33.3664μs | 29.9703 KOps/s | 30.4508 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 81.4820μs | 31.9744μs | 31.2750 KOps/s | 31.6391 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 49.0410μs | 19.9587μs | 50.1034 KOps/s | 52.7596 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 0.1181ms | 54.6081μs | 18.3123 KOps/s | 18.6815 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 67.2620μs | 35.9456μs | 27.8198 KOps/s | 28.5320 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 60.8420μs | 34.1038μs | 29.3223 KOps/s | 29.7542 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 58.3210μs | 22.0777μs | 45.2946 KOps/s | 46.2549 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7337s | 0.7204s | 1.3881 Ops/s | 1.3422 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7135s | 0.6096s | 1.6403 Ops/s | 1.6407 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7517s | 1.6498s | 0.6061 Ops/s | 0.6108 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5055s | 1.4198s | 0.7043 Ops/s | 0.7031 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9588s | 1.8789s | 0.5322 Ops/s | 0.5284 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.8011s | 1.7067s | 0.5859 Ops/s | 0.6007 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.7207s | 4.6133s | 0.2168 Ops/s | 0.2165 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5910s | 4.4683s | 0.2238 Ops/s | 0.2271 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 2.0156s | 1.8755s | 0.5332 Ops/s | 0.5376 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.6871s | 1.5955s | 0.6268 Ops/s | 0.6286 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 20.9816ms | 20.5039ms | 48.7712 Ops/s | 49.4542 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1420s | 3.7516ms | 266.5539 Ops/s | 265.0445 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1047ms | 82.1855μs | 12.1676 KOps/s | 12.1781 KOps/s | |
| test_values[td1_return_estimate-False-False] | 48.8558ms | 48.2699ms | 20.7168 Ops/s | 20.2392 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3192ms | 1.0785ms | 927.2482 Ops/s | 918.2039 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 79.2459ms | 78.5762ms | 12.7265 Ops/s | 12.5004 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.2896ms | 1.0692ms | 935.2353 Ops/s | 932.9930 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 20.8829ms | 20.5285ms | 48.7127 Ops/s | 49.0397 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0186ms | 0.7474ms | 1.3379 KOps/s | 1.3380 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.8076ms | 0.6654ms | 1.5028 KOps/s | 1.5023 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5535ms | 1.4824ms | 674.5713 Ops/s | 676.9365 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.8462ms | 0.6837ms | 1.4626 KOps/s | 1.4673 KOps/s | |
| test_dqn_speed[False-None] | 1.7342ms | 1.5638ms | 639.4656 Ops/s | 638.1585 Ops/s | |
| test_dqn_speed[False-backward] | 2.2907ms | 2.1809ms | 458.5309 Ops/s | 457.9869 Ops/s | |
| test_dqn_speed[True-None] | 0.7185ms | 0.6014ms | 1.6629 KOps/s | 1.6374 KOps/s | |
| test_dqn_speed[True-backward] | 1.2985ms | 1.2681ms | 788.5806 Ops/s | 862.9976 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7045ms | 0.6252ms | 1.5996 KOps/s | 1.5894 KOps/s | |
| test_ddpg_speed[False-None] | 3.3349ms | 2.9646ms | 337.3105 Ops/s | 333.7800 Ops/s | |
| test_ddpg_speed[False-backward] | 4.6976ms | 4.3187ms | 231.5523 Ops/s | 234.2858 Ops/s | |
| test_ddpg_speed[True-None] | 1.4774ms | 1.3855ms | 721.7868 Ops/s | 702.5776 Ops/s | |
| test_ddpg_speed[True-backward] | 2.6320ms | 2.5623ms | 390.2729 Ops/s | 404.4580 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.4881ms | 1.3882ms | 720.3487 Ops/s | 705.9544 Ops/s | |
| test_sac_speed[False-None] | 8.7626ms | 8.3558ms | 119.6769 Ops/s | 118.1789 Ops/s | |
| test_sac_speed[False-backward] | 11.8982ms | 11.4270ms | 87.5120 Ops/s | 88.4218 Ops/s | |
| test_sac_speed[True-None] | 2.1671ms | 1.9509ms | 512.5931 Ops/s | 511.3271 Ops/s | |
| test_sac_speed[True-backward] | 3.9185ms | 3.7444ms | 267.0688 Ops/s | 276.3228 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 16.7103ms | 10.1693ms | 98.3348 Ops/s | 99.0951 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.1744ms | 9.3268ms | 107.2173 Ops/s | 106.3378 Ops/s | |
| test_redq_deprec_speed[False-backward] | 13.0339ms | 12.5535ms | 79.6588 Ops/s | 80.8345 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.8300ms | 2.7429ms | 364.5822 Ops/s | 361.7694 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.8308ms | 4.4493ms | 224.7536 Ops/s | 227.7992 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 14.6907ms | 9.6988ms | 103.1050 Ops/s | 103.6153 Ops/s | |
| test_td3_speed[False-None] | 8.2831ms | 8.1801ms | 122.2481 Ops/s | 121.6831 Ops/s | |
| test_td3_speed[False-backward] | 10.9511ms | 10.6470ms | 93.9231 Ops/s | 93.0029 Ops/s | |
| test_td3_speed[True-None] | 1.8056ms | 1.7210ms | 581.0548 Ops/s | 582.5595 Ops/s | |
| test_td3_speed[True-backward] | 3.6590ms | 3.2721ms | 305.6143 Ops/s | 303.1344 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 55.7214ms | 26.0284ms | 38.4196 Ops/s | 38.1557 Ops/s | |
| test_cql_speed[False-None] | 17.8139ms | 17.4441ms | 57.3259 Ops/s | 57.0850 Ops/s | |
| test_cql_speed[False-backward] | 23.2195ms | 22.7313ms | 43.9922 Ops/s | 43.6171 Ops/s | |
| test_cql_speed[True-None] | 3.6221ms | 3.4901ms | 286.5221 Ops/s | 283.7604 Ops/s | |
| test_cql_speed[True-backward] | 6.2393ms | 5.7970ms | 172.5041 Ops/s | 171.1397 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 19.2525ms | 11.9621ms | 83.5976 Ops/s | 83.0774 Ops/s | |
| test_a2c_speed[False-None] | 3.3807ms | 3.2726ms | 305.5628 Ops/s | 299.4716 Ops/s | |
| test_a2c_speed[False-backward] | 6.7989ms | 6.3269ms | 158.0564 Ops/s | 162.9348 Ops/s | |
| test_a2c_speed[True-None] | 1.5672ms | 1.4594ms | 685.2017 Ops/s | 685.6255 Ops/s | |
| test_a2c_speed[True-backward] | 3.3961ms | 3.3034ms | 302.7210 Ops/s | 317.3265 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.1877ms | 1.1128ms | 898.6612 Ops/s | 892.4345 Ops/s | |
| test_ppo_speed[False-None] | 4.0368ms | 3.9315ms | 254.3538 Ops/s | 252.5184 Ops/s | |
| test_ppo_speed[False-backward] | 7.6246ms | 7.2240ms | 138.4276 Ops/s | 142.0952 Ops/s | |
| test_ppo_speed[True-None] | 1.7227ms | 1.6099ms | 621.1565 Ops/s | 620.7946 Ops/s | |
| test_ppo_speed[True-backward] | 3.6329ms | 3.5077ms | 285.0882 Ops/s | 299.4155 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.3002ms | 1.1720ms | 853.2784 Ops/s | 839.5047 Ops/s | |
| test_reinforce_speed[False-None] | 2.4874ms | 2.3554ms | 424.5579 Ops/s | 423.2604 Ops/s | |
| test_reinforce_speed[False-backward] | 3.9267ms | 3.4676ms | 288.3830 Ops/s | 285.1405 Ops/s | |
| test_reinforce_speed[True-None] | 1.5887ms | 1.4715ms | 679.5749 Ops/s | 690.6557 Ops/s | |
| test_reinforce_speed[True-backward] | 3.3588ms | 3.3025ms | 302.8032 Ops/s | 298.9322 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 15.3608ms | 8.7946ms | 113.7064 Ops/s | 114.0211 Ops/s | |
| test_iql_speed[False-None] | 9.8757ms | 9.5551ms | 104.6566 Ops/s | 104.1396 Ops/s | |
| test_iql_speed[False-backward] | 13.9267ms | 13.4599ms | 74.2949 Ops/s | 74.5260 Ops/s | |
| test_iql_speed[True-None] | 2.5528ms | 2.3397ms | 427.4046 Ops/s | 424.7639 Ops/s | |
| test_iql_speed[True-backward] | 5.5268ms | 4.9155ms | 203.4382 Ops/s | 196.2770 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 16.4241ms | 10.0463ms | 99.5396 Ops/s | 99.9013 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 7.7638ms | 5.9940ms | 166.8337 Ops/s | 165.9973 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.6121ms | 0.3597ms | 2.7804 KOps/s | 2.5392 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.7049ms | 0.3355ms | 2.9808 KOps/s | 2.6824 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0520ms | 5.8246ms | 171.6857 Ops/s | 172.2540 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 0.7697ms | 0.2825ms | 3.5397 KOps/s | 3.3494 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.4498ms | 0.2645ms | 3.7806 KOps/s | 3.4218 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.4798ms | 1.2657ms | 790.1022 Ops/s | 777.3782 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.6118ms | 1.2172ms | 821.5385 Ops/s | 851.2271 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.9405ms | 6.1085ms | 163.7068 Ops/s | 169.1323 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.5269ms | 0.5316ms | 1.8810 KOps/s | 2.2978 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.8142ms | 0.5152ms | 1.9408 KOps/s | 2.3967 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9597ms | 5.8409ms | 171.2067 Ops/s | 174.0910 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 2.4458ms | 0.3979ms | 2.5134 KOps/s | 3.4631 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5508ms | 0.3706ms | 2.6986 KOps/s | 2.6186 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0624ms | 5.7622ms | 173.5439 Ops/s | 176.0249 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.9744ms | 0.2881ms | 3.4706 KOps/s | 2.5502 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5037ms | 0.2672ms | 3.7428 KOps/s | 3.1381 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 8.1462ms | 5.9595ms | 167.7981 Ops/s | 166.6892 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.1453ms | 0.4420ms | 2.2627 KOps/s | 2.2552 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6194ms | 0.4216ms | 2.3718 KOps/s | 2.3491 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.7223s | 19.4072ms | 51.5274 Ops/s | 197.9671 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 11.9144ms | 1.9918ms | 502.0672 Ops/s | 503.2511 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 8.1804ms | 1.2692ms | 787.9214 Ops/s | 748.1744 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 6.5333ms | 5.0579ms | 197.7089 Ops/s | 195.2160 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9120ms | 1.8211ms | 549.1243 Ops/s | 553.3194 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 6.6603ms | 1.2971ms | 770.9533 Ops/s | 1.0329 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 6.6674ms | 5.2076ms | 192.0275 Ops/s | 187.0189 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 13.3225ms | 2.2642ms | 441.6645 Ops/s | 497.1345 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 1.2025ms | 1.1160ms | 896.0341 Ops/s | 852.4634 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 44.6031ms | 39.4286ms | 25.3623 Ops/s | 25.4170 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 19.7996ms | 18.1330ms | 55.1479 Ops/s | 55.4363 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 43.7873ms | 40.9133ms | 24.4419 Ops/s | 24.5962 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 20.0668ms | 18.7442ms | 53.3499 Ops/s | 54.6487 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 45.2167ms | 42.6320ms | 23.4565 Ops/s | 23.5330 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.0187ms | 20.0820ms | 49.7959 Ops/s | 50.8494 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8649ms | 0.2210ms | 4.5243 KOps/s | 4.5437 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.6872ms | 1.4312ms | 698.7385 Ops/s | 698.3963 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.8142ms | 2.3735ms | 421.3184 Ops/s | 421.5884 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.2517ms | 2.9815ms | 335.4059 Ops/s | 334.8433 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.2908ms | 0.1676ms | 5.9648 KOps/s | 6.0848 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.4406ms | 0.2576ms | 3.8815 KOps/s | 4.4575 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 2.1433ms | 1.8997ms | 526.4079 Ops/s | 536.6940 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.6500ms | 1.4396ms | 694.6334 Ops/s | 691.6212 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3379ms | 1.1638ms | 859.2731 Ops/s | 852.1754 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 7.6873ms | 3.6843ms | 271.4246 Ops/s | 269.9490 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.9990ms | 6.0955ms | 164.0548 Ops/s | 164.0041 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 12.6853ms | 7.2608ms | 137.7266 Ops/s | 138.0057 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.3919ms | 0.2791ms | 3.5826 KOps/s | 3.6453 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.8794ms | 1.5490ms | 645.5703 Ops/s | 663.9785 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 3.0263ms | 2.5001ms | 399.9782 Ops/s | 401.2837 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.5455ms | 3.2458ms | 308.0944 Ops/s | 313.2913 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.3355ms | 33.3680ms | 29.9688 Ops/s | 30.8315 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.2379ms | 65.0223ms | 15.3793 Ops/s | 15.4402 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.5936ms | 37.7283ms | 26.5053 Ops/s | 26.8487 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 74.0783ms | 73.4495ms | 13.6148 Ops/s | 13.7145 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 55.0887ms | 54.8049ms | 18.2465 Ops/s | 18.1603 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1096s | 0.1091s | 9.1662 Ops/s | 9.1500 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 56.9176ms | 56.6134ms | 17.6637 Ops/s | 17.4759 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1136s | 0.1130s | 8.8530 Ops/s | 8.8408 Ops/s |
Adds worker startup logging (INFO level) showing: - policy id, wrapped_policy id, and whether they match - scheme model id and whether it matches policy/wrapped_policy - param fingerprint at end of rollout (not just start) This will reveal whether the scheme updates the same object the collector uses for inference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ging SharedMemWeightSyncScheme.prepare_weights() only updated the first unique weight buffer (index [0]), so when workers ran on different devices (e.g. cuda:4 and cuda:6), only the first device's shared memory buffer received new weights. The second worker stayed stale. Fix: iterate over ALL unique weight buffers in prepare_weights(). Also removes the verbose diagnostic logging added during debugging and reformats touched files with ufmt. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
update_policy_weights_()silently fails to update one worker's policy inMultiAsyncCollector/MultiSyncCollectorwhen workers use differentpolicy_devicevalues_make_policy_factorycallsscheme.init_on_receiver(model=policy)storing a weakref to the original policy, but_get_policy_and_devicelater deepcopies the policy to place it on the target device. The scheme's model reference becomes stale — weight updates go to the original (unused) objectregister_scheme_receiver, we now check if the scheme's model matches the collector's actual policy and fix it if they divergeTest plan
test_weight_update_after_device_castpasses (4 variants: Sync/Async × MP/SharedMem)TestPolicyFactorytests still pass🤖 Generated with Claude Code