[LLM] Add Countdown numbers-game environment#3571
Closed
vmoens wants to merge 1 commit intogh/vmoens/252/basefrom
Closed
[LLM] Add Countdown numbers-game environment#3571vmoens wants to merge 1 commit intogh/vmoens/252/basefrom
vmoens wants to merge 1 commit intogh/vmoens/252/basefrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3571
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Contributor
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
[Quality] |
Quality | [Quality] Fix typos and add codespell |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
This was referenced Mar 26, 2026
Contributor
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
[Quality] |
Quality | [Quality] Fix typos and add codespell |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
vmoens
added a commit
that referenced
this pull request
Mar 26, 2026
Add CountdownEnv and CountdownRewardParser for the Countdown numbers game, a popular lightweight problem for GRPO training. Key features: - Procedural problem generation (no external dataset required) - Validates arithmetic expressions: correct operators, each source number used at most once, evaluates to target - Same 0/0.1/1.0 reward convention as GSM8K and MATH parsers - Configurable problem difficulty (num_count, max_number, max_target) - Includes unit tests and documentation Made-with: Cursor ghstack-source-id: 1cc93cb Pull-Request: #3545 ghstack-source-id: 1cc93cb Pull Request resolved: #3571
Contributor
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_tensor_to_bytestream_speed[pickle] | 83.6562μs | 82.8049μs | 12.0766 KOps/s | 11.8322 KOps/s | |
| test_tensor_to_bytestream_speed[torch.save] | 0.1438ms | 0.1428ms | 7.0039 KOps/s | 6.9985 KOps/s | |
| test_tensor_to_bytestream_speed[untyped_storage] | 0.1240s | 0.1230s | 8.1274 Ops/s | 8.0545 Ops/s | |
| test_tensor_to_bytestream_speed[numpy] | 2.5665μs | 2.5587μs | 390.8248 KOps/s | 376.1038 KOps/s | |
| test_tensor_to_bytestream_speed[safetensors] | 38.6829μs | 38.6447μs | 25.8767 KOps/s | 25.7214 KOps/s | |
| test_simple | 0.7949s | 0.7932s | 1.2607 Ops/s | 1.2223 Ops/s | |
| test_transformed | 1.3949s | 1.3930s | 0.7179 Ops/s | 0.7057 Ops/s | |
| test_serial | 2.3533s | 2.3479s | 0.4259 Ops/s | 0.4224 Ops/s | |
| test_parallel | 1.9183s | 1.8735s | 0.5338 Ops/s | 0.5430 Ops/s | |
| test_step_mdp_speed[True-True-True-True-True] | 0.3225ms | 41.1695μs | 24.2898 KOps/s | 23.7215 KOps/s | |
| test_step_mdp_speed[True-True-True-True-False] | 0.4357ms | 23.3452μs | 42.8353 KOps/s | 43.2215 KOps/s | |
| test_step_mdp_speed[True-True-True-False-True] | 0.4437ms | 24.0829μs | 41.5233 KOps/s | 42.2287 KOps/s | |
| test_step_mdp_speed[True-True-True-False-False] | 42.3110μs | 12.9362μs | 77.3026 KOps/s | 76.6090 KOps/s | |
| test_step_mdp_speed[True-True-False-True-True] | 0.4723ms | 44.9461μs | 22.2489 KOps/s | 22.4023 KOps/s | |
| test_step_mdp_speed[True-True-False-True-False] | 0.4729ms | 25.8127μs | 38.7407 KOps/s | 39.3359 KOps/s | |
| test_step_mdp_speed[True-True-False-False-True] | 60.9810μs | 26.6031μs | 37.5896 KOps/s | 38.1703 KOps/s | |
| test_step_mdp_speed[True-True-False-False-False] | 0.4636ms | 15.5426μs | 64.3395 KOps/s | 63.9217 KOps/s | |
| test_step_mdp_speed[True-False-True-True-True] | 0.4761ms | 48.0181μs | 20.8255 KOps/s | 21.1779 KOps/s | |
| test_step_mdp_speed[True-False-True-True-False] | 78.1120μs | 28.4278μs | 35.1768 KOps/s | 35.2038 KOps/s | |
| test_step_mdp_speed[True-False-True-False-True] | 59.5610μs | 26.0575μs | 38.3767 KOps/s | 37.0109 KOps/s | |
| test_step_mdp_speed[True-False-True-False-False] | 0.4372ms | 15.7731μs | 63.3989 KOps/s | 63.5602 KOps/s | |
| test_step_mdp_speed[True-False-False-True-True] | 0.4725ms | 50.7006μs | 19.7236 KOps/s | 20.1613 KOps/s | |
| test_step_mdp_speed[True-False-False-True-False] | 63.5220μs | 30.7129μs | 32.5597 KOps/s | 32.8159 KOps/s | |
| test_step_mdp_speed[True-False-False-False-True] | 0.4687ms | 28.4403μs | 35.1614 KOps/s | 34.4016 KOps/s | |
| test_step_mdp_speed[True-False-False-False-False] | 0.4501ms | 18.0264μs | 55.4741 KOps/s | 54.9423 KOps/s | |
| test_step_mdp_speed[False-True-True-True-True] | 94.7620μs | 46.9038μs | 21.3202 KOps/s | 20.9879 KOps/s | |
| test_step_mdp_speed[False-True-True-True-False] | 0.4562ms | 27.8239μs | 35.9404 KOps/s | 35.3740 KOps/s | |
| test_step_mdp_speed[False-True-True-False-True] | 2.4258ms | 30.3070μs | 32.9957 KOps/s | 33.1373 KOps/s | |
| test_step_mdp_speed[False-True-True-False-False] | 0.4388ms | 17.0604μs | 58.6151 KOps/s | 57.8886 KOps/s | |
| test_step_mdp_speed[False-True-False-True-True] | 91.9820μs | 49.9223μs | 20.0311 KOps/s | 19.7052 KOps/s | |
| test_step_mdp_speed[False-True-False-True-False] | 0.4502ms | 30.8202μs | 32.4462 KOps/s | 32.2292 KOps/s | |
| test_step_mdp_speed[False-True-False-False-True] | 0.4616ms | 32.9594μs | 30.3403 KOps/s | 31.3904 KOps/s | |
| test_step_mdp_speed[False-True-False-False-False] | 0.4381ms | 19.6921μs | 50.7819 KOps/s | 51.2957 KOps/s | |
| test_step_mdp_speed[False-False-True-True-True] | 88.5120μs | 53.3137μs | 18.7569 KOps/s | 18.9863 KOps/s | |
| test_step_mdp_speed[False-False-True-True-False] | 0.4631ms | 34.1396μs | 29.2915 KOps/s | 30.1314 KOps/s | |
| test_step_mdp_speed[False-False-True-False-True] | 0.4650ms | 32.8559μs | 30.4359 KOps/s | 30.5092 KOps/s | |
| test_step_mdp_speed[False-False-True-False-False] | 0.4517ms | 19.8526μs | 50.3713 KOps/s | 50.7474 KOps/s | |
| test_step_mdp_speed[False-False-False-True-True] | 97.5720μs | 55.1260μs | 18.1403 KOps/s | 18.1238 KOps/s | |
| test_step_mdp_speed[False-False-False-True-False] | 0.4670ms | 36.9278μs | 27.0799 KOps/s | 27.8148 KOps/s | |
| test_step_mdp_speed[False-False-False-False-True] | 0.4672ms | 34.3894μs | 29.0787 KOps/s | 29.2528 KOps/s | |
| test_step_mdp_speed[False-False-False-False-False] | 0.4493ms | 22.3141μs | 44.8148 KOps/s | 45.5645 KOps/s | |
| test_non_tensor_env_rollout_speed[1000-single-True] | 0.7388s | 0.7336s | 1.3632 Ops/s | 1.3196 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-single-False] | 0.7168s | 0.6146s | 1.6271 Ops/s | 1.6214 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] | 1.7487s | 1.6591s | 0.6027 Ops/s | 0.5986 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] | 1.5214s | 1.4409s | 0.6940 Ops/s | 0.6871 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-True] | 1.9995s | 1.9242s | 0.5197 Ops/s | 0.5159 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-serial-buffers-False] | 1.7819s | 1.7050s | 0.5865 Ops/s | 0.5878 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] | 4.6760s | 4.6085s | 0.2170 Ops/s | 0.2152 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] | 4.5613s | 4.4379s | 0.2253 Ops/s | 0.2241 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] | 1.9688s | 1.8884s | 0.5295 Ops/s | 0.5199 Ops/s | |
| test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] | 1.7054s | 1.6008s | 0.6247 Ops/s | 0.6125 Ops/s | |
| test_values[generalized_advantage_estimate-True-True] | 21.8538ms | 21.3973ms | 46.7350 Ops/s | 45.6840 Ops/s | |
| test_values[vec_generalized_advantage_estimate-True-True] | 0.1539s | 4.0058ms | 249.6355 Ops/s | 280.6252 Ops/s | |
| test_values[td0_return_estimate-False-False] | 0.1101ms | 86.0552μs | 11.6205 KOps/s | 11.6192 KOps/s | |
| test_values[td1_return_estimate-False-False] | 51.0513ms | 50.6614ms | 19.7389 Ops/s | 19.7111 Ops/s | |
| test_values[vec_td1_return_estimate-False-False] | 1.3544ms | 1.0996ms | 909.4151 Ops/s | 901.3318 Ops/s | |
| test_values[td_lambda_return_estimate-True-False] | 83.2278ms | 82.5523ms | 12.1135 Ops/s | 12.0944 Ops/s | |
| test_values[vec_td_lambda_return_estimate-True-False] | 1.3168ms | 1.0990ms | 909.8961 Ops/s | 911.0187 Ops/s | |
| test_gae_speed[generalized_advantage_estimate-False-1-512] | 23.2524ms | 23.0240ms | 43.4329 Ops/s | 45.6705 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-1-512] | 1.0997ms | 0.8124ms | 1.2309 KOps/s | 1.2909 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-1-512] | 0.7385ms | 0.6874ms | 1.4548 KOps/s | 1.4506 KOps/s | |
| test_gae_speed[vec_generalized_advantage_estimate-True-32-512] | 1.5829ms | 1.5115ms | 661.5840 Ops/s | 663.8857 Ops/s | |
| test_gae_speed[vec_generalized_advantage_estimate-False-32-512] | 0.7669ms | 0.7106ms | 1.4073 KOps/s | 1.4189 KOps/s | |
| test_dqn_speed[False-None] | 1.7250ms | 1.6100ms | 621.1069 Ops/s | 617.7673 Ops/s | |
| test_dqn_speed[False-backward] | 2.7146ms | 2.3347ms | 428.3167 Ops/s | 437.0294 Ops/s | |
| test_dqn_speed[True-None] | 1.0990ms | 0.5983ms | 1.6714 KOps/s | 1.6855 KOps/s | |
| test_dqn_speed[True-backward] | 1.3196ms | 1.2411ms | 805.7197 Ops/s | 795.4935 Ops/s | |
| test_dqn_speed[reduce-overhead-None] | 0.7305ms | 0.6247ms | 1.6007 KOps/s | 1.6101 KOps/s | |
| test_ddpg_speed[False-None] | 3.4929ms | 3.0762ms | 325.0736 Ops/s | 328.2684 Ops/s | |
| test_ddpg_speed[False-backward] | 4.8267ms | 4.5064ms | 221.9077 Ops/s | 222.3119 Ops/s | |
| test_ddpg_speed[True-None] | 1.4502ms | 1.3625ms | 733.9239 Ops/s | 720.3126 Ops/s | |
| test_ddpg_speed[True-backward] | 2.7011ms | 2.5417ms | 393.4302 Ops/s | 394.2024 Ops/s | |
| test_ddpg_speed[reduce-overhead-None] | 1.5355ms | 1.3957ms | 716.5027 Ops/s | 718.7049 Ops/s | |
| test_sac_speed[False-None] | 9.1294ms | 8.7153ms | 114.7414 Ops/s | 115.8428 Ops/s | |
| test_sac_speed[False-backward] | 12.8729ms | 12.0199ms | 83.1952 Ops/s | 83.2983 Ops/s | |
| test_sac_speed[True-None] | 2.1871ms | 1.8784ms | 532.3665 Ops/s | 534.2507 Ops/s | |
| test_sac_speed[True-backward] | 3.7552ms | 3.6637ms | 272.9504 Ops/s | 272.3681 Ops/s | |
| test_sac_speed[reduce-overhead-None] | 17.0128ms | 10.3458ms | 96.6576 Ops/s | 82.4685 Ops/s | |
| test_redq_deprec_speed[False-None] | 10.6678ms | 9.7145ms | 102.9387 Ops/s | 101.7628 Ops/s | |
| test_redq_deprec_speed[False-backward] | 14.5175ms | 13.1799ms | 75.8731 Ops/s | 75.3662 Ops/s | |
| test_redq_deprec_speed[True-None] | 2.6815ms | 2.5890ms | 386.2498 Ops/s | 377.4202 Ops/s | |
| test_redq_deprec_speed[True-backward] | 4.8543ms | 4.3111ms | 231.9586 Ops/s | 240.8377 Ops/s | |
| test_redq_deprec_speed[reduce-overhead-None] | 14.9076ms | 9.7026ms | 103.0648 Ops/s | 102.6180 Ops/s | |
| test_td3_speed[False-None] | 8.8783ms | 8.5946ms | 116.3524 Ops/s | 117.2944 Ops/s | |
| test_td3_speed[False-backward] | 11.7005ms | 11.2907ms | 88.5688 Ops/s | 90.6503 Ops/s | |
| test_td3_speed[True-None] | 1.6584ms | 1.6358ms | 611.3119 Ops/s | 610.1236 Ops/s | |
| test_td3_speed[True-backward] | 3.2095ms | 3.1744ms | 315.0160 Ops/s | 323.5701 Ops/s | |
| test_td3_speed[reduce-overhead-None] | 50.3455ms | 25.9361ms | 38.5562 Ops/s | 38.7469 Ops/s | |
| test_cql_speed[False-None] | 18.5019ms | 18.0353ms | 55.4469 Ops/s | 55.5054 Ops/s | |
| test_cql_speed[False-backward] | 24.2648ms | 23.8285ms | 41.9666 Ops/s | 42.6045 Ops/s | |
| test_cql_speed[True-None] | 3.4103ms | 3.3216ms | 301.0557 Ops/s | 303.1724 Ops/s | |
| test_cql_speed[True-backward] | 6.0370ms | 5.6085ms | 178.2994 Ops/s | 181.0487 Ops/s | |
| test_cql_speed[reduce-overhead-None] | 19.0677ms | 12.0912ms | 82.7045 Ops/s | 83.0378 Ops/s | |
| test_a2c_speed[False-None] | 3.6549ms | 3.4504ms | 289.8177 Ops/s | 292.1392 Ops/s | |
| test_a2c_speed[False-backward] | 8.0116ms | 6.7327ms | 148.5285 Ops/s | 152.2775 Ops/s | |
| test_a2c_speed[True-None] | 1.4651ms | 1.4031ms | 712.7120 Ops/s | 720.6560 Ops/s | |
| test_a2c_speed[True-backward] | 3.5765ms | 3.1676ms | 315.6918 Ops/s | 317.1351 Ops/s | |
| test_a2c_speed[reduce-overhead-None] | 1.0933ms | 1.0392ms | 962.2685 Ops/s | 944.6868 Ops/s | |
| test_ppo_speed[False-None] | 4.2032ms | 4.1045ms | 243.6334 Ops/s | 242.0179 Ops/s | |
| test_ppo_speed[False-backward] | 7.9778ms | 7.6032ms | 131.5233 Ops/s | 132.6254 Ops/s | |
| test_ppo_speed[True-None] | 1.6126ms | 1.5290ms | 654.0317 Ops/s | 656.3433 Ops/s | |
| test_ppo_speed[True-backward] | 3.4291ms | 3.3748ms | 296.3105 Ops/s | 293.7646 Ops/s | |
| test_ppo_speed[reduce-overhead-None] | 1.3009ms | 1.1087ms | 901.9679 Ops/s | 888.8316 Ops/s | |
| test_reinforce_speed[False-None] | 2.6562ms | 2.4538ms | 407.5301 Ops/s | 403.7609 Ops/s | |
| test_reinforce_speed[False-backward] | 4.0337ms | 3.6442ms | 274.4054 Ops/s | 273.1000 Ops/s | |
| test_reinforce_speed[True-None] | 1.4841ms | 1.3556ms | 737.7071 Ops/s | 730.7799 Ops/s | |
| test_reinforce_speed[True-backward] | 3.3418ms | 3.2149ms | 311.0475 Ops/s | 308.8894 Ops/s | |
| test_reinforce_speed[reduce-overhead-None] | 16.1019ms | 8.9166ms | 112.1499 Ops/s | 111.3198 Ops/s | |
| test_iql_speed[False-None] | 10.4379ms | 10.0031ms | 99.9686 Ops/s | 100.5270 Ops/s | |
| test_iql_speed[False-backward] | 14.8362ms | 14.1186ms | 70.8287 Ops/s | 71.0008 Ops/s | |
| test_iql_speed[True-None] | 2.4142ms | 2.2831ms | 438.0008 Ops/s | 435.5284 Ops/s | |
| test_iql_speed[True-backward] | 5.3709ms | 4.9963ms | 200.1465 Ops/s | 205.2995 Ops/s | |
| test_iql_speed[reduce-overhead-None] | 16.7861ms | 10.2475ms | 97.5849 Ops/s | 98.6757 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 6.4138ms | 6.0299ms | 165.8397 Ops/s | 166.1867 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 0.8274ms | 0.3017ms | 3.3145 KOps/s | 2.9706 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5590ms | 0.3043ms | 3.2857 KOps/s | 3.2502 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.1577ms | 5.8568ms | 170.7420 Ops/s | 171.5756 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 1.7730ms | 0.2882ms | 3.4694 KOps/s | 2.9736 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5952ms | 0.2938ms | 3.4039 KOps/s | 3.2124 KOps/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] | 1.5070ms | 1.2868ms | 777.1262 Ops/s | 645.8797 Ops/s | |
| test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] | 1.4282ms | 1.2008ms | 832.7599 Ops/s | 678.4482 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 9.5359ms | 6.1467ms | 162.6878 Ops/s | 166.3872 Ops/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 1.0480ms | 0.4837ms | 2.0675 KOps/s | 1.9761 KOps/s | |
| test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.7881ms | 0.4831ms | 2.0700 KOps/s | 2.3494 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] | 5.9768ms | 5.8979ms | 169.5518 Ops/s | 169.6675 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] | 1.9097ms | 0.3144ms | 3.1809 KOps/s | 3.3942 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] | 0.5785ms | 0.3304ms | 3.0269 KOps/s | 3.6774 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] | 6.0756ms | 5.8100ms | 172.1170 Ops/s | 173.0460 Ops/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] | 2.0512ms | 0.3521ms | 2.8400 KOps/s | 2.7850 KOps/s | |
| test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] | 0.5424ms | 0.3295ms | 3.0353 KOps/s | 2.9320 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] | 8.1911ms | 6.0253ms | 165.9672 Ops/s | 167.4008 Ops/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] | 2.2332ms | 0.4899ms | 2.0413 KOps/s | 2.0592 KOps/s | |
| test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] | 0.6147ms | 0.4216ms | 2.3718 KOps/s | 2.2309 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] | 0.9634s | 24.2849ms | 41.1779 Ops/s | 35.3321 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] | 11.6245ms | 2.0938ms | 477.5980 Ops/s | 536.3769 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] | 1.1246ms | 0.9774ms | 1.0231 KOps/s | 1.0246 KOps/s | |
| test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] | 10.0310ms | 5.1411ms | 194.5101 Ops/s | 192.2135 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] | 3.9080ms | 1.8240ms | 548.2481 Ops/s | 488.8971 Ops/s | |
| test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] | 1.1424ms | 0.9761ms | 1.0244 KOps/s | 1.0204 KOps/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] | 9.2799ms | 5.3137ms | 188.1932 Ops/s | 186.4900 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] | 9.4161ms | 2.1686ms | 461.1345 Ops/s | 497.1771 Ops/s | |
| test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] | 10.4125ms | 1.3399ms | 746.3201 Ops/s | 820.6258 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] | 42.2279ms | 39.6608ms | 25.2138 Ops/s | 25.2843 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] | 20.0365ms | 18.6050ms | 53.7491 Ops/s | 53.9001 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] | 43.6685ms | 40.2521ms | 24.8434 Ops/s | 24.6274 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] | 19.9495ms | 18.5922ms | 53.7860 Ops/s | 52.5462 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] | 44.5241ms | 42.1466ms | 23.7267 Ops/s | 23.4191 Ops/s | |
| test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] | 21.2175ms | 20.1079ms | 49.7318 Ops/s | 48.6039 Ops/s | |
| test_storage_write_lazystack[50-img_shape0-small] | 0.8806ms | 0.2313ms | 4.3228 KOps/s | 4.2623 KOps/s | |
| test_storage_write_lazystack[100-img_shape1-atari] | 1.6853ms | 1.3650ms | 732.5795 Ops/s | 701.4381 Ops/s | |
| test_storage_write_lazystack[100-img_shape2-large_img] | 2.7129ms | 2.2725ms | 440.0453 Ops/s | 433.1233 Ops/s | |
| test_storage_write_lazystack[200-img_shape3-large_batch] | 3.0397ms | 2.9114ms | 343.4803 Ops/s | 336.7704 Ops/s | |
| test_storage_write_contiguous[50-img_shape0-small] | 0.5495ms | 0.1663ms | 6.0116 KOps/s | 5.9930 KOps/s | |
| test_storage_write_contiguous[100-img_shape1-atari] | 0.4030ms | 0.2385ms | 4.1922 KOps/s | 4.1464 KOps/s | |
| test_storage_write_contiguous[100-img_shape2-large_img] | 1.9865ms | 1.8487ms | 540.9118 Ops/s | 586.4496 Ops/s | |
| test_storage_write_contiguous[200-img_shape3-large_batch] | 1.5401ms | 1.3720ms | 728.8592 Ops/s | 736.9873 Ops/s | |
| test_collector_stack_then_write[50-img_shape0-small] | 1.3310ms | 1.1570ms | 864.2739 Ops/s | 858.3650 Ops/s | |
| test_collector_stack_then_write[100-img_shape1-atari] | 3.9501ms | 3.5892ms | 278.6173 Ops/s | 273.4177 Ops/s | |
| test_collector_stack_then_write[100-img_shape2-large_img] | 11.2145ms | 5.8618ms | 170.5962 Ops/s | 171.8305 Ops/s | |
| test_collector_stack_then_write[200-img_shape3-large_batch] | 7.4371ms | 7.0455ms | 141.9340 Ops/s | 140.1679 Ops/s | |
| test_collector_lazystack_then_write[50-img_shape0-small] | 0.4422ms | 0.2787ms | 3.5877 KOps/s | 3.4620 KOps/s | |
| test_collector_lazystack_then_write[100-img_shape1-atari] | 1.6000ms | 1.4585ms | 685.6344 Ops/s | 650.5613 Ops/s | |
| test_collector_lazystack_then_write[100-img_shape2-large_img] | 2.8308ms | 2.4007ms | 416.5414 Ops/s | 408.8597 Ops/s | |
| test_collector_lazystack_then_write[200-img_shape3-large_batch] | 3.4449ms | 3.1296ms | 319.5323 Ops/s | 313.4996 Ops/s | |
| test_collector_without_rb[100-img_shape0-atari] | 34.7663ms | 33.4464ms | 29.8986 Ops/s | 29.1393 Ops/s | |
| test_collector_without_rb[200-img_shape1-large_batch] | 66.0463ms | 65.7630ms | 15.2061 Ops/s | 14.7296 Ops/s | |
| test_collector_with_rb[100-img_shape0-atari] | 38.7329ms | 38.0663ms | 26.2700 Ops/s | 25.7327 Ops/s | |
| test_collector_with_rb[200-img_shape1-large_batch] | 75.2361ms | 74.5165ms | 13.4198 Ops/s | 13.2452 Ops/s | |
| test_collector_without_rb_cuda[100-img_shape0-atari] | 56.3901ms | 56.2898ms | 17.7652 Ops/s | 17.1379 Ops/s | |
| test_collector_without_rb_cuda[200-img_shape1-large_batch] | 0.1161s | 0.1127s | 8.8741 Ops/s | 8.6789 Ops/s | |
| test_collector_with_rb_cuda[100-img_shape0-atari] | 58.7277ms | 58.3260ms | 17.1450 Ops/s | 16.4423 Ops/s | |
| test_collector_with_rb_cuda[200-img_shape1-large_batch] | 0.1160s | 0.1156s | 8.6540 Ops/s | 8.3931 Ops/s |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
Add CountdownEnv and CountdownRewardParser for the Countdown numbers
game, a popular lightweight problem for GRPO training.
Key features:
number used at most once, evaluates to target
Made-with: Cursor
Pull-Request: #3545