Skip to content

[Docs] Add TransformersWrapper/ChatEnv integration documentation#3377

Merged
vmoens merged 3 commits into
gh/vmoens/204/basefrom
gh/vmoens/204/head
Jan 29, 2026
Merged

[Docs] Add TransformersWrapper/ChatEnv integration documentation#3377
vmoens merged 3 commits into
gh/vmoens/204/basefrom
gh/vmoens/204/head

Conversation

@vmoens
Copy link
Copy Markdown
Collaborator

@vmoens vmoens commented Jan 22, 2026

Stack from ghstack (oldest at bottom):

Add TRANSFORMERS_CHATENV_INTEGRATION.md documenting:

  • The data flow between TransformersWrapper and ChatEnv
  • The ChatHistory contract (prompt, response, full attributes)
  • Common issues and debugging tips
  • Related files for reference

Also update LLM_TEST_ISSUES.md to reflect the fixes:

  • Issue 10 (Gated HuggingFace Models): Now uses mock tokenizer with SmolLM
  • Issue 11 (TransformersWrapper History Output): Fixed by setting full

[ghstack-poisoned]
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Jan 22, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3377

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

vmoens added a commit that referenced this pull request Jan 22, 2026
Add TRANSFORMERS_CHATENV_INTEGRATION.md documenting:
- The data flow between TransformersWrapper and ChatEnv
- The ChatHistory contract (prompt, response, full attributes)
- Common issues and debugging tips
- Related files for reference

Also update LLM_TEST_ISSUES.md to reflect the fixes:
- Issue 10 (Gated HuggingFace Models): Now uses mock tokenizer with SmolLM
- Issue 11 (TransformersWrapper History Output): Fixed by setting full


ghstack-source-id: e06fd77
Pull-Request: #3377
@vmoens vmoens added the llm/ LLM-related PR, triggers LLM CI tests label Jan 22, 2026
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2026
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 22, 2026
Add TRANSFORMERS_CHATENV_INTEGRATION.md documenting:
- The data flow between TransformersWrapper and ChatEnv
- The ChatHistory contract (prompt, response, full attributes)
- Common issues and debugging tips
- Related files for reference

Also update LLM_TEST_ISSUES.md to reflect the fixes:
- Issue 10 (Gated HuggingFace Models): Now uses mock tokenizer with SmolLM
- Issue 11 (TransformersWrapper History Output): Fixed by setting full


ghstack-source-id: 8ed97f7
Pull-Request: #3377
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 22, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 153. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}15$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 86.6142μs 85.6737μs 11.6722 KOps/s 12.3477 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_tensor_to_bytestream_speed[torch.save] 0.1455ms 0.1448ms 6.9079 KOps/s 7.1183 KOps/s $\color{#d91a1a}-2.96\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1154s 0.1147s 8.7151 Ops/s 9.3670 Ops/s $\textbf{\color{#d91a1a}-6.96\%}$
test_tensor_to_bytestream_speed[numpy] 2.6780μs 2.6724μs 374.1956 KOps/s 394.2497 KOps/s $\textbf{\color{#d91a1a}-5.09\%}$
test_tensor_to_bytestream_speed[safetensors] 39.2378μs 38.2262μs 26.1601 KOps/s 27.0103 KOps/s $\color{#d91a1a}-3.15\%$
test_simple 0.6670s 0.5769s 1.7335 Ops/s 1.7470 Ops/s $\color{#d91a1a}-0.77\%$
test_transformed 1.2522s 1.1600s 0.8621 Ops/s 0.8681 Ops/s $\color{#d91a1a}-0.69\%$
test_serial 1.6894s 1.6828s 0.5943 Ops/s 0.5918 Ops/s $\color{#35bf28}+0.41\%$
test_parallel 1.2574s 1.1803s 0.8472 Ops/s 0.8938 Ops/s $\textbf{\color{#d91a1a}-5.21\%}$
test_step_mdp_speed[True-True-True-True-True] 0.1580ms 45.0677μs 22.1888 KOps/s 21.6822 KOps/s $\color{#35bf28}+2.34\%$
test_step_mdp_speed[True-True-True-True-False] 52.0020μs 25.1217μs 39.8062 KOps/s 39.4014 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[True-True-True-False-True] 49.4120μs 25.3065μs 39.5155 KOps/s 39.4555 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[True-True-True-False-False] 36.5810μs 13.8703μs 72.0965 KOps/s 70.7528 KOps/s $\color{#35bf28}+1.90\%$
test_step_mdp_speed[True-True-False-True-True] 85.4930μs 48.6174μs 20.5688 KOps/s 20.8954 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-True-False-True-False] 55.0120μs 27.6498μs 36.1666 KOps/s 35.2817 KOps/s $\color{#35bf28}+2.51\%$
test_step_mdp_speed[True-True-False-False-True] 58.5730μs 28.5042μs 35.0825 KOps/s 35.2645 KOps/s $\color{#d91a1a}-0.52\%$
test_step_mdp_speed[True-True-False-False-False] 56.8830μs 16.8329μs 59.4076 KOps/s 59.7222 KOps/s $\color{#d91a1a}-0.53\%$
test_step_mdp_speed[True-False-True-True-True] 85.6330μs 50.8301μs 19.6734 KOps/s 19.4884 KOps/s $\color{#35bf28}+0.95\%$
test_step_mdp_speed[True-False-True-True-False] 62.0230μs 30.4237μs 32.8691 KOps/s 31.8371 KOps/s $\color{#35bf28}+3.24\%$
test_step_mdp_speed[True-False-True-False-True] 63.2020μs 28.3224μs 35.3077 KOps/s 35.1916 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-True-False-False] 43.2620μs 16.6260μs 60.1466 KOps/s 59.7693 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[True-False-False-True-True] 99.7240μs 52.7847μs 18.9449 KOps/s 18.4398 KOps/s $\color{#35bf28}+2.74\%$
test_step_mdp_speed[True-False-False-True-False] 74.4230μs 32.8021μs 30.4858 KOps/s 29.4805 KOps/s $\color{#35bf28}+3.41\%$
test_step_mdp_speed[True-False-False-False-True] 71.9330μs 30.1481μs 33.1696 KOps/s 32.0483 KOps/s $\color{#35bf28}+3.50\%$
test_step_mdp_speed[True-False-False-False-False] 54.9820μs 19.0565μs 52.4754 KOps/s 51.2611 KOps/s $\color{#35bf28}+2.37\%$
test_step_mdp_speed[False-True-True-True-True] 96.3040μs 50.6871μs 19.7289 KOps/s 19.1875 KOps/s $\color{#35bf28}+2.82\%$
test_step_mdp_speed[False-True-True-True-False] 68.8130μs 30.7922μs 32.4758 KOps/s 32.2917 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[False-True-True-False-True] 81.4330μs 32.2264μs 31.0305 KOps/s 31.3110 KOps/s $\color{#d91a1a}-0.90\%$
test_step_mdp_speed[False-True-True-False-False] 44.6520μs 18.1136μs 55.2070 KOps/s 53.6026 KOps/s $\color{#35bf28}+2.99\%$
test_step_mdp_speed[False-True-False-True-True] 2.7665ms 53.7984μs 18.5879 KOps/s 18.2600 KOps/s $\color{#35bf28}+1.80\%$
test_step_mdp_speed[False-True-False-True-False] 69.4130μs 33.1388μs 30.1761 KOps/s 29.5103 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[False-True-False-False-True] 73.3040μs 34.3723μs 29.0932 KOps/s 28.2740 KOps/s $\color{#35bf28}+2.90\%$
test_step_mdp_speed[False-True-False-False-False] 70.8530μs 20.9715μs 47.6837 KOps/s 46.1710 KOps/s $\color{#35bf28}+3.28\%$
test_step_mdp_speed[False-False-True-True-True] 94.9250μs 55.6491μs 17.9697 KOps/s 17.3596 KOps/s $\color{#35bf28}+3.51\%$
test_step_mdp_speed[False-False-True-True-False] 68.2530μs 35.7649μs 27.9604 KOps/s 27.6107 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[False-False-True-False-True] 71.7730μs 34.4917μs 28.9925 KOps/s 28.5033 KOps/s $\color{#35bf28}+1.72\%$
test_step_mdp_speed[False-False-True-False-False] 62.8930μs 20.8195μs 48.0318 KOps/s 46.3499 KOps/s $\color{#35bf28}+3.63\%$
test_step_mdp_speed[False-False-False-True-True] 0.1092ms 57.5029μs 17.3904 KOps/s 16.7181 KOps/s $\color{#35bf28}+4.02\%$
test_step_mdp_speed[False-False-False-True-False] 78.3630μs 37.6476μs 26.5621 KOps/s 25.3955 KOps/s $\color{#35bf28}+4.59\%$
test_step_mdp_speed[False-False-False-False-True] 70.1130μs 36.1038μs 27.6979 KOps/s 26.3361 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_step_mdp_speed[False-False-False-False-False] 63.7330μs 23.0724μs 43.3418 KOps/s 41.3588 KOps/s $\color{#35bf28}+4.79\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.8591s 0.7665s 1.3047 Ops/s 1.3050 Ops/s $\color{#d91a1a}-0.02\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7258s 0.6323s 1.5816 Ops/s 1.5723 Ops/s $\color{#35bf28}+0.59\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7425s 1.6675s 0.5997 Ops/s 0.5971 Ops/s $\color{#35bf28}+0.43\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5216s 1.4488s 0.6902 Ops/s 0.6920 Ops/s $\color{#d91a1a}-0.26\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 1.9967s 1.9174s 0.5215 Ops/s 0.5206 Ops/s $\color{#35bf28}+0.19\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7771s 1.6996s 0.5884 Ops/s 0.5898 Ops/s $\color{#d91a1a}-0.24\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.7514s 4.6456s 0.2153 Ops/s 0.2152 Ops/s $\color{#35bf28}+0.03\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.4425s 4.3528s 0.2297 Ops/s 0.2244 Ops/s $\color{#35bf28}+2.38\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.1654s 2.0343s 0.4916 Ops/s 0.5069 Ops/s $\color{#d91a1a}-3.03\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.7342s 1.6651s 0.6006 Ops/s 0.5983 Ops/s $\color{#35bf28}+0.37\%$
test_values[generalized_advantage_estimate-True-True] 10.5307ms 10.2403ms 97.6536 Ops/s 98.3214 Ops/s $\color{#d91a1a}-0.68\%$
test_values[vec_generalized_advantage_estimate-True-True] 15.2541ms 11.3493ms 88.1115 Ops/s 88.3077 Ops/s $\color{#d91a1a}-0.22\%$
test_values[td0_return_estimate-False-False] 0.2521ms 0.1266ms 7.8995 KOps/s 7.9124 KOps/s $\color{#d91a1a}-0.16\%$
test_values[td1_return_estimate-False-False] 27.7073ms 27.3064ms 36.6215 Ops/s 36.5607 Ops/s $\color{#35bf28}+0.17\%$
test_values[vec_td1_return_estimate-False-False] 12.4606ms 11.4420ms 87.3972 Ops/s 87.6139 Ops/s $\color{#d91a1a}-0.25\%$
test_values[td_lambda_return_estimate-True-False] 40.8908ms 40.2229ms 24.8615 Ops/s 25.0276 Ops/s $\color{#d91a1a}-0.66\%$
test_values[vec_td_lambda_return_estimate-True-False] 12.1747ms 11.3567ms 88.0538 Ops/s 86.8004 Ops/s $\color{#35bf28}+1.44\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 9.2381ms 9.1183ms 109.6698 Ops/s 110.4405 Ops/s $\color{#d91a1a}-0.70\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.8690ms 1.5432ms 648.0129 Ops/s 689.3843 Ops/s $\textbf{\color{#d91a1a}-6.00\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.4703ms 0.4197ms 2.3826 KOps/s 2.4174 KOps/s $\color{#d91a1a}-1.44\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 35.6280ms 28.3083ms 35.3254 Ops/s 51.7381 Ops/s $\textbf{\color{#d91a1a}-31.72\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 2.0722ms 1.7298ms 578.0882 Ops/s 576.6770 Ops/s $\color{#35bf28}+0.24\%$
test_dqn_speed[False-None] 1.7214ms 1.3932ms 717.7827 Ops/s 710.5847 Ops/s $\color{#35bf28}+1.01\%$
test_dqn_speed[False-backward] 2.0007ms 1.9237ms 519.8261 Ops/s 526.8453 Ops/s $\color{#d91a1a}-1.33\%$
test_dqn_speed[True-None] 0.7584ms 0.5314ms 1.8819 KOps/s 1.7594 KOps/s $\textbf{\color{#35bf28}+6.97\%}$
test_dqn_speed[True-backward] 1.0733ms 0.9831ms 1.0172 KOps/s 852.1163 Ops/s $\textbf{\color{#35bf28}+19.37\%}$
test_dqn_speed[reduce-overhead-None] 0.9040ms 0.5184ms 1.9288 KOps/s 1.8309 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_ddpg_speed[False-None] 0.1994s 3.3925ms 294.7706 Ops/s 351.2526 Ops/s $\textbf{\color{#d91a1a}-16.08\%}$
test_ddpg_speed[False-backward] 4.1029ms 4.0097ms 249.3930 Ops/s 247.7575 Ops/s $\color{#35bf28}+0.66\%$
test_ddpg_speed[True-None] 1.5858ms 1.3895ms 719.6635 Ops/s 717.2487 Ops/s $\color{#35bf28}+0.34\%$
test_ddpg_speed[True-backward] 2.4911ms 2.3824ms 419.7379 Ops/s 370.6213 Ops/s $\textbf{\color{#35bf28}+13.25\%}$
test_ddpg_speed[reduce-overhead-None] 2.2204ms 1.4187ms 704.8651 Ops/s 711.3368 Ops/s $\color{#d91a1a}-0.91\%$
test_sac_speed[False-None] 8.5251ms 7.9381ms 125.9754 Ops/s 125.2789 Ops/s $\color{#35bf28}+0.56\%$
test_sac_speed[False-backward] 11.6496ms 11.1032ms 90.0640 Ops/s 89.8860 Ops/s $\color{#35bf28}+0.20\%$
test_sac_speed[True-None] 2.5244ms 2.1609ms 462.7744 Ops/s 455.1299 Ops/s $\color{#35bf28}+1.68\%$
test_sac_speed[True-backward] 4.1211ms 4.0130ms 249.1871 Ops/s 222.1270 Ops/s $\textbf{\color{#35bf28}+12.18\%}$
test_sac_speed[reduce-overhead-None] 2.5708ms 2.1425ms 466.7552 Ops/s 466.8918 Ops/s $\color{#d91a1a}-0.03\%$
test_redq_speed[False-None] 12.5346ms 10.5328ms 94.9412 Ops/s 97.5028 Ops/s $\color{#d91a1a}-2.63\%$
test_redq_speed[False-backward] 21.0190ms 17.7527ms 56.3294 Ops/s 57.4374 Ops/s $\color{#d91a1a}-1.93\%$
test_redq_speed[True-None] 5.1750ms 4.4177ms 226.3629 Ops/s 221.9222 Ops/s $\color{#35bf28}+2.00\%$
test_redq_speed[True-backward] 10.1826ms 9.5063ms 105.1939 Ops/s 99.3191 Ops/s $\textbf{\color{#35bf28}+5.92\%}$
test_redq_speed[reduce-overhead-None] 4.5380ms 4.3712ms 228.7718 Ops/s 218.8096 Ops/s $\color{#35bf28}+4.55\%$
test_redq_deprec_speed[False-None] 12.8023ms 11.4072ms 87.6638 Ops/s 91.9340 Ops/s $\color{#d91a1a}-4.64\%$
test_redq_deprec_speed[False-backward] 16.6049ms 16.2466ms 61.5514 Ops/s 64.6227 Ops/s $\color{#d91a1a}-4.75\%$
test_redq_deprec_speed[True-None] 4.2249ms 3.8925ms 256.9058 Ops/s 258.9435 Ops/s $\color{#d91a1a}-0.79\%$
test_redq_deprec_speed[True-backward] 8.3386ms 8.0633ms 124.0180 Ops/s 131.7714 Ops/s $\textbf{\color{#d91a1a}-5.88\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.0323ms 3.7919ms 263.7220 Ops/s 279.1831 Ops/s $\textbf{\color{#d91a1a}-5.54\%}$
test_td3_speed[False-None] 8.7289ms 7.9388ms 125.9644 Ops/s 124.2301 Ops/s $\color{#35bf28}+1.40\%$
test_td3_speed[False-backward] 11.1945ms 10.7258ms 93.2331 Ops/s 91.0172 Ops/s $\color{#35bf28}+2.43\%$
test_td3_speed[True-None] 1.8739ms 1.8322ms 545.7970 Ops/s 543.5783 Ops/s $\color{#35bf28}+0.41\%$
test_td3_speed[True-backward] 3.8347ms 3.6174ms 276.4449 Ops/s 270.0462 Ops/s $\color{#35bf28}+2.37\%$
test_td3_speed[reduce-overhead-None] 1.8329ms 1.7945ms 557.2650 Ops/s 550.2624 Ops/s $\color{#35bf28}+1.27\%$
test_cql_speed[False-None] 29.1217ms 25.8454ms 38.6916 Ops/s 38.5035 Ops/s $\color{#35bf28}+0.49\%$
test_cql_speed[False-backward] 35.5871ms 34.7833ms 28.7494 Ops/s 28.1506 Ops/s $\color{#35bf28}+2.13\%$
test_cql_speed[True-None] 13.3526ms 12.4732ms 80.1718 Ops/s 81.3306 Ops/s $\color{#d91a1a}-1.42\%$
test_cql_speed[True-backward] 18.2832ms 17.8360ms 56.0664 Ops/s 56.9869 Ops/s $\color{#d91a1a}-1.62\%$
test_cql_speed[reduce-overhead-None] 13.1560ms 12.5330ms 79.7892 Ops/s 81.5535 Ops/s $\color{#d91a1a}-2.16\%$
test_a2c_speed[False-None] 5.8478ms 5.4596ms 183.1621 Ops/s 186.3073 Ops/s $\color{#d91a1a}-1.69\%$
test_a2c_speed[False-backward] 12.2899ms 11.9730ms 83.5212 Ops/s 85.0392 Ops/s $\color{#d91a1a}-1.79\%$
test_a2c_speed[True-None] 4.0461ms 3.7842ms 264.2591 Ops/s 270.7309 Ops/s $\color{#d91a1a}-2.39\%$
test_a2c_speed[True-backward] 8.9682ms 8.7536ms 114.2389 Ops/s 109.8839 Ops/s $\color{#35bf28}+3.96\%$
test_a2c_speed[reduce-overhead-None] 4.1998ms 3.7777ms 264.7137 Ops/s 267.7614 Ops/s $\color{#d91a1a}-1.14\%$
test_ppo_speed[False-None] 6.3402ms 5.8792ms 170.0907 Ops/s 172.3808 Ops/s $\color{#d91a1a}-1.33\%$
test_ppo_speed[False-backward] 12.8135ms 12.4792ms 80.1333 Ops/s 81.6856 Ops/s $\color{#d91a1a}-1.90\%$
test_ppo_speed[True-None] 4.1623ms 3.7165ms 269.0693 Ops/s 273.4025 Ops/s $\color{#d91a1a}-1.58\%$
test_ppo_speed[True-backward] 8.9084ms 8.4294ms 118.6330 Ops/s 117.1392 Ops/s $\color{#35bf28}+1.28\%$
test_ppo_speed[reduce-overhead-None] 4.0531ms 3.6700ms 272.4775 Ops/s 278.0590 Ops/s $\color{#d91a1a}-2.01\%$
test_reinforce_speed[False-None] 5.0333ms 4.5743ms 218.6135 Ops/s 222.3787 Ops/s $\color{#d91a1a}-1.69\%$
test_reinforce_speed[False-backward] 7.9504ms 7.4148ms 134.8652 Ops/s 138.0275 Ops/s $\color{#d91a1a}-2.29\%$
test_reinforce_speed[True-None] 3.1986ms 2.9751ms 336.1265 Ops/s 347.9234 Ops/s $\color{#d91a1a}-3.39\%$
test_reinforce_speed[True-backward] 8.1236ms 7.8818ms 126.8752 Ops/s 119.2107 Ops/s $\textbf{\color{#35bf28}+6.43\%}$
test_reinforce_speed[reduce-overhead-None] 3.2563ms 2.9210ms 342.3541 Ops/s 343.7292 Ops/s $\color{#d91a1a}-0.40\%$
test_iql_speed[False-None] 20.4672ms 19.8371ms 50.4105 Ops/s 50.2046 Ops/s $\color{#35bf28}+0.41\%$
test_iql_speed[False-backward] 30.9672ms 30.2321ms 33.0775 Ops/s 32.6883 Ops/s $\color{#35bf28}+1.19\%$
test_iql_speed[True-None] 9.0350ms 8.5220ms 117.3434 Ops/s 113.8014 Ops/s $\color{#35bf28}+3.11\%$
test_iql_speed[True-backward] 17.0823ms 16.6910ms 59.9124 Ops/s 57.9148 Ops/s $\color{#35bf28}+3.45\%$
test_iql_speed[reduce-overhead-None] 8.8883ms 8.6370ms 115.7816 Ops/s 115.8813 Ops/s $\color{#d91a1a}-0.09\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.1739ms 6.0098ms 166.3948 Ops/s 165.1590 Ops/s $\color{#35bf28}+0.75\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.2137ms 0.3198ms 3.1274 KOps/s 3.1110 KOps/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6515ms 0.3370ms 2.9671 KOps/s 3.2604 KOps/s $\textbf{\color{#d91a1a}-9.00\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0637ms 5.8084ms 172.1654 Ops/s 171.6905 Ops/s $\color{#35bf28}+0.28\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.8344ms 0.3258ms 3.0697 KOps/s 3.1083 KOps/s $\color{#d91a1a}-1.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5547ms 0.2914ms 3.4317 KOps/s 3.3017 KOps/s $\color{#35bf28}+3.94\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.4938ms 1.2586ms 794.5050 Ops/s 805.0533 Ops/s $\color{#d91a1a}-1.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4020ms 1.1811ms 846.6594 Ops/s 847.6896 Ops/s $\color{#d91a1a}-0.12\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 10.1867ms 6.1243ms 163.2846 Ops/s 166.5205 Ops/s $\color{#d91a1a}-1.94\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.9565ms 0.4697ms 2.1290 KOps/s 2.2646 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8343ms 0.4185ms 2.3894 KOps/s 2.2605 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0253ms 5.8689ms 170.3888 Ops/s 170.3368 Ops/s $\color{#35bf28}+0.03\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9364ms 0.3701ms 2.7019 KOps/s 3.2036 KOps/s $\textbf{\color{#d91a1a}-15.66\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5320ms 0.3565ms 2.8052 KOps/s 3.4254 KOps/s $\textbf{\color{#d91a1a}-18.11\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0079ms 5.7908ms 172.6864 Ops/s 172.1165 Ops/s $\color{#35bf28}+0.33\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.6342ms 0.2998ms 3.3351 KOps/s 2.9981 KOps/s $\textbf{\color{#35bf28}+11.24\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 7.0294ms 0.3202ms 3.1233 KOps/s 3.1690 KOps/s $\color{#d91a1a}-1.44\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.3311ms 5.9446ms 168.2203 Ops/s 166.0073 Ops/s $\color{#35bf28}+1.33\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0890ms 0.5405ms 1.8501 KOps/s 609.2241 Ops/s $\textbf{\color{#35bf28}+203.68\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7461ms 0.4822ms 2.0737 KOps/s 2.1894 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5844s 17.1159ms 58.4253 Ops/s 195.7149 Ops/s $\textbf{\color{#d91a1a}-70.15\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.8987ms 1.8170ms 550.3485 Ops/s 451.6590 Ops/s $\textbf{\color{#35bf28}+21.85\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.8991ms 1.1845ms 844.2378 Ops/s 864.7359 Ops/s $\color{#d91a1a}-2.37\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 6.6102ms 5.0149ms 199.4041 Ops/s 191.2833 Ops/s $\color{#35bf28}+4.25\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 11.1934ms 1.9069ms 524.4038 Ops/s 520.2098 Ops/s $\color{#35bf28}+0.81\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 8.0868ms 1.2442ms 803.7458 Ops/s 797.1520 Ops/s $\color{#35bf28}+0.83\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5126s 15.3960ms 64.9518 Ops/s 60.6672 Ops/s $\textbf{\color{#35bf28}+7.06\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 4.1502ms 1.9208ms 520.6214 Ops/s 474.5988 Ops/s $\textbf{\color{#35bf28}+9.70\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 1.2791ms 1.0226ms 977.9179 Ops/s 960.7646 Ops/s $\color{#35bf28}+1.79\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.4295ms 35.8157ms 27.9207 Ops/s 27.6541 Ops/s $\color{#35bf28}+0.96\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.6936ms 18.0196ms 55.4952 Ops/s 54.9070 Ops/s $\color{#35bf28}+1.07\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.6610ms 37.1449ms 26.9216 Ops/s 26.5881 Ops/s $\color{#35bf28}+1.25\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.9862ms 18.3906ms 54.3758 Ops/s 53.2653 Ops/s $\color{#35bf28}+2.08\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.6178ms 38.7254ms 25.8229 Ops/s 25.6929 Ops/s $\color{#35bf28}+0.51\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.6138ms 19.9125ms 50.2198 Ops/s 49.1885 Ops/s $\color{#35bf28}+2.10\%$

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 22, 2026

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 148. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_tensor_to_bytestream_speed[pickle] 83.6858μs 82.1728μs 12.1695 KOps/s 12.4674 KOps/s $\color{#d91a1a}-2.39\%$
test_tensor_to_bytestream_speed[torch.save] 0.1409ms 0.1402ms 7.1350 KOps/s 7.2209 KOps/s $\color{#d91a1a}-1.19\%$
test_tensor_to_bytestream_speed[untyped_storage] 0.1179s 0.1177s 8.4965 Ops/s 8.5331 Ops/s $\color{#d91a1a}-0.43\%$
test_tensor_to_bytestream_speed[numpy] 2.5760μs 2.5534μs 391.6280 KOps/s 397.8715 KOps/s $\color{#d91a1a}-1.57\%$
test_tensor_to_bytestream_speed[safetensors] 39.1144μs 38.8705μs 25.7265 KOps/s 26.6113 KOps/s $\color{#d91a1a}-3.33\%$
test_simple 0.9047s 0.8140s 1.2285 Ops/s 1.2301 Ops/s $\color{#d91a1a}-0.13\%$
test_transformed 1.5383s 1.4415s 0.6937 Ops/s 0.6941 Ops/s $\color{#d91a1a}-0.06\%$
test_serial 2.3876s 2.2923s 0.4362 Ops/s 0.4393 Ops/s $\color{#d91a1a}-0.69\%$
test_parallel 1.9966s 1.9432s 0.5146 Ops/s 0.5217 Ops/s $\color{#d91a1a}-1.37\%$
test_step_mdp_speed[True-True-True-True-True] 0.2662ms 45.7890μs 21.8393 KOps/s 21.5204 KOps/s $\color{#35bf28}+1.48\%$
test_step_mdp_speed[True-True-True-True-False] 57.9310μs 25.2203μs 39.6506 KOps/s 38.8400 KOps/s $\color{#35bf28}+2.09\%$
test_step_mdp_speed[True-True-True-False-True] 84.6020μs 26.2498μs 38.0955 KOps/s 38.8183 KOps/s $\color{#d91a1a}-1.86\%$
test_step_mdp_speed[True-True-True-False-False] 42.1110μs 13.9658μs 71.6035 KOps/s 70.2436 KOps/s $\color{#35bf28}+1.94\%$
test_step_mdp_speed[True-True-False-True-True] 0.1358ms 49.0494μs 20.3876 KOps/s 20.2476 KOps/s $\color{#35bf28}+0.69\%$
test_step_mdp_speed[True-True-False-True-False] 63.1810μs 27.4611μs 36.4151 KOps/s 34.6250 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_step_mdp_speed[True-True-False-False-True] 71.2210μs 28.9161μs 34.5828 KOps/s 34.5134 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[True-True-False-False-False] 47.1710μs 16.8612μs 59.3077 KOps/s 57.3546 KOps/s $\color{#35bf28}+3.41\%$
test_step_mdp_speed[True-False-True-True-True] 87.7710μs 52.3505μs 19.1020 KOps/s 19.1114 KOps/s $\color{#d91a1a}-0.05\%$
test_step_mdp_speed[True-False-True-True-False] 63.5910μs 30.7576μs 32.5123 KOps/s 31.3385 KOps/s $\color{#35bf28}+3.75\%$
test_step_mdp_speed[True-False-True-False-True] 64.5910μs 29.2824μs 34.1502 KOps/s 34.5199 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[True-False-True-False-False] 78.2410μs 16.4727μs 60.7065 KOps/s 57.8053 KOps/s $\textbf{\color{#35bf28}+5.02\%}$
test_step_mdp_speed[True-False-False-True-True] 87.3820μs 53.3365μs 18.7489 KOps/s 18.1012 KOps/s $\color{#35bf28}+3.58\%$
test_step_mdp_speed[True-False-False-True-False] 70.5110μs 33.5281μs 29.8258 KOps/s 28.8818 KOps/s $\color{#35bf28}+3.27\%$
test_step_mdp_speed[True-False-False-False-True] 63.3310μs 30.9743μs 32.2849 KOps/s 31.7112 KOps/s $\color{#35bf28}+1.81\%$
test_step_mdp_speed[True-False-False-False-False] 42.6110μs 19.3325μs 51.7265 KOps/s 49.7078 KOps/s $\color{#35bf28}+4.06\%$
test_step_mdp_speed[False-True-True-True-True] 87.4510μs 51.6256μs 19.3702 KOps/s 19.2177 KOps/s $\color{#35bf28}+0.79\%$
test_step_mdp_speed[False-True-True-True-False] 70.6110μs 31.1362μs 32.1169 KOps/s 31.4180 KOps/s $\color{#35bf28}+2.22\%$
test_step_mdp_speed[False-True-True-False-True] 58.9310μs 32.4671μs 30.8004 KOps/s 30.2293 KOps/s $\color{#35bf28}+1.89\%$
test_step_mdp_speed[False-True-True-False-False] 56.0610μs 18.4629μs 54.1628 KOps/s 53.2705 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[False-True-False-True-True] 2.7101ms 54.1945μs 18.4520 KOps/s 18.0447 KOps/s $\color{#35bf28}+2.26\%$
test_step_mdp_speed[False-True-False-True-False] 79.2210μs 33.8076μs 29.5791 KOps/s 28.7450 KOps/s $\color{#35bf28}+2.90\%$
test_step_mdp_speed[False-True-False-False-True] 62.9410μs 34.5733μs 28.9241 KOps/s 28.3928 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[False-True-False-False-False] 52.4610μs 21.4638μs 46.5901 KOps/s 45.4106 KOps/s $\color{#35bf28}+2.60\%$
test_step_mdp_speed[False-False-True-True-True] 94.0710μs 57.6477μs 17.3467 KOps/s 17.2773 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[False-False-True-True-False] 6.1476ms 36.4647μs 27.4238 KOps/s 26.7716 KOps/s $\color{#35bf28}+2.44\%$
test_step_mdp_speed[False-False-True-False-True] 73.2610μs 34.3501μs 29.1120 KOps/s 28.3214 KOps/s $\color{#35bf28}+2.79\%$
test_step_mdp_speed[False-False-True-False-False] 58.6810μs 20.9642μs 47.7004 KOps/s 45.8078 KOps/s $\color{#35bf28}+4.13\%$
test_step_mdp_speed[False-False-False-True-True] 0.1019ms 58.2392μs 17.1706 KOps/s 16.8264 KOps/s $\color{#35bf28}+2.05\%$
test_step_mdp_speed[False-False-False-True-False] 88.8320μs 38.4433μs 26.0123 KOps/s 24.9453 KOps/s $\color{#35bf28}+4.28\%$
test_step_mdp_speed[False-False-False-False-True] 74.3820μs 36.6961μs 27.2508 KOps/s 26.8263 KOps/s $\color{#35bf28}+1.58\%$
test_step_mdp_speed[False-False-False-False-False] 56.7610μs 23.4866μs 42.5775 KOps/s 41.1346 KOps/s $\color{#35bf28}+3.51\%$
test_non_tensor_env_rollout_speed[1000-single-True] 0.7664s 0.7599s 1.3160 Ops/s 1.3028 Ops/s $\color{#35bf28}+1.01\%$
test_non_tensor_env_rollout_speed[1000-single-False] 0.7337s 0.6434s 1.5543 Ops/s 1.5680 Ops/s $\color{#d91a1a}-0.87\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-True] 1.7721s 1.6958s 0.5897 Ops/s 0.5957 Ops/s $\color{#d91a1a}-1.00\%$
test_non_tensor_env_rollout_speed[1000-serial-no-buffers-False] 1.5461s 1.4661s 0.6821 Ops/s 0.6898 Ops/s $\color{#d91a1a}-1.12\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-True] 2.0335s 1.9446s 0.5143 Ops/s 0.5194 Ops/s $\color{#d91a1a}-0.99\%$
test_non_tensor_env_rollout_speed[1000-serial-buffers-False] 1.7867s 1.7119s 0.5841 Ops/s 0.5886 Ops/s $\color{#d91a1a}-0.75\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-True] 4.8115s 4.6598s 0.2146 Ops/s 0.2149 Ops/s $\color{#d91a1a}-0.15\%$
test_non_tensor_env_rollout_speed[1000-parallel-no-buffers-False] 4.5287s 4.4647s 0.2240 Ops/s 0.2241 Ops/s $\color{#d91a1a}-0.03\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-True] 2.1497s 2.0319s 0.4922 Ops/s 0.5119 Ops/s $\color{#d91a1a}-3.86\%$
test_non_tensor_env_rollout_speed[1000-parallel-buffers-False] 1.9125s 1.7427s 0.5738 Ops/s 0.6022 Ops/s $\color{#d91a1a}-4.72\%$
test_values[generalized_advantage_estimate-True-True] 21.6823ms 19.8167ms 50.4625 Ops/s 50.1806 Ops/s $\color{#35bf28}+0.56\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1452s 3.8123ms 262.3068 Ops/s 262.5178 Ops/s $\color{#d91a1a}-0.08\%$
test_values[td0_return_estimate-False-False] 0.1070ms 81.5266μs 12.2659 KOps/s 12.3031 KOps/s $\color{#d91a1a}-0.30\%$
test_values[td1_return_estimate-False-False] 51.8293ms 48.2578ms 20.7221 Ops/s 21.1333 Ops/s $\color{#d91a1a}-1.95\%$
test_values[vec_td1_return_estimate-False-False] 1.3177ms 1.0731ms 931.8788 Ops/s 931.1326 Ops/s $\color{#35bf28}+0.08\%$
test_values[td_lambda_return_estimate-True-False] 83.6223ms 77.4441ms 12.9125 Ops/s 12.8137 Ops/s $\color{#35bf28}+0.77\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3125ms 1.0767ms 928.7817 Ops/s 932.0854 Ops/s $\color{#d91a1a}-0.35\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 22.2994ms 21.6027ms 46.2905 Ops/s 49.8897 Ops/s $\textbf{\color{#d91a1a}-7.21\%}$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0306ms 0.7485ms 1.3360 KOps/s 1.3343 KOps/s $\color{#35bf28}+0.12\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7301ms 0.6624ms 1.5096 KOps/s 1.4998 KOps/s $\color{#35bf28}+0.66\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5407ms 1.4832ms 674.2164 Ops/s 673.2087 Ops/s $\color{#35bf28}+0.15\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.8648ms 0.7150ms 1.3986 KOps/s 1.4453 KOps/s $\color{#d91a1a}-3.23\%$
test_dqn_speed[False-None] 1.7356ms 1.5368ms 650.6979 Ops/s 654.3375 Ops/s $\color{#d91a1a}-0.56\%$
test_dqn_speed[False-backward] 2.3348ms 2.1835ms 457.9845 Ops/s 455.0008 Ops/s $\color{#35bf28}+0.66\%$
test_dqn_speed[True-None] 0.9921ms 0.5348ms 1.8698 KOps/s 1.8490 KOps/s $\color{#35bf28}+1.12\%$
test_dqn_speed[True-backward] 1.2026ms 1.1658ms 857.8032 Ops/s 938.2426 Ops/s $\textbf{\color{#d91a1a}-8.57\%}$
test_dqn_speed[reduce-overhead-None] 0.6925ms 0.5566ms 1.7967 KOps/s 1.7319 KOps/s $\color{#35bf28}+3.74\%$
test_ddpg_speed[False-None] 3.2640ms 2.9115ms 343.4603 Ops/s 342.4569 Ops/s $\color{#35bf28}+0.29\%$
test_ddpg_speed[False-backward] 4.8008ms 4.3376ms 230.5420 Ops/s 239.0292 Ops/s $\color{#d91a1a}-3.55\%$
test_ddpg_speed[True-None] 1.3281ms 1.2742ms 784.8033 Ops/s 779.0345 Ops/s $\color{#35bf28}+0.74\%$
test_ddpg_speed[True-backward] 2.5086ms 2.4456ms 408.8928 Ops/s 431.0413 Ops/s $\textbf{\color{#d91a1a}-5.14\%}$
test_ddpg_speed[reduce-overhead-None] 1.4205ms 1.3275ms 753.3021 Ops/s 762.1871 Ops/s $\color{#d91a1a}-1.17\%$
test_sac_speed[False-None] 8.7258ms 8.2558ms 121.1266 Ops/s 119.4393 Ops/s $\color{#35bf28}+1.41\%$
test_sac_speed[False-backward] 12.2263ms 11.5121ms 86.8648 Ops/s 87.8313 Ops/s $\color{#d91a1a}-1.10\%$
test_sac_speed[True-None] 1.8849ms 1.7629ms 567.2611 Ops/s 563.4436 Ops/s $\color{#35bf28}+0.68\%$
test_sac_speed[True-backward] 3.5753ms 3.5150ms 284.4954 Ops/s 295.3953 Ops/s $\color{#d91a1a}-3.69\%$
test_sac_speed[reduce-overhead-None] 18.5986ms 10.7146ms 93.3307 Ops/s 92.5817 Ops/s $\color{#35bf28}+0.81\%$
test_redq_deprec_speed[False-None] 10.0113ms 9.3443ms 107.0174 Ops/s 106.8710 Ops/s $\color{#35bf28}+0.14\%$
test_redq_deprec_speed[False-backward] 13.2530ms 12.6334ms 79.1551 Ops/s 80.5683 Ops/s $\color{#d91a1a}-1.75\%$
test_redq_deprec_speed[True-None] 2.7768ms 2.4898ms 401.6450 Ops/s 393.8100 Ops/s $\color{#35bf28}+1.99\%$
test_redq_deprec_speed[True-backward] 4.6529ms 4.2355ms 236.0982 Ops/s 237.9083 Ops/s $\color{#d91a1a}-0.76\%$
test_redq_deprec_speed[reduce-overhead-None] 15.6563ms 9.5902ms 104.2727 Ops/s 104.3248 Ops/s $\color{#d91a1a}-0.05\%$
test_td3_speed[False-None] 8.3088ms 8.2230ms 121.6101 Ops/s 121.6255 Ops/s $\color{#d91a1a}-0.01\%$
test_td3_speed[False-backward] 11.3430ms 10.8329ms 92.3117 Ops/s 93.9880 Ops/s $\color{#d91a1a}-1.78\%$
test_td3_speed[True-None] 1.6542ms 1.6020ms 624.2292 Ops/s 633.0147 Ops/s $\color{#d91a1a}-1.39\%$
test_td3_speed[True-backward] 3.2148ms 3.1622ms 316.2342 Ops/s 329.4750 Ops/s $\color{#d91a1a}-4.02\%$
test_td3_speed[reduce-overhead-None] 79.4448ms 23.5129ms 42.5298 Ops/s 42.9071 Ops/s $\color{#d91a1a}-0.88\%$
test_cql_speed[False-None] 17.5070ms 17.2234ms 58.0604 Ops/s 57.8258 Ops/s $\color{#35bf28}+0.41\%$
test_cql_speed[False-backward] 23.3630ms 22.8047ms 43.8506 Ops/s 44.3129 Ops/s $\color{#d91a1a}-1.04\%$
test_cql_speed[True-None] 3.2928ms 3.1526ms 317.1948 Ops/s 315.3389 Ops/s $\color{#35bf28}+0.59\%$
test_cql_speed[True-backward] 6.2062ms 5.3363ms 187.3963 Ops/s 186.7292 Ops/s $\color{#35bf28}+0.36\%$
test_cql_speed[reduce-overhead-None] 19.1742ms 11.8850ms 84.1398 Ops/s 85.7631 Ops/s $\color{#d91a1a}-1.89\%$
test_a2c_speed[False-None] 4.5617ms 3.2275ms 309.8352 Ops/s 307.5202 Ops/s $\color{#35bf28}+0.75\%$
test_a2c_speed[False-backward] 6.7630ms 6.3656ms 157.0948 Ops/s 160.8246 Ops/s $\color{#d91a1a}-2.32\%$
test_a2c_speed[True-None] 1.4586ms 1.3134ms 761.3878 Ops/s 757.0615 Ops/s $\color{#35bf28}+0.57\%$
test_a2c_speed[True-backward] 3.1032ms 3.0436ms 328.5550 Ops/s 326.7547 Ops/s $\color{#35bf28}+0.55\%$
test_a2c_speed[reduce-overhead-None] 1.0472ms 0.9729ms 1.0279 KOps/s 1.0294 KOps/s $\color{#d91a1a}-0.15\%$
test_ppo_speed[False-None] 3.9654ms 3.8481ms 259.8664 Ops/s 261.5320 Ops/s $\color{#d91a1a}-0.64\%$
test_ppo_speed[False-backward] 7.5831ms 7.1478ms 139.9022 Ops/s 139.6001 Ops/s $\color{#35bf28}+0.22\%$
test_ppo_speed[True-None] 1.4826ms 1.3899ms 719.4798 Ops/s 711.6575 Ops/s $\color{#35bf28}+1.10\%$
test_ppo_speed[True-backward] 3.0508ms 2.9973ms 333.6335 Ops/s 331.0741 Ops/s $\color{#35bf28}+0.77\%$
test_ppo_speed[reduce-overhead-None] 1.0954ms 1.0163ms 983.9491 Ops/s 944.8936 Ops/s $\color{#35bf28}+4.13\%$
test_reinforce_speed[False-None] 2.4538ms 2.2976ms 435.2352 Ops/s 431.8742 Ops/s $\color{#35bf28}+0.78\%$
test_reinforce_speed[False-backward] 3.8281ms 3.3385ms 299.5321 Ops/s 303.4212 Ops/s $\color{#d91a1a}-1.28\%$
test_reinforce_speed[True-None] 1.3209ms 1.2310ms 812.3346 Ops/s 795.1588 Ops/s $\color{#35bf28}+2.16\%$
test_reinforce_speed[True-backward] 3.0228ms 2.8802ms 347.1952 Ops/s 346.2114 Ops/s $\color{#35bf28}+0.28\%$
test_reinforce_speed[reduce-overhead-None] 0.4613s 10.1237ms 98.7783 Ops/s 97.7680 Ops/s $\color{#35bf28}+1.03\%$
test_iql_speed[False-None] 10.1919ms 9.4697ms 105.5997 Ops/s 104.3982 Ops/s $\color{#35bf28}+1.15\%$
test_iql_speed[False-backward] 13.4239ms 13.1732ms 75.9118 Ops/s 73.3983 Ops/s $\color{#35bf28}+3.42\%$
test_iql_speed[True-None] 2.2312ms 2.1210ms 471.4656 Ops/s 458.6984 Ops/s $\color{#35bf28}+2.78\%$
test_iql_speed[True-backward] 4.7433ms 4.5837ms 218.1654 Ops/s 215.2045 Ops/s $\color{#35bf28}+1.38\%$
test_iql_speed[reduce-overhead-None] 17.8605ms 10.2708ms 97.3638 Ops/s 76.4865 Ops/s $\textbf{\color{#35bf28}+27.30\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3239ms 5.9735ms 167.4060 Ops/s 166.9398 Ops/s $\color{#35bf28}+0.28\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1888ms 0.3967ms 2.5208 KOps/s 3.2311 KOps/s $\textbf{\color{#d91a1a}-21.98\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6553ms 0.3574ms 2.7982 KOps/s 3.3848 KOps/s $\textbf{\color{#d91a1a}-17.33\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1656ms 5.8123ms 172.0479 Ops/s 171.2738 Ops/s $\color{#35bf28}+0.45\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.7792ms 0.3155ms 3.1698 KOps/s 3.5289 KOps/s $\textbf{\color{#d91a1a}-10.18\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6166ms 0.2954ms 3.3851 KOps/s 3.6351 KOps/s $\textbf{\color{#d91a1a}-6.88\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5469ms 1.2800ms 781.2392 Ops/s 785.3928 Ops/s $\color{#d91a1a}-0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.4167ms 1.1959ms 836.1749 Ops/s 839.2546 Ops/s $\color{#d91a1a}-0.37\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1748ms 5.9191ms 168.9456 Ops/s 165.6512 Ops/s $\color{#35bf28}+1.99\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 0.7822ms 0.4812ms 2.0782 KOps/s 2.0926 KOps/s $\color{#d91a1a}-0.69\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7370ms 0.5145ms 1.9437 KOps/s 2.0831 KOps/s $\textbf{\color{#d91a1a}-6.70\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9559ms 5.7967ms 172.5114 Ops/s 169.7078 Ops/s $\color{#35bf28}+1.65\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.8314ms 0.3528ms 2.8342 KOps/s 3.3105 KOps/s $\textbf{\color{#d91a1a}-14.39\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5663ms 0.3155ms 3.1691 KOps/s 2.9783 KOps/s $\textbf{\color{#35bf28}+6.41\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.1297ms 5.7970ms 172.5026 Ops/s 170.8056 Ops/s $\color{#35bf28}+0.99\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.7990ms 0.3583ms 2.7911 KOps/s 3.5115 KOps/s $\textbf{\color{#d91a1a}-20.52\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6206ms 0.3420ms 2.9237 KOps/s 3.6831 KOps/s $\textbf{\color{#d91a1a}-20.62\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0675ms 5.8999ms 169.4957 Ops/s 165.3444 Ops/s $\color{#35bf28}+2.51\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.8580ms 0.4476ms 2.2341 KOps/s 1.9525 KOps/s $\textbf{\color{#35bf28}+14.42\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7407ms 0.4480ms 2.2320 KOps/s 2.2950 KOps/s $\color{#d91a1a}-2.75\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.5884s 16.7395ms 59.7389 Ops/s 51.2374 Ops/s $\textbf{\color{#35bf28}+16.59\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.4541ms 1.9473ms 513.5204 Ops/s 565.9924 Ops/s $\textbf{\color{#d91a1a}-9.27\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 11.5242ms 1.3782ms 725.5893 Ops/s 746.5899 Ops/s $\color{#d91a1a}-2.81\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.6304ms 5.0789ms 196.8933 Ops/s 192.9492 Ops/s $\color{#35bf28}+2.04\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.5508ms 1.9373ms 516.1874 Ops/s 507.7256 Ops/s $\color{#35bf28}+1.67\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.1115ms 0.9295ms 1.0758 KOps/s 778.1276 Ops/s $\textbf{\color{#35bf28}+38.25\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.5619s 16.5011ms 60.6019 Ops/s 50.8772 Ops/s $\textbf{\color{#35bf28}+19.11\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.0992ms 2.0847ms 479.6869 Ops/s 466.8665 Ops/s $\color{#35bf28}+2.75\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.0147ms 1.1085ms 902.1015 Ops/s 926.0147 Ops/s $\color{#d91a1a}-2.58\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 38.9685ms 36.4341ms 27.4468 Ops/s 27.9082 Ops/s $\color{#d91a1a}-1.65\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 20.7762ms 18.4521ms 54.1944 Ops/s 54.6444 Ops/s $\color{#d91a1a}-0.82\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 40.4169ms 36.9024ms 27.0985 Ops/s 26.8876 Ops/s $\color{#35bf28}+0.78\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 19.7501ms 18.4029ms 54.3393 Ops/s 53.0388 Ops/s $\color{#35bf28}+2.45\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 40.2737ms 38.5144ms 25.9643 Ops/s 25.8845 Ops/s $\color{#35bf28}+0.31\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 21.6446ms 19.9589ms 50.1029 Ops/s 50.4622 Ops/s $\color{#d91a1a}-0.71\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Jan 29, 2026
Add TRANSFORMERS_CHATENV_INTEGRATION.md documenting:
- The data flow between TransformersWrapper and ChatEnv
- The ChatHistory contract (prompt, response, full attributes)
- Common issues and debugging tips
- Related files for reference

Also update LLM_TEST_ISSUES.md to reflect the fixes:
- Issue 10 (Gated HuggingFace Models): Now uses mock tokenizer with SmolLM
- Issue 11 (TransformersWrapper History Output): Fixed by setting full

ghstack-source-id: 8c9722b
Pull-Request: #3377
@github-actions github-actions Bot added Modules Documentation Improvements or additions to documentation labels Jan 29, 2026
vmoens added a commit that referenced this pull request Jan 29, 2026
Add TRANSFORMERS_CHATENV_INTEGRATION.md documenting:
- The data flow between TransformersWrapper and ChatEnv
- The ChatHistory contract (prompt, response, full attributes)
- Common issues and debugging tips
- Related files for reference

Also update LLM_TEST_ISSUES.md to reflect the fixes:
- Issue 10 (Gated HuggingFace Models): Now uses mock tokenizer with SmolLM
- Issue 11 (TransformersWrapper History Output): Fixed by setting full

ghstack-source-id: 8c9722b
Pull-Request: #3377
@vmoens vmoens merged commit d9d66b6 into gh/vmoens/204/base Jan 29, 2026
57 of 82 checks passed
@vmoens vmoens deleted the gh/vmoens/204/head branch January 29, 2026 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation llm/ LLM-related PR, triggers LLM CI tests Modules

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant