Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Apr 11, 2025

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Apr 11, 2025
ghstack-source-id: 329e30d
Pull Request resolved: #1285
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 11, 2025
@vmoens
Copy link
Collaborator Author

vmoens commented Apr 11, 2025

Restrictions:

  • Seems like vmap within exported code with functional calls is not supported yet
  • using strict=True with functional calls (without vmap) also fails as the exported code fails to see that we swapped the params.

cc @matteobettini @jeguzzi

@vmoens vmoens added the bug Something isn't working label Apr 11, 2025
@github-actions
Copy link

github-actions bot commented Apr 11, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 221. Improved: $\large\color{#35bf28}16$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 53.7510μs 18.7584μs 53.3093 KOps/s 53.4233 KOps/s $\color{#d91a1a}-0.21\%$
test_plain_set_stack_nested 49.2120μs 18.5615μs 53.8751 KOps/s 52.8518 KOps/s $\color{#35bf28}+1.94\%$
test_plain_set_nested_inplace 74.4600μs 20.2527μs 49.3762 KOps/s 48.9441 KOps/s $\color{#35bf28}+0.88\%$
test_plain_set_stack_nested_inplace 79.6890μs 20.2229μs 49.4490 KOps/s 49.1102 KOps/s $\color{#35bf28}+0.69\%$
test_items 24.9170μs 4.1968μs 238.2791 KOps/s 235.5518 KOps/s $\color{#35bf28}+1.16\%$
test_items_nested 0.5130ms 0.3974ms 2.5163 KOps/s 2.4700 KOps/s $\color{#35bf28}+1.87\%$
test_items_nested_locked 0.8052ms 0.3965ms 2.5219 KOps/s 2.4657 KOps/s $\color{#35bf28}+2.28\%$
test_items_nested_leaf 0.1664ms 77.3143μs 12.9342 KOps/s 12.9393 KOps/s $\color{#d91a1a}-0.04\%$
test_items_stack_nested 0.6598ms 0.4003ms 2.4981 KOps/s 2.4499 KOps/s $\color{#35bf28}+1.97\%$
test_items_stack_nested_leaf 0.1674ms 77.2972μs 12.9371 KOps/s 12.8542 KOps/s $\color{#35bf28}+0.64\%$
test_items_stack_nested_locked 0.5740ms 0.4009ms 2.4944 KOps/s 2.4225 KOps/s $\color{#35bf28}+2.97\%$
test_keys 29.6460μs 3.4854μs 286.9107 KOps/s 288.5859 KOps/s $\color{#d91a1a}-0.58\%$
test_keys_nested 0.2622ms 0.1622ms 6.1655 KOps/s 5.9996 KOps/s $\color{#35bf28}+2.76\%$
test_keys_nested_locked 0.7879ms 0.1693ms 5.9069 KOps/s 5.8034 KOps/s $\color{#35bf28}+1.78\%$
test_keys_nested_leaf 0.2346ms 0.1427ms 7.0060 KOps/s 6.8715 KOps/s $\color{#35bf28}+1.96\%$
test_keys_stack_nested 0.2616ms 0.1629ms 6.1382 KOps/s 5.9834 KOps/s $\color{#35bf28}+2.59\%$
test_keys_stack_nested_leaf 0.3055ms 0.1434ms 6.9721 KOps/s 6.8435 KOps/s $\color{#35bf28}+1.88\%$
test_keys_stack_nested_locked 0.2696ms 0.1686ms 5.9317 KOps/s 5.7545 KOps/s $\color{#35bf28}+3.08\%$
test_values 9.4036μs 1.0430μs 958.7339 KOps/s 961.2134 KOps/s $\color{#d91a1a}-0.26\%$
test_values_nested 0.1170ms 62.4315μs 16.0176 KOps/s 15.7180 KOps/s $\color{#35bf28}+1.91\%$
test_values_nested_locked 0.1172ms 61.9947μs 16.1304 KOps/s 15.6301 KOps/s $\color{#35bf28}+3.20\%$
test_values_nested_leaf 0.1214ms 70.5636μs 14.1716 KOps/s 13.8342 KOps/s $\color{#35bf28}+2.44\%$
test_values_stack_nested 0.1190ms 61.8957μs 16.1562 KOps/s 15.8257 KOps/s $\color{#35bf28}+2.09\%$
test_values_stack_nested_leaf 0.1423ms 69.9389μs 14.2982 KOps/s 13.8107 KOps/s $\color{#35bf28}+3.53\%$
test_values_stack_nested_locked 0.1769ms 61.6540μs 16.2195 KOps/s 15.6663 KOps/s $\color{#35bf28}+3.53\%$
test_membership 2.5788μs 0.7087μs 1.4110 MOps/s 1.4155 MOps/s $\color{#d91a1a}-0.32\%$
test_membership_nested 36.6790μs 2.9313μs 341.1410 KOps/s 348.6970 KOps/s $\color{#d91a1a}-2.17\%$
test_membership_nested_leaf 28.9540μs 2.9361μs 340.5921 KOps/s 350.6064 KOps/s $\color{#d91a1a}-2.86\%$
test_membership_stacked_nested 40.8970μs 2.8457μs 351.4030 KOps/s 350.3304 KOps/s $\color{#35bf28}+0.31\%$
test_membership_stacked_nested_leaf 18.2650μs 2.8961μs 345.2912 KOps/s 347.3774 KOps/s $\color{#d91a1a}-0.60\%$
test_membership_nested_last 40.4360μs 4.3120μs 231.9122 KOps/s 231.6489 KOps/s $\color{#35bf28}+0.11\%$
test_membership_nested_leaf_last 30.8880μs 4.3352μs 230.6724 KOps/s 226.7691 KOps/s $\color{#35bf28}+1.72\%$
test_membership_stacked_nested_last 44.4230μs 4.2463μs 235.5001 KOps/s 231.1860 KOps/s $\color{#35bf28}+1.87\%$
test_membership_stacked_nested_leaf_last 22.2320μs 4.2652μs 234.4558 KOps/s 231.8247 KOps/s $\color{#35bf28}+1.13\%$
test_nested_getleaf 84.4690μs 16.4800μs 60.6797 KOps/s 60.4826 KOps/s $\color{#35bf28}+0.33\%$
test_nested_get 45.5450μs 15.7992μs 63.2944 KOps/s 64.1656 KOps/s $\color{#d91a1a}-1.36\%$
test_stacked_getleaf 48.4910μs 16.4833μs 60.6673 KOps/s 61.0884 KOps/s $\color{#d91a1a}-0.69\%$
test_stacked_get 40.4660μs 15.6690μs 63.8204 KOps/s 63.5681 KOps/s $\color{#35bf28}+0.40\%$
test_nested_getitemleaf 45.6760μs 16.9265μs 59.0789 KOps/s 57.9222 KOps/s $\color{#35bf28}+2.00\%$
test_nested_getitem 48.7210μs 16.1940μs 61.7513 KOps/s 61.4763 KOps/s $\color{#35bf28}+0.45\%$
test_stacked_getitemleaf 44.8540μs 17.1902μs 58.1726 KOps/s 58.0340 KOps/s $\color{#35bf28}+0.24\%$
test_stacked_getitem 49.3020μs 16.1908μs 61.7635 KOps/s 61.7899 KOps/s $\color{#d91a1a}-0.04\%$
test_lock_nested 6.2249ms 0.4278ms 2.3373 KOps/s 2.3123 KOps/s $\color{#35bf28}+1.08\%$
test_lock_stack_nested 0.7444ms 0.4181ms 2.3919 KOps/s 2.3658 KOps/s $\color{#35bf28}+1.10\%$
test_unlock_nested 0.6165ms 0.3528ms 2.8343 KOps/s 2.8635 KOps/s $\color{#d91a1a}-1.02\%$
test_unlock_stack_nested 0.5594ms 0.3414ms 2.9289 KOps/s 2.9130 KOps/s $\color{#35bf28}+0.55\%$
test_flatten_speed 0.2355ms 99.9870μs 10.0013 KOps/s 9.7180 KOps/s $\color{#35bf28}+2.92\%$
test_unflatten_speed 0.7948ms 0.5829ms 1.7157 KOps/s 1.6826 KOps/s $\color{#35bf28}+1.97\%$
test_common_ops 6.7094ms 0.8287ms 1.2067 KOps/s 1.2022 KOps/s $\color{#35bf28}+0.37\%$
test_creation 61.0140μs 2.4432μs 409.3066 KOps/s 399.8454 KOps/s $\color{#35bf28}+2.37\%$
test_creation_empty 0.5261ms 10.0329μs 99.6718 KOps/s 97.6401 KOps/s $\color{#35bf28}+2.08\%$
test_creation_nested_1 0.1252ms 14.7282μs 67.8970 KOps/s 68.0223 KOps/s $\color{#d91a1a}-0.18\%$
test_creation_nested_2 0.1737ms 19.5558μs 51.1358 KOps/s 49.3647 KOps/s $\color{#35bf28}+3.59\%$
test_clone 0.1295ms 13.9260μs 71.8084 KOps/s 74.1713 KOps/s $\color{#d91a1a}-3.19\%$
test_getitem[int] 0.2033ms 12.5776μs 79.5066 KOps/s 77.7156 KOps/s $\color{#35bf28}+2.30\%$
test_getitem[slice_int] 0.1715ms 24.2132μs 41.2998 KOps/s 38.9345 KOps/s $\textbf{\color{#35bf28}+6.07\%}$
test_getitem[range] 0.2809ms 51.1616μs 19.5459 KOps/s 20.3065 KOps/s $\color{#d91a1a}-3.75\%$
test_getitem[tuple] 0.1504ms 19.9929μs 50.0178 KOps/s 48.5223 KOps/s $\color{#35bf28}+3.08\%$
test_getitem[list] 0.3464ms 46.3442μs 21.5777 KOps/s 22.1910 KOps/s $\color{#d91a1a}-2.76\%$
test_setitem_dim[int] 79.3190μs 26.7402μs 37.3969 KOps/s 38.8294 KOps/s $\color{#d91a1a}-3.69\%$
test_setitem_dim[slice_int] 0.1011ms 51.6552μs 19.3591 KOps/s 19.5078 KOps/s $\color{#d91a1a}-0.76\%$
test_setitem_dim[range] 0.1269ms 74.7934μs 13.3702 KOps/s 13.2195 KOps/s $\color{#35bf28}+1.14\%$
test_setitem_dim[tuple] 91.4820μs 40.9618μs 24.4130 KOps/s 24.5721 KOps/s $\color{#d91a1a}-0.65\%$
test_setitem 0.3268ms 20.2022μs 49.4995 KOps/s 48.7914 KOps/s $\color{#35bf28}+1.45\%$
test_set 0.2328ms 19.4677μs 51.3672 KOps/s 51.0015 KOps/s $\color{#35bf28}+0.72\%$
test_set_shared 0.4536ms 0.1818ms 5.5008 KOps/s 5.5904 KOps/s $\color{#d91a1a}-1.60\%$
test_update 0.2765ms 24.0104μs 41.6486 KOps/s 41.0209 KOps/s $\color{#35bf28}+1.53\%$
test_update_nested 0.1546ms 40.8062μs 24.5061 KOps/s 23.6577 KOps/s $\color{#35bf28}+3.59\%$
test_update__nested 0.1365ms 34.0902μs 29.3339 KOps/s 29.2286 KOps/s $\color{#35bf28}+0.36\%$
test_set_nested 0.1906ms 22.6582μs 44.1341 KOps/s 46.1124 KOps/s $\color{#d91a1a}-4.29\%$
test_set_nested_new 0.1625ms 27.9816μs 35.7378 KOps/s 36.4974 KOps/s $\color{#d91a1a}-2.08\%$
test_select 0.1846ms 43.5284μs 22.9735 KOps/s 22.7721 KOps/s $\color{#35bf28}+0.88\%$
test_select_nested 0.1374ms 62.8505μs 15.9108 KOps/s 16.0065 KOps/s $\color{#d91a1a}-0.60\%$
test_exclude_nested 0.1532ms 79.7859μs 12.5335 KOps/s 12.2607 KOps/s $\color{#35bf28}+2.23\%$
test_empty[True] 0.7929ms 0.4081ms 2.4503 KOps/s 2.4407 KOps/s $\color{#35bf28}+0.39\%$
test_empty[False] 11.0933μs 1.3536μs 738.7560 KOps/s 732.4198 KOps/s $\color{#35bf28}+0.87\%$
test_unbind_speed 0.4402ms 0.2724ms 3.6711 KOps/s 3.6901 KOps/s $\color{#d91a1a}-0.51\%$
test_unbind_speed_stack0 0.5815ms 0.2686ms 3.7228 KOps/s 3.7318 KOps/s $\color{#d91a1a}-0.24\%$
test_unbind_speed_stack1 0.1089s 0.7480ms 1.3369 KOps/s 1.2192 KOps/s $\textbf{\color{#35bf28}+9.66\%}$
test_split 0.1169s 1.9656ms 508.7476 Ops/s 560.8031 Ops/s $\textbf{\color{#d91a1a}-9.28\%}$
test_chunk 1.8011ms 1.5863ms 630.3833 Ops/s 558.5050 Ops/s $\textbf{\color{#35bf28}+12.87\%}$
test_consolidate_njt[False-None] 10.9715ms 8.2768ms 120.8193 Ops/s 122.2954 Ops/s $\color{#d91a1a}-1.21\%$
test_creation[device0] 0.2681ms 91.0966μs 10.9774 KOps/s 11.0178 KOps/s $\color{#d91a1a}-0.37\%$
test_creation_from_tensor 3.5451ms 95.0505μs 10.5207 KOps/s 10.8722 KOps/s $\color{#d91a1a}-3.23\%$
test_add_one[memmap_tensor0] 0.1206ms 5.0705μs 197.2192 KOps/s 205.8425 KOps/s $\color{#d91a1a}-4.19\%$
test_contiguous[memmap_tensor0] 33.3520μs 0.5180μs 1.9306 MOps/s 1.9847 MOps/s $\color{#d91a1a}-2.72\%$
test_stack[memmap_tensor0] 34.9060μs 3.5383μs 282.6202 KOps/s 299.7375 KOps/s $\textbf{\color{#d91a1a}-5.71\%}$
test_memmaptd_index 1.8326ms 0.2332ms 4.2876 KOps/s 4.3365 KOps/s $\color{#d91a1a}-1.13\%$
test_memmaptd_index_astensor 0.1222s 0.3677ms 2.7198 KOps/s 3.1252 KOps/s $\textbf{\color{#d91a1a}-12.97\%}$
test_memmaptd_index_op 1.1041ms 0.5570ms 1.7952 KOps/s 1.8243 KOps/s $\color{#d91a1a}-1.60\%$
test_serialize_model 0.1264s 0.1183s 8.4544 Ops/s 8.6580 Ops/s $\color{#d91a1a}-2.35\%$
test_serialize_model_pickle 0.4667s 0.3996s 2.5022 Ops/s 2.5768 Ops/s $\color{#d91a1a}-2.89\%$
test_serialize_weights 0.1255s 0.1154s 8.6651 Ops/s 7.4548 Ops/s $\textbf{\color{#35bf28}+16.24\%}$
test_serialize_weights_returnearly 0.1858s 0.1627s 6.1478 Ops/s 6.4980 Ops/s $\textbf{\color{#d91a1a}-5.39\%}$
test_serialize_weights_pickle 0.5811s 0.4347s 2.3006 Ops/s 2.3294 Ops/s $\color{#d91a1a}-1.24\%$
test_serialize_weights_filesystem 0.1471s 0.1423s 7.0282 Ops/s 7.0918 Ops/s $\color{#d91a1a}-0.90\%$
test_serialize_model_filesystem 0.1597s 0.1530s 6.5373 Ops/s 5.8819 Ops/s $\textbf{\color{#35bf28}+11.14\%}$
test_reshape_pytree 56.6860μs 26.1821μs 38.1941 KOps/s 37.9049 KOps/s $\color{#35bf28}+0.76\%$
test_reshape_td 89.1970μs 32.5745μs 30.6989 KOps/s 29.2427 KOps/s $\color{#35bf28}+4.98\%$
test_view_pytree 70.4020μs 26.1834μs 38.1922 KOps/s 38.0318 KOps/s $\color{#35bf28}+0.42\%$
test_view_td 98.9760μs 39.8910μs 25.0683 KOps/s 23.1661 KOps/s $\textbf{\color{#35bf28}+8.21\%}$
test_unbind_pytree 68.7690μs 29.5170μs 33.8787 KOps/s 33.9790 KOps/s $\color{#d91a1a}-0.30\%$
test_unbind_td 0.3654ms 39.8692μs 25.0820 KOps/s 24.7074 KOps/s $\color{#35bf28}+1.52\%$
test_split_pytree 85.1990μs 29.1017μs 34.3623 KOps/s 34.3865 KOps/s $\color{#d91a1a}-0.07\%$
test_split_td 0.5684ms 44.8728μs 22.2852 KOps/s 22.1088 KOps/s $\color{#35bf28}+0.80\%$
test_add_pytree 0.1259ms 36.3647μs 27.4992 KOps/s 28.7305 KOps/s $\color{#d91a1a}-4.29\%$
test_add_td 0.3148ms 57.9110μs 17.2679 KOps/s 17.1670 KOps/s $\color{#35bf28}+0.59\%$
test_compile_add_one_nested[tensordict-compile] 0.1448ms 66.0931μs 15.1302 KOps/s 15.0613 KOps/s $\color{#35bf28}+0.46\%$
test_compile_add_one_nested[tensordict-eager] 0.4083ms 0.1814ms 5.5140 KOps/s 5.5412 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_one_nested[pytree-compile] 0.1249ms 46.1757μs 21.6564 KOps/s 22.2242 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_add_one_nested[pytree-eager] 0.2576ms 0.1180ms 8.4721 KOps/s 8.4536 KOps/s $\color{#35bf28}+0.22\%$
test_compile_copy_nested[tensordict-compile] 74.9000μs 28.3206μs 35.3100 KOps/s 36.3848 KOps/s $\color{#d91a1a}-2.95\%$
test_compile_copy_nested[tensordict-eager] 0.1199ms 63.4464μs 15.7613 KOps/s 15.6677 KOps/s $\color{#35bf28}+0.60\%$
test_compile_copy_nested[pytree-compile] 0.1517ms 79.4403μs 12.5881 KOps/s 12.6492 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_copy_nested[pytree-eager] 0.1521ms 65.9757μs 15.1571 KOps/s 14.9252 KOps/s $\color{#35bf28}+1.55\%$
test_compile_add_one_flat[tensordict-compile] 0.2808ms 0.1079ms 9.2666 KOps/s 9.4932 KOps/s $\color{#d91a1a}-2.39\%$
test_compile_add_one_flat[tensordict-eager] 0.4272ms 0.2143ms 4.6671 KOps/s 4.6058 KOps/s $\color{#35bf28}+1.33\%$
test_compile_add_one_flat[tensorclass-compile] 0.1443ms 46.4051μs 21.5493 KOps/s 22.2372 KOps/s $\color{#d91a1a}-3.09\%$
test_compile_add_one_flat[tensorclass-eager] 0.1749ms 68.5046μs 14.5976 KOps/s 14.5288 KOps/s $\color{#35bf28}+0.47\%$
test_compile_add_one_flat[pytree-compile] 0.2420ms 0.1021ms 9.7953 KOps/s 9.9422 KOps/s $\color{#d91a1a}-1.48\%$
test_compile_add_one_flat[pytree-eager] 0.4231ms 0.2037ms 4.9087 KOps/s 5.0270 KOps/s $\color{#d91a1a}-2.35\%$
test_compile_add_self_flat[tensordict-eager] 0.4375ms 0.2304ms 4.3408 KOps/s 4.2568 KOps/s $\color{#35bf28}+1.97\%$
test_compile_add_self_flat[tensordict-compile] 0.2285ms 0.1090ms 9.1703 KOps/s 9.2889 KOps/s $\color{#d91a1a}-1.28\%$
test_compile_add_self_flat[tensorclass-eager] 0.1677ms 67.0278μs 14.9192 KOps/s 15.4568 KOps/s $\color{#d91a1a}-3.48\%$
test_compile_add_self_flat[tensorclass-compile] 0.1306ms 47.9958μs 20.8352 KOps/s 21.3268 KOps/s $\color{#d91a1a}-2.31\%$
test_compile_add_self_flat[pytree-eager] 0.2907ms 0.1578ms 6.3365 KOps/s 6.1178 KOps/s $\color{#35bf28}+3.58\%$
test_compile_add_self_flat[pytree-compile] 0.1944ms 0.1010ms 9.9048 KOps/s 10.1056 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_copy_flat[tensordict-compile] 55.9450μs 21.6582μs 46.1720 KOps/s 47.2233 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_copy_flat[tensordict-eager] 0.1571ms 66.9326μs 14.9404 KOps/s 14.7057 KOps/s $\color{#35bf28}+1.60\%$
test_compile_copy_flat[pytree-compile] 0.1549ms 82.0622μs 12.1859 KOps/s 11.9258 KOps/s $\color{#35bf28}+2.18\%$
test_compile_copy_flat[pytree-eager] 0.1263ms 66.6234μs 15.0097 KOps/s 14.6621 KOps/s $\color{#35bf28}+2.37\%$
test_compile_assign_and_add[tensordict-compile] 0.3283ms 0.2185ms 4.5761 KOps/s 4.6308 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_assign_and_add[tensordict-eager] 1.8389ms 1.4738ms 678.4986 Ops/s 678.8403 Ops/s $\color{#d91a1a}-0.05\%$
test_compile_assign_and_add[pytree-compile] 0.4332ms 0.2129ms 4.6969 KOps/s 4.8061 KOps/s $\color{#d91a1a}-2.27\%$
test_compile_assign_and_add[pytree-eager] 1.1966ms 0.8350ms 1.1975 KOps/s 1.2175 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_assign_and_add_stack[compile] 0.6561ms 0.4693ms 2.1309 KOps/s 2.1576 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_assign_and_add_stack[eager] 2.9121ms 2.5526ms 391.7580 Ops/s 395.7689 Ops/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[tensor-tensordict-compile] 0.1064ms 36.6476μs 27.2869 KOps/s 28.1825 KOps/s $\color{#d91a1a}-3.18\%$
test_compile_indexing[tensor-tensordict-eager] 0.6831ms 35.3867μs 28.2592 KOps/s 28.9418 KOps/s $\color{#d91a1a}-2.36\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1085ms 32.3354μs 30.9259 KOps/s 33.4017 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_compile_indexing[tensor-tensorclass-eager] 97.9540μs 23.1567μs 43.1840 KOps/s 44.6259 KOps/s $\color{#d91a1a}-3.23\%$
test_compile_indexing[tensor-pytree-compile] 0.1049ms 32.2168μs 31.0397 KOps/s 32.5701 KOps/s $\color{#d91a1a}-4.70\%$
test_compile_indexing[tensor-pytree-eager] 0.1079ms 23.5763μs 42.4155 KOps/s 44.2721 KOps/s $\color{#d91a1a}-4.19\%$
test_compile_indexing[slice-tensordict-compile] 0.1579ms 51.4637μs 19.4312 KOps/s 19.7107 KOps/s $\color{#d91a1a}-1.42\%$
test_compile_indexing[slice-tensordict-eager] 0.5191ms 20.8782μs 47.8967 KOps/s 44.9268 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1045ms 45.2710μs 22.0892 KOps/s 21.7643 KOps/s $\color{#35bf28}+1.49\%$
test_compile_indexing[slice-tensorclass-eager] 83.4260μs 18.8746μs 52.9813 KOps/s 53.6119 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_indexing[slice-pytree-compile] 0.3156ms 47.1691μs 21.2003 KOps/s 21.7061 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_indexing[slice-pytree-eager] 58.1690μs 18.7115μs 53.4430 KOps/s 53.3550 KOps/s $\color{#35bf28}+0.17\%$
test_compile_indexing[int-tensordict-compile] 0.1422ms 51.9654μs 19.2436 KOps/s 19.1781 KOps/s $\color{#35bf28}+0.34\%$
test_compile_indexing[int-tensordict-eager] 1.0307ms 20.8273μs 48.0138 KOps/s 46.0833 KOps/s $\color{#35bf28}+4.19\%$
test_compile_indexing[int-tensorclass-compile] 0.2874ms 45.2600μs 22.0946 KOps/s 21.7040 KOps/s $\color{#35bf28}+1.80\%$
test_compile_indexing[int-tensorclass-eager] 0.3005ms 18.8868μs 52.9470 KOps/s 53.7839 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_indexing[int-pytree-compile] 0.1163ms 45.9537μs 21.7610 KOps/s 21.6668 KOps/s $\color{#35bf28}+0.44\%$
test_compile_indexing[int-pytree-eager] 61.4250μs 18.7217μs 53.4141 KOps/s 53.8509 KOps/s $\color{#d91a1a}-0.81\%$
test_mod_add[eager] 86.2010μs 33.4031μs 29.9373 KOps/s 28.4976 KOps/s $\textbf{\color{#35bf28}+5.05\%}$
test_mod_add[compile] 0.1240ms 64.3675μs 15.5358 KOps/s 15.6642 KOps/s $\color{#d91a1a}-0.82\%$
test_mod_add[compile-overhead] 0.1306ms 62.8325μs 15.9153 KOps/s 15.3391 KOps/s $\color{#35bf28}+3.76\%$
test_mod_wrap[eager] 0.3432ms 0.2207ms 4.5310 KOps/s 4.5787 KOps/s $\color{#d91a1a}-1.04\%$
test_mod_wrap[compile] 2.2946ms 0.2263ms 4.4198 KOps/s 4.3633 KOps/s $\color{#35bf28}+1.29\%$
test_mod_wrap[compile-overhead] 0.3167ms 0.2239ms 4.4659 KOps/s 4.4526 KOps/s $\color{#35bf28}+0.30\%$
test_mod_wrap_and_backward[eager] 16.6486ms 12.8783ms 77.6500 Ops/s 89.3243 Ops/s $\textbf{\color{#d91a1a}-13.07\%}$
test_mod_wrap_and_backward[compile] 13.6778ms 11.3718ms 87.9367 Ops/s 92.9158 Ops/s $\textbf{\color{#d91a1a}-5.36\%}$
test_mod_wrap_and_backward[compile-overhead] 13.7407ms 11.3076ms 88.4361 Ops/s 85.2653 Ops/s $\color{#35bf28}+3.72\%$
test_seq_add[eager] 0.3160ms 0.1273ms 7.8550 KOps/s 7.7573 KOps/s $\color{#35bf28}+1.26\%$
test_seq_add[compile] 0.1512ms 77.7730μs 12.8579 KOps/s 12.6215 KOps/s $\color{#35bf28}+1.87\%$
test_seq_add[compile-overhead] 0.1700ms 75.5421μs 13.2376 KOps/s 13.0175 KOps/s $\color{#35bf28}+1.69\%$
test_seq_wrap[eager] 1.0397ms 0.4418ms 2.2633 KOps/s 2.2625 KOps/s $\color{#35bf28}+0.04\%$
test_seq_wrap[compile] 0.4384ms 0.2407ms 4.1546 KOps/s 4.0868 KOps/s $\color{#35bf28}+1.66\%$
test_seq_wrap[compile-overhead] 0.3849ms 0.2452ms 4.0778 KOps/s 4.0763 KOps/s $\color{#35bf28}+0.04\%$
test_func_call_runtime[False-eager] 0.9426ms 0.5377ms 1.8599 KOps/s 1.7980 KOps/s $\color{#35bf28}+3.45\%$
test_func_call_runtime[False-compile] 0.9073ms 0.4485ms 2.2298 KOps/s 2.2219 KOps/s $\color{#35bf28}+0.35\%$
test_func_call_runtime[False-compile-overhead] 0.8476ms 0.4470ms 2.2373 KOps/s 2.2482 KOps/s $\color{#d91a1a}-0.49\%$
test_func_call_runtime[True-eager] 1.2309ms 0.7619ms 1.3125 KOps/s 1.3138 KOps/s $\color{#d91a1a}-0.10\%$
test_func_call_runtime[True-compile] 0.7329ms 0.4641ms 2.1549 KOps/s 2.1500 KOps/s $\color{#35bf28}+0.23\%$
test_func_call_runtime[True-compile-overhead] 0.7796ms 0.4633ms 2.1583 KOps/s 2.0960 KOps/s $\color{#35bf28}+2.97\%$
test_func_call_cm_runtime[False-eager] 0.9362ms 0.5419ms 1.8454 KOps/s 1.8094 KOps/s $\color{#35bf28}+1.99\%$
test_func_call_cm_runtime[False-compile] 0.9506ms 0.4455ms 2.2449 KOps/s 2.2554 KOps/s $\color{#d91a1a}-0.47\%$
test_func_call_cm_runtime[False-compile-overhead] 0.7322ms 0.4464ms 2.2404 KOps/s 2.2709 KOps/s $\color{#d91a1a}-1.35\%$
test_func_call_cm_runtime[True-eager] 1.4567ms 0.9016ms 1.1091 KOps/s 1.0939 KOps/s $\color{#35bf28}+1.39\%$
test_func_call_cm_runtime[True-compile] 1.2034ms 0.7984ms 1.2524 KOps/s 1.2451 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9864ms 0.7997ms 1.2505 KOps/s 1.2115 KOps/s $\color{#35bf28}+3.22\%$
test_vmap_func_call_cm_runtime[eager] 2.6147ms 1.8944ms 527.8780 Ops/s 511.5821 Ops/s $\color{#35bf28}+3.19\%$
test_vmap_func_call_cm_runtime[compile] 0.8011ms 0.5336ms 1.8739 KOps/s 1.8578 KOps/s $\color{#35bf28}+0.87\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8870ms 0.5369ms 1.8624 KOps/s 1.8676 KOps/s $\color{#d91a1a}-0.28\%$
test_distributed 0.4026ms 0.1228ms 8.1409 KOps/s 7.8695 KOps/s $\color{#35bf28}+3.45\%$
test_tdmodule 46.2470μs 26.1300μs 38.2701 KOps/s 37.5679 KOps/s $\color{#35bf28}+1.87\%$
test_tdmodule_dispatch 0.1769ms 50.5977μs 19.7637 KOps/s 19.3058 KOps/s $\color{#35bf28}+2.37\%$
test_tdseq 48.4810μs 28.1364μs 35.5412 KOps/s 35.2877 KOps/s $\color{#35bf28}+0.72\%$
test_tdseq_dispatch 88.4150μs 55.0771μs 18.1564 KOps/s 18.0178 KOps/s $\color{#35bf28}+0.77\%$
test_instantiation_functorch 2.1082ms 1.5352ms 651.3959 Ops/s 650.7719 Ops/s $\color{#35bf28}+0.10\%$
test_exec_functorch 0.3669ms 0.1783ms 5.6087 KOps/s 5.4433 KOps/s $\color{#35bf28}+3.04\%$
test_exec_functional_call 0.3298ms 0.1755ms 5.6966 KOps/s 5.8034 KOps/s $\color{#d91a1a}-1.84\%$
test_exec_td_decorator 0.4599ms 0.2367ms 4.2254 KOps/s 4.3641 KOps/s $\color{#d91a1a}-3.18\%$
test_vmap_mlp_speed_decorator[True-True] 0.9836ms 0.6526ms 1.5323 KOps/s 1.4744 KOps/s $\color{#35bf28}+3.93\%$
test_vmap_mlp_speed_decorator[True-False] 0.9906ms 0.6531ms 1.5312 KOps/s 1.4904 KOps/s $\color{#35bf28}+2.74\%$
test_vmap_mlp_speed_decorator[False-True] 0.8179ms 0.5301ms 1.8864 KOps/s 1.8518 KOps/s $\color{#35bf28}+1.87\%$
test_vmap_mlp_speed_decorator[False-False] 0.6913ms 0.5254ms 1.9034 KOps/s 1.8601 KOps/s $\color{#35bf28}+2.33\%$
test_to_module_speed[True] 1.6969ms 1.3162ms 759.7776 Ops/s 734.0869 Ops/s $\color{#35bf28}+3.50\%$
test_to_module_speed[False] 1.7925ms 1.2902ms 775.0498 Ops/s 771.7309 Ops/s $\color{#35bf28}+0.43\%$
test_tc_init 0.1733ms 45.2080μs 22.1200 KOps/s 20.6682 KOps/s $\textbf{\color{#35bf28}+7.02\%}$
test_tc_init_tensor_only 0.1321ms 14.3191μs 69.8368 KOps/s 66.2803 KOps/s $\textbf{\color{#35bf28}+5.37\%}$
test_tc_init_nested 0.2152ms 89.7849μs 11.1377 KOps/s 10.4441 KOps/s $\textbf{\color{#35bf28}+6.64\%}$
test_tc_first_layer_tensor 42.3890μs 1.6137μs 619.6996 KOps/s 615.1788 KOps/s $\color{#35bf28}+0.73\%$
test_tc_first_layer_tensor_only 4.6237μs 0.8975μs 1.1142 MOps/s 1.1169 MOps/s $\color{#d91a1a}-0.24\%$
test_tc_first_layer_tensor_set 19.8270μs 4.1044μs 243.6399 KOps/s 240.5011 KOps/s $\color{#35bf28}+1.31\%$
test_tc_first_layer_tensor_only_set 47.3780μs 2.5706μs 389.0083 KOps/s 365.3466 KOps/s $\textbf{\color{#35bf28}+6.48\%}$
test_tc_first_layer_nontensor 18.0840μs 4.7513μs 210.4695 KOps/s 212.8869 KOps/s $\color{#d91a1a}-1.14\%$
test_tc_second_layer_tensor 51.9470μs 3.0230μs 330.7933 KOps/s 324.7929 KOps/s $\color{#35bf28}+1.85\%$
test_tc_second_layer_nontensor 32.7620μs 6.2324μs 160.4521 KOps/s 160.0896 KOps/s $\color{#35bf28}+0.23\%$
test_unbind 0.2571s 14.1796ms 70.5240 Ops/s 66.2939 Ops/s $\textbf{\color{#35bf28}+6.38\%}$
test_full_like 12.9100ms 5.1770ms 193.1629 Ops/s 179.9378 Ops/s $\textbf{\color{#35bf28}+7.35\%}$
test_zeros_like 5.7705ms 2.7136ms 368.5175 Ops/s 346.0094 Ops/s $\textbf{\color{#35bf28}+6.51\%}$
test_ones_like 5.8972ms 3.4961ms 286.0300 Ops/s 194.8314 Ops/s $\textbf{\color{#35bf28}+46.81\%}$
test_clone 6.8965ms 5.2917ms 188.9766 Ops/s 180.0756 Ops/s $\color{#35bf28}+4.94\%$
test_squeeze 74.7400μs 12.3524μs 80.9557 KOps/s 79.6850 KOps/s $\color{#35bf28}+1.59\%$
test_unsqueeze 0.1489ms 92.1952μs 10.8466 KOps/s 10.3835 KOps/s $\color{#35bf28}+4.46\%$
test_split 0.5101ms 0.1963ms 5.0951 KOps/s 5.1214 KOps/s $\color{#d91a1a}-0.51\%$
test_permute 0.3572ms 0.2004ms 4.9904 KOps/s 4.7649 KOps/s $\color{#35bf28}+4.73\%$
test_stack 32.3805ms 24.8446ms 40.2502 Ops/s 39.2995 Ops/s $\color{#35bf28}+2.42\%$
test_cat 30.2717ms 25.4031ms 39.3652 Ops/s 39.7212 Ops/s $\color{#d91a1a}-0.90\%$

@github-actions
Copy link

github-actions bot commented Apr 11, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 233. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 44.7810μs 11.4134μs 87.6166 KOps/s 88.1207 KOps/s $\color{#d91a1a}-0.57\%$
test_plain_set_stack_nested 40.8610μs 11.4714μs 87.1731 KOps/s 87.4010 KOps/s $\color{#d91a1a}-0.26\%$
test_plain_set_nested_inplace 69.0810μs 12.5113μs 79.9276 KOps/s 79.7257 KOps/s $\color{#35bf28}+0.25\%$
test_plain_set_stack_nested_inplace 41.5300μs 12.4580μs 80.2694 KOps/s 80.3263 KOps/s $\color{#d91a1a}-0.07\%$
test_items 26.0810μs 2.8348μs 352.7524 KOps/s 341.0829 KOps/s $\color{#35bf28}+3.42\%$
test_items_nested 0.4154ms 0.3572ms 2.7993 KOps/s 2.7253 KOps/s $\color{#35bf28}+2.71\%$
test_items_nested_locked 0.4682ms 0.3653ms 2.7376 KOps/s 2.7288 KOps/s $\color{#35bf28}+0.32\%$
test_items_nested_leaf 0.1064ms 60.0974μs 16.6397 KOps/s 16.5186 KOps/s $\color{#35bf28}+0.73\%$
test_items_stack_nested 0.4010ms 0.3646ms 2.7425 KOps/s 2.7074 KOps/s $\color{#35bf28}+1.30\%$
test_items_stack_nested_leaf 97.1010μs 60.9512μs 16.4066 KOps/s 16.5735 KOps/s $\color{#d91a1a}-1.01\%$
test_items_stack_nested_locked 0.4221ms 0.3631ms 2.7537 KOps/s 2.7140 KOps/s $\color{#35bf28}+1.46\%$
test_keys 29.1910μs 3.3988μs 294.2212 KOps/s 280.4995 KOps/s $\color{#35bf28}+4.89\%$
test_keys_nested 0.1194ms 88.3710μs 11.3159 KOps/s 11.1845 KOps/s $\color{#35bf28}+1.18\%$
test_keys_nested_locked 0.8183ms 94.2908μs 10.6055 KOps/s 10.3728 KOps/s $\color{#35bf28}+2.24\%$
test_keys_nested_leaf 0.1429ms 78.5513μs 12.7305 KOps/s 12.4693 KOps/s $\color{#35bf28}+2.09\%$
test_keys_stack_nested 0.1516ms 87.8941μs 11.3773 KOps/s 11.2432 KOps/s $\color{#35bf28}+1.19\%$
test_keys_stack_nested_leaf 0.1237ms 78.3322μs 12.7661 KOps/s 12.4238 KOps/s $\color{#35bf28}+2.76\%$
test_keys_stack_nested_locked 0.1498ms 93.4432μs 10.7017 KOps/s 10.4908 KOps/s $\color{#35bf28}+2.01\%$
test_values 8.9683μs 0.8490μs 1.1778 MOps/s 1.1805 MOps/s $\color{#d91a1a}-0.22\%$
test_values_nested 70.8310μs 37.1764μs 26.8988 KOps/s 26.4024 KOps/s $\color{#35bf28}+1.88\%$
test_values_nested_locked 66.8210μs 39.0489μs 25.6089 KOps/s 24.9941 KOps/s $\color{#35bf28}+2.46\%$
test_values_nested_leaf 68.3410μs 42.2830μs 23.6502 KOps/s 23.3649 KOps/s $\color{#35bf28}+1.22\%$
test_values_stack_nested 79.4420μs 37.4029μs 26.7359 KOps/s 26.3888 KOps/s $\color{#35bf28}+1.32\%$
test_values_stack_nested_leaf 70.7010μs 42.3088μs 23.6357 KOps/s 23.3602 KOps/s $\color{#35bf28}+1.18\%$
test_values_stack_nested_locked 74.3510μs 39.1739μs 25.5272 KOps/s 24.7046 KOps/s $\color{#35bf28}+3.33\%$
test_membership 1.7535μs 0.4974μs 2.0104 MOps/s 2.0115 MOps/s $\color{#d91a1a}-0.05\%$
test_membership_nested 22.3800μs 2.0582μs 485.8546 KOps/s 505.8997 KOps/s $\color{#d91a1a}-3.96\%$
test_membership_nested_leaf 17.3800μs 1.9629μs 509.4620 KOps/s 507.3890 KOps/s $\color{#35bf28}+0.41\%$
test_membership_stacked_nested 31.8610μs 2.0475μs 488.3941 KOps/s 484.3282 KOps/s $\color{#35bf28}+0.84\%$
test_membership_stacked_nested_leaf 21.7300μs 2.0517μs 487.3967 KOps/s 482.2032 KOps/s $\color{#35bf28}+1.08\%$
test_membership_nested_last 26.4810μs 2.9857μs 334.9324 KOps/s 329.7962 KOps/s $\color{#35bf28}+1.56\%$
test_membership_nested_leaf_last 49.7910μs 2.9502μs 338.9599 KOps/s 328.1380 KOps/s $\color{#35bf28}+3.30\%$
test_membership_stacked_nested_last 29.9300μs 3.0067μs 332.5888 KOps/s 327.7434 KOps/s $\color{#35bf28}+1.48\%$
test_membership_stacked_nested_leaf_last 50.1910μs 2.9891μs 334.5467 KOps/s 328.1775 KOps/s $\color{#35bf28}+1.94\%$
test_nested_getleaf 44.8310μs 13.1156μs 76.2448 KOps/s 77.6158 KOps/s $\color{#d91a1a}-1.77\%$
test_nested_get 48.4510μs 12.4648μs 80.2260 KOps/s 81.0625 KOps/s $\color{#d91a1a}-1.03\%$
test_stacked_getleaf 45.8910μs 13.0003μs 76.9212 KOps/s 76.6961 KOps/s $\color{#35bf28}+0.29\%$
test_stacked_get 32.2510μs 12.2715μs 81.4896 KOps/s 81.1475 KOps/s $\color{#35bf28}+0.42\%$
test_nested_getitemleaf 76.5310μs 13.4883μs 74.1386 KOps/s 74.1182 KOps/s $\color{#35bf28}+0.03\%$
test_nested_getitem 38.7000μs 12.7028μs 78.7228 KOps/s 78.4649 KOps/s $\color{#35bf28}+0.33\%$
test_stacked_getitemleaf 44.7210μs 13.3663μs 74.8152 KOps/s 74.2509 KOps/s $\color{#35bf28}+0.76\%$
test_stacked_getitem 63.7010μs 12.6525μs 79.0359 KOps/s 79.3248 KOps/s $\color{#d91a1a}-0.36\%$
test_lock_nested 0.8231ms 0.3541ms 2.8241 KOps/s 2.8758 KOps/s $\color{#d91a1a}-1.80\%$
test_lock_stack_nested 0.3948ms 0.3485ms 2.8692 KOps/s 2.9060 KOps/s $\color{#d91a1a}-1.27\%$
test_unlock_nested 0.5251ms 0.3005ms 3.3282 KOps/s 3.4522 KOps/s $\color{#d91a1a}-3.59\%$
test_unlock_stack_nested 0.3267ms 0.2865ms 3.4903 KOps/s 3.5512 KOps/s $\color{#d91a1a}-1.71\%$
test_flatten_speed 0.1167ms 77.1152μs 12.9676 KOps/s 12.9671 KOps/s $+0.00\%$
test_unflatten_speed 0.5108ms 0.4002ms 2.4988 KOps/s 2.4674 KOps/s $\color{#35bf28}+1.27\%$
test_common_ops 0.8956ms 0.6363ms 1.5717 KOps/s 1.6082 KOps/s $\color{#d91a1a}-2.27\%$
test_creation 0.1216ms 1.7185μs 581.8907 KOps/s 569.9992 KOps/s $\color{#35bf28}+2.09\%$
test_creation_empty 0.6263ms 7.2296μs 138.3201 KOps/s 139.3044 KOps/s $\color{#d91a1a}-0.71\%$
test_creation_nested_1 0.1160ms 10.2143μs 97.9023 KOps/s 99.3141 KOps/s $\color{#d91a1a}-1.42\%$
test_creation_nested_2 0.1306ms 12.9522μs 77.2067 KOps/s 77.3626 KOps/s $\color{#d91a1a}-0.20\%$
test_clone 44.8900μs 10.0980μs 99.0296 KOps/s 100.0848 KOps/s $\color{#d91a1a}-1.05\%$
test_getitem[int] 0.1676ms 10.6351μs 94.0281 KOps/s 94.1461 KOps/s $\color{#d91a1a}-0.13\%$
test_getitem[slice_int] 0.1173ms 20.8026μs 48.0710 KOps/s 49.1620 KOps/s $\color{#d91a1a}-2.22\%$
test_getitem[range] 0.1317ms 37.1829μs 26.8940 KOps/s 27.3204 KOps/s $\color{#d91a1a}-1.56\%$
test_getitem[tuple] 0.1084ms 18.1574μs 55.0741 KOps/s 56.1367 KOps/s $\color{#d91a1a}-1.89\%$
test_getitem[list] 0.1365ms 33.5442μs 29.8114 KOps/s 31.0131 KOps/s $\color{#d91a1a}-3.87\%$
test_setitem_dim[int] 42.3210μs 18.8480μs 53.0560 KOps/s 54.3815 KOps/s $\color{#d91a1a}-2.44\%$
test_setitem_dim[slice_int] 72.8010μs 39.0408μs 25.6142 KOps/s 27.5270 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_setitem_dim[range] 79.1610μs 54.6553μs 18.2965 KOps/s 18.9845 KOps/s $\color{#d91a1a}-3.62\%$
test_setitem_dim[tuple] 54.4910μs 32.0870μs 31.1653 KOps/s 31.9462 KOps/s $\color{#d91a1a}-2.44\%$
test_setitem 0.2172ms 14.9930μs 66.6978 KOps/s 67.7408 KOps/s $\color{#d91a1a}-1.54\%$
test_set 0.2230ms 14.3371μs 69.7493 KOps/s 70.6999 KOps/s $\color{#d91a1a}-1.34\%$
test_set_shared 0.5126ms 0.1611ms 6.2074 KOps/s 6.3466 KOps/s $\color{#d91a1a}-2.19\%$
test_update 0.2251ms 18.3337μs 54.5445 KOps/s 58.0372 KOps/s $\textbf{\color{#d91a1a}-6.02\%}$
test_update_nested 0.1187ms 28.5236μs 35.0586 KOps/s 36.2873 KOps/s $\color{#d91a1a}-3.39\%$
test_update__nested 76.8010μs 24.3933μs 40.9949 KOps/s 42.0341 KOps/s $\color{#d91a1a}-2.47\%$
test_set_nested 86.6010μs 15.7334μs 63.5589 KOps/s 67.0138 KOps/s $\textbf{\color{#d91a1a}-5.16\%}$
test_set_nested_new 0.1071ms 18.7922μs 53.2137 KOps/s 52.6818 KOps/s $\color{#35bf28}+1.01\%$
test_select 0.1264ms 30.3268μs 32.9741 KOps/s 32.4427 KOps/s $\color{#35bf28}+1.64\%$
test_select_nested 73.4510μs 42.9531μs 23.2812 KOps/s 23.0483 KOps/s $\color{#35bf28}+1.01\%$
test_exclude_nested 99.8110μs 62.0766μs 16.1091 KOps/s 15.5883 KOps/s $\color{#35bf28}+3.34\%$
test_empty[True] 0.3385ms 0.2907ms 3.4396 KOps/s 3.3425 KOps/s $\color{#35bf28}+2.91\%$
test_empty[False] 4.5681μs 0.8202μs 1.2193 MOps/s 1.1998 MOps/s $\color{#35bf28}+1.62\%$
test_to 0.8127ms 0.1706ms 5.8630 KOps/s 16.4483 KOps/s $\textbf{\color{#d91a1a}-64.36\%}$
test_to_nonblocking 0.1046ms 52.2275μs 19.1470 KOps/s 19.9130 KOps/s $\color{#d91a1a}-3.85\%$
test_unbind_speed 0.2976ms 0.2446ms 4.0885 KOps/s 4.1644 KOps/s $\color{#d91a1a}-1.82\%$
test_unbind_speed_stack0 0.2985ms 0.2454ms 4.0748 KOps/s 4.2119 KOps/s $\color{#d91a1a}-3.26\%$
test_unbind_speed_stack1 93.0539ms 0.7490ms 1.3352 KOps/s 1.3626 KOps/s $\color{#d91a1a}-2.01\%$
test_split 94.3766ms 1.5977ms 625.8842 Ops/s 640.0389 Ops/s $\color{#d91a1a}-2.21\%$
test_chunk 94.1168ms 1.5950ms 626.9447 Ops/s 633.1993 Ops/s $\color{#d91a1a}-0.99\%$
test_consolidate[False-None] 96.6195ms 3.0995ms 322.6360 Ops/s 324.4685 Ops/s $\color{#d91a1a}-0.56\%$
test_consolidate[default-None] 2.1502ms 1.7522ms 570.7228 Ops/s 577.4989 Ops/s $\color{#d91a1a}-1.17\%$
test_consolidate[reduce-overhead-None] 1.8577ms 1.7859ms 559.9392 Ops/s 570.5081 Ops/s $\color{#d91a1a}-1.85\%$
test_consolidate_njt[False-None] 6.8315ms 6.4796ms 154.3297 Ops/s 151.6020 Ops/s $\color{#35bf28}+1.80\%$
test_to[False-False-None] 1.9147ms 1.7580ms 568.8174 Ops/s 572.9520 Ops/s $\color{#d91a1a}-0.72\%$
test_to[True-False-None] 2.1266ms 1.5075ms 663.3390 Ops/s 701.5633 Ops/s $\textbf{\color{#d91a1a}-5.45\%}$
test_to[within-False-None] 4.5879ms 4.4158ms 226.4579 Ops/s 228.7987 Ops/s $\color{#d91a1a}-1.02\%$
test_to[True-default-None] 5.7838ms 5.4729ms 182.7182 Ops/s 187.7395 Ops/s $\color{#d91a1a}-2.67\%$
test_to_njt[False-False-None] 7.4767ms 6.9814ms 143.2372 Ops/s 145.2479 Ops/s $\color{#d91a1a}-1.38\%$
test_to_njt[True-False-None] 5.8126ms 5.4813ms 182.4370 Ops/s 180.3533 Ops/s $\color{#35bf28}+1.16\%$
test_to_njt[within-False-None] 12.8837ms 12.2609ms 81.5601 Ops/s 81.2528 Ops/s $\color{#35bf28}+0.38\%$
test_creation[device0] 0.4543ms 78.7493μs 12.6985 KOps/s 12.4828 KOps/s $\color{#35bf28}+1.73\%$
test_creation_from_tensor 0.6372ms 83.6357μs 11.9566 KOps/s 11.8189 KOps/s $\color{#35bf28}+1.17\%$
test_add_one[memmap_tensor0] 0.4323ms 6.5458μs 152.7687 KOps/s 162.6173 KOps/s $\textbf{\color{#d91a1a}-6.06\%}$
test_contiguous[memmap_tensor0] 1.8300μs 0.4253μs 2.3514 MOps/s 2.3559 MOps/s $\color{#d91a1a}-0.19\%$
test_stack[memmap_tensor0] 37.1510μs 4.8729μs 205.2166 KOps/s 224.7681 KOps/s $\textbf{\color{#d91a1a}-8.70\%}$
test_memmaptd_index 1.4191ms 0.2503ms 3.9945 KOps/s 4.1005 KOps/s $\color{#d91a1a}-2.59\%$
test_memmaptd_index_astensor 0.4808ms 0.3126ms 3.1988 KOps/s 3.2080 KOps/s $\color{#d91a1a}-0.29\%$
test_memmaptd_index_op 1.1882ms 0.5513ms 1.8138 KOps/s 1.8604 KOps/s $\color{#d91a1a}-2.50\%$
test_serialize_model 0.1329s 0.1318s 7.5846 Ops/s 7.5525 Ops/s $\color{#35bf28}+0.43\%$
test_serialize_model_pickle 1.3495s 1.2170s 0.8217 Ops/s 0.8210 Ops/s $\color{#35bf28}+0.08\%$
test_serialize_weights 0.1332s 0.1323s 7.5570 Ops/s 7.5805 Ops/s $\color{#d91a1a}-0.31\%$
test_serialize_weights_returnearly 0.3306s 53.2027ms 18.7960 Ops/s 16.1292 Ops/s $\textbf{\color{#35bf28}+16.53\%}$
test_serialize_weights_pickle 1.3463s 1.1850s 0.8439 Ops/s 0.8351 Ops/s $\color{#35bf28}+1.06\%$
test_reshape_pytree 65.4510μs 22.0552μs 45.3407 KOps/s 44.4315 KOps/s $\color{#35bf28}+2.05\%$
test_reshape_td 0.4095ms 26.9519μs 37.1032 KOps/s 36.7834 KOps/s $\color{#35bf28}+0.87\%$
test_view_pytree 47.0310μs 22.1391μs 45.1690 KOps/s 45.5786 KOps/s $\color{#d91a1a}-0.90\%$
test_view_td 59.6610μs 32.6775μs 30.6021 KOps/s 30.2218 KOps/s $\color{#35bf28}+1.26\%$
test_unbind_pytree 54.5110μs 28.5031μs 35.0839 KOps/s 35.6468 KOps/s $\color{#d91a1a}-1.58\%$
test_unbind_td 0.7509ms 37.4501μs 26.7022 KOps/s 26.7763 KOps/s $\color{#d91a1a}-0.28\%$
test_split_pytree 0.4222ms 30.1347μs 33.1843 KOps/s 32.7935 KOps/s $\color{#35bf28}+1.19\%$
test_split_td 0.9336ms 38.3971μs 26.0436 KOps/s 25.4370 KOps/s $\color{#35bf28}+2.39\%$
test_add_pytree 0.4142ms 33.1106μs 30.2018 KOps/s 30.8890 KOps/s $\color{#d91a1a}-2.22\%$
test_add_td 0.2855ms 46.5865μs 21.4654 KOps/s 21.8412 KOps/s $\color{#d91a1a}-1.72\%$
test_compile_add_one_nested[tensordict-compile] 0.1765ms 0.1239ms 8.0721 KOps/s 7.8525 KOps/s $\color{#35bf28}+2.80\%$
test_compile_add_one_nested[tensordict-eager] 0.2380ms 0.1426ms 7.0147 KOps/s 6.8608 KOps/s $\color{#35bf28}+2.24\%$
test_compile_add_one_nested[pytree-compile] 0.2026ms 95.7806μs 10.4405 KOps/s 10.0908 KOps/s $\color{#35bf28}+3.47\%$
test_compile_add_one_nested[pytree-eager] 0.2049ms 0.1576ms 6.3468 KOps/s 6.6606 KOps/s $\color{#d91a1a}-4.71\%$
test_compile_copy_nested[tensordict-compile] 65.1510μs 24.5883μs 40.6697 KOps/s 36.2276 KOps/s $\textbf{\color{#35bf28}+12.26\%}$
test_compile_copy_nested[tensordict-eager] 71.2110μs 34.8064μs 28.7304 KOps/s 28.0883 KOps/s $\color{#35bf28}+2.29\%$
test_compile_copy_nested[pytree-compile] 0.4912ms 63.6273μs 15.7165 KOps/s 15.4386 KOps/s $\color{#35bf28}+1.80\%$
test_compile_copy_nested[pytree-eager] 89.4010μs 48.6985μs 20.5345 KOps/s 20.3444 KOps/s $\color{#35bf28}+0.93\%$
test_compile_add_one_flat[tensordict-compile] 0.1918ms 0.1450ms 6.8964 KOps/s 6.8964 KOps/s $+0.00\%$
test_compile_add_one_flat[tensordict-eager] 0.3409ms 0.2201ms 4.5424 KOps/s 4.5056 KOps/s $\color{#35bf28}+0.82\%$
test_compile_add_one_flat[tensorclass-compile] 0.1478ms 0.1008ms 9.9188 KOps/s 9.8157 KOps/s $\color{#35bf28}+1.05\%$
test_compile_add_one_flat[tensorclass-eager] 0.1269ms 60.6374μs 16.4915 KOps/s 17.1053 KOps/s $\color{#d91a1a}-3.59\%$
test_compile_add_one_flat[pytree-compile] 0.1811ms 0.1392ms 7.1859 KOps/s 7.1894 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_add_one_flat[pytree-eager] 0.5475ms 0.4779ms 2.0925 KOps/s 2.0753 KOps/s $\color{#35bf28}+0.83\%$
test_compile_add_self_flat[tensordict-eager] 0.4331ms 0.2629ms 3.8042 KOps/s 3.7506 KOps/s $\color{#35bf28}+1.43\%$
test_compile_add_self_flat[tensordict-compile] 0.1849ms 0.1451ms 6.8931 KOps/s 6.7665 KOps/s $\color{#35bf28}+1.87\%$
test_compile_add_self_flat[tensorclass-eager] 0.1577ms 70.6031μs 14.1637 KOps/s 14.0273 KOps/s $\color{#35bf28}+0.97\%$
test_compile_add_self_flat[tensorclass-compile] 0.1423ms 98.8816μs 10.1131 KOps/s 10.1343 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_self_flat[pytree-eager] 0.4778ms 0.4110ms 2.4331 KOps/s 2.4353 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_add_self_flat[pytree-compile] 0.1847ms 0.1372ms 7.2888 KOps/s 7.3760 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_copy_flat[tensordict-compile] 0.1041ms 19.7361μs 50.6687 KOps/s 52.2963 KOps/s $\color{#d91a1a}-3.11\%$
test_compile_copy_flat[tensordict-eager] 0.2032ms 32.0348μs 31.2161 KOps/s 31.1529 KOps/s $\color{#35bf28}+0.20\%$
test_compile_copy_flat[pytree-compile] 0.1119ms 70.0311μs 14.2794 KOps/s 14.2985 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_copy_flat[pytree-eager] 86.4220μs 51.9322μs 19.2559 KOps/s 19.2029 KOps/s $\color{#35bf28}+0.28\%$
test_compile_assign_and_add[tensordict-compile] 1.6375ms 0.3948ms 2.5330 KOps/s 2.1775 KOps/s $\textbf{\color{#35bf28}+16.33\%}$
test_compile_assign_and_add[tensordict-eager] 2.9642ms 2.7947ms 357.8152 Ops/s 369.1014 Ops/s $\color{#d91a1a}-3.06\%$
test_compile_assign_and_add[pytree-compile] 1.5876ms 0.4337ms 2.3057 KOps/s 2.2456 KOps/s $\color{#35bf28}+2.68\%$
test_compile_assign_and_add[pytree-eager] 2.7544ms 2.6305ms 380.1511 Ops/s 379.5296 Ops/s $\color{#35bf28}+0.16\%$
test_compile_indexing[tensor-tensordict-compile] 0.5209ms 0.1142ms 8.7559 KOps/s 8.7564 KOps/s $-0.01\%$
test_compile_indexing[tensor-tensordict-eager] 0.5782ms 85.6780μs 11.6716 KOps/s 11.6224 KOps/s $\color{#35bf28}+0.42\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5141ms 0.1107ms 9.0349 KOps/s 9.0287 KOps/s $\color{#35bf28}+0.07\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1608ms 69.6120μs 14.3653 KOps/s 14.1232 KOps/s $\color{#35bf28}+1.71\%$
test_compile_indexing[tensor-pytree-compile] 0.1682ms 0.1106ms 9.0389 KOps/s 8.9593 KOps/s $\color{#35bf28}+0.89\%$
test_compile_indexing[tensor-pytree-eager] 0.1310ms 66.8983μs 14.9481 KOps/s 14.2966 KOps/s $\color{#35bf28}+4.56\%$
test_compile_indexing[slice-tensordict-compile] 0.1330ms 99.6022μs 10.0399 KOps/s 10.0176 KOps/s $\color{#35bf28}+0.22\%$
test_compile_indexing[slice-tensordict-eager] 0.1484ms 18.9538μs 52.7598 KOps/s 52.8674 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-tensorclass-compile] 0.1472ms 96.7584μs 10.3350 KOps/s 10.3322 KOps/s $\color{#35bf28}+0.03\%$
test_compile_indexing[slice-tensorclass-eager] 52.3210μs 15.8957μs 62.9102 KOps/s 62.6267 KOps/s $\color{#35bf28}+0.45\%$
test_compile_indexing[slice-pytree-compile] 0.1443ms 96.8885μs 10.3211 KOps/s 10.0155 KOps/s $\color{#35bf28}+3.05\%$
test_compile_indexing[slice-pytree-eager] 51.3210μs 15.9436μs 62.7211 KOps/s 63.2513 KOps/s $\color{#d91a1a}-0.84\%$
test_compile_indexing[int-tensordict-compile] 0.1507ms 0.1006ms 9.9369 KOps/s 9.6110 KOps/s $\color{#35bf28}+3.39\%$
test_compile_indexing[int-tensordict-eager] 0.7000ms 18.9776μs 52.6938 KOps/s 49.1527 KOps/s $\textbf{\color{#35bf28}+7.20\%}$
test_compile_indexing[int-tensorclass-compile] 0.1495ms 96.9206μs 10.3177 KOps/s 9.7544 KOps/s $\textbf{\color{#35bf28}+5.78\%}$
test_compile_indexing[int-tensorclass-eager] 0.1111ms 21.2898μs 46.9708 KOps/s 63.7330 KOps/s $\textbf{\color{#d91a1a}-26.30\%}$
test_compile_indexing[int-pytree-compile] 0.1557ms 99.2963μs 10.0709 KOps/s 10.0124 KOps/s $\color{#35bf28}+0.58\%$
test_compile_indexing[int-pytree-eager] 53.5100μs 15.9524μs 62.6866 KOps/s 63.3243 KOps/s $\color{#d91a1a}-1.01\%$
test_mod_add[eager] 90.1610μs 38.3888μs 26.0493 KOps/s 26.1613 KOps/s $\color{#d91a1a}-0.43\%$
test_mod_add[compile] 0.4449ms 84.5941μs 11.8211 KOps/s 11.9409 KOps/s $\color{#d91a1a}-1.00\%$
test_mod_add[compile-overhead] 0.3324ms 0.1849ms 5.4071 KOps/s 5.4757 KOps/s $\color{#d91a1a}-1.25\%$
test_mod_wrap[eager] 0.3224ms 0.2456ms 4.0717 KOps/s 3.9982 KOps/s $\color{#35bf28}+1.84\%$
test_mod_wrap[compile] 0.3572ms 0.3006ms 3.3265 KOps/s 3.4531 KOps/s $\color{#d91a1a}-3.66\%$
test_mod_wrap[compile-overhead] 7.2877ms 3.8296ms 261.1248 Ops/s 260.0677 Ops/s $\color{#35bf28}+0.41\%$
test_mod_wrap_and_backward[eager] 1.5285ms 1.3325ms 750.4638 Ops/s 696.9990 Ops/s $\textbf{\color{#35bf28}+7.67\%}$
test_mod_wrap_and_backward[compile] 1.3951ms 1.2832ms 779.3064 Ops/s 724.0034 Ops/s $\textbf{\color{#35bf28}+7.64\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4078ms 0.9351ms 1.0694 KOps/s 972.7106 Ops/s $\textbf{\color{#35bf28}+9.94\%}$
test_seq_add[eager] 0.3243ms 0.1307ms 7.6536 KOps/s 7.7587 KOps/s $\color{#d91a1a}-1.35\%$
test_seq_add[compile] 0.1344ms 91.0073μs 10.9881 KOps/s 10.7948 KOps/s $\color{#35bf28}+1.79\%$
test_seq_add[compile-overhead] 0.1749ms 0.1311ms 7.6279 KOps/s 7.4965 KOps/s $\color{#35bf28}+1.75\%$
test_seq_wrap[eager] 1.0573ms 0.4305ms 2.3227 KOps/s 2.2826 KOps/s $\color{#35bf28}+1.76\%$
test_seq_wrap[compile] 1.1460ms 0.3058ms 3.2706 KOps/s 3.2293 KOps/s $\color{#35bf28}+1.28\%$
test_seq_wrap[compile-overhead] 0.3359ms 0.2290ms 4.3661 KOps/s 4.3552 KOps/s $\color{#35bf28}+0.25\%$
test_func_call_runtime[False-eager] 0.8998ms 0.7447ms 1.3428 KOps/s 1.3547 KOps/s $\color{#d91a1a}-0.88\%$
test_func_call_runtime[False-compile] 0.9494ms 0.7585ms 1.3184 KOps/s 1.3096 KOps/s $\color{#35bf28}+0.67\%$
test_func_call_runtime[False-compile-overhead] 0.4117ms 0.3692ms 2.7086 KOps/s 2.7216 KOps/s $\color{#d91a1a}-0.48\%$
test_func_call_runtime[True-eager] 1.1459ms 0.9093ms 1.0997 KOps/s 1.1096 KOps/s $\color{#d91a1a}-0.89\%$
test_func_call_runtime[True-compile] 0.9849ms 0.8165ms 1.2247 KOps/s 1.2832 KOps/s $\color{#d91a1a}-4.56\%$
test_func_call_runtime[True-compile-overhead] 0.4520ms 0.3943ms 2.5363 KOps/s 2.5433 KOps/s $\color{#d91a1a}-0.28\%$
test_func_call_cm_runtime[False-eager] 0.8881ms 0.7628ms 1.3109 KOps/s 1.3564 KOps/s $\color{#d91a1a}-3.35\%$
test_func_call_cm_runtime[False-compile] 0.8368ms 0.7617ms 1.3128 KOps/s 1.2773 KOps/s $\color{#35bf28}+2.79\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4141ms 0.3701ms 2.7017 KOps/s 2.7076 KOps/s $\color{#d91a1a}-0.22\%$
test_func_call_cm_runtime[True-eager] 1.2097ms 0.9921ms 1.0079 KOps/s 997.0734 Ops/s $\color{#35bf28}+1.09\%$
test_func_call_cm_runtime[True-compile] 1.0365ms 0.9766ms 1.0240 KOps/s 999.8656 Ops/s $\color{#35bf28}+2.41\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0693ms 0.9812ms 1.0192 KOps/s 1.0020 KOps/s $\color{#35bf28}+1.72\%$
test_vmap_func_call_cm_runtime[eager] 2.4622ms 2.0331ms 491.8566 Ops/s 492.7165 Ops/s $\color{#d91a1a}-0.17\%$
test_vmap_func_call_cm_runtime[compile] 0.9634ms 0.8280ms 1.2077 KOps/s 1.2022 KOps/s $\color{#35bf28}+0.45\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4740ms 0.4207ms 2.3767 KOps/s 2.3623 KOps/s $\color{#35bf28}+0.61\%$
test_distributed 3.0017ms 0.1788ms 5.5927 KOps/s 8.4224 KOps/s $\textbf{\color{#d91a1a}-33.60\%}$
test_tdmodule 0.5417ms 20.7828μs 48.1167 KOps/s 51.3220 KOps/s $\textbf{\color{#d91a1a}-6.25\%}$
test_tdmodule_dispatch 78.4610μs 37.7528μs 26.4881 KOps/s 26.5326 KOps/s $\color{#d91a1a}-0.17\%$
test_tdseq 42.5800μs 20.7102μs 48.2854 KOps/s 49.5081 KOps/s $\color{#d91a1a}-2.47\%$
test_tdseq_dispatch 76.2510μs 40.2951μs 24.8169 KOps/s 25.3015 KOps/s $\color{#d91a1a}-1.92\%$
test_instantiation_functorch 1.6449ms 1.5624ms 640.0355 Ops/s 639.9498 Ops/s $\color{#35bf28}+0.01\%$
test_exec_functorch 0.2214ms 0.1412ms 7.0800 KOps/s 7.0743 KOps/s $\color{#35bf28}+0.08\%$
test_exec_functional_call 0.1866ms 0.1326ms 7.5433 KOps/s 7.6530 KOps/s $\color{#d91a1a}-1.43\%$
test_exec_td_decorator 0.3747ms 0.1828ms 5.4712 KOps/s 5.5123 KOps/s $\color{#d91a1a}-0.74\%$
test_vmap_mlp_speed_decorator[True-True] 0.8728ms 0.6750ms 1.4816 KOps/s 1.4801 KOps/s $\color{#35bf28}+0.10\%$
test_vmap_mlp_speed_decorator[True-False] 0.8924ms 0.6791ms 1.4725 KOps/s 1.4742 KOps/s $\color{#d91a1a}-0.11\%$
test_vmap_mlp_speed_decorator[False-True] 0.7957ms 0.5843ms 1.7115 KOps/s 1.7188 KOps/s $\color{#d91a1a}-0.43\%$
test_vmap_mlp_speed_decorator[False-False] 0.7027ms 0.5913ms 1.6913 KOps/s 1.7126 KOps/s $\color{#d91a1a}-1.24\%$
test_vmap_transformer_speed_decorator[True-True] 18.8937ms 18.7615ms 53.3007 Ops/s 53.5335 Ops/s $\color{#d91a1a}-0.43\%$
test_vmap_transformer_speed_decorator[True-False] 19.5851ms 18.7767ms 53.2575 Ops/s 53.5944 Ops/s $\color{#d91a1a}-0.63\%$
test_vmap_transformer_speed_decorator[False-True] 18.7259ms 18.6338ms 53.6660 Ops/s 54.2264 Ops/s $\color{#d91a1a}-1.03\%$
test_vmap_transformer_speed_decorator[False-False] 19.3549ms 18.6763ms 53.5438 Ops/s 53.8801 Ops/s $\color{#d91a1a}-0.62\%$
test_to_module_speed[True] 1.5207ms 0.9846ms 1.0156 KOps/s 1.0351 KOps/s $\color{#d91a1a}-1.88\%$
test_to_module_speed[False] 1.4186ms 0.9747ms 1.0259 KOps/s 1.0434 KOps/s $\color{#d91a1a}-1.67\%$
test_tc_init 0.1514ms 34.4076μs 29.0634 KOps/s 28.7499 KOps/s $\color{#35bf28}+1.09\%$
test_tc_init_tensor_only 0.1026ms 10.8068μs 92.5344 KOps/s 92.8869 KOps/s $\color{#d91a1a}-0.38\%$
test_tc_init_nested 0.1959ms 67.9664μs 14.7132 KOps/s 14.4471 KOps/s $\color{#35bf28}+1.84\%$
test_tc_first_layer_tensor 6.5368μs 0.8043μs 1.2433 MOps/s 1.0915 MOps/s $\textbf{\color{#35bf28}+13.90\%}$
test_tc_first_layer_tensor_only 5.3280μs 0.4295μs 2.3285 MOps/s 2.3619 MOps/s $\color{#d91a1a}-1.41\%$
test_tc_first_layer_tensor_set 23.1200μs 2.9312μs 341.1556 KOps/s 346.6446 KOps/s $\color{#d91a1a}-1.58\%$
test_tc_first_layer_tensor_only_set 16.8803μs 1.7595μs 568.3577 KOps/s 547.5824 KOps/s $\color{#35bf28}+3.79\%$
test_tc_first_layer_nontensor 16.1110μs 2.3612μs 423.5198 KOps/s 423.9959 KOps/s $\color{#d91a1a}-0.11\%$
test_tc_second_layer_tensor 39.0810μs 1.7562μs 569.3965 KOps/s 571.4271 KOps/s $\color{#d91a1a}-0.36\%$
test_tc_second_layer_nontensor 23.1110μs 3.1434μs 318.1290 KOps/s 312.3788 KOps/s $\color{#35bf28}+1.84\%$
test_unbind 0.2322s 12.9106ms 77.4558 Ops/s 145.9440 Ops/s $\textbf{\color{#d91a1a}-46.93\%}$
test_full_like 9.2116ms 5.0008ms 199.9684 Ops/s 112.8641 Ops/s $\textbf{\color{#35bf28}+77.18\%}$
test_zeros_like 5.3595ms 4.3196ms 231.5011 Ops/s 231.4273 Ops/s $\color{#35bf28}+0.03\%$
test_ones_like 4.9592ms 4.3146ms 231.7685 Ops/s 230.7537 Ops/s $\color{#35bf28}+0.44\%$
test_clone 6.4669ms 6.3388ms 157.7584 Ops/s 158.0672 Ops/s $\color{#d91a1a}-0.20\%$
test_squeeze 80.2610μs 10.2500μs 97.5608 KOps/s 100.7318 KOps/s $\color{#d91a1a}-3.15\%$
test_unsqueeze 0.1345ms 76.2858μs 13.1086 KOps/s 13.0666 KOps/s $\color{#35bf28}+0.32\%$
test_split 0.4252ms 0.1685ms 5.9340 KOps/s 6.0180 KOps/s $\color{#d91a1a}-1.40\%$
test_permute 0.2803ms 0.1907ms 5.2425 KOps/s 5.2137 KOps/s $\color{#35bf28}+0.55\%$
test_stack 51.4773ms 50.4660ms 19.8153 Ops/s 56.6795 Ops/s $\textbf{\color{#d91a1a}-65.04\%}$
test_cat 50.6625ms 50.3181ms 19.8736 Ops/s 38.4678 Ops/s $\textbf{\color{#d91a1a}-48.34\%}$

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Apr 11, 2025
ghstack-source-id: 81c298f
Pull Request resolved: #1285
@vmoens vmoens merged commit 2656142 into gh/vmoens/50/base Apr 11, 2025
20 of 34 checks passed
vmoens pushed a commit that referenced this pull request Apr 11, 2025
ghstack-source-id: 81c298f
Pull Request resolved: #1285
@vmoens vmoens deleted the gh/vmoens/50/head branch April 11, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants