Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Jan 13, 2025

I propose a global flag composite_lp_aggregate to handle the issue of the aggregation of log-probs.

So far, we have dealt with this using kwargs everywhere (in ProbabilisticTDModule, ProbabilisticTDSequential, CompositeDistribution and subclasses). The hierarchy of these classes and what to do when args conflict isn't easy to handle. It's also confusing for users, who I suspect will usually want to work with either collapsed or non-collapsed log-probs.

A global flag (set to True for now and False in the future) will make things easier to handle.
Globally, the v0.6.2 behaviour will not be changed all users who rely on it will be informed about the upcoming change through a warning that will tell them to set the global var to False to accommodate upcoming changes. If they set it to True, nothing will change for them but that also means that bugs will not be solved (we won't maintain the True behaviour).

When composite_lp_aggregate() == True, we'll have aggregate_probabilities=True, include_sum=True and inplace=True by default everywhere. When composite_lp_aggregate() == False, all of these will be set to False, meaning that any call to whatever.log_prob(tensordict) will return another tensordict containing the leaf log-probs.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 13, 2025
@vmoens vmoens added Refactor Refactoring code - not a new feature Deprecation Announces or enacts a deprecation labels Jan 13, 2025
@github-actions
Copy link

github-actions bot commented Jan 14, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 47.6490μs 19.1288μs 52.2773 KOps/s 50.9612 KOps/s $\color{#35bf28}+2.58\%$
test_plain_set_stack_nested 50.6140μs 19.4405μs 51.4391 KOps/s 50.1858 KOps/s $\color{#35bf28}+2.50\%$
test_plain_set_nested_inplace 72.2750μs 21.0282μs 47.5551 KOps/s 46.3551 KOps/s $\color{#35bf28}+2.59\%$
test_plain_set_stack_nested_inplace 62.2160μs 21.0214μs 47.5706 KOps/s 46.5642 KOps/s $\color{#35bf28}+2.16\%$
test_items 39.4940μs 4.1608μs 240.3359 KOps/s 231.1232 KOps/s $\color{#35bf28}+3.99\%$
test_items_nested 0.4813ms 0.3958ms 2.5263 KOps/s 2.4399 KOps/s $\color{#35bf28}+3.54\%$
test_items_nested_locked 0.7288ms 0.3963ms 2.5233 KOps/s 2.4552 KOps/s $\color{#35bf28}+2.77\%$
test_items_nested_leaf 0.1209ms 77.6177μs 12.8837 KOps/s 12.6881 KOps/s $\color{#35bf28}+1.54\%$
test_items_stack_nested 0.5459ms 0.3998ms 2.5014 KOps/s 2.4422 KOps/s $\color{#35bf28}+2.42\%$
test_items_stack_nested_leaf 0.1518ms 78.9272μs 12.6699 KOps/s 12.2968 KOps/s $\color{#35bf28}+3.03\%$
test_items_stack_nested_locked 0.6043ms 0.3990ms 2.5064 KOps/s 2.4353 KOps/s $\color{#35bf28}+2.92\%$
test_keys 28.3730μs 3.4943μs 286.1811 KOps/s 287.2819 KOps/s $\color{#d91a1a}-0.38\%$
test_keys_nested 0.2928ms 0.1650ms 6.0602 KOps/s 5.9116 KOps/s $\color{#35bf28}+2.51\%$
test_keys_nested_locked 0.6882ms 0.1696ms 5.8965 KOps/s 5.6655 KOps/s $\color{#35bf28}+4.08\%$
test_keys_nested_leaf 0.2618ms 0.1431ms 6.9906 KOps/s 6.7656 KOps/s $\color{#35bf28}+3.32\%$
test_keys_stack_nested 0.2618ms 0.1658ms 6.0307 KOps/s 6.0366 KOps/s $\color{#d91a1a}-0.10\%$
test_keys_stack_nested_leaf 0.2190ms 0.1431ms 6.9857 KOps/s 7.0435 KOps/s $\color{#d91a1a}-0.82\%$
test_keys_stack_nested_locked 0.2737ms 0.1709ms 5.8503 KOps/s 5.8339 KOps/s $\color{#35bf28}+0.28\%$
test_values 9.9846μs 1.0267μs 974.0270 KOps/s 950.8590 KOps/s $\color{#35bf28}+2.44\%$
test_values_nested 0.1225ms 62.4896μs 16.0027 KOps/s 15.8927 KOps/s $\color{#35bf28}+0.69\%$
test_values_nested_locked 0.1135ms 62.4497μs 16.0129 KOps/s 15.7265 KOps/s $\color{#35bf28}+1.82\%$
test_values_nested_leaf 0.1239ms 71.1562μs 14.0536 KOps/s 13.7923 KOps/s $\color{#35bf28}+1.89\%$
test_values_stack_nested 0.1184ms 62.6220μs 15.9688 KOps/s 15.3605 KOps/s $\color{#35bf28}+3.96\%$
test_values_stack_nested_leaf 0.1250ms 71.7296μs 13.9412 KOps/s 14.0569 KOps/s $\color{#d91a1a}-0.82\%$
test_values_stack_nested_locked 0.1157ms 63.0234μs 15.8671 KOps/s 15.3180 KOps/s $\color{#35bf28}+3.59\%$
test_membership 5.2881μs 0.6915μs 1.4461 MOps/s 1.1304 MOps/s $\textbf{\color{#35bf28}+27.93\%}$
test_membership_nested 24.1050μs 2.9025μs 344.5357 KOps/s 340.9579 KOps/s $\color{#35bf28}+1.05\%$
test_membership_nested_leaf 43.4110μs 2.9194μs 342.5396 KOps/s 342.3902 KOps/s $\color{#35bf28}+0.04\%$
test_membership_stacked_nested 16.5110μs 2.8670μs 348.8025 KOps/s 339.4935 KOps/s $\color{#35bf28}+2.74\%$
test_membership_stacked_nested_leaf 39.6540μs 2.8821μs 346.9653 KOps/s 340.6511 KOps/s $\color{#35bf28}+1.85\%$
test_membership_nested_last 26.3990μs 4.3044μs 232.3202 KOps/s 226.6151 KOps/s $\color{#35bf28}+2.52\%$
test_membership_nested_leaf_last 48.6010μs 4.3479μs 229.9974 KOps/s 225.6689 KOps/s $\color{#35bf28}+1.92\%$
test_membership_stacked_nested_last 42.2480μs 4.2741μs 233.9692 KOps/s 73.8938 KOps/s $\textbf{\color{#35bf28}+216.63\%}$
test_membership_stacked_nested_leaf_last 23.8750μs 4.3086μs 232.0963 KOps/s 74.5746 KOps/s $\textbf{\color{#35bf28}+211.23\%}$
test_nested_getleaf 57.4670μs 10.5724μs 94.5856 KOps/s 92.2604 KOps/s $\color{#35bf28}+2.52\%$
test_nested_get 54.4820μs 10.0525μs 99.4779 KOps/s 97.0814 KOps/s $\color{#35bf28}+2.47\%$
test_stacked_getleaf 37.3100μs 10.5969μs 94.3674 KOps/s 93.4605 KOps/s $\color{#35bf28}+0.97\%$
test_stacked_get 46.1570μs 10.0403μs 99.5984 KOps/s 98.2228 KOps/s $\color{#35bf28}+1.40\%$
test_nested_getitemleaf 54.9830μs 11.2386μs 88.9791 KOps/s 87.8680 KOps/s $\color{#35bf28}+1.26\%$
test_nested_getitem 46.6170μs 10.6394μs 93.9905 KOps/s 92.3678 KOps/s $\color{#35bf28}+1.76\%$
test_stacked_getitemleaf 56.4360μs 11.2025μs 89.2661 KOps/s 88.2837 KOps/s $\color{#35bf28}+1.11\%$
test_stacked_getitem 38.4020μs 10.6039μs 94.3052 KOps/s 92.0940 KOps/s $\color{#35bf28}+2.40\%$
test_lock_nested 1.1256ms 0.4571ms 2.1875 KOps/s 1.7835 KOps/s $\textbf{\color{#35bf28}+22.66\%}$
test_lock_stack_nested 0.5093ms 0.4237ms 2.3603 KOps/s 2.3935 KOps/s $\color{#d91a1a}-1.39\%$
test_unlock_nested 0.9275ms 0.3756ms 2.6625 KOps/s 2.6387 KOps/s $\color{#35bf28}+0.90\%$
test_unlock_stack_nested 0.5263ms 0.3444ms 2.9033 KOps/s 3.0112 KOps/s $\color{#d91a1a}-3.58\%$
test_flatten_speed 0.1866ms 0.1011ms 9.8908 KOps/s 9.7920 KOps/s $\color{#35bf28}+1.01\%$
test_unflatten_speed 0.7095ms 0.5126ms 1.9509 KOps/s 1.9054 KOps/s $\color{#35bf28}+2.39\%$
test_common_ops 1.7894ms 0.7483ms 1.3364 KOps/s 1.2984 KOps/s $\color{#35bf28}+2.93\%$
test_creation 45.2950μs 2.4877μs 401.9721 KOps/s 376.0471 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_creation_empty 32.8920μs 9.5150μs 105.0974 KOps/s 103.1581 KOps/s $\color{#35bf28}+1.88\%$
test_creation_nested_1 38.3410μs 12.4624μs 80.2413 KOps/s 80.1200 KOps/s $\color{#35bf28}+0.15\%$
test_creation_nested_2 37.7710μs 17.0233μs 58.7431 KOps/s 57.5631 KOps/s $\color{#35bf28}+2.05\%$
test_clone 74.5700μs 13.4062μs 74.5924 KOps/s 73.3338 KOps/s $\color{#35bf28}+1.72\%$
test_getitem[int] 1.2554ms 12.9086μs 77.4679 KOps/s 76.8288 KOps/s $\color{#35bf28}+0.83\%$
test_getitem[slice_int] 0.1387ms 24.3230μs 41.1133 KOps/s 40.1741 KOps/s $\color{#35bf28}+2.34\%$
test_getitem[range] 0.1724ms 48.8903μs 20.4540 KOps/s 20.7079 KOps/s $\color{#d91a1a}-1.23\%$
test_getitem[tuple] 0.1335ms 19.9061μs 50.2358 KOps/s 47.6578 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_getitem[list] 0.1960ms 43.2736μs 23.1088 KOps/s 22.7659 KOps/s $\color{#35bf28}+1.51\%$
test_setitem_dim[int] 57.5170μs 26.6233μs 37.5611 KOps/s 37.0612 KOps/s $\color{#35bf28}+1.35\%$
test_setitem_dim[slice_int] 77.7250μs 52.0196μs 19.2235 KOps/s 18.9459 KOps/s $\color{#35bf28}+1.47\%$
test_setitem_dim[range] 0.1231ms 75.4533μs 13.2532 KOps/s 13.2333 KOps/s $\color{#35bf28}+0.15\%$
test_setitem_dim[tuple] 71.2730μs 40.6808μs 24.5816 KOps/s 23.5941 KOps/s $\color{#35bf28}+4.19\%$
test_setitem 0.1217ms 19.2944μs 51.8285 KOps/s 51.0088 KOps/s $\color{#35bf28}+1.61\%$
test_set 86.3020μs 18.2843μs 54.6916 KOps/s 53.7685 KOps/s $\color{#35bf28}+1.72\%$
test_set_shared 3.6403ms 0.1728ms 5.7883 KOps/s 5.8163 KOps/s $\color{#d91a1a}-0.48\%$
test_update 0.1489ms 20.1705μs 49.5773 KOps/s 48.6470 KOps/s $\color{#35bf28}+1.91\%$
test_update_nested 99.6770μs 30.1542μs 33.1629 KOps/s 32.0276 KOps/s $\color{#35bf28}+3.54\%$
test_update__nested 0.6457ms 33.4576μs 29.8885 KOps/s 28.5921 KOps/s $\color{#35bf28}+4.53\%$
test_set_nested 89.0560μs 20.5440μs 48.6761 KOps/s 47.0010 KOps/s $\color{#35bf28}+3.56\%$
test_set_nested_new 78.0660μs 25.1113μs 39.8227 KOps/s 38.4947 KOps/s $\color{#35bf28}+3.45\%$
test_select 0.1525ms 41.1227μs 24.3175 KOps/s 23.6860 KOps/s $\color{#35bf28}+2.67\%$
test_select_nested 0.1199ms 62.8898μs 15.9008 KOps/s 15.2421 KOps/s $\color{#35bf28}+4.32\%$
test_exclude_nested 0.1781ms 81.3378μs 12.2944 KOps/s 11.8701 KOps/s $\color{#35bf28}+3.57\%$
test_empty[True] 0.6171ms 0.4059ms 2.4634 KOps/s 2.3691 KOps/s $\color{#35bf28}+3.98\%$
test_empty[False] 9.6980μs 1.3357μs 748.6649 KOps/s 720.4520 KOps/s $\color{#35bf28}+3.92\%$
test_unbind_speed 1.0582ms 0.2793ms 3.5809 KOps/s 3.6639 KOps/s $\color{#d91a1a}-2.26\%$
test_unbind_speed_stack0 0.4076ms 0.2694ms 3.7125 KOps/s 3.8906 KOps/s $\color{#d91a1a}-4.58\%$
test_unbind_speed_stack1 0.1055s 0.7937ms 1.2599 KOps/s 1.4065 KOps/s $\textbf{\color{#d91a1a}-10.42\%}$
test_split 1.7572ms 1.5753ms 634.7836 Ops/s 559.5763 Ops/s $\textbf{\color{#35bf28}+13.44\%}$
test_chunk 0.1040s 1.8954ms 527.5966 Ops/s 560.5367 Ops/s $\textbf{\color{#d91a1a}-5.88\%}$
test_consolidate_njt[False-None] 8.7183ms 8.2635ms 121.0135 Ops/s 120.4005 Ops/s $\color{#35bf28}+0.51\%$
test_creation[device0] 0.4692ms 91.5584μs 10.9220 KOps/s 11.0487 KOps/s $\color{#d91a1a}-1.15\%$
test_creation_from_tensor 3.4285ms 94.7100μs 10.5585 KOps/s 10.5379 KOps/s $\color{#35bf28}+0.20\%$
test_add_one[memmap_tensor0] 0.1386ms 4.9219μs 203.1727 KOps/s 202.9119 KOps/s $\color{#35bf28}+0.13\%$
test_contiguous[memmap_tensor0] 10.8910μs 0.5101μs 1.9605 MOps/s 1.9437 MOps/s $\color{#35bf28}+0.87\%$
test_stack[memmap_tensor0] 39.9150μs 3.4439μs 290.3700 KOps/s 280.6249 KOps/s $\color{#35bf28}+3.47\%$
test_memmaptd_index 0.8969ms 0.2332ms 4.2883 KOps/s 4.2834 KOps/s $\color{#35bf28}+0.11\%$
test_memmaptd_index_astensor 0.5782ms 0.3210ms 3.1157 KOps/s 3.1266 KOps/s $\color{#d91a1a}-0.35\%$
test_memmaptd_index_op 1.1240ms 0.5352ms 1.8685 KOps/s 1.8270 KOps/s $\color{#35bf28}+2.27\%$
test_serialize_model 0.1309s 0.1183s 8.4560 Ops/s 8.5155 Ops/s $\color{#d91a1a}-0.70\%$
test_serialize_model_pickle 0.4495s 0.4005s 2.4969 Ops/s 2.5922 Ops/s $\color{#d91a1a}-3.68\%$
test_serialize_weights 0.1221s 0.1140s 8.7687 Ops/s 8.7251 Ops/s $\color{#35bf28}+0.50\%$
test_serialize_weights_returnearly 0.2541s 0.1750s 5.7138 Ops/s 6.2655 Ops/s $\textbf{\color{#d91a1a}-8.81\%}$
test_serialize_weights_pickle 0.6038s 0.4360s 2.2936 Ops/s 2.4338 Ops/s $\textbf{\color{#d91a1a}-5.76\%}$
test_serialize_weights_filesystem 0.1498s 0.1399s 7.1483 Ops/s 7.0174 Ops/s $\color{#35bf28}+1.86\%$
test_serialize_model_filesystem 0.1620s 0.1444s 6.9258 Ops/s 6.2658 Ops/s $\textbf{\color{#35bf28}+10.53\%}$
test_reshape_pytree 64.8910μs 26.1525μs 38.2372 KOps/s 37.2711 KOps/s $\color{#35bf28}+2.59\%$
test_reshape_td 0.1133ms 33.5616μs 29.7960 KOps/s 30.4184 KOps/s $\color{#d91a1a}-2.05\%$
test_view_pytree 58.9500μs 26.0084μs 38.4491 KOps/s 37.2420 KOps/s $\color{#35bf28}+3.24\%$
test_view_td 79.1170μs 38.7501μs 25.8064 KOps/s 25.3651 KOps/s $\color{#35bf28}+1.74\%$
test_unbind_pytree 69.7810μs 29.2032μs 34.2428 KOps/s 33.5226 KOps/s $\color{#35bf28}+2.15\%$
test_unbind_td 0.2845ms 39.7299μs 25.1700 KOps/s 24.9380 KOps/s $\color{#35bf28}+0.93\%$
test_split_pytree 71.8950μs 28.9900μs 34.4947 KOps/s 32.8617 KOps/s $\color{#35bf28}+4.97\%$
test_split_td 0.5508ms 44.3035μs 22.5716 KOps/s 21.9001 KOps/s $\color{#35bf28}+3.07\%$
test_add_pytree 77.5950μs 34.8440μs 28.6993 KOps/s 27.8110 KOps/s $\color{#35bf28}+3.19\%$
test_add_td 0.1162ms 49.2040μs 20.3236 KOps/s 19.2069 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_compile_add_one_nested[tensordict-compile] 0.1474ms 62.3955μs 16.0268 KOps/s 15.3222 KOps/s $\color{#35bf28}+4.60\%$
test_compile_add_one_nested[tensordict-eager] 0.3970ms 0.1751ms 5.7105 KOps/s 5.6996 KOps/s $\color{#35bf28}+0.19\%$
test_compile_add_one_nested[pytree-compile] 0.1434ms 46.0805μs 21.7011 KOps/s 21.1912 KOps/s $\color{#35bf28}+2.41\%$
test_compile_add_one_nested[pytree-eager] 0.2623ms 0.1196ms 8.3620 KOps/s 8.2963 KOps/s $\color{#35bf28}+0.79\%$
test_compile_copy_nested[tensordict-compile] 0.1123ms 27.1519μs 36.8298 KOps/s 35.7910 KOps/s $\color{#35bf28}+2.90\%$
test_compile_copy_nested[tensordict-eager] 0.1210ms 58.7892μs 17.0099 KOps/s 16.8959 KOps/s $\color{#35bf28}+0.67\%$
test_compile_copy_nested[pytree-compile] 0.1504ms 79.4527μs 12.5861 KOps/s 12.4355 KOps/s $\color{#35bf28}+1.21\%$
test_compile_copy_nested[pytree-eager] 0.1403ms 67.7769μs 14.7543 KOps/s 14.5783 KOps/s $\color{#35bf28}+1.21\%$
test_compile_add_one_flat[tensordict-compile] 0.1833ms 0.1048ms 9.5459 KOps/s 9.3733 KOps/s $\color{#35bf28}+1.84\%$
test_compile_add_one_flat[tensordict-eager] 0.4750ms 0.2139ms 4.6748 KOps/s 4.6081 KOps/s $\color{#35bf28}+1.45\%$
test_compile_add_one_flat[tensorclass-compile] 0.1361ms 45.2505μs 22.0992 KOps/s 21.6125 KOps/s $\color{#35bf28}+2.25\%$
test_compile_add_one_flat[tensorclass-eager] 0.5069ms 67.5564μs 14.8025 KOps/s 14.8204 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_add_one_flat[pytree-compile] 0.2002ms 0.1023ms 9.7715 KOps/s 9.6031 KOps/s $\color{#35bf28}+1.75\%$
test_compile_add_one_flat[pytree-eager] 0.3221ms 0.2016ms 4.9602 KOps/s 5.0305 KOps/s $\color{#d91a1a}-1.40\%$
test_compile_add_self_flat[tensordict-eager] 0.4692ms 0.2334ms 4.2838 KOps/s 4.2502 KOps/s $\color{#35bf28}+0.79\%$
test_compile_add_self_flat[tensordict-compile] 0.1910ms 0.1050ms 9.5280 KOps/s 9.4442 KOps/s $\color{#35bf28}+0.89\%$
test_compile_add_self_flat[tensorclass-eager] 0.1627ms 63.5928μs 15.7250 KOps/s 15.5060 KOps/s $\color{#35bf28}+1.41\%$
test_compile_add_self_flat[tensorclass-compile] 0.1076ms 47.9943μs 20.8358 KOps/s 20.8957 KOps/s $\color{#d91a1a}-0.29\%$
test_compile_add_self_flat[pytree-eager] 0.3140ms 0.1617ms 6.1841 KOps/s 6.3367 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_add_self_flat[pytree-compile] 0.2279ms 0.1032ms 9.6925 KOps/s 9.7482 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_copy_flat[tensordict-compile] 68.3280μs 21.8203μs 45.8290 KOps/s 41.7296 KOps/s $\textbf{\color{#35bf28}+9.82\%}$
test_compile_copy_flat[tensordict-eager] 0.1484ms 67.2744μs 14.8645 KOps/s 15.0009 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_copy_flat[pytree-compile] 0.1972ms 79.7585μs 12.5378 KOps/s 12.5338 KOps/s $\color{#35bf28}+0.03\%$
test_compile_copy_flat[pytree-eager] 0.1574ms 68.0473μs 14.6957 KOps/s 14.7513 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_assign_and_add[tensordict-compile] 0.3417ms 0.2091ms 4.7831 KOps/s 4.8609 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_assign_and_add[tensordict-eager] 2.4661ms 1.3445ms 743.7948 Ops/s 759.1004 Ops/s $\color{#d91a1a}-2.02\%$
test_compile_assign_and_add[pytree-compile] 0.3110ms 0.2075ms 4.8188 KOps/s 4.9939 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_assign_and_add[pytree-eager] 1.3878ms 0.7796ms 1.2828 KOps/s 1.2755 KOps/s $\color{#35bf28}+0.57\%$
test_compile_assign_and_add_stack[compile] 0.5299ms 0.4503ms 2.2209 KOps/s 2.2230 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_assign_and_add_stack[eager] 3.6112ms 2.5841ms 386.9759 Ops/s 391.7029 Ops/s $\color{#d91a1a}-1.21\%$
test_compile_indexing[tensor-tensordict-compile] 90.9100μs 36.0378μs 27.7487 KOps/s 27.0949 KOps/s $\color{#35bf28}+2.41\%$
test_compile_indexing[tensor-tensordict-eager] 0.5484ms 33.3181μs 30.0138 KOps/s 29.4987 KOps/s $\color{#35bf28}+1.75\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1011ms 29.0904μs 34.3757 KOps/s 32.0626 KOps/s $\textbf{\color{#35bf28}+7.21\%}$
test_compile_indexing[tensor-tensorclass-eager] 69.7910μs 23.4624μs 42.6213 KOps/s 42.1796 KOps/s $\color{#35bf28}+1.05\%$
test_compile_indexing[tensor-pytree-compile] 99.3960μs 30.3871μs 32.9087 KOps/s 32.1964 KOps/s $\color{#35bf28}+2.21\%$
test_compile_indexing[tensor-pytree-eager] 91.5920μs 23.7806μs 42.0510 KOps/s 42.1952 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_indexing[slice-tensordict-compile] 0.1052ms 52.1408μs 19.1788 KOps/s 19.2238 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_indexing[slice-tensordict-eager] 0.5660ms 20.2335μs 49.4230 KOps/s 48.9278 KOps/s $\color{#35bf28}+1.01\%$
test_compile_indexing[slice-tensorclass-compile] 0.1134ms 44.6351μs 22.4039 KOps/s 21.7814 KOps/s $\color{#35bf28}+2.86\%$
test_compile_indexing[slice-tensorclass-eager] 83.4850μs 18.6986μs 53.4799 KOps/s 52.9760 KOps/s $\color{#35bf28}+0.95\%$
test_compile_indexing[slice-pytree-compile] 0.1049ms 45.7825μs 21.8424 KOps/s 21.6798 KOps/s $\color{#35bf28}+0.75\%$
test_compile_indexing[slice-pytree-eager] 72.6360μs 18.6654μs 53.5751 KOps/s 52.9815 KOps/s $\color{#35bf28}+1.12\%$
test_compile_indexing[int-tensordict-compile] 0.1097ms 53.3576μs 18.7415 KOps/s 18.5446 KOps/s $\color{#35bf28}+1.06\%$
test_compile_indexing[int-tensordict-eager] 0.9487ms 19.8450μs 50.3905 KOps/s 49.5054 KOps/s $\color{#35bf28}+1.79\%$
test_compile_indexing[int-tensorclass-compile] 0.1556ms 45.2566μs 22.0962 KOps/s 21.4267 KOps/s $\color{#35bf28}+3.12\%$
test_compile_indexing[int-tensorclass-eager] 78.2360μs 18.4711μs 54.1386 KOps/s 52.9385 KOps/s $\color{#35bf28}+2.27\%$
test_compile_indexing[int-pytree-compile] 0.1253ms 44.8537μs 22.2947 KOps/s 21.3760 KOps/s $\color{#35bf28}+4.30\%$
test_compile_indexing[int-pytree-eager] 0.8249ms 18.4554μs 54.1848 KOps/s 53.3213 KOps/s $\color{#35bf28}+1.62\%$
test_mod_add[eager] 99.3060μs 33.5517μs 29.8047 KOps/s 29.8641 KOps/s $\color{#d91a1a}-0.20\%$
test_mod_add[compile] 0.1234ms 48.5003μs 20.6184 KOps/s 20.6306 KOps/s $\color{#d91a1a}-0.06\%$
test_mod_add[compile-overhead] 0.1203ms 49.2114μs 20.3205 KOps/s 20.7275 KOps/s $\color{#d91a1a}-1.96\%$
test_mod_wrap[eager] 0.3500ms 0.2207ms 4.5311 KOps/s 4.4463 KOps/s $\color{#35bf28}+1.91\%$
test_mod_wrap[compile] 0.3997ms 0.2111ms 4.7381 KOps/s 4.5772 KOps/s $\color{#35bf28}+3.52\%$
test_mod_wrap[compile-overhead] 0.3817ms 0.2070ms 4.8315 KOps/s 4.7781 KOps/s $\color{#35bf28}+1.12\%$
test_mod_wrap_and_backward[eager] 20.7550ms 12.6583ms 78.9995 Ops/s 84.2877 Ops/s $\textbf{\color{#d91a1a}-6.27\%}$
test_mod_wrap_and_backward[compile] 17.8635ms 12.6071ms 79.3203 Ops/s 87.4856 Ops/s $\textbf{\color{#d91a1a}-9.33\%}$
test_mod_wrap_and_backward[compile-overhead] 16.8940ms 13.3364ms 74.9827 Ops/s 73.4562 Ops/s $\color{#35bf28}+2.08\%$
test_seq_add[eager] 0.2425ms 0.1105ms 9.0523 KOps/s 8.7521 KOps/s $\color{#35bf28}+3.43\%$
test_seq_add[compile] 0.1410ms 63.6004μs 15.7232 KOps/s 15.5645 KOps/s $\color{#35bf28}+1.02\%$
test_seq_add[compile-overhead] 0.1279ms 60.6360μs 16.4919 KOps/s 16.0288 KOps/s $\color{#35bf28}+2.89\%$
test_seq_wrap[eager] 0.5690ms 0.4356ms 2.2957 KOps/s 2.2328 KOps/s $\color{#35bf28}+2.82\%$
test_seq_wrap[compile] 0.4583ms 0.2299ms 4.3496 KOps/s 4.2251 KOps/s $\color{#35bf28}+2.95\%$
test_seq_wrap[compile-overhead] 0.4277ms 0.2270ms 4.4062 KOps/s 4.2615 KOps/s $\color{#35bf28}+3.40\%$
test_func_call_runtime[False-eager] 0.8923ms 0.5482ms 1.8241 KOps/s 1.7428 KOps/s $\color{#35bf28}+4.66\%$
test_func_call_runtime[False-compile] 0.5259ms 0.4222ms 2.3687 KOps/s 2.3252 KOps/s $\color{#35bf28}+1.87\%$
test_func_call_runtime[False-compile-overhead] 0.8112ms 0.4288ms 2.3320 KOps/s 2.3469 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_runtime[True-eager] 1.0889ms 0.7721ms 1.2952 KOps/s 1.2662 KOps/s $\color{#35bf28}+2.29\%$
test_func_call_runtime[True-compile] 0.6059ms 0.4583ms 2.1821 KOps/s 2.1316 KOps/s $\color{#35bf28}+2.37\%$
test_func_call_runtime[True-compile-overhead] 0.6477ms 0.4583ms 2.1822 KOps/s 2.1326 KOps/s $\color{#35bf28}+2.33\%$
test_func_call_cm_runtime[False-eager] 1.2497ms 0.5431ms 1.8413 KOps/s 1.7516 KOps/s $\textbf{\color{#35bf28}+5.13\%}$
test_func_call_cm_runtime[False-compile] 0.5414ms 0.4204ms 2.3786 KOps/s 2.3409 KOps/s $\color{#35bf28}+1.61\%$
test_func_call_cm_runtime[False-compile-overhead] 0.5443ms 0.4209ms 2.3758 KOps/s 2.3275 KOps/s $\color{#35bf28}+2.07\%$
test_func_call_cm_runtime[True-eager] 1.0508ms 0.9152ms 1.0927 KOps/s 1.0742 KOps/s $\color{#35bf28}+1.72\%$
test_func_call_cm_runtime[True-compile] 0.5865ms 0.4867ms 2.0546 KOps/s 1.9897 KOps/s $\color{#35bf28}+3.26\%$
test_func_call_cm_runtime[True-compile-overhead] 0.6536ms 0.4843ms 2.0650 KOps/s 1.9971 KOps/s $\color{#35bf28}+3.40\%$
test_vmap_func_call_cm_runtime[eager] 2.4372ms 1.9095ms 523.6965 Ops/s 508.5175 Ops/s $\color{#35bf28}+2.98\%$
test_vmap_func_call_cm_runtime[compile] 0.9127ms 0.5228ms 1.9129 KOps/s 1.9040 KOps/s $\color{#35bf28}+0.47\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.8897ms 0.5294ms 1.8889 KOps/s 1.9032 KOps/s $\color{#d91a1a}-0.75\%$
test_distributed 0.2703ms 0.1247ms 8.0221 KOps/s 7.7828 KOps/s $\color{#35bf28}+3.07\%$
test_tdmodule 69.3200μs 24.9635μs 40.0585 KOps/s 39.2775 KOps/s $\color{#35bf28}+1.99\%$
test_tdmodule_dispatch 77.9350μs 46.1653μs 21.6613 KOps/s 21.4891 KOps/s $\color{#35bf28}+0.80\%$
test_tdseq 45.1050μs 27.1566μs 36.8235 KOps/s 35.1758 KOps/s $\color{#35bf28}+4.68\%$
test_tdseq_dispatch 76.8740μs 49.8152μs 20.0742 KOps/s 18.8677 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_instantiation_functorch 2.6785ms 1.5525ms 644.1021 Ops/s 644.1201 Ops/s $-0.00\%$
test_exec_functorch 0.3061ms 0.1820ms 5.4939 KOps/s 5.4136 KOps/s $\color{#35bf28}+1.48\%$
test_exec_functional_call 0.3117ms 0.1701ms 5.8781 KOps/s 5.5181 KOps/s $\textbf{\color{#35bf28}+6.52\%}$
test_exec_td_decorator 0.4731ms 0.2299ms 4.3505 KOps/s 4.2465 KOps/s $\color{#35bf28}+2.45\%$
test_vmap_mlp_speed_decorator[True-True] 0.9658ms 0.6533ms 1.5307 KOps/s 1.5454 KOps/s $\color{#d91a1a}-0.95\%$
test_vmap_mlp_speed_decorator[True-False] 1.0666ms 0.6541ms 1.5288 KOps/s 1.5341 KOps/s $\color{#d91a1a}-0.35\%$
test_vmap_mlp_speed_decorator[False-True] 0.8438ms 0.5383ms 1.8576 KOps/s 1.8715 KOps/s $\color{#d91a1a}-0.75\%$
test_vmap_mlp_speed_decorator[False-False] 0.9680ms 0.5359ms 1.8660 KOps/s 1.8709 KOps/s $\color{#d91a1a}-0.26\%$
test_to_module_speed[True] 1.6913ms 1.3340ms 749.6008 Ops/s 729.9320 Ops/s $\color{#35bf28}+2.69\%$
test_to_module_speed[False] 1.8258ms 1.2992ms 769.6971 Ops/s 754.4421 Ops/s $\color{#35bf28}+2.02\%$
test_tc_init 96.4900μs 45.8129μs 21.8279 KOps/s 22.4815 KOps/s $\color{#d91a1a}-2.91\%$
test_tc_init_nested 0.1501ms 91.6613μs 10.9097 KOps/s 11.2932 KOps/s $\color{#d91a1a}-3.40\%$
test_tc_first_layer_tensor 38.5010μs 1.5800μs 632.9303 KOps/s 630.9741 KOps/s $\color{#35bf28}+0.31\%$
test_tc_first_layer_nontensor 27.0910μs 5.1402μs 194.5434 KOps/s 212.4574 KOps/s $\textbf{\color{#d91a1a}-8.43\%}$
test_tc_second_layer_tensor 24.3560μs 2.9292μs 341.3922 KOps/s 347.6049 KOps/s $\color{#d91a1a}-1.79\%$
test_tc_second_layer_nontensor 46.3770μs 6.2856μs 159.0948 KOps/s 166.5600 KOps/s $\color{#d91a1a}-4.48\%$
test_unbind 0.2363s 13.4628ms 74.2787 Ops/s 77.5577 Ops/s $\color{#d91a1a}-4.23\%$
test_full_like 18.0313ms 11.8828ms 84.1552 Ops/s 82.1803 Ops/s $\color{#35bf28}+2.40\%$
test_zeros_like 11.2453ms 7.1377ms 140.1009 Ops/s 126.1127 Ops/s $\textbf{\color{#35bf28}+11.09\%}$
test_ones_like 12.1094ms 7.9260ms 126.1663 Ops/s 130.3149 Ops/s $\color{#d91a1a}-3.18\%$
test_clone 13.2835ms 9.4340ms 105.9993 Ops/s 107.5229 Ops/s $\color{#d91a1a}-1.42\%$
test_squeeze 58.4590μs 11.8234μs 84.5777 KOps/s 81.5354 KOps/s $\color{#35bf28}+3.73\%$
test_unsqueeze 0.2969ms 91.6152μs 10.9152 KOps/s 11.0472 KOps/s $\color{#d91a1a}-1.19\%$
test_split 0.3974ms 0.1999ms 5.0030 KOps/s 5.0856 KOps/s $\color{#d91a1a}-1.62\%$
test_permute 0.2795ms 0.2010ms 4.9754 KOps/s 5.0038 KOps/s $\color{#d91a1a}-0.57\%$
test_stack 29.9623ms 24.9111ms 40.1427 Ops/s 40.3167 Ops/s $\color{#d91a1a}-0.43\%$
test_cat 29.3022ms 24.5323ms 40.7626 Ops/s 40.9552 Ops/s $\color{#d91a1a}-0.47\%$

@github-actions
Copy link

github-actions bot commented Jan 14, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}33$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 36.7700μs 12.1604μs 82.2343 KOps/s 77.2928 KOps/s $\textbf{\color{#35bf28}+6.39\%}$
test_plain_set_stack_nested 35.1810μs 12.3944μs 80.6819 KOps/s 75.6935 KOps/s $\textbf{\color{#35bf28}+6.59\%}$
test_plain_set_nested_inplace 42.2810μs 13.2189μs 75.6493 KOps/s 70.9914 KOps/s $\textbf{\color{#35bf28}+6.56\%}$
test_plain_set_stack_nested_inplace 50.1610μs 13.3467μs 74.9249 KOps/s 70.7601 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_items 30.2510μs 2.8966μs 345.2309 KOps/s 342.5848 KOps/s $\color{#35bf28}+0.77\%$
test_items_nested 0.4402ms 0.3627ms 2.7567 KOps/s 2.7295 KOps/s $\color{#35bf28}+1.00\%$
test_items_nested_locked 0.4509ms 0.3656ms 2.7355 KOps/s 2.7308 KOps/s $\color{#35bf28}+0.17\%$
test_items_nested_leaf 87.2210μs 59.1260μs 16.9130 KOps/s 17.0656 KOps/s $\color{#d91a1a}-0.89\%$
test_items_stack_nested 0.3990ms 0.3694ms 2.7070 KOps/s 2.6723 KOps/s $\color{#35bf28}+1.30\%$
test_items_stack_nested_leaf 0.1242ms 60.7118μs 16.4713 KOps/s 16.4726 KOps/s $-0.01\%$
test_items_stack_nested_locked 0.4415ms 0.3658ms 2.7334 KOps/s 2.7083 KOps/s $\color{#35bf28}+0.93\%$
test_keys 41.8100μs 3.7139μs 269.2615 KOps/s 282.7091 KOps/s $\color{#d91a1a}-4.76\%$
test_keys_nested 0.1269ms 87.9966μs 11.3641 KOps/s 11.2808 KOps/s $\color{#35bf28}+0.74\%$
test_keys_nested_locked 0.8254ms 94.0174μs 10.6363 KOps/s 10.5535 KOps/s $\color{#35bf28}+0.78\%$
test_keys_nested_leaf 0.1627ms 79.0679μs 12.6474 KOps/s 12.5033 KOps/s $\color{#35bf28}+1.15\%$
test_keys_stack_nested 0.1137ms 89.6862μs 11.1500 KOps/s 11.0618 KOps/s $\color{#35bf28}+0.80\%$
test_keys_stack_nested_leaf 0.1237ms 80.5663μs 12.4121 KOps/s 12.1894 KOps/s $\color{#35bf28}+1.83\%$
test_keys_stack_nested_locked 0.1319ms 94.7110μs 10.5584 KOps/s 10.3595 KOps/s $\color{#35bf28}+1.92\%$
test_values 5.5235μs 0.8525μs 1.1730 MOps/s 1.1747 MOps/s $\color{#d91a1a}-0.14\%$
test_values_nested 0.1193ms 37.4863μs 26.6764 KOps/s 26.5271 KOps/s $\color{#35bf28}+0.56\%$
test_values_nested_locked 74.8110μs 38.7042μs 25.8370 KOps/s 25.3997 KOps/s $\color{#35bf28}+1.72\%$
test_values_nested_leaf 64.8510μs 41.6562μs 24.0060 KOps/s 23.7377 KOps/s $\color{#35bf28}+1.13\%$
test_values_stack_nested 61.7910μs 38.0347μs 26.2918 KOps/s 26.0260 KOps/s $\color{#35bf28}+1.02\%$
test_values_stack_nested_leaf 0.1061ms 42.6442μs 23.4499 KOps/s 23.3623 KOps/s $\color{#35bf28}+0.37\%$
test_values_stack_nested_locked 68.9620μs 39.2769μs 25.4603 KOps/s 25.2499 KOps/s $\color{#35bf28}+0.83\%$
test_membership 1.5646μs 0.5119μs 1.9534 MOps/s 1.9019 MOps/s $\color{#35bf28}+2.71\%$
test_membership_nested 20.5805μs 2.0277μs 493.1772 KOps/s 497.2547 KOps/s $\color{#d91a1a}-0.82\%$
test_membership_nested_leaf 21.8205μs 2.0286μs 492.9475 KOps/s 493.7240 KOps/s $\color{#d91a1a}-0.16\%$
test_membership_stacked_nested 35.5300μs 2.0607μs 485.2644 KOps/s 480.2739 KOps/s $\color{#35bf28}+1.04\%$
test_membership_stacked_nested_leaf 64.3910μs 2.0875μs 479.0488 KOps/s 483.7824 KOps/s $\color{#d91a1a}-0.98\%$
test_membership_nested_last 35.2310μs 3.1296μs 319.5277 KOps/s 322.5093 KOps/s $\color{#d91a1a}-0.92\%$
test_membership_nested_leaf_last 44.5310μs 3.1521μs 317.2441 KOps/s 328.3025 KOps/s $\color{#d91a1a}-3.37\%$
test_membership_stacked_nested_last 60.4210μs 3.6421μs 274.5690 KOps/s 241.0404 KOps/s $\textbf{\color{#35bf28}+13.91\%}$
test_membership_stacked_nested_leaf_last 34.9610μs 3.6383μs 274.8567 KOps/s 238.1124 KOps/s $\textbf{\color{#35bf28}+15.43\%}$
test_nested_getleaf 34.1910μs 6.1620μs 162.2851 KOps/s 163.6503 KOps/s $\color{#d91a1a}-0.83\%$
test_nested_get 41.5600μs 5.8230μs 171.7315 KOps/s 172.0856 KOps/s $\color{#d91a1a}-0.21\%$
test_stacked_getleaf 32.1300μs 6.1384μs 162.9089 KOps/s 163.6275 KOps/s $\color{#d91a1a}-0.44\%$
test_stacked_get 64.9610μs 5.8244μs 171.6925 KOps/s 172.3025 KOps/s $\color{#d91a1a}-0.35\%$
test_nested_getitemleaf 38.0310μs 6.4414μs 155.2463 KOps/s 156.9865 KOps/s $\color{#d91a1a}-1.11\%$
test_nested_getitem 26.9210μs 6.2150μs 160.9007 KOps/s 164.6507 KOps/s $\color{#d91a1a}-2.28\%$
test_stacked_getitemleaf 29.0500μs 6.4562μs 154.8896 KOps/s 155.5357 KOps/s $\color{#d91a1a}-0.42\%$
test_stacked_getitem 67.1820μs 6.1159μs 163.5095 KOps/s 164.1392 KOps/s $\color{#d91a1a}-0.38\%$
test_lock_nested 0.7300ms 0.3727ms 2.6834 KOps/s 2.6181 KOps/s $\color{#35bf28}+2.50\%$
test_lock_stack_nested 0.3839ms 0.3461ms 2.8897 KOps/s 2.8986 KOps/s $\color{#d91a1a}-0.31\%$
test_unlock_nested 0.6389ms 0.3183ms 3.1419 KOps/s 3.1908 KOps/s $\color{#d91a1a}-1.53\%$
test_unlock_stack_nested 0.3136ms 0.2847ms 3.5128 KOps/s 3.5395 KOps/s $\color{#d91a1a}-0.76\%$
test_flatten_speed 0.1529ms 76.6164μs 13.0520 KOps/s 13.1478 KOps/s $\color{#d91a1a}-0.73\%$
test_unflatten_speed 0.4321ms 0.3163ms 3.1611 KOps/s 3.0934 KOps/s $\color{#35bf28}+2.19\%$
test_common_ops 1.6097ms 0.5993ms 1.6685 KOps/s 1.5647 KOps/s $\textbf{\color{#35bf28}+6.63\%}$
test_creation 0.1061ms 1.7548μs 569.8675 KOps/s 564.8825 KOps/s $\color{#35bf28}+0.88\%$
test_creation_empty 1.4018ms 8.4065μs 118.9549 KOps/s 100.5220 KOps/s $\textbf{\color{#35bf28}+18.34\%}$
test_creation_nested_1 34.5010μs 10.0227μs 99.7735 KOps/s 85.5179 KOps/s $\textbf{\color{#35bf28}+16.67\%}$
test_creation_nested_2 35.4000μs 12.8992μs 77.5240 KOps/s 70.4418 KOps/s $\textbf{\color{#35bf28}+10.05\%}$
test_clone 0.1185ms 10.2422μs 97.6357 KOps/s 98.7538 KOps/s $\color{#d91a1a}-1.13\%$
test_getitem[int] 1.1996ms 11.0342μs 90.6274 KOps/s 93.2100 KOps/s $\color{#d91a1a}-2.77\%$
test_getitem[slice_int] 0.1138ms 21.2509μs 47.0568 KOps/s 47.7115 KOps/s $\color{#d91a1a}-1.37\%$
test_getitem[range] 0.1308ms 36.9925μs 27.0325 KOps/s 26.7071 KOps/s $\color{#35bf28}+1.22\%$
test_getitem[tuple] 0.1098ms 18.4658μs 54.1542 KOps/s 55.0269 KOps/s $\color{#d91a1a}-1.59\%$
test_getitem[list] 0.3401ms 32.6327μs 30.6441 KOps/s 30.3330 KOps/s $\color{#35bf28}+1.03\%$
test_setitem_dim[int] 39.7710μs 18.9138μs 52.8714 KOps/s 54.3547 KOps/s $\color{#d91a1a}-2.73\%$
test_setitem_dim[slice_int] 60.4510μs 37.5831μs 26.6077 KOps/s 26.3350 KOps/s $\color{#35bf28}+1.04\%$
test_setitem_dim[range] 81.2110μs 52.2510μs 19.1384 KOps/s 18.8594 KOps/s $\color{#35bf28}+1.48\%$
test_setitem_dim[tuple] 53.1210μs 31.0818μs 32.1732 KOps/s 30.6896 KOps/s $\color{#35bf28}+4.83\%$
test_setitem 0.1020ms 14.2883μs 69.9871 KOps/s 64.4163 KOps/s $\textbf{\color{#35bf28}+8.65\%}$
test_set 0.1126ms 13.9216μs 71.8309 KOps/s 66.6573 KOps/s $\textbf{\color{#35bf28}+7.76\%}$
test_set_shared 1.7834ms 0.1509ms 6.6277 KOps/s 6.5484 KOps/s $\color{#35bf28}+1.21\%$
test_update 0.3057ms 17.1600μs 58.2750 KOps/s 52.9868 KOps/s $\textbf{\color{#35bf28}+9.98\%}$
test_update_nested 0.1084ms 22.3174μs 44.8081 KOps/s 41.2323 KOps/s $\textbf{\color{#35bf28}+8.67\%}$
test_update__nested 0.4617ms 24.6946μs 40.4947 KOps/s 40.6581 KOps/s $\color{#d91a1a}-0.40\%$
test_set_nested 97.2510μs 15.3936μs 64.9621 KOps/s 61.2502 KOps/s $\textbf{\color{#35bf28}+6.06\%}$
test_set_nested_new 0.1166ms 17.7337μs 56.3900 KOps/s 53.1964 KOps/s $\textbf{\color{#35bf28}+6.00\%}$
test_select 0.2180ms 29.9161μs 33.4268 KOps/s 32.5586 KOps/s $\color{#35bf28}+2.67\%$
test_select_nested 75.8010μs 43.4362μs 23.0223 KOps/s 22.7952 KOps/s $\color{#35bf28}+1.00\%$
test_exclude_nested 0.1377ms 61.9355μs 16.1458 KOps/s 15.6175 KOps/s $\color{#35bf28}+3.38\%$
test_empty[True] 0.3482ms 0.2935ms 3.4069 KOps/s 3.4094 KOps/s $\color{#d91a1a}-0.08\%$
test_empty[False] 3.2061μs 0.8175μs 1.2232 MOps/s 1.2009 MOps/s $\color{#35bf28}+1.86\%$
test_to 86.3410μs 56.7723μs 17.6142 KOps/s 17.9021 KOps/s $\color{#d91a1a}-1.61\%$
test_to_nonblocking 83.1310μs 48.0174μs 20.8258 KOps/s 20.8741 KOps/s $\color{#d91a1a}-0.23\%$
test_unbind_speed 1.5898ms 0.2401ms 4.1641 KOps/s 4.2174 KOps/s $\color{#d91a1a}-1.26\%$
test_unbind_speed_stack0 0.3007ms 0.2384ms 4.1952 KOps/s 4.1174 KOps/s $\color{#35bf28}+1.89\%$
test_unbind_speed_stack1 95.4108ms 0.6747ms 1.4822 KOps/s 1.4895 KOps/s $\color{#d91a1a}-0.49\%$
test_split 95.8192ms 1.6130ms 619.9460 Ops/s 628.2366 Ops/s $\color{#d91a1a}-1.32\%$
test_chunk 97.7415ms 1.6082ms 621.8095 Ops/s 633.5333 Ops/s $\color{#d91a1a}-1.85\%$
test_consolidate[False-None] 0.1003s 2.9850ms 335.0127 Ops/s 335.7523 Ops/s $\color{#d91a1a}-0.22\%$
test_consolidate[default-None] 2.6830ms 1.7078ms 585.5547 Ops/s 585.5819 Ops/s $-0.00\%$
test_consolidate[reduce-overhead-None] 1.8348ms 1.7604ms 568.0394 Ops/s 571.1071 Ops/s $\color{#d91a1a}-0.54\%$
test_consolidate_njt[False-None] 7.0287ms 6.5763ms 152.0604 Ops/s 150.4412 Ops/s $\color{#35bf28}+1.08\%$
test_to[False-False-None] 1.7675ms 1.6942ms 590.2374 Ops/s 584.9811 Ops/s $\color{#35bf28}+0.90\%$
test_to[True-False-None] 1.5924ms 1.3354ms 748.8619 Ops/s 735.2609 Ops/s $\color{#35bf28}+1.85\%$
test_to[within-False-None] 4.4522ms 4.1478ms 241.0901 Ops/s 238.5639 Ops/s $\color{#35bf28}+1.06\%$
test_to[True-default-None] 5.6983ms 5.3044ms 188.5217 Ops/s 186.6576 Ops/s $\color{#35bf28}+1.00\%$
test_to_njt[False-False-None] 7.1147ms 6.9354ms 144.1885 Ops/s 143.1087 Ops/s $\color{#35bf28}+0.75\%$
test_to_njt[True-False-None] 5.7031ms 5.5007ms 181.7962 Ops/s 181.4035 Ops/s $\color{#35bf28}+0.22\%$
test_to_njt[within-False-None] 12.4025ms 12.2847ms 81.4022 Ops/s 80.2353 Ops/s $\color{#35bf28}+1.45\%$
test_creation[device0] 0.4664ms 80.5937μs 12.4079 KOps/s 12.3514 KOps/s $\color{#35bf28}+0.46\%$
test_creation_from_tensor 0.6318ms 83.5315μs 11.9715 KOps/s 11.8743 KOps/s $\color{#35bf28}+0.82\%$
test_add_one[memmap_tensor0] 0.4221ms 6.3898μs 156.4995 KOps/s 156.4370 KOps/s $\color{#35bf28}+0.04\%$
test_contiguous[memmap_tensor0] 1.7776μs 0.4309μs 2.3209 MOps/s 2.3168 MOps/s $\color{#35bf28}+0.18\%$
test_stack[memmap_tensor0] 20.3800μs 4.6408μs 215.4812 KOps/s 222.7217 KOps/s $\color{#d91a1a}-3.25\%$
test_memmaptd_index 1.3680ms 0.2604ms 3.8398 KOps/s 3.9264 KOps/s $\color{#d91a1a}-2.21\%$
test_memmaptd_index_astensor 0.5887ms 0.3243ms 3.0831 KOps/s 3.1749 KOps/s $\color{#d91a1a}-2.89\%$
test_memmaptd_index_op 1.0458ms 0.5847ms 1.7103 KOps/s 1.6701 KOps/s $\color{#35bf28}+2.41\%$
test_serialize_model 0.1326s 0.1317s 7.5927 Ops/s 7.6200 Ops/s $\color{#d91a1a}-0.36\%$
test_serialize_model_pickle 1.3492s 1.2106s 0.8260 Ops/s 0.8259 Ops/s $\color{#35bf28}+0.02\%$
test_serialize_weights 0.1321s 0.1304s 7.6689 Ops/s 7.6726 Ops/s $\color{#d91a1a}-0.05\%$
test_serialize_weights_returnearly 0.3428s 53.8079ms 18.5846 Ops/s 14.4342 Ops/s $\textbf{\color{#35bf28}+28.75\%}$
test_serialize_weights_pickle 1.3458s 1.1860s 0.8432 Ops/s 0.7953 Ops/s $\textbf{\color{#35bf28}+6.03\%}$
test_reshape_pytree 62.1910μs 22.9149μs 43.6398 KOps/s 44.2175 KOps/s $\color{#d91a1a}-1.31\%$
test_reshape_td 63.6010μs 26.5714μs 37.6345 KOps/s 36.0599 KOps/s $\color{#35bf28}+4.37\%$
test_view_pytree 0.1207ms 22.1478μs 45.1513 KOps/s 45.1227 KOps/s $\color{#35bf28}+0.06\%$
test_view_td 59.6910μs 31.2538μs 31.9961 KOps/s 30.8098 KOps/s $\color{#35bf28}+3.85\%$
test_unbind_pytree 0.1522ms 28.3553μs 35.2667 KOps/s 35.2425 KOps/s $\color{#35bf28}+0.07\%$
test_unbind_td 0.7515ms 37.0397μs 26.9981 KOps/s 27.3447 KOps/s $\color{#d91a1a}-1.27\%$
test_split_pytree 57.0210μs 30.6438μs 32.6330 KOps/s 31.8864 KOps/s $\color{#35bf28}+2.34\%$
test_split_td 1.0053ms 39.7555μs 25.1537 KOps/s 24.9813 KOps/s $\color{#35bf28}+0.69\%$
test_add_pytree 61.8510μs 33.6762μs 29.6946 KOps/s 29.9606 KOps/s $\color{#d91a1a}-0.89\%$
test_add_td 0.1802ms 47.6207μs 20.9993 KOps/s 19.3273 KOps/s $\textbf{\color{#35bf28}+8.65\%}$
test_compile_add_one_nested[tensordict-compile] 0.1757ms 0.1218ms 8.2123 KOps/s 7.9854 KOps/s $\color{#35bf28}+2.84\%$
test_compile_add_one_nested[tensordict-eager] 0.2843ms 0.1325ms 7.5452 KOps/s 7.3881 KOps/s $\color{#35bf28}+2.13\%$
test_compile_add_one_nested[pytree-compile] 0.1472ms 98.6278μs 10.1391 KOps/s 10.3441 KOps/s $\color{#d91a1a}-1.98\%$
test_compile_add_one_nested[pytree-eager] 1.6285ms 0.1505ms 6.6465 KOps/s 6.8249 KOps/s $\color{#d91a1a}-2.61\%$
test_compile_copy_nested[tensordict-compile] 96.7610μs 22.3815μs 44.6798 KOps/s 44.6272 KOps/s $\color{#35bf28}+0.12\%$
test_compile_copy_nested[tensordict-eager] 82.9920μs 29.6450μs 33.7325 KOps/s 33.0964 KOps/s $\color{#35bf28}+1.92\%$
test_compile_copy_nested[pytree-compile] 0.4085ms 65.1945μs 15.3387 KOps/s 15.0402 KOps/s $\color{#35bf28}+1.98\%$
test_compile_copy_nested[pytree-eager] 83.6310μs 49.7392μs 20.1049 KOps/s 19.8329 KOps/s $\color{#35bf28}+1.37\%$
test_compile_add_one_flat[tensordict-compile] 0.1824ms 0.1420ms 7.0419 KOps/s 6.9144 KOps/s $\color{#35bf28}+1.84\%$
test_compile_add_one_flat[tensordict-eager] 0.3106ms 0.2165ms 4.6191 KOps/s 4.6053 KOps/s $\color{#35bf28}+0.30\%$
test_compile_add_one_flat[tensorclass-compile] 0.1854ms 99.1465μs 10.0861 KOps/s 10.1742 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_add_one_flat[tensorclass-eager] 0.1150ms 56.3841μs 17.7355 KOps/s 17.8347 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_add_one_flat[pytree-compile] 0.1738ms 0.1354ms 7.3844 KOps/s 7.3848 KOps/s $-0.01\%$
test_compile_add_one_flat[pytree-eager] 0.5729ms 0.4789ms 2.0880 KOps/s 2.1430 KOps/s $\color{#d91a1a}-2.57\%$
test_compile_add_self_flat[tensordict-eager] 0.3816ms 0.2608ms 3.8348 KOps/s 3.8593 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_add_self_flat[tensordict-compile] 0.2022ms 0.1497ms 6.6784 KOps/s 7.0389 KOps/s $\textbf{\color{#d91a1a}-5.12\%}$
test_compile_add_self_flat[tensorclass-eager] 0.1696ms 70.0694μs 14.2716 KOps/s 14.5317 KOps/s $\color{#d91a1a}-1.79\%$
test_compile_add_self_flat[tensorclass-compile] 0.1510ms 0.1064ms 9.4023 KOps/s 10.1703 KOps/s $\textbf{\color{#d91a1a}-7.55\%}$
test_compile_add_self_flat[pytree-eager] 0.5538ms 0.4089ms 2.4455 KOps/s 2.4698 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_add_self_flat[pytree-compile] 0.1756ms 0.1369ms 7.3058 KOps/s 7.4310 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_copy_flat[tensordict-compile] 79.6710μs 19.2872μs 51.8479 KOps/s 55.0637 KOps/s $\textbf{\color{#d91a1a}-5.84\%}$
test_compile_copy_flat[tensordict-eager] 96.5020μs 30.8022μs 32.4652 KOps/s 31.6246 KOps/s $\color{#35bf28}+2.66\%$
test_compile_copy_flat[pytree-compile] 0.1053ms 70.4879μs 14.1868 KOps/s 13.9578 KOps/s $\color{#35bf28}+1.64\%$
test_compile_copy_flat[pytree-eager] 0.1373ms 52.1730μs 19.1670 KOps/s 19.2130 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_assign_and_add[tensordict-compile] 1.6349ms 0.3935ms 2.5414 KOps/s 2.1919 KOps/s $\textbf{\color{#35bf28}+15.94\%}$
test_compile_assign_and_add[tensordict-eager] 2.6887ms 2.6134ms 382.6433 Ops/s 383.4553 Ops/s $\color{#d91a1a}-0.21\%$
test_compile_assign_and_add[pytree-compile] 1.5948ms 0.4353ms 2.2975 KOps/s 2.1218 KOps/s $\textbf{\color{#35bf28}+8.28\%}$
test_compile_assign_and_add[pytree-eager] 2.7371ms 2.6327ms 379.8401 Ops/s 381.5021 Ops/s $\color{#d91a1a}-0.44\%$
test_compile_indexing[tensor-tensordict-compile] 0.1635ms 0.1117ms 8.9493 KOps/s 8.4632 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_compile_indexing[tensor-tensordict-eager] 0.5511ms 77.9309μs 12.8319 KOps/s 12.3661 KOps/s $\color{#35bf28}+3.77\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1705ms 0.1046ms 9.5589 KOps/s 9.5692 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1106ms 67.2966μs 14.8596 KOps/s 14.4228 KOps/s $\color{#35bf28}+3.03\%$
test_compile_indexing[tensor-pytree-compile] 0.1521ms 0.1056ms 9.4674 KOps/s 9.5142 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_indexing[tensor-pytree-eager] 0.1162ms 66.9258μs 14.9419 KOps/s 14.8972 KOps/s $\color{#35bf28}+0.30\%$
test_compile_indexing[slice-tensordict-compile] 0.2335ms 0.1031ms 9.7005 KOps/s 9.9924 KOps/s $\color{#d91a1a}-2.92\%$
test_compile_indexing[slice-tensordict-eager] 0.1677ms 17.4747μs 57.2255 KOps/s 57.5534 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_indexing[slice-tensorclass-compile] 0.1355ms 96.0504μs 10.4112 KOps/s 10.2783 KOps/s $\color{#35bf28}+1.29\%$
test_compile_indexing[slice-tensorclass-eager] 57.0610μs 16.0511μs 62.3011 KOps/s 63.4290 KOps/s $\color{#d91a1a}-1.78\%$
test_compile_indexing[slice-pytree-compile] 0.1443ms 96.5170μs 10.3609 KOps/s 10.0928 KOps/s $\color{#35bf28}+2.66\%$
test_compile_indexing[slice-pytree-eager] 0.1150ms 15.8779μs 62.9808 KOps/s 63.5312 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_indexing[int-tensordict-compile] 0.1440ms 0.1001ms 9.9858 KOps/s 9.4654 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_compile_indexing[int-tensordict-eager] 0.5718ms 17.2797μs 57.8714 KOps/s 59.1036 KOps/s $\color{#d91a1a}-2.08\%$
test_compile_indexing[int-tensorclass-compile] 0.1798ms 96.8586μs 10.3243 KOps/s 10.2614 KOps/s $\color{#35bf28}+0.61\%$
test_compile_indexing[int-tensorclass-eager] 43.7710μs 15.8422μs 63.1225 KOps/s 64.0035 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_indexing[int-pytree-compile] 0.1395ms 96.6823μs 10.3432 KOps/s 10.2593 KOps/s $\color{#35bf28}+0.82\%$
test_compile_indexing[int-pytree-eager] 47.9610μs 15.7370μs 63.5445 KOps/s 63.1236 KOps/s $\color{#35bf28}+0.67\%$
test_mod_add[eager] 0.1300ms 38.4616μs 25.9999 KOps/s 25.6554 KOps/s $\color{#35bf28}+1.34\%$
test_mod_add[compile] 0.1268ms 82.4111μs 12.1343 KOps/s 11.7979 KOps/s $\color{#35bf28}+2.85\%$
test_mod_add[compile-overhead] 0.3182ms 0.1674ms 5.9742 KOps/s 5.6789 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_mod_wrap[eager] 0.3306ms 0.2521ms 3.9660 KOps/s 3.8181 KOps/s $\color{#35bf28}+3.87\%$
test_mod_wrap[compile] 0.3374ms 0.2843ms 3.5177 KOps/s 3.4913 KOps/s $\color{#35bf28}+0.75\%$
test_mod_wrap[compile-overhead] 7.0129ms 3.7234ms 268.5737 Ops/s 271.5757 Ops/s $\color{#d91a1a}-1.11\%$
test_mod_wrap_and_backward[eager] 1.4668ms 1.3280ms 753.0186 Ops/s 709.5403 Ops/s $\textbf{\color{#35bf28}+6.13\%}$
test_mod_wrap_and_backward[compile] 1.3594ms 1.2742ms 784.7818 Ops/s 730.8724 Ops/s $\textbf{\color{#35bf28}+7.38\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3589ms 0.9206ms 1.0862 KOps/s 907.8000 Ops/s $\textbf{\color{#35bf28}+19.65\%}$
test_seq_add[eager] 0.2486ms 0.1165ms 8.5874 KOps/s 8.4640 KOps/s $\color{#35bf28}+1.46\%$
test_seq_add[compile] 0.1356ms 88.8900μs 11.2499 KOps/s 11.1793 KOps/s $\color{#35bf28}+0.63\%$
test_seq_add[compile-overhead] 0.1760ms 0.1298ms 7.7049 KOps/s 7.7251 KOps/s $\color{#d91a1a}-0.26\%$
test_seq_wrap[eager] 0.4769ms 0.4117ms 2.4291 KOps/s 2.3596 KOps/s $\color{#35bf28}+2.95\%$
test_seq_wrap[compile] 0.3514ms 0.3019ms 3.3120 KOps/s 3.3144 KOps/s $\color{#d91a1a}-0.07\%$
test_seq_wrap[compile-overhead] 0.3075ms 0.2245ms 4.4539 KOps/s 4.4037 KOps/s $\color{#35bf28}+1.14\%$
test_func_call_runtime[False-eager] 0.7886ms 0.7146ms 1.3995 KOps/s 1.3596 KOps/s $\color{#35bf28}+2.93\%$
test_func_call_runtime[False-compile] 0.8074ms 0.7540ms 1.3262 KOps/s 1.3451 KOps/s $\color{#d91a1a}-1.40\%$
test_func_call_runtime[False-compile-overhead] 0.4240ms 0.3661ms 2.7314 KOps/s 2.7109 KOps/s $\color{#35bf28}+0.75\%$
test_func_call_runtime[True-eager] 0.9428ms 0.8697ms 1.1498 KOps/s 1.1456 KOps/s $\color{#35bf28}+0.37\%$
test_func_call_runtime[True-compile] 0.8289ms 0.7747ms 1.2908 KOps/s 1.3083 KOps/s $\color{#d91a1a}-1.34\%$
test_func_call_runtime[True-compile-overhead] 0.4478ms 0.3870ms 2.5843 KOps/s 2.5818 KOps/s $\color{#35bf28}+0.10\%$
test_func_call_cm_runtime[False-eager] 0.8085ms 0.7127ms 1.4030 KOps/s 1.4136 KOps/s $\color{#d91a1a}-0.75\%$
test_func_call_cm_runtime[False-compile] 0.8052ms 0.7570ms 1.3210 KOps/s 1.3252 KOps/s $\color{#d91a1a}-0.32\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4376ms 0.3676ms 2.7206 KOps/s 2.7130 KOps/s $\color{#35bf28}+0.28\%$
test_func_call_cm_runtime[True-eager] 1.0664ms 0.9818ms 1.0185 KOps/s 999.9982 Ops/s $\color{#35bf28}+1.85\%$
test_func_call_cm_runtime[True-compile] 0.9481ms 0.8013ms 1.2479 KOps/s 1.2568 KOps/s $\color{#d91a1a}-0.71\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5597ms 0.4115ms 2.4303 KOps/s 2.4129 KOps/s $\color{#35bf28}+0.72\%$
test_vmap_func_call_cm_runtime[eager] 2.4839ms 2.0239ms 494.0973 Ops/s 492.1497 Ops/s $\color{#35bf28}+0.40\%$
test_vmap_func_call_cm_runtime[compile] 0.9538ms 0.8186ms 1.2216 KOps/s 1.2297 KOps/s $\color{#d91a1a}-0.66\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5218ms 0.4140ms 2.4153 KOps/s 2.4009 KOps/s $\color{#35bf28}+0.60\%$
test_distributed 0.6819ms 0.1624ms 6.1583 KOps/s 8.5343 KOps/s $\textbf{\color{#d91a1a}-27.84\%}$
test_tdmodule 0.1675ms 20.7283μs 48.2433 KOps/s 47.4886 KOps/s $\color{#35bf28}+1.59\%$
test_tdmodule_dispatch 70.0910μs 36.3810μs 27.4869 KOps/s 26.4004 KOps/s $\color{#35bf28}+4.12\%$
test_tdseq 32.6110μs 20.8580μs 47.9433 KOps/s 45.4092 KOps/s $\textbf{\color{#35bf28}+5.58\%}$
test_tdseq_dispatch 59.7210μs 38.7125μs 25.8315 KOps/s 24.2046 KOps/s $\textbf{\color{#35bf28}+6.72\%}$
test_instantiation_functorch 1.6062ms 1.5420ms 648.5210 Ops/s 639.9389 Ops/s $\color{#35bf28}+1.34\%$
test_exec_functorch 0.2084ms 0.1408ms 7.1024 KOps/s 7.0790 KOps/s $\color{#35bf28}+0.33\%$
test_exec_functional_call 0.2012ms 0.1314ms 7.6075 KOps/s 7.4408 KOps/s $\color{#35bf28}+2.24\%$
test_exec_td_decorator 0.3685ms 0.1787ms 5.5945 KOps/s 5.4593 KOps/s $\color{#35bf28}+2.48\%$
test_vmap_mlp_speed_decorator[True-True] 0.8233ms 0.6674ms 1.4983 KOps/s 1.4942 KOps/s $\color{#35bf28}+0.27\%$
test_vmap_mlp_speed_decorator[True-False] 0.8346ms 0.6701ms 1.4922 KOps/s 1.5016 KOps/s $\color{#d91a1a}-0.62\%$
test_vmap_mlp_speed_decorator[False-True] 0.7217ms 0.5821ms 1.7180 KOps/s 1.7452 KOps/s $\color{#d91a1a}-1.55\%$
test_vmap_mlp_speed_decorator[False-False] 0.7486ms 0.5881ms 1.7005 KOps/s 1.7470 KOps/s $\color{#d91a1a}-2.66\%$
test_vmap_transformer_speed_decorator[True-True] 19.2999ms 18.6852ms 53.5182 Ops/s 53.9437 Ops/s $\color{#d91a1a}-0.79\%$
test_vmap_transformer_speed_decorator[True-False] 19.3566ms 18.7070ms 53.4558 Ops/s 53.9514 Ops/s $\color{#d91a1a}-0.92\%$
test_vmap_transformer_speed_decorator[False-True] 18.8087ms 18.6279ms 53.6830 Ops/s 54.5233 Ops/s $\color{#d91a1a}-1.54\%$
test_vmap_transformer_speed_decorator[False-False] 18.7740ms 18.6011ms 53.7602 Ops/s 54.3385 Ops/s $\color{#d91a1a}-1.06\%$
test_to_module_speed[True] 1.0653ms 0.9736ms 1.0272 KOps/s 1.0364 KOps/s $\color{#d91a1a}-0.89\%$
test_to_module_speed[False] 1.3530ms 0.9653ms 1.0359 KOps/s 1.0441 KOps/s $\color{#d91a1a}-0.78\%$
test_tc_init 76.8120μs 35.2068μs 28.4036 KOps/s 26.6669 KOps/s $\textbf{\color{#35bf28}+6.51\%}$
test_tc_init_nested 0.2397ms 69.5196μs 14.3844 KOps/s 13.2001 KOps/s $\textbf{\color{#35bf28}+8.97\%}$
test_tc_first_layer_tensor 13.4973μs 0.7143μs 1.4000 MOps/s 1.2071 MOps/s $\textbf{\color{#35bf28}+15.98\%}$
test_tc_first_layer_nontensor 67.1910μs 2.3070μs 433.4695 KOps/s 434.9652 KOps/s $\color{#d91a1a}-0.34\%$
test_tc_second_layer_tensor 47.7973μs 1.4370μs 695.9178 KOps/s 694.3641 KOps/s $\color{#35bf28}+0.22\%$
test_tc_second_layer_nontensor 36.2310μs 3.0287μs 330.1780 KOps/s 331.9329 KOps/s $\color{#d91a1a}-0.53\%$
test_unbind 0.2300s 10.2275ms 97.7758 Ops/s 140.4178 Ops/s $\textbf{\color{#d91a1a}-30.37\%}$
test_full_like 10.9462ms 10.0337ms 99.6638 Ops/s 103.9180 Ops/s $\color{#d91a1a}-4.09\%$
test_zeros_like 9.3303ms 7.3134ms 136.7358 Ops/s 138.9037 Ops/s $\color{#d91a1a}-1.56\%$
test_ones_like 5.0610ms 4.3868ms 227.9588 Ops/s 227.4273 Ops/s $\color{#35bf28}+0.23\%$
test_clone 12.1982ms 9.4718ms 105.5766 Ops/s 147.4653 Ops/s $\textbf{\color{#d91a1a}-28.41\%}$
test_squeeze 54.5110μs 9.8328μs 101.7006 KOps/s 103.3615 KOps/s $\color{#d91a1a}-1.61\%$
test_unsqueeze 0.1935ms 73.2570μs 13.6506 KOps/s 13.3855 KOps/s $\color{#35bf28}+1.98\%$
test_split 0.3924ms 0.1598ms 6.2561 KOps/s 5.8446 KOps/s $\textbf{\color{#35bf28}+7.04\%}$
test_permute 0.3209ms 0.1802ms 5.5504 KOps/s 5.5609 KOps/s $\color{#d91a1a}-0.19\%$
test_stack 52.0102ms 50.5134ms 19.7967 Ops/s 19.5183 Ops/s $\color{#35bf28}+1.43\%$
test_cat 51.8713ms 51.3347ms 19.4800 Ops/s 19.5330 Ops/s $\color{#d91a1a}-0.27\%$

@vmoens vmoens force-pushed the composite_lp_aggregate branch from 8c1335e to 04e1c1d Compare January 14, 2025 15:24
@vmoens vmoens merged commit 0013e38 into main Jan 15, 2025
45 of 54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Deprecation Announces or enacts a deprecation Refactor Refactoring code - not a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants