Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Sep 2, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Sep 2, 2024
ghstack-source-id: ea5b148
Pull Request resolved: #977
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 2, 2024
@vmoens vmoens added the bug Something isn't working label Sep 2, 2024
@github-actions
Copy link

github-actions bot commented Sep 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}27$. Worsened: $\large\color{#d91a1a}12$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 46.8180μs 19.5126μs 51.2490 KOps/s 50.0081 KOps/s $\color{#35bf28}+2.48\%$
test_plain_set_stack_nested 54.0510μs 19.7253μs 50.6963 KOps/s 49.8430 KOps/s $\color{#35bf28}+1.71\%$
test_plain_set_nested_inplace 60.6830μs 21.2721μs 47.0099 KOps/s 46.5922 KOps/s $\color{#35bf28}+0.90\%$
test_plain_set_stack_nested_inplace 86.5040μs 21.0852μs 47.4265 KOps/s 46.6928 KOps/s $\color{#35bf28}+1.57\%$
test_items 21.0100μs 4.1907μs 238.6227 KOps/s 239.6746 KOps/s $\color{#d91a1a}-0.44\%$
test_items_nested 0.5652ms 0.3293ms 3.0369 KOps/s 3.0697 KOps/s $\color{#d91a1a}-1.07\%$
test_items_nested_locked 0.4029ms 0.3247ms 3.0799 KOps/s 3.0585 KOps/s $\color{#35bf28}+0.70\%$
test_items_nested_leaf 0.1328ms 83.5312μs 11.9716 KOps/s 11.6238 KOps/s $\color{#35bf28}+2.99\%$
test_items_stack_nested 0.5990ms 0.3296ms 3.0343 KOps/s 2.9945 KOps/s $\color{#35bf28}+1.33\%$
test_items_stack_nested_leaf 0.1513ms 83.8549μs 11.9254 KOps/s 12.1050 KOps/s $\color{#d91a1a}-1.48\%$
test_items_stack_nested_locked 0.5043ms 0.3306ms 3.0252 KOps/s 3.0230 KOps/s $\color{#35bf28}+0.07\%$
test_keys 21.4610μs 3.5083μs 285.0424 KOps/s 281.6120 KOps/s $\color{#35bf28}+1.22\%$
test_keys_nested 0.1366ms 97.0465μs 10.3043 KOps/s 10.0766 KOps/s $\color{#35bf28}+2.26\%$
test_keys_nested_locked 1.7037ms 0.1017ms 9.8343 KOps/s 9.7956 KOps/s $\color{#35bf28}+0.40\%$
test_keys_nested_leaf 0.1280ms 79.3001μs 12.6103 KOps/s 12.0996 KOps/s $\color{#35bf28}+4.22\%$
test_keys_stack_nested 0.1458ms 95.6665μs 10.4530 KOps/s 10.5433 KOps/s $\color{#d91a1a}-0.86\%$
test_keys_stack_nested_leaf 0.1376ms 79.8808μs 12.5187 KOps/s 12.6677 KOps/s $\color{#d91a1a}-1.18\%$
test_keys_stack_nested_locked 0.1449ms 0.1002ms 9.9788 KOps/s 10.0174 KOps/s $\color{#d91a1a}-0.39\%$
test_values 4.9314μs 1.0635μs 940.3109 KOps/s 879.4138 KOps/s $\textbf{\color{#35bf28}+6.92\%}$
test_values_nested 96.6310μs 48.0694μs 20.8032 KOps/s 20.9666 KOps/s $\color{#d91a1a}-0.78\%$
test_values_nested_locked 99.0050μs 48.4938μs 20.6212 KOps/s 20.8103 KOps/s $\color{#d91a1a}-0.91\%$
test_values_nested_leaf 0.1324ms 42.7224μs 23.4069 KOps/s 23.4363 KOps/s $\color{#d91a1a}-0.13\%$
test_values_stack_nested 0.1013ms 48.4717μs 20.6306 KOps/s 20.8185 KOps/s $\color{#d91a1a}-0.90\%$
test_values_stack_nested_leaf 78.8680μs 42.0250μs 23.7954 KOps/s 24.3658 KOps/s $\color{#d91a1a}-2.34\%$
test_values_stack_nested_locked 96.1100μs 48.7045μs 20.5320 KOps/s 20.7483 KOps/s $\color{#d91a1a}-1.04\%$
test_membership 15.1890μs 0.8417μs 1.1881 MOps/s 1.4358 MOps/s $\textbf{\color{#d91a1a}-17.25\%}$
test_membership_nested 22.7920μs 2.6507μs 377.2632 KOps/s 382.8921 KOps/s $\color{#d91a1a}-1.47\%$
test_membership_nested_leaf 38.1810μs 2.6517μs 377.1107 KOps/s 365.1150 KOps/s $\color{#35bf28}+3.29\%$
test_membership_stacked_nested 27.1610μs 2.6149μs 382.4220 KOps/s 388.7672 KOps/s $\color{#d91a1a}-1.63\%$
test_membership_stacked_nested_leaf 22.1410μs 2.6384μs 379.0122 KOps/s 381.8130 KOps/s $\color{#d91a1a}-0.73\%$
test_membership_nested_last 55.8070μs 3.8107μs 262.4193 KOps/s 264.7241 KOps/s $\color{#d91a1a}-0.87\%$
test_membership_nested_leaf_last 34.5850μs 3.8078μs 262.6155 KOps/s 264.8392 KOps/s $\color{#d91a1a}-0.84\%$
test_membership_stacked_nested_last 29.1440μs 6.7154μs 148.9122 KOps/s 79.2394 KOps/s $\textbf{\color{#35bf28}+87.93\%}$
test_membership_stacked_nested_leaf_last 0.1723ms 6.9236μs 144.4342 KOps/s 78.4126 KOps/s $\textbf{\color{#35bf28}+84.20\%}$
test_nested_getleaf 73.0270μs 10.7685μs 92.8632 KOps/s 93.9749 KOps/s $\color{#d91a1a}-1.18\%$
test_nested_get 49.7030μs 10.2448μs 97.6108 KOps/s 98.5291 KOps/s $\color{#d91a1a}-0.93\%$
test_stacked_getleaf 60.3740μs 10.6997μs 93.4607 KOps/s 94.1032 KOps/s $\color{#d91a1a}-0.68\%$
test_stacked_get 39.4940μs 10.2358μs 97.6959 KOps/s 98.4103 KOps/s $\color{#d91a1a}-0.73\%$
test_nested_getitemleaf 61.3050μs 11.1760μs 89.4773 KOps/s 90.8439 KOps/s $\color{#d91a1a}-1.50\%$
test_nested_getitem 53.6700μs 10.4026μs 96.1295 KOps/s 98.7217 KOps/s $\color{#d91a1a}-2.63\%$
test_stacked_getitemleaf 44.1020μs 11.2120μs 89.1903 KOps/s 91.3998 KOps/s $\color{#d91a1a}-2.42\%$
test_stacked_getitem 0.1005ms 10.3761μs 96.3758 KOps/s 97.9898 KOps/s $\color{#d91a1a}-1.65\%$
test_lock_nested 93.2349ms 0.5773ms 1.7322 KOps/s 2.0744 KOps/s $\textbf{\color{#d91a1a}-16.50\%}$
test_lock_stack_nested 0.5429ms 0.4406ms 2.2698 KOps/s 2.3207 KOps/s $\color{#d91a1a}-2.20\%$
test_unlock_nested 98.8029ms 0.5154ms 1.9404 KOps/s 2.5163 KOps/s $\textbf{\color{#d91a1a}-22.89\%}$
test_unlock_stack_nested 0.6449ms 0.3617ms 2.7650 KOps/s 2.8612 KOps/s $\color{#d91a1a}-3.36\%$
test_flatten_speed 0.2640ms 0.1044ms 9.5766 KOps/s 9.4551 KOps/s $\color{#35bf28}+1.29\%$
test_unflatten_speed 0.6802ms 0.4487ms 2.2286 KOps/s 2.1933 KOps/s $\color{#35bf28}+1.61\%$
test_common_ops 4.8325ms 1.0652ms 938.7958 Ops/s 917.9170 Ops/s $\color{#35bf28}+2.27\%$
test_creation 32.5910μs 2.0962μs 477.0482 KOps/s 496.3805 KOps/s $\color{#d91a1a}-3.89\%$
test_creation_empty 50.3740μs 15.9171μs 62.8255 KOps/s 57.7981 KOps/s $\textbf{\color{#35bf28}+8.70\%}$
test_creation_nested_1 65.4530μs 18.7647μs 53.2916 KOps/s 48.5273 KOps/s $\textbf{\color{#35bf28}+9.82\%}$
test_creation_nested_2 54.7120μs 23.2858μs 42.9446 KOps/s 40.1984 KOps/s $\textbf{\color{#35bf28}+6.83\%}$
test_clone 94.7270μs 16.9209μs 59.0984 KOps/s 56.4749 KOps/s $\color{#35bf28}+4.65\%$
test_getitem[int] 1.3044ms 16.3263μs 61.2508 KOps/s 60.8120 KOps/s $\color{#35bf28}+0.72\%$
test_getitem[slice_int] 0.1372ms 29.7323μs 33.6334 KOps/s 32.7145 KOps/s $\color{#35bf28}+2.81\%$
test_getitem[range] 0.2758ms 58.1557μs 17.1952 KOps/s 16.9918 KOps/s $\color{#35bf28}+1.20\%$
test_getitem[tuple] 0.1555ms 23.8428μs 41.9414 KOps/s 39.2397 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_getitem[list] 0.4427ms 53.7182μs 18.6157 KOps/s 18.9021 KOps/s $\color{#d91a1a}-1.52\%$
test_setitem_dim[int] 77.9160μs 38.6777μs 25.8547 KOps/s 25.2442 KOps/s $\color{#35bf28}+2.42\%$
test_setitem_dim[slice_int] 0.1183ms 67.5067μs 14.8134 KOps/s 14.4368 KOps/s $\color{#35bf28}+2.61\%$
test_setitem_dim[range] 0.1406ms 92.4198μs 10.8202 KOps/s 10.6831 KOps/s $\color{#35bf28}+1.28\%$
test_setitem_dim[tuple] 98.1030μs 54.7283μs 18.2721 KOps/s 17.1683 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_setitem 0.1092ms 28.4844μs 35.1069 KOps/s 33.2592 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_set 0.1094ms 27.7032μs 36.0970 KOps/s 34.2535 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_set_shared 1.1529ms 0.2115ms 4.7282 KOps/s 4.7730 KOps/s $\color{#d91a1a}-0.94\%$
test_update 0.1599ms 32.7962μs 30.4913 KOps/s 28.6546 KOps/s $\textbf{\color{#35bf28}+6.41\%}$
test_update_nested 0.1506ms 42.8298μs 23.3482 KOps/s 21.7587 KOps/s $\textbf{\color{#35bf28}+7.31\%}$
test_update__nested 0.1729ms 34.1829μs 29.2544 KOps/s 27.9345 KOps/s $\color{#35bf28}+4.72\%$
test_set_nested 0.1473ms 29.5976μs 33.7865 KOps/s 31.3898 KOps/s $\textbf{\color{#35bf28}+7.64\%}$
test_set_nested_new 0.1077ms 34.7005μs 28.8180 KOps/s 27.1509 KOps/s $\textbf{\color{#35bf28}+6.14\%}$
test_select 0.1458ms 51.5715μs 19.3905 KOps/s 18.6059 KOps/s $\color{#35bf28}+4.22\%$
test_select_nested 0.1281ms 58.3039μs 17.1515 KOps/s 16.9463 KOps/s $\color{#35bf28}+1.21\%$
test_exclude_nested 0.1483ms 73.6886μs 13.5706 KOps/s 13.2871 KOps/s $\color{#35bf28}+2.13\%$
test_empty[True] 0.3879ms 0.3098ms 3.2275 KOps/s 3.1944 KOps/s $\color{#35bf28}+1.04\%$
test_empty[False] 8.4678μs 1.1953μs 836.5987 KOps/s 874.3811 KOps/s $\color{#d91a1a}-4.32\%$
test_unbind_speed 0.4969ms 0.2996ms 3.3376 KOps/s 3.3684 KOps/s $\color{#d91a1a}-0.92\%$
test_unbind_speed_stack0 0.6349ms 0.2929ms 3.4144 KOps/s 3.5372 KOps/s $\color{#d91a1a}-3.47\%$
test_unbind_speed_stack1 91.6880ms 0.7714ms 1.2964 KOps/s 1.4408 KOps/s $\textbf{\color{#d91a1a}-10.03\%}$
test_split 88.0102ms 2.1849ms 457.6966 Ops/s 459.6636 Ops/s $\color{#d91a1a}-0.43\%$
test_chunk 2.2207ms 2.0080ms 498.0064 Ops/s 458.5946 Ops/s $\textbf{\color{#35bf28}+8.59\%}$
test_creation[device0] 3.7848ms 0.1205ms 8.2982 KOps/s 8.0839 KOps/s $\color{#35bf28}+2.65\%$
test_creation_from_tensor 0.2306ms 0.1194ms 8.3751 KOps/s 8.6093 KOps/s $\color{#d91a1a}-2.72\%$
test_add_one[memmap_tensor0] 77.5860μs 7.7328μs 129.3184 KOps/s 132.9677 KOps/s $\color{#d91a1a}-2.74\%$
test_contiguous[memmap_tensor0] 18.0940μs 1.9316μs 517.7132 KOps/s 516.4499 KOps/s $\color{#35bf28}+0.24\%$
test_stack[memmap_tensor0] 50.9460μs 5.5369μs 180.6080 KOps/s 174.4702 KOps/s $\color{#35bf28}+3.52\%$
test_memmaptd_index 0.6258ms 0.4060ms 2.4633 KOps/s 2.4817 KOps/s $\color{#d91a1a}-0.74\%$
test_memmaptd_index_astensor 0.9741ms 0.4840ms 2.0661 KOps/s 2.0734 KOps/s $\color{#d91a1a}-0.35\%$
test_memmaptd_index_op 1.4594ms 1.0033ms 996.6945 Ops/s 992.2216 Ops/s $\color{#35bf28}+0.45\%$
test_serialize_model 0.1323s 0.1205s 8.2985 Ops/s 8.6407 Ops/s $\color{#d91a1a}-3.96\%$
test_serialize_model_pickle 0.4351s 0.3869s 2.5844 Ops/s 2.5183 Ops/s $\color{#35bf28}+2.63\%$
test_serialize_weights 0.1233s 0.1166s 8.5784 Ops/s 7.7978 Ops/s $\textbf{\color{#35bf28}+10.01\%}$
test_serialize_weights_returnearly 0.2821s 0.1792s 5.5802 Ops/s 6.3436 Ops/s $\textbf{\color{#d91a1a}-12.03\%}$
test_serialize_weights_pickle 1.1998s 0.7009s 1.4268 Ops/s 2.5316 Ops/s $\textbf{\color{#d91a1a}-43.64\%}$
test_serialize_weights_filesystem 0.1467s 0.1410s 7.0912 Ops/s 7.1494 Ops/s $\color{#d91a1a}-0.81\%$
test_serialize_model_filesystem 0.1522s 0.1448s 6.9042 Ops/s 6.2257 Ops/s $\textbf{\color{#35bf28}+10.90\%}$
test_reshape_pytree 85.7310μs 38.3306μs 26.0888 KOps/s 25.8597 KOps/s $\color{#35bf28}+0.89\%$
test_reshape_td 0.1020ms 45.5552μs 21.9514 KOps/s 22.0107 KOps/s $\color{#d91a1a}-0.27\%$
test_view_pytree 80.5810μs 38.2219μs 26.1630 KOps/s 25.9802 KOps/s $\color{#35bf28}+0.70\%$
test_view_td 0.1457ms 51.2901μs 19.4969 KOps/s 19.3217 KOps/s $\color{#35bf28}+0.91\%$
test_unbind_pytree 81.1220μs 35.8789μs 27.8715 KOps/s 27.2561 KOps/s $\color{#35bf28}+2.26\%$
test_unbind_td 0.2983ms 44.1046μs 22.6734 KOps/s 22.8127 KOps/s $\color{#d91a1a}-0.61\%$
test_split_pytree 89.0070μs 38.1080μs 26.2412 KOps/s 26.2148 KOps/s $\color{#35bf28}+0.10\%$
test_split_td 0.1939ms 56.3641μs 17.7418 KOps/s 17.4204 KOps/s $\color{#35bf28}+1.84\%$
test_add_pytree 0.1005ms 45.7732μs 21.8469 KOps/s 22.2028 KOps/s $\color{#d91a1a}-1.60\%$
test_add_td 0.1613ms 79.2641μs 12.6161 KOps/s 12.2817 KOps/s $\color{#35bf28}+2.72\%$
test_compile_add_one_nested[tensordict-compile] 0.1258ms 55.6687μs 17.9634 KOps/s 17.4316 KOps/s $\color{#35bf28}+3.05\%$
test_compile_add_one_nested[tensordict-eager] 0.3944ms 0.1828ms 5.4705 KOps/s 5.4175 KOps/s $\color{#35bf28}+0.98\%$
test_compile_add_one_nested[pytree-compile] 0.1164ms 55.4760μs 18.0258 KOps/s 17.8383 KOps/s $\color{#35bf28}+1.05\%$
test_compile_add_one_nested[pytree-eager] 0.2205ms 0.1420ms 7.0437 KOps/s 7.0654 KOps/s $\color{#d91a1a}-0.31\%$
test_compile_copy_nested[tensordict-compile] 80.6910μs 19.9561μs 50.1101 KOps/s 47.4344 KOps/s $\textbf{\color{#35bf28}+5.64\%}$
test_compile_copy_nested[tensordict-eager] 0.1246ms 65.6841μs 15.2244 KOps/s 14.1231 KOps/s $\textbf{\color{#35bf28}+7.80\%}$
test_compile_copy_nested[pytree-compile] 0.1404ms 74.3738μs 13.4456 KOps/s 12.9609 KOps/s $\color{#35bf28}+3.74\%$
test_compile_copy_nested[pytree-eager] 0.1699ms 67.5125μs 14.8121 KOps/s 14.4777 KOps/s $\color{#35bf28}+2.31\%$
test_compile_add_one_flat[tensordict-compile] 0.2842ms 0.1726ms 5.7928 KOps/s 5.7968 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_add_one_flat[tensordict-eager] 0.3488ms 0.1877ms 5.3281 KOps/s 5.3504 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_add_one_flat[tensorclass-compile] 0.1093ms 39.6594μs 25.2147 KOps/s 24.1922 KOps/s $\color{#35bf28}+4.23\%$
test_compile_add_one_flat[tensorclass-eager] 0.6488ms 68.2809μs 14.6454 KOps/s 14.8193 KOps/s $\color{#d91a1a}-1.17\%$
test_compile_add_one_flat[pytree-compile] 0.2688ms 0.1722ms 5.8056 KOps/s 5.7574 KOps/s $\color{#35bf28}+0.84\%$
test_compile_add_one_flat[pytree-eager] 0.4134ms 0.2942ms 3.3989 KOps/s 3.4438 KOps/s $\color{#d91a1a}-1.31\%$
test_compile_add_self_flat[tensordict-eager] 0.4290ms 0.2010ms 4.9743 KOps/s 5.0248 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_self_flat[tensordict-compile] 0.3325ms 0.1729ms 5.7850 KOps/s 5.7855 KOps/s $-0.01\%$
test_compile_add_self_flat[tensorclass-eager] 0.3068ms 63.2129μs 15.8196 KOps/s 16.2190 KOps/s $\color{#d91a1a}-2.46\%$
test_compile_add_self_flat[tensorclass-compile] 0.1121ms 41.0346μs 24.3697 KOps/s 23.9878 KOps/s $\color{#35bf28}+1.59\%$
test_compile_add_self_flat[pytree-eager] 0.3977ms 0.2396ms 4.1729 KOps/s 4.2682 KOps/s $\color{#d91a1a}-2.23\%$
test_compile_add_self_flat[pytree-compile] 0.3802ms 0.1760ms 5.6832 KOps/s 5.6487 KOps/s $\color{#35bf28}+0.61\%$
test_compile_copy_flat[tensordict-compile] 0.1837ms 0.1025ms 9.7540 KOps/s 9.8053 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_copy_flat[tensordict-eager] 0.2609ms 57.5087μs 17.3887 KOps/s 15.9772 KOps/s $\textbf{\color{#35bf28}+8.83\%}$
test_compile_copy_flat[pytree-compile] 0.2059ms 75.7920μs 13.1940 KOps/s 12.9599 KOps/s $\color{#35bf28}+1.81\%$
test_compile_copy_flat[pytree-eager] 0.1438ms 68.5137μs 14.5956 KOps/s 14.6636 KOps/s $\color{#d91a1a}-0.46\%$
test_compile_assign_and_add[tensordict-compile] 0.2755ms 0.1929ms 5.1851 KOps/s 5.1502 KOps/s $\color{#35bf28}+0.68\%$
test_compile_assign_and_add[tensordict-eager] 1.9201ms 1.6260ms 615.0193 Ops/s 592.7391 Ops/s $\color{#35bf28}+3.76\%$
test_compile_assign_and_add[pytree-compile] 0.2909ms 0.1890ms 5.2922 KOps/s 5.2542 KOps/s $\color{#35bf28}+0.72\%$
test_compile_assign_and_add[pytree-eager] 1.3066ms 1.1103ms 900.6239 Ops/s 908.3696 Ops/s $\color{#d91a1a}-0.85\%$
test_compile_assign_and_add_stack[compile] 0.5039ms 0.4139ms 2.4161 KOps/s 2.3728 KOps/s $\color{#35bf28}+1.83\%$
test_compile_assign_and_add_stack[eager] 4.0223ms 3.6586ms 273.3278 Ops/s 266.4569 Ops/s $\color{#35bf28}+2.58\%$
test_compile_indexing[tensor-tensordict-compile] 92.3240μs 33.6993μs 29.6742 KOps/s 28.6320 KOps/s $\color{#35bf28}+3.64\%$
test_compile_indexing[tensor-tensordict-eager] 1.2357ms 47.8259μs 20.9092 KOps/s 20.2763 KOps/s $\color{#35bf28}+3.12\%$
test_compile_indexing[tensor-tensorclass-compile] 92.0730μs 28.8471μs 34.6655 KOps/s 33.2141 KOps/s $\color{#35bf28}+4.37\%$
test_compile_indexing[tensor-tensorclass-eager] 74.1300μs 28.5838μs 34.9848 KOps/s 35.5489 KOps/s $\color{#d91a1a}-1.59\%$
test_compile_indexing[tensor-pytree-compile] 71.9650μs 29.2651μs 34.1704 KOps/s 32.9588 KOps/s $\color{#35bf28}+3.68\%$
test_compile_indexing[tensor-pytree-eager] 92.6840μs 28.2542μs 35.3930 KOps/s 35.1289 KOps/s $\color{#35bf28}+0.75\%$
test_compile_indexing[slice-tensordict-compile] 0.1389ms 72.3404μs 13.8235 KOps/s 13.5107 KOps/s $\color{#35bf28}+2.32\%$
test_compile_indexing[slice-tensordict-eager] 0.4104ms 26.7788μs 37.3430 KOps/s 35.0935 KOps/s $\textbf{\color{#35bf28}+6.41\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1354ms 67.0458μs 14.9152 KOps/s 14.6473 KOps/s $\color{#35bf28}+1.83\%$
test_compile_indexing[slice-tensorclass-eager] 82.8250μs 23.0123μs 43.4551 KOps/s 43.3217 KOps/s $\color{#35bf28}+0.31\%$
test_compile_indexing[slice-pytree-compile] 0.1224ms 66.8188μs 14.9658 KOps/s 14.7669 KOps/s $\color{#35bf28}+1.35\%$
test_compile_indexing[slice-pytree-eager] 88.5360μs 22.8146μs 43.8316 KOps/s 44.0113 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_indexing[int-tensordict-compile] 0.1850ms 72.4477μs 13.8031 KOps/s 13.6826 KOps/s $\color{#35bf28}+0.88\%$
test_compile_indexing[int-tensordict-eager] 1.0045ms 26.6489μs 37.5249 KOps/s 35.7836 KOps/s $\color{#35bf28}+4.87\%$
test_compile_indexing[int-tensorclass-compile] 0.1548ms 66.3956μs 15.0612 KOps/s 14.7100 KOps/s $\color{#35bf28}+2.39\%$
test_compile_indexing[int-tensorclass-eager] 87.3330μs 22.5568μs 44.3325 KOps/s 44.1537 KOps/s $\color{#35bf28}+0.41\%$
test_compile_indexing[int-pytree-compile] 0.1547ms 66.9700μs 14.9321 KOps/s 14.7454 KOps/s $\color{#35bf28}+1.27\%$
test_compile_indexing[int-pytree-eager] 62.5670μs 22.4202μs 44.6025 KOps/s 43.8763 KOps/s $\color{#35bf28}+1.66\%$
test_mod_add[eager] 82.4950μs 23.7004μs 42.1934 KOps/s 41.8200 KOps/s $\color{#35bf28}+0.89\%$
test_mod_add[compile] 91.0500μs 37.8036μs 26.4525 KOps/s 21.8785 KOps/s $\textbf{\color{#35bf28}+20.91\%}$
test_mod_add[compile-overhead] 81.3930μs 38.0419μs 26.2868 KOps/s 25.0288 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_mod_wrap[eager] 0.3313ms 0.2017ms 4.9568 KOps/s 4.8521 KOps/s $\color{#35bf28}+2.16\%$
test_mod_wrap[compile] 0.4342ms 0.2326ms 4.2991 KOps/s 4.2848 KOps/s $\color{#35bf28}+0.33\%$
test_mod_wrap[compile-overhead] 0.4272ms 0.2308ms 4.3326 KOps/s 4.2635 KOps/s $\color{#35bf28}+1.62\%$
test_mod_wrap_and_backward[eager] 12.0623ms 10.7898ms 92.6800 Ops/s 92.6225 Ops/s $\color{#35bf28}+0.06\%$
test_mod_wrap_and_backward[compile] 12.6230ms 10.7969ms 92.6195 Ops/s 91.3831 Ops/s $\color{#35bf28}+1.35\%$
test_mod_wrap_and_backward[compile-overhead] 12.5616ms 10.7869ms 92.7052 Ops/s 86.6204 Ops/s $\textbf{\color{#35bf28}+7.02\%}$
test_seq_add[eager] 0.1807ms 87.6677μs 11.4067 KOps/s 10.8708 KOps/s $\color{#35bf28}+4.93\%$
test_seq_add[compile] 0.1375ms 63.3167μs 15.7936 KOps/s 15.5491 KOps/s $\color{#35bf28}+1.57\%$
test_seq_add[compile-overhead] 0.1383ms 62.3772μs 16.0315 KOps/s 15.3846 KOps/s $\color{#35bf28}+4.21\%$
test_seq_wrap[eager] 0.6370ms 0.3755ms 2.6629 KOps/s 2.6386 KOps/s $\color{#35bf28}+0.92\%$
test_seq_wrap[compile] 0.4974ms 0.2718ms 3.6786 KOps/s 3.6642 KOps/s $\color{#35bf28}+0.39\%$
test_seq_wrap[compile-overhead] 0.4541ms 0.2734ms 3.6570 KOps/s 3.7340 KOps/s $\color{#d91a1a}-2.06\%$
test_func_call_runtime[False-eager] 0.8329ms 0.5317ms 1.8809 KOps/s 1.9253 KOps/s $\color{#d91a1a}-2.31\%$
test_func_call_runtime[False-compile] 0.7309ms 0.5106ms 1.9587 KOps/s 1.9882 KOps/s $\color{#d91a1a}-1.49\%$
test_func_call_runtime[False-compile-overhead] 0.7036ms 0.5093ms 1.9633 KOps/s 2.0046 KOps/s $\color{#d91a1a}-2.06\%$
test_func_call_runtime[True-eager] 1.1079ms 0.7478ms 1.3373 KOps/s 1.3449 KOps/s $\color{#d91a1a}-0.56\%$
test_func_call_runtime[True-compile] 0.6217ms 0.5170ms 1.9342 KOps/s 1.9257 KOps/s $\color{#35bf28}+0.44\%$
test_func_call_runtime[True-compile-overhead] 0.7103ms 0.5214ms 1.9180 KOps/s 1.8703 KOps/s $\color{#35bf28}+2.55\%$
test_func_call_cm_runtime[False-eager] 0.9273ms 0.5378ms 1.8594 KOps/s 1.9506 KOps/s $\color{#d91a1a}-4.68\%$
test_func_call_cm_runtime[False-compile] 0.6812ms 0.5068ms 1.9732 KOps/s 2.0008 KOps/s $\color{#d91a1a}-1.38\%$
test_func_call_cm_runtime[False-compile-overhead] 0.8527ms 0.5090ms 1.9646 KOps/s 1.9726 KOps/s $\color{#d91a1a}-0.40\%$
test_func_call_cm_runtime[True-eager] 2.3545ms 0.8944ms 1.1181 KOps/s 1.1686 KOps/s $\color{#d91a1a}-4.32\%$
test_func_call_cm_runtime[True-compile] 1.1783ms 0.7443ms 1.3435 KOps/s 1.3391 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9208ms 0.7388ms 1.3535 KOps/s 1.3349 KOps/s $\color{#35bf28}+1.39\%$
test_vmap_func_call_cm_runtime[eager] 3.4245ms 1.8621ms 537.0384 Ops/s 542.0192 Ops/s $\color{#d91a1a}-0.92\%$
test_vmap_func_call_cm_runtime[compile] 2.5111ms 1.9116ms 523.1226 Ops/s 526.3621 Ops/s $\color{#d91a1a}-0.62\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.8621ms 1.9255ms 519.3580 Ops/s 525.4971 Ops/s $\color{#d91a1a}-1.17\%$
test_distributed 0.3194ms 0.1251ms 7.9956 KOps/s 7.9164 KOps/s $\color{#35bf28}+1.00\%$
test_tdmodule 0.1164ms 16.3828μs 61.0397 KOps/s 57.4899 KOps/s $\textbf{\color{#35bf28}+6.17\%}$
test_tdmodule_dispatch 62.5270μs 33.4281μs 29.9150 KOps/s 28.5977 KOps/s $\color{#35bf28}+4.61\%$
test_tdseq 42.2200μs 19.0246μs 52.5634 KOps/s 50.5983 KOps/s $\color{#35bf28}+3.88\%$
test_tdseq_dispatch 72.4560μs 37.2254μs 26.8634 KOps/s 25.8488 KOps/s $\color{#35bf28}+3.92\%$
test_instantiation_functorch 1.8789ms 1.5683ms 637.6459 Ops/s 627.9691 Ops/s $\color{#35bf28}+1.54\%$
test_instantiation_td 2.0211ms 1.1719ms 853.3279 Ops/s 848.3730 Ops/s $\color{#35bf28}+0.58\%$
test_exec_functorch 0.3444ms 0.1898ms 5.2677 KOps/s 5.2271 KOps/s $\color{#35bf28}+0.78\%$
test_exec_functional_call 0.3575ms 0.1766ms 5.6630 KOps/s 5.6453 KOps/s $\color{#35bf28}+0.31\%$
test_exec_td 0.3490ms 0.1734ms 5.7685 KOps/s 5.9275 KOps/s $\color{#d91a1a}-2.68\%$
test_exec_td_decorator 1.1379ms 0.2262ms 4.4212 KOps/s 4.4048 KOps/s $\color{#35bf28}+0.37\%$
test_vmap_mlp_speed[True-True] 0.7964ms 0.6395ms 1.5637 KOps/s 1.5631 KOps/s $\color{#35bf28}+0.04\%$
test_vmap_mlp_speed[True-False] 1.1968ms 0.6478ms 1.5437 KOps/s 1.5858 KOps/s $\color{#d91a1a}-2.66\%$
test_vmap_mlp_speed[False-True] 0.7760ms 0.4976ms 2.0096 KOps/s 2.0373 KOps/s $\color{#d91a1a}-1.36\%$
test_vmap_mlp_speed[False-False] 0.7792ms 0.4995ms 2.0022 KOps/s 2.0147 KOps/s $\color{#d91a1a}-0.62\%$
test_vmap_mlp_speed_decorator[True-True] 1.5390ms 0.6164ms 1.6222 KOps/s 1.6343 KOps/s $\color{#d91a1a}-0.74\%$
test_vmap_mlp_speed_decorator[True-False] 1.0228ms 0.6205ms 1.6116 KOps/s 1.6298 KOps/s $\color{#d91a1a}-1.12\%$
test_vmap_mlp_speed_decorator[False-True] 0.7693ms 0.5100ms 1.9608 KOps/s 1.9592 KOps/s $\color{#35bf28}+0.08\%$
test_vmap_mlp_speed_decorator[False-False] 0.8224ms 0.5098ms 1.9614 KOps/s 1.9065 KOps/s $\color{#35bf28}+2.88\%$
test_to_module_speed[True] 2.1054ms 1.2850ms 778.1979 Ops/s 781.4100 Ops/s $\color{#d91a1a}-0.41\%$
test_to_module_speed[False] 1.6431ms 1.2468ms 802.0502 Ops/s 789.6473 Ops/s $\color{#35bf28}+1.57\%$
test_tc_init 0.1038ms 40.0367μs 24.9771 KOps/s 22.2412 KOps/s $\textbf{\color{#35bf28}+12.30\%}$
test_tc_init_nested 0.1638ms 81.2099μs 12.3138 KOps/s 11.2804 KOps/s $\textbf{\color{#35bf28}+9.16\%}$
test_tc_first_layer_tensor 34.3440μs 1.5137μs 660.6178 KOps/s 634.4053 KOps/s $\color{#35bf28}+4.13\%$
test_tc_first_layer_nontensor 50.1440μs 4.7353μs 211.1804 KOps/s 205.2453 KOps/s $\color{#35bf28}+2.89\%$
test_tc_second_layer_tensor 36.4890μs 2.8280μs 353.6126 KOps/s 339.9946 KOps/s $\color{#35bf28}+4.01\%$
test_tc_second_layer_nontensor 43.7420μs 5.9874μs 167.0178 KOps/s 161.5413 KOps/s $\color{#35bf28}+3.39\%$
test_unbind 0.4796s 13.2253ms 75.6127 Ops/s 76.6072 Ops/s $\color{#d91a1a}-1.30\%$
test_full_like 8.9400ms 7.5060ms 133.2266 Ops/s 143.5089 Ops/s $\textbf{\color{#d91a1a}-7.16\%}$
test_zeros_like 4.2624ms 2.9142ms 343.1420 Ops/s 367.2393 Ops/s $\textbf{\color{#d91a1a}-6.56\%}$
test_ones_like 4.0463ms 3.4268ms 291.8198 Ops/s 320.0478 Ops/s $\textbf{\color{#d91a1a}-8.82\%}$
test_clone 6.0611ms 5.4310ms 184.1265 Ops/s 205.0522 Ops/s $\textbf{\color{#d91a1a}-10.21\%}$
test_squeeze 66.8750μs 12.3926μs 80.6930 KOps/s 77.7881 KOps/s $\color{#35bf28}+3.73\%$
test_unsqueeze 0.3440ms 89.4472μs 11.1798 KOps/s 11.1574 KOps/s $\color{#35bf28}+0.20\%$
test_split 0.3499ms 0.1896ms 5.2749 KOps/s 5.1139 KOps/s $\color{#35bf28}+3.15\%$
test_permute 0.3508ms 0.2126ms 4.7042 KOps/s 4.6087 KOps/s $\color{#35bf28}+2.07\%$
test_stack 34.5231ms 26.2665ms 38.0713 Ops/s 40.4343 Ops/s $\textbf{\color{#d91a1a}-5.84\%}$
test_cat 34.3235ms 26.7592ms 37.3703 Ops/s 41.0109 Ops/s $\textbf{\color{#d91a1a}-8.88\%}$

@github-actions
Copy link

github-actions bot commented Sep 2, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}30$. Worsened: $\large\color{#d91a1a}9$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 49.8310μs 14.1279μs 70.7817 KOps/s 67.0274 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_plain_set_stack_nested 40.2800μs 14.3898μs 69.4937 KOps/s 65.1098 KOps/s $\textbf{\color{#35bf28}+6.73\%}$
test_plain_set_nested_inplace 64.2600μs 15.1740μs 65.9023 KOps/s 62.8681 KOps/s $\color{#35bf28}+4.83\%$
test_plain_set_stack_nested_inplace 43.1510μs 15.1283μs 66.1012 KOps/s 64.0889 KOps/s $\color{#35bf28}+3.14\%$
test_items 34.0300μs 2.8214μs 354.4297 KOps/s 348.2108 KOps/s $\color{#35bf28}+1.79\%$
test_items_nested 0.4017ms 0.3124ms 3.2007 KOps/s 3.2034 KOps/s $\color{#d91a1a}-0.08\%$
test_items_nested_locked 0.3734ms 0.3144ms 3.1802 KOps/s 3.1597 KOps/s $\color{#35bf28}+0.65\%$
test_items_nested_leaf 91.5710μs 62.8435μs 15.9125 KOps/s 15.9366 KOps/s $\color{#d91a1a}-0.15\%$
test_items_stack_nested 0.4952ms 0.3175ms 3.1498 KOps/s 3.1795 KOps/s $\color{#d91a1a}-0.93\%$
test_items_stack_nested_leaf 89.1710μs 64.4093μs 15.5257 KOps/s 15.4150 KOps/s $\color{#35bf28}+0.72\%$
test_items_stack_nested_locked 0.3697ms 0.3143ms 3.1821 KOps/s 3.1668 KOps/s $\color{#35bf28}+0.48\%$
test_keys 47.7910μs 3.4007μs 294.0548 KOps/s 294.4142 KOps/s $\color{#d91a1a}-0.12\%$
test_keys_nested 90.7920μs 54.8614μs 18.2278 KOps/s 18.2013 KOps/s $\color{#35bf28}+0.15\%$
test_keys_nested_locked 0.9669ms 59.0808μs 16.9260 KOps/s 16.6640 KOps/s $\color{#35bf28}+1.57\%$
test_keys_nested_leaf 81.2410μs 46.0920μs 21.6957 KOps/s 21.5691 KOps/s $\color{#35bf28}+0.59\%$
test_keys_stack_nested 88.9410μs 54.5803μs 18.3216 KOps/s 18.3651 KOps/s $\color{#d91a1a}-0.24\%$
test_keys_stack_nested_leaf 71.4310μs 46.2266μs 21.6326 KOps/s 21.3613 KOps/s $\color{#35bf28}+1.27\%$
test_keys_stack_nested_locked 0.1144ms 58.7098μs 17.0329 KOps/s 16.7919 KOps/s $\color{#35bf28}+1.44\%$
test_values 4.3617μs 0.8186μs 1.2216 MOps/s 1.2302 MOps/s $\color{#d91a1a}-0.70\%$
test_values_nested 68.0810μs 27.8886μs 35.8570 KOps/s 36.4017 KOps/s $\color{#d91a1a}-1.50\%$
test_values_nested_locked 66.1810μs 29.6581μs 33.7176 KOps/s 34.3039 KOps/s $\color{#d91a1a}-1.71\%$
test_values_nested_leaf 51.8800μs 24.4145μs 40.9593 KOps/s 41.4420 KOps/s $\color{#d91a1a}-1.16\%$
test_values_stack_nested 61.8710μs 28.7080μs 34.8335 KOps/s 35.1827 KOps/s $\color{#d91a1a}-0.99\%$
test_values_stack_nested_leaf 52.2910μs 25.2940μs 39.5351 KOps/s 40.2976 KOps/s $\color{#d91a1a}-1.89\%$
test_values_stack_nested_locked 54.6810μs 30.4180μs 32.8753 KOps/s 33.3723 KOps/s $\color{#d91a1a}-1.49\%$
test_membership 1.7486μs 0.4754μs 2.1033 MOps/s 2.1033 MOps/s $+0.00\%$
test_membership_nested 23.6410μs 1.8030μs 554.6379 KOps/s 568.8771 KOps/s $\color{#d91a1a}-2.50\%$
test_membership_nested_leaf 11.4303μs 1.7052μs 586.4560 KOps/s 574.1437 KOps/s $\color{#35bf28}+2.14\%$
test_membership_stacked_nested 37.1010μs 1.7946μs 557.2331 KOps/s 558.5835 KOps/s $\color{#d91a1a}-0.24\%$
test_membership_stacked_nested_leaf 29.7300μs 1.7721μs 564.3165 KOps/s 571.5720 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_nested_last 24.6710μs 2.5596μs 390.6792 KOps/s 391.9110 KOps/s $\color{#d91a1a}-0.31\%$
test_membership_nested_leaf_last 34.5110μs 2.6031μs 384.1502 KOps/s 389.2199 KOps/s $\color{#d91a1a}-1.30\%$
test_membership_stacked_nested_last 22.2700μs 2.5884μs 386.3383 KOps/s 391.4675 KOps/s $\color{#d91a1a}-1.31\%$
test_membership_stacked_nested_leaf_last 30.4500μs 2.5539μs 391.5637 KOps/s 387.5732 KOps/s $\color{#35bf28}+1.03\%$
test_nested_getleaf 42.5200μs 6.1300μs 163.1324 KOps/s 165.7219 KOps/s $\color{#d91a1a}-1.56\%$
test_nested_get 38.3300μs 5.7685μs 173.3550 KOps/s 175.2181 KOps/s $\color{#d91a1a}-1.06\%$
test_stacked_getleaf 29.3900μs 5.9642μs 167.6683 KOps/s 165.1664 KOps/s $\color{#35bf28}+1.51\%$
test_stacked_get 43.1010μs 5.6868μs 175.8448 KOps/s 179.1023 KOps/s $\color{#d91a1a}-1.82\%$
test_nested_getitemleaf 35.2510μs 6.0794μs 164.4910 KOps/s 163.4696 KOps/s $\color{#35bf28}+0.62\%$
test_nested_getitem 37.3400μs 5.7546μs 173.7740 KOps/s 174.9197 KOps/s $\color{#d91a1a}-0.65\%$
test_stacked_getitemleaf 33.1000μs 6.0250μs 165.9762 KOps/s 165.4705 KOps/s $\color{#35bf28}+0.31\%$
test_stacked_getitem 35.0100μs 5.6915μs 175.7021 KOps/s 175.8427 KOps/s $\color{#d91a1a}-0.08\%$
test_lock_nested 0.8664ms 0.4117ms 2.4288 KOps/s 2.3655 KOps/s $\color{#35bf28}+2.68\%$
test_lock_stack_nested 0.4297ms 0.3762ms 2.6584 KOps/s 2.6095 KOps/s $\color{#35bf28}+1.87\%$
test_unlock_nested 0.7349ms 0.3545ms 2.8205 KOps/s 2.7800 KOps/s $\color{#35bf28}+1.46\%$
test_unlock_stack_nested 0.3918ms 0.3153ms 3.1711 KOps/s 3.1020 KOps/s $\color{#35bf28}+2.23\%$
test_flatten_speed 0.1650ms 78.1577μs 12.7946 KOps/s 12.7053 KOps/s $\color{#35bf28}+0.70\%$
test_unflatten_speed 0.3501ms 0.2805ms 3.5653 KOps/s 3.5795 KOps/s $\color{#d91a1a}-0.40\%$
test_common_ops 1.5020ms 1.2408ms 805.9343 Ops/s 796.9804 Ops/s $\color{#35bf28}+1.12\%$
test_creation 28.7500μs 1.4653μs 682.4639 KOps/s 694.0438 KOps/s $\color{#d91a1a}-1.67\%$
test_creation_empty 46.8710μs 16.0523μs 62.2962 KOps/s 57.2901 KOps/s $\textbf{\color{#35bf28}+8.74\%}$
test_creation_nested_1 50.3510μs 17.8794μs 55.9302 KOps/s 52.4979 KOps/s $\textbf{\color{#35bf28}+6.54\%}$
test_creation_nested_2 58.9710μs 20.4781μs 48.8326 KOps/s 44.8760 KOps/s $\textbf{\color{#35bf28}+8.82\%}$
test_clone 71.6010μs 27.8889μs 35.8565 KOps/s 34.3223 KOps/s $\color{#35bf28}+4.47\%$
test_getitem[int] 1.3990ms 15.7930μs 63.3191 KOps/s 62.9666 KOps/s $\color{#35bf28}+0.56\%$
test_getitem[slice_int] 0.1304ms 26.7883μs 37.3297 KOps/s 36.3539 KOps/s $\color{#35bf28}+2.68\%$
test_getitem[range] 0.2814ms 0.1096ms 9.1227 KOps/s 9.0328 KOps/s $\color{#35bf28}+1.00\%$
test_getitem[tuple] 0.1300ms 23.5281μs 42.5024 KOps/s 42.1914 KOps/s $\color{#35bf28}+0.74\%$
test_getitem[list] 0.2117ms 96.7538μs 10.3355 KOps/s 10.1689 KOps/s $\color{#35bf28}+1.64\%$
test_setitem_dim[int] 74.0310μs 51.0295μs 19.5965 KOps/s 18.8886 KOps/s $\color{#35bf28}+3.75\%$
test_setitem_dim[slice_int] 0.1203ms 74.7658μs 13.3751 KOps/s 13.2951 KOps/s $\color{#35bf28}+0.60\%$
test_setitem_dim[range] 0.1798ms 0.1340ms 7.4609 KOps/s 7.3590 KOps/s $\color{#35bf28}+1.39\%$
test_setitem_dim[tuple] 0.1615ms 67.3857μs 14.8399 KOps/s 14.4629 KOps/s $\color{#35bf28}+2.61\%$
test_setitem 86.1710μs 40.5006μs 24.6910 KOps/s 24.1036 KOps/s $\color{#35bf28}+2.44\%$
test_set 83.7910μs 39.8541μs 25.0915 KOps/s 24.3670 KOps/s $\color{#35bf28}+2.97\%$
test_set_shared 0.3597ms 48.9264μs 20.4389 KOps/s 20.0584 KOps/s $\color{#35bf28}+1.90\%$
test_update 90.1410μs 47.9597μs 20.8508 KOps/s 19.7210 KOps/s $\textbf{\color{#35bf28}+5.73\%}$
test_update_nested 0.1100ms 54.8003μs 18.2481 KOps/s 17.4369 KOps/s $\color{#35bf28}+4.65\%$
test_update__nested 98.8720μs 57.5613μs 17.3728 KOps/s 17.0547 KOps/s $\color{#35bf28}+1.87\%$
test_set_nested 78.0520μs 41.0636μs 24.3525 KOps/s 22.9076 KOps/s $\textbf{\color{#35bf28}+6.31\%}$
test_set_nested_new 81.8610μs 44.9469μs 22.2485 KOps/s 21.1401 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_select 0.1137ms 58.1041μs 17.2105 KOps/s 16.7742 KOps/s $\color{#35bf28}+2.60\%$
test_select_nested 0.5347ms 42.5512μs 23.5011 KOps/s 24.5273 KOps/s $\color{#d91a1a}-4.18\%$
test_exclude_nested 0.1019ms 57.9208μs 17.2649 KOps/s 17.9578 KOps/s $\color{#d91a1a}-3.86\%$
test_empty[True] 0.3137ms 0.2395ms 4.1759 KOps/s 4.1494 KOps/s $\color{#35bf28}+0.64\%$
test_empty[False] 4.0611μs 0.7156μs 1.3974 MOps/s 1.3952 MOps/s $\color{#35bf28}+0.16\%$
test_to 53.3710μs 24.7349μs 40.4288 KOps/s 42.3109 KOps/s $\color{#d91a1a}-4.45\%$
test_to_nonblocking 77.5810μs 23.5083μs 42.5382 KOps/s 43.2105 KOps/s $\color{#d91a1a}-1.56\%$
test_unbind_speed 1.5583ms 0.2687ms 3.7217 KOps/s 3.5495 KOps/s $\color{#35bf28}+4.85\%$
test_unbind_speed_stack0 0.3539ms 0.2667ms 3.7496 KOps/s 3.5688 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_unbind_speed_stack1 92.4662ms 0.6881ms 1.4534 KOps/s 1.4122 KOps/s $\color{#35bf28}+2.92\%$
test_split 93.7477ms 2.1394ms 467.4141 Ops/s 463.3647 Ops/s $\color{#35bf28}+0.87\%$
test_chunk 95.9326ms 2.1414ms 466.9744 Ops/s 461.3848 Ops/s $\color{#35bf28}+1.21\%$
test_creation[device0] 0.3540ms 0.1260ms 7.9346 KOps/s 7.9005 KOps/s $\color{#35bf28}+0.43\%$
test_creation_from_tensor 0.3844ms 0.1301ms 7.6878 KOps/s 7.7116 KOps/s $\color{#d91a1a}-0.31\%$
test_add_one[memmap_tensor0] 0.2541ms 8.9339μs 111.9337 KOps/s 107.6294 KOps/s $\color{#35bf28}+4.00\%$
test_contiguous[memmap_tensor0] 23.7300μs 2.2186μs 450.7382 KOps/s 451.3990 KOps/s $\color{#d91a1a}-0.15\%$
test_stack[memmap_tensor0] 38.0500μs 6.5120μs 153.5620 KOps/s 146.3060 KOps/s $\color{#35bf28}+4.96\%$
test_memmaptd_index 1.1164ms 0.4266ms 2.3444 KOps/s 2.2878 KOps/s $\color{#35bf28}+2.47\%$
test_memmaptd_index_astensor 0.7333ms 0.4814ms 2.0774 KOps/s 2.0168 KOps/s $\color{#35bf28}+3.01\%$
test_memmaptd_index_op 1.4856ms 1.0209ms 979.5577 Ops/s 934.1192 Ops/s $\color{#35bf28}+4.86\%$
test_serialize_model 0.1318s 0.1290s 7.7541 Ops/s 7.7440 Ops/s $\color{#35bf28}+0.13\%$
test_serialize_model_pickle 1.3750s 1.2175s 0.8214 Ops/s 0.8246 Ops/s $\color{#d91a1a}-0.39\%$
test_serialize_weights 0.1286s 0.1279s 7.8197 Ops/s 7.0286 Ops/s $\textbf{\color{#35bf28}+11.26\%}$
test_serialize_weights_returnearly 0.2431s 62.1710ms 16.0847 Ops/s 18.0854 Ops/s $\textbf{\color{#d91a1a}-11.06\%}$
test_serialize_weights_pickle 1.3485s 1.2180s 0.8210 Ops/s 0.8212 Ops/s $\color{#d91a1a}-0.02\%$
test_reshape_pytree 67.4510μs 35.7132μs 28.0009 KOps/s 27.9899 KOps/s $\color{#35bf28}+0.04\%$
test_reshape_td 83.7710μs 45.0576μs 22.1938 KOps/s 24.4745 KOps/s $\textbf{\color{#d91a1a}-9.32\%}$
test_view_pytree 77.7110μs 35.2105μs 28.4007 KOps/s 28.3964 KOps/s $\color{#35bf28}+0.02\%$
test_view_td 0.1122ms 48.5248μs 20.6080 KOps/s 21.9729 KOps/s $\textbf{\color{#d91a1a}-6.21\%}$
test_unbind_pytree 60.7910μs 33.8077μs 29.5790 KOps/s 28.2401 KOps/s $\color{#35bf28}+4.74\%$
test_unbind_td 0.3845ms 41.4163μs 24.1451 KOps/s 23.3622 KOps/s $\color{#35bf28}+3.35\%$
test_split_pytree 87.0710μs 45.4727μs 21.9912 KOps/s 21.7573 KOps/s $\color{#35bf28}+1.08\%$
test_split_td 0.4444ms 53.5178μs 18.6854 KOps/s 18.2702 KOps/s $\color{#35bf28}+2.27\%$
test_add_pytree 0.1126ms 58.2594μs 17.1646 KOps/s 17.7606 KOps/s $\color{#d91a1a}-3.36\%$
test_add_td 0.1405ms 95.0571μs 10.5200 KOps/s 10.9826 KOps/s $\color{#d91a1a}-4.21\%$
test_compile_add_one_nested[tensordict-compile] 0.4129ms 0.2189ms 4.5679 KOps/s 4.6599 KOps/s $\color{#d91a1a}-1.97\%$
test_compile_add_one_nested[tensordict-eager] 0.2522ms 0.1564ms 6.3935 KOps/s 6.4068 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_add_one_nested[pytree-compile] 0.2330ms 0.1536ms 6.5109 KOps/s 6.8640 KOps/s $\textbf{\color{#d91a1a}-5.15\%}$
test_compile_add_one_nested[pytree-eager] 0.2611ms 0.1833ms 5.4556 KOps/s 5.4237 KOps/s $\color{#35bf28}+0.59\%$
test_compile_copy_nested[tensordict-compile] 68.5710μs 20.4691μs 48.8542 KOps/s 56.4282 KOps/s $\textbf{\color{#d91a1a}-13.42\%}$
test_compile_copy_nested[tensordict-eager] 70.6710μs 42.4034μs 23.5830 KOps/s 23.7147 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_copy_nested[pytree-compile] 0.2215ms 63.1137μs 15.8444 KOps/s 15.4661 KOps/s $\color{#35bf28}+2.45\%$
test_compile_copy_nested[pytree-eager] 80.7610μs 49.5032μs 20.2007 KOps/s 20.0610 KOps/s $\color{#35bf28}+0.70\%$
test_compile_add_one_flat[tensordict-compile] 0.4538ms 0.3230ms 3.0963 KOps/s 3.1025 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_add_one_flat[tensordict-eager] 0.2559ms 0.2077ms 4.8135 KOps/s 4.7026 KOps/s $\color{#35bf28}+2.36\%$
test_compile_add_one_flat[tensorclass-compile] 0.1749ms 0.1300ms 7.6916 KOps/s 7.6842 KOps/s $\color{#35bf28}+0.10\%$
test_compile_add_one_flat[tensorclass-eager] 0.1897ms 62.6691μs 15.9568 KOps/s 15.7882 KOps/s $\color{#35bf28}+1.07\%$
test_compile_add_one_flat[pytree-compile] 0.4585ms 0.3327ms 3.0061 KOps/s 3.1068 KOps/s $\color{#d91a1a}-3.24\%$
test_compile_add_one_flat[pytree-eager] 0.8450ms 0.6491ms 1.5407 KOps/s 1.5679 KOps/s $\color{#d91a1a}-1.74\%$
test_compile_add_self_flat[tensordict-eager] 0.3467ms 0.2515ms 3.9768 KOps/s 4.0281 KOps/s $\color{#d91a1a}-1.27\%$
test_compile_add_self_flat[tensordict-compile] 0.4487ms 0.3327ms 3.0059 KOps/s 3.0737 KOps/s $\color{#d91a1a}-2.21\%$
test_compile_add_self_flat[tensorclass-eager] 0.2010ms 75.2800μs 13.2837 KOps/s 13.6432 KOps/s $\color{#d91a1a}-2.63\%$
test_compile_add_self_flat[tensorclass-compile] 0.1950ms 0.1405ms 7.1192 KOps/s 7.6359 KOps/s $\textbf{\color{#d91a1a}-6.77\%}$
test_compile_add_self_flat[pytree-eager] 0.6880ms 0.5294ms 1.8890 KOps/s 1.8541 KOps/s $\color{#35bf28}+1.88\%$
test_compile_add_self_flat[pytree-compile] 0.4042ms 0.3315ms 3.0168 KOps/s 3.0940 KOps/s $\color{#d91a1a}-2.49\%$
test_compile_copy_flat[tensordict-compile] 0.1193ms 18.1523μs 55.0894 KOps/s 58.1984 KOps/s $\textbf{\color{#d91a1a}-5.34\%}$
test_compile_copy_flat[tensordict-eager] 0.1207ms 27.5226μs 36.3338 KOps/s 36.6168 KOps/s $\color{#d91a1a}-0.77\%$
test_compile_copy_flat[pytree-compile] 0.1711ms 70.4360μs 14.1973 KOps/s 13.5790 KOps/s $\color{#35bf28}+4.55\%$
test_compile_copy_flat[pytree-eager] 0.1578ms 51.1520μs 19.5496 KOps/s 18.1312 KOps/s $\textbf{\color{#35bf28}+7.82\%}$
test_compile_assign_and_add[tensordict-compile] 2.3630ms 0.8247ms 1.2126 KOps/s 1.1314 KOps/s $\textbf{\color{#35bf28}+7.17\%}$
test_compile_assign_and_add[tensordict-eager] 3.3213ms 3.0711ms 325.6130 Ops/s 316.6428 Ops/s $\color{#35bf28}+2.83\%$
test_compile_assign_and_add[pytree-compile] 2.3227ms 0.8083ms 1.2371 KOps/s 1.1262 KOps/s $\textbf{\color{#35bf28}+9.85\%}$
test_compile_assign_and_add[pytree-eager] 3.5663ms 3.1549ms 316.9686 Ops/s 307.4597 Ops/s $\color{#35bf28}+3.09\%$
test_compile_indexing[tensor-tensordict-compile] 0.1512ms 0.1121ms 8.9219 KOps/s 8.9615 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_indexing[tensor-tensordict-eager] 0.4599ms 58.0482μs 17.2271 KOps/s 16.5241 KOps/s $\color{#35bf28}+4.25\%$
test_compile_indexing[tensor-tensorclass-compile] 0.5154ms 0.1052ms 9.5015 KOps/s 9.4955 KOps/s $\color{#35bf28}+0.06\%$
test_compile_indexing[tensor-tensorclass-eager] 0.4388ms 41.6452μs 24.0124 KOps/s 23.0029 KOps/s $\color{#35bf28}+4.39\%$
test_compile_indexing[tensor-pytree-compile] 0.5007ms 0.1057ms 9.4610 KOps/s 9.5216 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_indexing[tensor-pytree-eager] 96.5810μs 41.7126μs 23.9736 KOps/s 22.8841 KOps/s $\color{#35bf28}+4.76\%$
test_compile_indexing[slice-tensordict-compile] 0.1934ms 0.1393ms 7.1809 KOps/s 7.1942 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_indexing[slice-tensordict-eager] 0.4174ms 23.5758μs 42.4163 KOps/s 41.1048 KOps/s $\color{#35bf28}+3.19\%$
test_compile_indexing[slice-tensorclass-compile] 0.2081ms 0.1333ms 7.5037 KOps/s 7.4762 KOps/s $\color{#35bf28}+0.37\%$
test_compile_indexing[slice-tensorclass-eager] 0.1137ms 20.1976μs 49.5108 KOps/s 48.2756 KOps/s $\color{#35bf28}+2.56\%$
test_compile_indexing[slice-pytree-compile] 0.5163ms 0.1334ms 7.4948 KOps/s 7.5121 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_indexing[slice-pytree-eager] 0.3949ms 20.5237μs 48.7241 KOps/s 48.0195 KOps/s $\color{#35bf28}+1.47\%$
test_compile_indexing[int-tensordict-compile] 0.5192ms 0.1396ms 7.1658 KOps/s 7.2068 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_indexing[int-tensordict-eager] 0.5290ms 23.6516μs 42.2805 KOps/s 41.2618 KOps/s $\color{#35bf28}+2.47\%$
test_compile_indexing[int-tensorclass-compile] 0.5331ms 0.1334ms 7.4960 KOps/s 7.4797 KOps/s $\color{#35bf28}+0.22\%$
test_compile_indexing[int-tensorclass-eager] 0.4034ms 20.5576μs 48.6437 KOps/s 48.0962 KOps/s $\color{#35bf28}+1.14\%$
test_compile_indexing[int-pytree-compile] 0.5202ms 0.1331ms 7.5160 KOps/s 7.4543 KOps/s $\color{#35bf28}+0.83\%$
test_compile_indexing[int-pytree-eager] 58.3900μs 20.3152μs 49.2243 KOps/s 47.9046 KOps/s $\color{#35bf28}+2.75\%$
test_mod_add[eager] 79.1010μs 30.6939μs 32.5798 KOps/s 30.8243 KOps/s $\textbf{\color{#35bf28}+5.70\%}$
test_mod_add[compile] 0.4692ms 70.3020μs 14.2243 KOps/s 14.2420 KOps/s $\color{#d91a1a}-0.12\%$
test_mod_add[compile-overhead] 0.2651ms 0.1380ms 7.2482 KOps/s 6.5650 KOps/s $\textbf{\color{#35bf28}+10.41\%}$
test_mod_wrap[eager] 0.6455ms 0.2405ms 4.1583 KOps/s 4.0210 KOps/s $\color{#35bf28}+3.42\%$
test_mod_wrap[compile] 1.1540ms 0.2897ms 3.4521 KOps/s 3.4345 KOps/s $\color{#35bf28}+0.51\%$
test_mod_wrap[compile-overhead] 8.2532ms 4.1313ms 242.0567 Ops/s 243.2956 Ops/s $\color{#d91a1a}-0.51\%$
test_mod_wrap_and_backward[eager] 1.4894ms 1.3284ms 752.7840 Ops/s 690.6236 Ops/s $\textbf{\color{#35bf28}+9.00\%}$
test_mod_wrap_and_backward[compile] 2.3738ms 1.2949ms 772.2671 Ops/s 706.3107 Ops/s $\textbf{\color{#35bf28}+9.34\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3298ms 0.8942ms 1.1183 KOps/s 985.6006 Ops/s $\textbf{\color{#35bf28}+13.47\%}$
test_seq_add[eager] 0.1511ms 92.0766μs 10.8605 KOps/s 9.9126 KOps/s $\textbf{\color{#35bf28}+9.56\%}$
test_seq_add[compile] 0.6969ms 80.5282μs 12.4180 KOps/s 12.2773 KOps/s $\color{#35bf28}+1.15\%$
test_seq_add[compile-overhead] 0.1680ms 0.1183ms 8.4509 KOps/s 8.3505 KOps/s $\color{#35bf28}+1.20\%$
test_seq_wrap[eager] 0.4663ms 0.3949ms 2.5326 KOps/s 2.5522 KOps/s $\color{#d91a1a}-0.77\%$
test_seq_wrap[compile] 1.1701ms 0.3067ms 3.2604 KOps/s 3.1551 KOps/s $\color{#35bf28}+3.34\%$
test_seq_wrap[compile-overhead] 0.2673ms 0.2197ms 4.5523 KOps/s 4.3933 KOps/s $\color{#35bf28}+3.62\%$
test_func_call_runtime[False-eager] 0.8302ms 0.7657ms 1.3059 KOps/s 1.3016 KOps/s $\color{#35bf28}+0.33\%$
test_func_call_runtime[False-compile] 1.0297ms 0.7863ms 1.2717 KOps/s 1.2606 KOps/s $\color{#35bf28}+0.88\%$
test_func_call_runtime[False-compile-overhead] 0.4141ms 0.3626ms 2.7582 KOps/s 2.7436 KOps/s $\color{#35bf28}+0.53\%$
test_func_call_runtime[True-eager] 0.9563ms 0.8878ms 1.1264 KOps/s 1.0841 KOps/s $\color{#35bf28}+3.90\%$
test_func_call_runtime[True-compile] 0.8564ms 0.8016ms 1.2474 KOps/s 1.2262 KOps/s $\color{#35bf28}+1.73\%$
test_func_call_runtime[True-compile-overhead] 0.5202ms 0.3982ms 2.5111 KOps/s 2.4964 KOps/s $\color{#35bf28}+0.59\%$
test_func_call_cm_runtime[False-eager] 0.8354ms 0.7512ms 1.3313 KOps/s 1.2846 KOps/s $\color{#35bf28}+3.63\%$
test_func_call_cm_runtime[False-compile] 0.8297ms 0.7675ms 1.3029 KOps/s 1.2601 KOps/s $\color{#35bf28}+3.40\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4122ms 0.3626ms 2.7578 KOps/s 2.7559 KOps/s $\color{#35bf28}+0.07\%$
test_func_call_cm_runtime[True-eager] 1.0711ms 0.9802ms 1.0202 KOps/s 980.9801 Ops/s $\color{#35bf28}+4.00\%$
test_func_call_cm_runtime[True-compile] 0.8922ms 0.8278ms 1.2080 KOps/s 1.1825 KOps/s $\color{#35bf28}+2.16\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4994ms 0.4190ms 2.3866 KOps/s 2.3630 KOps/s $\color{#35bf28}+1.00\%$
test_vmap_func_call_cm_runtime[eager] 2.5220ms 2.0280ms 493.1046 Ops/s 475.4725 Ops/s $\color{#35bf28}+3.71\%$
test_vmap_func_call_cm_runtime[compile] 1.0010ms 0.8935ms 1.1191 KOps/s 1.1542 KOps/s $\color{#d91a1a}-3.04\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.4945ms 0.4269ms 2.3425 KOps/s 2.3290 KOps/s $\color{#35bf28}+0.58\%$
test_distributed 1.8398ms 0.2094ms 4.7747 KOps/s 8.4801 KOps/s $\textbf{\color{#d91a1a}-43.70\%}$
test_tdmodule 0.1142ms 14.5481μs 68.7375 KOps/s 63.3620 KOps/s $\textbf{\color{#35bf28}+8.48\%}$
test_tdmodule_dispatch 65.2210μs 29.4196μs 33.9910 KOps/s 30.9493 KOps/s $\textbf{\color{#35bf28}+9.83\%}$
test_tdseq 36.7710μs 15.2907μs 65.3991 KOps/s 60.2803 KOps/s $\textbf{\color{#35bf28}+8.49\%}$
test_tdseq_dispatch 53.2410μs 31.6703μs 31.5753 KOps/s 29.7768 KOps/s $\textbf{\color{#35bf28}+6.04\%}$
test_instantiation_functorch 1.9193ms 1.8060ms 553.7124 Ops/s 535.8853 Ops/s $\color{#35bf28}+3.33\%$
test_instantiation_td 1.7252ms 1.1664ms 857.3693 Ops/s 825.9390 Ops/s $\color{#35bf28}+3.81\%$
test_exec_functorch 0.2530ms 0.2016ms 4.9604 KOps/s 4.7140 KOps/s $\textbf{\color{#35bf28}+5.23\%}$
test_exec_functional_call 0.2984ms 0.2088ms 4.7904 KOps/s 4.6752 KOps/s $\color{#35bf28}+2.46\%$
test_exec_td 0.2663ms 0.2118ms 4.7212 KOps/s 4.4924 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_exec_td_decorator 0.9292ms 0.2528ms 3.9550 KOps/s 3.7808 KOps/s $\color{#35bf28}+4.61\%$
test_vmap_mlp_speed[True-True] 0.7821ms 0.6757ms 1.4800 KOps/s 1.4423 KOps/s $\color{#35bf28}+2.62\%$
test_vmap_mlp_speed[True-False] 0.7470ms 0.6747ms 1.4820 KOps/s 1.4401 KOps/s $\color{#35bf28}+2.91\%$
test_vmap_mlp_speed[False-True] 0.6936ms 0.5675ms 1.7621 KOps/s 1.7325 KOps/s $\color{#35bf28}+1.71\%$
test_vmap_mlp_speed[False-False] 0.6401ms 0.5657ms 1.7676 KOps/s 1.7145 KOps/s $\color{#35bf28}+3.10\%$
test_vmap_mlp_speed_decorator[True-True] 0.7713ms 0.6613ms 1.5121 KOps/s 1.4227 KOps/s $\textbf{\color{#35bf28}+6.28\%}$
test_vmap_mlp_speed_decorator[True-False] 0.9036ms 0.6618ms 1.5110 KOps/s 1.4101 KOps/s $\textbf{\color{#35bf28}+7.16\%}$
test_vmap_mlp_speed_decorator[False-True] 0.6943ms 0.5792ms 1.7265 KOps/s 1.6184 KOps/s $\textbf{\color{#35bf28}+6.68\%}$
test_vmap_mlp_speed_decorator[False-False] 0.6881ms 0.5793ms 1.7261 KOps/s 1.6148 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_vmap_transformer_speed[True-True] 8.4019ms 8.2341ms 121.4457 Ops/s 119.0106 Ops/s $\color{#35bf28}+2.05\%$
test_vmap_transformer_speed[True-False] 8.2676ms 8.1718ms 122.3716 Ops/s 119.6144 Ops/s $\color{#35bf28}+2.31\%$
test_vmap_transformer_speed[False-True] 8.1668ms 7.9792ms 125.3265 Ops/s 122.3623 Ops/s $\color{#35bf28}+2.42\%$
test_vmap_transformer_speed[False-False] 8.1071ms 8.0091ms 124.8576 Ops/s 122.3310 Ops/s $\color{#35bf28}+2.07\%$
test_vmap_transformer_speed_decorator[True-True] 19.2146ms 19.0688ms 52.4417 Ops/s 51.1112 Ops/s $\color{#35bf28}+2.60\%$
test_vmap_transformer_speed_decorator[True-False] 19.2118ms 19.0924ms 52.3768 Ops/s 50.8040 Ops/s $\color{#35bf28}+3.10\%$
test_vmap_transformer_speed_decorator[False-True] 19.1233ms 18.9614ms 52.7388 Ops/s 51.6003 Ops/s $\color{#35bf28}+2.21\%$
test_vmap_transformer_speed_decorator[False-False] 19.1880ms 18.9886ms 52.6632 Ops/s 51.3229 Ops/s $\color{#35bf28}+2.61\%$
test_to_module_speed[True] 1.4144ms 0.9046ms 1.1054 KOps/s 1.0878 KOps/s $\color{#35bf28}+1.62\%$
test_to_module_speed[False] 1.2620ms 0.8939ms 1.1186 KOps/s 1.1174 KOps/s $\color{#35bf28}+0.11\%$
test_tc_init 61.9210μs 32.8263μs 30.4633 KOps/s 29.5196 KOps/s $\color{#35bf28}+3.20\%$
test_tc_init_nested 99.2310μs 65.7935μs 15.1991 KOps/s 13.9561 KOps/s $\textbf{\color{#35bf28}+8.91\%}$
test_tc_first_layer_tensor 4.8530μs 0.6779μs 1.4751 MOps/s 1.4740 MOps/s $\color{#35bf28}+0.07\%$
test_tc_first_layer_nontensor 19.2410μs 2.2398μs 446.4624 KOps/s 445.4382 KOps/s $\color{#35bf28}+0.23\%$
test_tc_second_layer_tensor 9.2000μs 1.3717μs 729.0239 KOps/s 731.3226 KOps/s $\color{#d91a1a}-0.31\%$
test_tc_second_layer_nontensor 31.4910μs 2.9585μs 338.0108 KOps/s 339.4418 KOps/s $\color{#d91a1a}-0.42\%$
test_unbind 0.1910s 11.6270ms 86.0070 Ops/s 91.8951 Ops/s $\textbf{\color{#d91a1a}-6.41\%}$
test_full_like 0.6487ms 0.5740ms 1.7422 KOps/s 1.7399 KOps/s $\color{#35bf28}+0.13\%$
test_zeros_like 0.2665ms 0.1980ms 5.0494 KOps/s 5.0537 KOps/s $\color{#d91a1a}-0.09\%$
test_ones_like 0.2338ms 0.1978ms 5.0550 KOps/s 5.0592 KOps/s $\color{#d91a1a}-0.08\%$
test_clone 0.4521ms 0.4135ms 2.4186 KOps/s 2.4152 KOps/s $\color{#35bf28}+0.14\%$
test_squeeze 40.9010μs 9.2325μs 108.3126 KOps/s 105.0634 KOps/s $\color{#35bf28}+3.09\%$
test_unsqueeze 0.2620ms 68.8007μs 14.5347 KOps/s 14.1704 KOps/s $\color{#35bf28}+2.57\%$
test_split 0.2736ms 0.1523ms 6.5650 KOps/s 6.4329 KOps/s $\color{#35bf28}+2.05\%$
test_permute 0.2602ms 0.1703ms 5.8703 KOps/s 5.8018 KOps/s $\color{#35bf28}+1.18\%$
test_stack 1.2559ms 0.8606ms 1.1620 KOps/s 1.1281 KOps/s $\color{#35bf28}+3.01\%$
test_cat 1.2518ms 1.2317ms 811.9156 Ops/s 811.9925 Ops/s $-0.01\%$

@vmoens vmoens merged commit a20623e into gh/vmoens/17/base Sep 2, 2024
vmoens pushed a commit that referenced this pull request Sep 2, 2024
ghstack-source-id: ea5b148
Pull Request resolved: #977
@vmoens vmoens deleted the gh/vmoens/17/head branch September 2, 2024 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants