Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Sep 16, 2024

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: f7350bb
Pull Request resolved: #995
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 16, 2024
@vmoens vmoens merged commit d8e1e76 into gh/vmoens/22/base Sep 16, 2024
vmoens pushed a commit that referenced this pull request Sep 16, 2024
ghstack-source-id: f7350bb
Pull Request resolved: #995
@vmoens vmoens deleted the gh/vmoens/22/head branch September 16, 2024 23:45
@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}29$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 78.0170μs 20.3336μs 49.1796 KOps/s 48.3402 KOps/s $\color{#35bf28}+1.74\%$
test_plain_set_stack_nested 55.4940μs 20.3697μs 49.0924 KOps/s 47.9260 KOps/s $\color{#35bf28}+2.43\%$
test_plain_set_nested_inplace 58.8200μs 21.7666μs 45.9419 KOps/s 44.8279 KOps/s $\color{#35bf28}+2.49\%$
test_plain_set_stack_nested_inplace 53.1400μs 21.5183μs 46.4720 KOps/s 44.8773 KOps/s $\color{#35bf28}+3.55\%$
test_items 28.7630μs 4.1424μs 241.4075 KOps/s 241.4749 KOps/s $\color{#d91a1a}-0.03\%$
test_items_nested 0.9385ms 0.3640ms 2.7470 KOps/s 2.7954 KOps/s $\color{#d91a1a}-1.73\%$
test_items_nested_locked 0.7312ms 0.3664ms 2.7293 KOps/s 2.7818 KOps/s $\color{#d91a1a}-1.89\%$
test_items_nested_leaf 0.1406ms 69.5794μs 14.3721 KOps/s 14.4136 KOps/s $\color{#d91a1a}-0.29\%$
test_items_stack_nested 0.4989ms 0.3633ms 2.7525 KOps/s 2.7521 KOps/s $\color{#35bf28}+0.02\%$
test_items_stack_nested_leaf 0.1810ms 70.5845μs 14.1674 KOps/s 13.9892 KOps/s $\color{#35bf28}+1.27\%$
test_items_stack_nested_locked 0.7628ms 0.3644ms 2.7445 KOps/s 2.7924 KOps/s $\color{#d91a1a}-1.72\%$
test_keys 41.2670μs 3.5809μs 279.2571 KOps/s 284.1841 KOps/s $\color{#d91a1a}-1.73\%$
test_keys_nested 0.1982ms 0.1001ms 9.9935 KOps/s 10.1975 KOps/s $\color{#d91a1a}-2.00\%$
test_keys_nested_locked 1.2806ms 0.1076ms 9.2966 KOps/s 9.5200 KOps/s $\color{#d91a1a}-2.35\%$
test_keys_nested_leaf 0.1626ms 82.7191μs 12.0891 KOps/s 11.9169 KOps/s $\color{#35bf28}+1.45\%$
test_keys_stack_nested 0.1846ms 99.4972μs 10.0505 KOps/s 10.1641 KOps/s $\color{#d91a1a}-1.12\%$
test_keys_stack_nested_leaf 0.1795ms 83.7338μs 11.9426 KOps/s 12.3497 KOps/s $\color{#d91a1a}-3.30\%$
test_keys_stack_nested_locked 0.1824ms 0.1044ms 9.5745 KOps/s 9.5993 KOps/s $\color{#d91a1a}-0.26\%$
test_values 9.2132μs 1.1180μs 894.4625 KOps/s 894.8477 KOps/s $\color{#d91a1a}-0.04\%$
test_values_nested 0.1427ms 72.1889μs 13.8526 KOps/s 13.9305 KOps/s $\color{#d91a1a}-0.56\%$
test_values_nested_locked 0.1274ms 71.9373μs 13.9010 KOps/s 13.9598 KOps/s $\color{#d91a1a}-0.42\%$
test_values_nested_leaf 0.1234ms 61.3006μs 16.3131 KOps/s 16.1808 KOps/s $\color{#35bf28}+0.82\%$
test_values_stack_nested 0.1424ms 72.6802μs 13.7589 KOps/s 13.7071 KOps/s $\color{#35bf28}+0.38\%$
test_values_stack_nested_leaf 0.1224ms 61.6436μs 16.2223 KOps/s 15.3265 KOps/s $\textbf{\color{#35bf28}+5.84\%}$
test_values_stack_nested_locked 0.1302ms 72.2718μs 13.8367 KOps/s 13.8731 KOps/s $\color{#d91a1a}-0.26\%$
test_membership 6.0929μs 0.7212μs 1.3865 MOps/s 1.1425 MOps/s $\textbf{\color{#35bf28}+21.36\%}$
test_membership_nested 20.4280μs 2.7658μs 361.5641 KOps/s 369.8564 KOps/s $\color{#d91a1a}-2.24\%$
test_membership_nested_leaf 20.9390μs 2.7823μs 359.4111 KOps/s 367.7906 KOps/s $\color{#d91a1a}-2.28\%$
test_membership_stacked_nested 18.4140μs 2.8256μs 353.9064 KOps/s 366.7315 KOps/s $\color{#d91a1a}-3.50\%$
test_membership_stacked_nested_leaf 28.1430μs 2.7935μs 357.9714 KOps/s 362.5658 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_nested_last 39.0330μs 3.8801μs 257.7246 KOps/s 260.3833 KOps/s $\color{#d91a1a}-1.02\%$
test_membership_nested_leaf_last 35.2160μs 3.9073μs 255.9294 KOps/s 256.6957 KOps/s $\color{#d91a1a}-0.30\%$
test_membership_stacked_nested_last 43.0610μs 3.8895μs 257.1045 KOps/s 224.8366 KOps/s $\textbf{\color{#35bf28}+14.35\%}$
test_membership_stacked_nested_leaf_last 26.9610μs 3.9194μs 255.1398 KOps/s 223.8742 KOps/s $\textbf{\color{#35bf28}+13.97\%}$
test_nested_getleaf 49.6630μs 10.7343μs 93.1589 KOps/s 94.3210 KOps/s $\color{#d91a1a}-1.23\%$
test_nested_get 50.5650μs 10.2007μs 98.0328 KOps/s 99.1078 KOps/s $\color{#d91a1a}-1.08\%$
test_stacked_getleaf 34.3550μs 10.7925μs 92.6567 KOps/s 92.9277 KOps/s $\color{#d91a1a}-0.29\%$
test_stacked_get 47.5590μs 10.2198μs 97.8489 KOps/s 98.0313 KOps/s $\color{#d91a1a}-0.19\%$
test_nested_getitemleaf 36.2080μs 11.1455μs 89.7226 KOps/s 89.8546 KOps/s $\color{#d91a1a}-0.15\%$
test_nested_getitem 50.1040μs 10.2506μs 97.5555 KOps/s 97.8302 KOps/s $\color{#d91a1a}-0.28\%$
test_stacked_getitemleaf 36.3080μs 10.9904μs 90.9882 KOps/s 90.4254 KOps/s $\color{#35bf28}+0.62\%$
test_stacked_getitem 71.9950μs 10.3692μs 96.4395 KOps/s 96.8992 KOps/s $\color{#d91a1a}-0.47\%$
test_lock_nested 81.6548ms 0.5666ms 1.7649 KOps/s 2.1249 KOps/s $\textbf{\color{#d91a1a}-16.94\%}$
test_lock_stack_nested 0.7133ms 0.4525ms 2.2100 KOps/s 2.2678 KOps/s $\color{#d91a1a}-2.55\%$
test_unlock_nested 85.7191ms 0.4927ms 2.0298 KOps/s 2.5273 KOps/s $\textbf{\color{#d91a1a}-19.69\%}$
test_unlock_stack_nested 0.7774ms 0.3711ms 2.6945 KOps/s 2.7453 KOps/s $\color{#d91a1a}-1.85\%$
test_flatten_speed 0.1463ms 87.4011μs 11.4415 KOps/s 11.3668 KOps/s $\color{#35bf28}+0.66\%$
test_unflatten_speed 0.8264ms 0.4625ms 2.1620 KOps/s 2.2157 KOps/s $\color{#d91a1a}-2.43\%$
test_common_ops 3.9498ms 1.1280ms 886.5206 Ops/s 890.7756 Ops/s $\color{#d91a1a}-0.48\%$
test_creation 18.3140μs 2.0972μs 476.8301 KOps/s 482.2576 KOps/s $\color{#d91a1a}-1.13\%$
test_creation_empty 54.1420μs 16.8548μs 59.3301 KOps/s 53.0300 KOps/s $\textbf{\color{#35bf28}+11.88\%}$
test_creation_nested_1 1.1789ms 20.1712μs 49.5757 KOps/s 45.7995 KOps/s $\textbf{\color{#35bf28}+8.24\%}$
test_creation_nested_2 57.9890μs 24.8012μs 40.3206 KOps/s 38.3620 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_clone 0.1056ms 16.5944μs 60.2613 KOps/s 58.7624 KOps/s $\color{#35bf28}+2.55\%$
test_getitem[int] 0.9477ms 16.2231μs 61.6407 KOps/s 59.2035 KOps/s $\color{#35bf28}+4.12\%$
test_getitem[slice_int] 0.1565ms 29.1119μs 34.3502 KOps/s 32.4608 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_getitem[range] 0.1792ms 56.7913μs 17.6083 KOps/s 16.7442 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_getitem[tuple] 0.1274ms 24.3517μs 41.0649 KOps/s 40.0365 KOps/s $\color{#35bf28}+2.57\%$
test_getitem[list] 0.2241ms 51.8442μs 19.2885 KOps/s 18.7138 KOps/s $\color{#35bf28}+3.07\%$
test_setitem_dim[int] 68.9900μs 32.1529μs 31.1014 KOps/s 30.7600 KOps/s $\color{#35bf28}+1.11\%$
test_setitem_dim[slice_int] 99.1860μs 59.7141μs 16.7465 KOps/s 16.3320 KOps/s $\color{#35bf28}+2.54\%$
test_setitem_dim[range] 0.1572ms 82.8595μs 12.0686 KOps/s 12.0150 KOps/s $\color{#35bf28}+0.45\%$
test_setitem_dim[tuple] 83.7870μs 49.2085μs 20.3217 KOps/s 20.5088 KOps/s $\color{#d91a1a}-0.91\%$
test_setitem 77.6160μs 29.4172μs 33.9937 KOps/s 33.7230 KOps/s $\color{#35bf28}+0.80\%$
test_set 98.0040μs 28.5170μs 35.0668 KOps/s 34.1502 KOps/s $\color{#35bf28}+2.68\%$
test_set_shared 1.3028ms 0.2073ms 4.8234 KOps/s 4.7335 KOps/s $\color{#35bf28}+1.90\%$
test_update 0.1410ms 34.7255μs 28.7973 KOps/s 27.1323 KOps/s $\textbf{\color{#35bf28}+6.14\%}$
test_update_nested 0.1156ms 45.8762μs 21.7978 KOps/s 21.5776 KOps/s $\color{#35bf28}+1.02\%$
test_update__nested 83.4660μs 34.6994μs 28.8190 KOps/s 29.2905 KOps/s $\color{#d91a1a}-1.61\%$
test_set_nested 76.0420μs 30.7106μs 32.5620 KOps/s 31.9401 KOps/s $\color{#35bf28}+1.95\%$
test_set_nested_new 0.1211ms 35.5484μs 28.1307 KOps/s 27.4796 KOps/s $\color{#35bf28}+2.37\%$
test_select 0.1228ms 52.9848μs 18.8733 KOps/s 18.4130 KOps/s $\color{#35bf28}+2.50\%$
test_select_nested 0.9264ms 58.6298μs 17.0562 KOps/s 16.9629 KOps/s $\color{#35bf28}+0.55\%$
test_exclude_nested 0.1582ms 74.7285μs 13.3818 KOps/s 13.4933 KOps/s $\color{#d91a1a}-0.83\%$
test_empty[True] 0.5133ms 0.3181ms 3.1435 KOps/s 3.1725 KOps/s $\color{#d91a1a}-0.91\%$
test_empty[False] 6.7348μs 1.1894μs 840.7434 KOps/s 854.4228 KOps/s $\color{#d91a1a}-1.60\%$
test_unbind_speed 0.4676ms 0.2999ms 3.3344 KOps/s 3.3446 KOps/s $\color{#d91a1a}-0.31\%$
test_unbind_speed_stack0 0.6736ms 0.2974ms 3.3630 KOps/s 3.4328 KOps/s $\color{#d91a1a}-2.03\%$
test_unbind_speed_stack1 85.1312ms 0.8000ms 1.2500 KOps/s 1.3958 KOps/s $\textbf{\color{#d91a1a}-10.45\%}$
test_split 2.7852ms 1.9482ms 513.2833 Ops/s 453.4294 Ops/s $\textbf{\color{#35bf28}+13.20\%}$
test_chunk 91.4264ms 2.3019ms 434.4228 Ops/s 450.4782 Ops/s $\color{#d91a1a}-3.56\%$
test_creation[device0] 3.4376ms 0.1180ms 8.4713 KOps/s 8.5440 KOps/s $\color{#d91a1a}-0.85\%$
test_creation_from_tensor 0.2251ms 0.1151ms 8.6873 KOps/s 8.5847 KOps/s $\color{#35bf28}+1.20\%$
test_add_one[memmap_tensor0] 0.1956ms 7.0636μs 141.5718 KOps/s 135.3712 KOps/s $\color{#35bf28}+4.58\%$
test_contiguous[memmap_tensor0] 21.3300μs 1.9135μs 522.5939 KOps/s 528.0327 KOps/s $\color{#d91a1a}-1.03\%$
test_stack[memmap_tensor0] 47.2180μs 5.5257μs 180.9736 KOps/s 178.4770 KOps/s $\color{#35bf28}+1.40\%$
test_memmaptd_index 1.0192ms 0.3952ms 2.5307 KOps/s 2.5053 KOps/s $\color{#35bf28}+1.01\%$
test_memmaptd_index_astensor 0.8135ms 0.4694ms 2.1302 KOps/s 2.0955 KOps/s $\color{#35bf28}+1.66\%$
test_memmaptd_index_op 1.3420ms 0.9754ms 1.0252 KOps/s 979.4189 Ops/s $\color{#35bf28}+4.67\%$
test_serialize_model 0.1232s 0.1169s 8.5532 Ops/s 8.4345 Ops/s $\color{#35bf28}+1.41\%$
test_serialize_model_pickle 0.4299s 0.3855s 2.5942 Ops/s 2.4874 Ops/s $\color{#35bf28}+4.29\%$
test_serialize_weights 0.1207s 0.1159s 8.6264 Ops/s 7.8352 Ops/s $\textbf{\color{#35bf28}+10.10\%}$
test_serialize_weights_returnearly 0.1715s 0.1560s 6.4106 Ops/s 6.3719 Ops/s $\color{#35bf28}+0.61\%$
test_serialize_weights_pickle 0.4837s 0.4062s 2.4621 Ops/s 2.5505 Ops/s $\color{#d91a1a}-3.47\%$
test_serialize_weights_filesystem 0.1453s 0.1397s 7.1607 Ops/s 7.0861 Ops/s $\color{#35bf28}+1.05\%$
test_serialize_model_filesystem 0.1612s 0.1495s 6.6897 Ops/s 6.1527 Ops/s $\textbf{\color{#35bf28}+8.73\%}$
test_reshape_pytree 84.0080μs 38.1680μs 26.2000 KOps/s 25.8435 KOps/s $\color{#35bf28}+1.38\%$
test_reshape_td 0.1016ms 45.4435μs 22.0054 KOps/s 20.9510 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_view_pytree 88.4660μs 38.2144μs 26.1681 KOps/s 25.8603 KOps/s $\color{#35bf28}+1.19\%$
test_view_td 0.1153ms 50.7449μs 19.7064 KOps/s 18.8163 KOps/s $\color{#35bf28}+4.73\%$
test_unbind_pytree 82.9660μs 34.5388μs 28.9529 KOps/s 27.4554 KOps/s $\textbf{\color{#35bf28}+5.45\%}$
test_unbind_td 0.3723ms 44.0045μs 22.7249 KOps/s 22.8318 KOps/s $\color{#d91a1a}-0.47\%$
test_split_pytree 87.0030μs 36.7725μs 27.1942 KOps/s 26.4968 KOps/s $\color{#35bf28}+2.63\%$
test_split_td 0.1942ms 55.6765μs 17.9609 KOps/s 17.3392 KOps/s $\color{#35bf28}+3.59\%$
test_add_pytree 0.1132ms 43.7355μs 22.8647 KOps/s 22.2221 KOps/s $\color{#35bf28}+2.89\%$
test_add_td 0.1758ms 79.4716μs 12.5831 KOps/s 12.6215 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_add_one_nested[tensordict-compile] 0.1295ms 57.5877μs 17.3648 KOps/s 17.8192 KOps/s $\color{#d91a1a}-2.55\%$
test_compile_add_one_nested[tensordict-eager] 0.4089ms 0.1784ms 5.6047 KOps/s 5.6755 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_add_one_nested[pytree-compile] 0.1153ms 56.5794μs 17.6743 KOps/s 17.5510 KOps/s $\color{#35bf28}+0.70\%$
test_compile_add_one_nested[pytree-eager] 0.3048ms 0.1393ms 7.1763 KOps/s 7.0537 KOps/s $\color{#35bf28}+1.74\%$
test_compile_copy_nested[tensordict-compile] 50.7260μs 20.8937μs 47.8613 KOps/s 48.7582 KOps/s $\color{#d91a1a}-1.84\%$
test_compile_copy_nested[tensordict-eager] 0.1462ms 67.6606μs 14.7797 KOps/s 15.1104 KOps/s $\color{#d91a1a}-2.19\%$
test_compile_copy_nested[pytree-compile] 0.1521ms 74.4233μs 13.4366 KOps/s 13.4430 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_copy_nested[pytree-eager] 0.1079ms 67.5417μs 14.8057 KOps/s 14.8777 KOps/s $\color{#d91a1a}-0.48\%$
test_compile_add_one_flat[tensordict-compile] 0.2745ms 0.1706ms 5.8625 KOps/s 5.8023 KOps/s $\color{#35bf28}+1.04\%$
test_compile_add_one_flat[tensordict-eager] 0.3078ms 0.1857ms 5.3856 KOps/s 5.3295 KOps/s $\color{#35bf28}+1.05\%$
test_compile_add_one_flat[tensorclass-compile] 0.1034ms 47.1053μs 21.2291 KOps/s 21.1907 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_one_flat[tensorclass-eager] 0.1421ms 67.4787μs 14.8195 KOps/s 14.4478 KOps/s $\color{#35bf28}+2.57\%$
test_compile_add_one_flat[pytree-compile] 0.2632ms 0.1723ms 5.8041 KOps/s 5.7735 KOps/s $\color{#35bf28}+0.53\%$
test_compile_add_one_flat[pytree-eager] 0.4197ms 0.2828ms 3.5360 KOps/s 3.3416 KOps/s $\textbf{\color{#35bf28}+5.82\%}$
test_compile_add_self_flat[tensordict-eager] 0.3249ms 0.1980ms 5.0497 KOps/s 4.9404 KOps/s $\color{#35bf28}+2.21\%$
test_compile_add_self_flat[tensordict-compile] 0.3368ms 0.1747ms 5.7248 KOps/s 5.7284 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_self_flat[tensorclass-eager] 0.1605ms 61.9517μs 16.1416 KOps/s 15.8832 KOps/s $\color{#35bf28}+1.63\%$
test_compile_add_self_flat[tensorclass-compile] 0.1099ms 47.7657μs 20.9355 KOps/s 20.8216 KOps/s $\color{#35bf28}+0.55\%$
test_compile_add_self_flat[pytree-eager] 0.3894ms 0.2311ms 4.3275 KOps/s 4.1172 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_compile_add_self_flat[pytree-compile] 0.2658ms 0.1768ms 5.6566 KOps/s 5.6904 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_copy_flat[tensordict-compile] 0.1636ms 0.1010ms 9.9049 KOps/s 9.8791 KOps/s $\color{#35bf28}+0.26\%$
test_compile_copy_flat[tensordict-eager] 0.1138ms 56.8130μs 17.6016 KOps/s 17.3504 KOps/s $\color{#35bf28}+1.45\%$
test_compile_copy_flat[pytree-compile] 0.1474ms 77.1137μs 12.9679 KOps/s 13.2123 KOps/s $\color{#d91a1a}-1.85\%$
test_compile_copy_flat[pytree-eager] 0.1353ms 68.6147μs 14.5741 KOps/s 14.5968 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_assign_and_add[tensordict-compile] 0.2730ms 0.1930ms 5.1811 KOps/s 4.9766 KOps/s $\color{#35bf28}+4.11\%$
test_compile_assign_and_add[tensordict-eager] 1.8362ms 1.6309ms 613.1674 Ops/s 614.4994 Ops/s $\color{#d91a1a}-0.22\%$
test_compile_assign_and_add[pytree-compile] 0.2700ms 0.1916ms 5.2202 KOps/s 5.0489 KOps/s $\color{#35bf28}+3.39\%$
test_compile_assign_and_add[pytree-eager] 1.3268ms 1.0768ms 928.7025 Ops/s 884.3323 Ops/s $\textbf{\color{#35bf28}+5.02\%}$
test_compile_assign_and_add_stack[compile] 0.5204ms 0.4123ms 2.4257 KOps/s 2.3018 KOps/s $\textbf{\color{#35bf28}+5.38\%}$
test_compile_assign_and_add_stack[eager] 5.3431ms 3.6064ms 277.2878 Ops/s 263.6137 Ops/s $\textbf{\color{#35bf28}+5.19\%}$
test_compile_indexing[tensor-tensordict-compile] 95.2990μs 35.1302μs 28.4655 KOps/s 29.0317 KOps/s $\color{#d91a1a}-1.95\%$
test_compile_indexing[tensor-tensordict-eager] 1.0750ms 48.0860μs 20.7961 KOps/s 20.8685 KOps/s $\color{#d91a1a}-0.35\%$
test_compile_indexing[tensor-tensorclass-compile] 75.2210μs 30.3922μs 32.9032 KOps/s 34.3657 KOps/s $\color{#d91a1a}-4.26\%$
test_compile_indexing[tensor-tensorclass-eager] 88.3290μs 27.4823μs 36.3871 KOps/s 35.0925 KOps/s $\color{#35bf28}+3.69\%$
test_compile_indexing[tensor-pytree-compile] 86.5020μs 29.8090μs 33.5469 KOps/s 33.9006 KOps/s $\color{#d91a1a}-1.04\%$
test_compile_indexing[tensor-pytree-eager] 72.8270μs 27.3087μs 36.6183 KOps/s 34.9965 KOps/s $\color{#35bf28}+4.63\%$
test_compile_indexing[slice-tensordict-compile] 0.1765ms 73.9521μs 13.5223 KOps/s 13.6714 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_indexing[slice-tensordict-eager] 0.6162ms 26.9884μs 37.0530 KOps/s 34.4821 KOps/s $\textbf{\color{#35bf28}+7.46\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1331ms 68.8198μs 14.5307 KOps/s 14.6177 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_indexing[slice-tensorclass-eager] 56.3160μs 22.1487μs 45.1493 KOps/s 42.7167 KOps/s $\textbf{\color{#35bf28}+5.69\%}$
test_compile_indexing[slice-pytree-compile] 0.1347ms 67.9656μs 14.7133 KOps/s 15.0790 KOps/s $\color{#d91a1a}-2.42\%$
test_compile_indexing[slice-pytree-eager] 83.5970μs 22.1802μs 45.0853 KOps/s 43.3041 KOps/s $\color{#35bf28}+4.11\%$
test_compile_indexing[int-tensordict-compile] 0.1493ms 74.5625μs 13.4116 KOps/s 14.0545 KOps/s $\color{#d91a1a}-4.57\%$
test_compile_indexing[int-tensordict-eager] 0.8595ms 26.7064μs 37.4442 KOps/s 36.2961 KOps/s $\color{#35bf28}+3.16\%$
test_compile_indexing[int-tensorclass-compile] 0.1489ms 67.9891μs 14.7082 KOps/s 14.8468 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_indexing[int-tensorclass-eager] 81.1320μs 22.1375μs 45.1723 KOps/s 44.3508 KOps/s $\color{#35bf28}+1.85\%$
test_compile_indexing[int-pytree-compile] 0.1740ms 68.1713μs 14.6689 KOps/s 14.8845 KOps/s $\color{#d91a1a}-1.45\%$
test_compile_indexing[int-pytree-eager] 88.0850μs 21.9358μs 45.5875 KOps/s 44.0272 KOps/s $\color{#35bf28}+3.54\%$
test_mod_add[eager] 78.8880μs 23.2991μs 42.9201 KOps/s 40.5260 KOps/s $\textbf{\color{#35bf28}+5.91\%}$
test_mod_add[compile] 84.3590μs 38.0321μs 26.2936 KOps/s 26.0006 KOps/s $\color{#35bf28}+1.13\%$
test_mod_add[compile-overhead] 0.1381ms 40.7411μs 24.5452 KOps/s 25.6813 KOps/s $\color{#d91a1a}-4.42\%$
test_mod_wrap[eager] 0.3974ms 0.2036ms 4.9114 KOps/s 4.7656 KOps/s $\color{#35bf28}+3.06\%$
test_mod_wrap[compile] 0.4763ms 0.2311ms 4.3271 KOps/s 4.3216 KOps/s $\color{#35bf28}+0.13\%$
test_mod_wrap[compile-overhead] 0.4152ms 0.2277ms 4.3921 KOps/s 4.3591 KOps/s $\color{#35bf28}+0.76\%$
test_mod_wrap_and_backward[eager] 12.0983ms 10.5864ms 94.4608 Ops/s 86.0882 Ops/s $\textbf{\color{#35bf28}+9.73\%}$
test_mod_wrap_and_backward[compile] 12.5993ms 10.5604ms 94.6933 Ops/s 82.5807 Ops/s $\textbf{\color{#35bf28}+14.67\%}$
test_mod_wrap_and_backward[compile-overhead] 12.0709ms 10.5150ms 95.1025 Ops/s 88.9588 Ops/s $\textbf{\color{#35bf28}+6.91\%}$
test_seq_add[eager] 0.1783ms 88.9804μs 11.2384 KOps/s 11.2223 KOps/s $\color{#35bf28}+0.14\%$
test_seq_add[compile] 0.1217ms 64.2575μs 15.5624 KOps/s 15.8822 KOps/s $\color{#d91a1a}-2.01\%$
test_seq_add[compile-overhead] 0.1514ms 62.7979μs 15.9241 KOps/s 16.3507 KOps/s $\color{#d91a1a}-2.61\%$
test_seq_wrap[eager] 0.5481ms 0.3737ms 2.6763 KOps/s 2.6118 KOps/s $\color{#35bf28}+2.47\%$
test_seq_wrap[compile] 1.3803ms 0.2633ms 3.7983 KOps/s 3.7796 KOps/s $\color{#35bf28}+0.49\%$
test_seq_wrap[compile-overhead] 1.3589ms 0.2619ms 3.8188 KOps/s 3.7719 KOps/s $\color{#35bf28}+1.24\%$
test_func_call_runtime[False-eager] 0.9556ms 0.5142ms 1.9447 KOps/s 1.9895 KOps/s $\color{#d91a1a}-2.25\%$
test_func_call_runtime[False-compile] 0.9126ms 0.4946ms 2.0220 KOps/s 2.0021 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_runtime[False-compile-overhead] 0.5972ms 0.4885ms 2.0471 KOps/s 2.0110 KOps/s $\color{#35bf28}+1.80\%$
test_func_call_runtime[True-eager] 1.1716ms 0.7267ms 1.3761 KOps/s 1.3900 KOps/s $\color{#d91a1a}-1.00\%$
test_func_call_runtime[True-compile] 0.8300ms 0.5042ms 1.9832 KOps/s 1.9572 KOps/s $\color{#35bf28}+1.32\%$
test_func_call_runtime[True-compile-overhead] 1.0473ms 0.5033ms 1.9868 KOps/s 1.9459 KOps/s $\color{#35bf28}+2.10\%$
test_func_call_cm_runtime[False-eager] 0.7523ms 0.5031ms 1.9878 KOps/s 2.0125 KOps/s $\color{#d91a1a}-1.23\%$
test_func_call_cm_runtime[False-compile] 0.8659ms 0.4947ms 2.0216 KOps/s 1.9928 KOps/s $\color{#35bf28}+1.45\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6775ms 0.4924ms 2.0310 KOps/s 1.9823 KOps/s $\color{#35bf28}+2.46\%$
test_func_call_cm_runtime[True-eager] 1.1334ms 0.8562ms 1.1680 KOps/s 1.1813 KOps/s $\color{#d91a1a}-1.12\%$
test_func_call_cm_runtime[True-compile] 1.1357ms 0.7222ms 1.3846 KOps/s 1.3723 KOps/s $\color{#35bf28}+0.90\%$
test_func_call_cm_runtime[True-compile-overhead] 0.9196ms 0.7235ms 1.3822 KOps/s 1.3846 KOps/s $\color{#d91a1a}-0.17\%$
test_vmap_func_call_cm_runtime[eager] 2.5491ms 1.8134ms 551.4567 Ops/s 534.6187 Ops/s $\color{#35bf28}+3.15\%$
test_vmap_func_call_cm_runtime[compile] 2.5111ms 1.8499ms 540.5776 Ops/s 519.5829 Ops/s $\color{#35bf28}+4.04\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.5886ms 1.8500ms 540.5268 Ops/s 522.3719 Ops/s $\color{#35bf28}+3.48\%$
test_distributed 0.2673ms 0.1220ms 8.1969 KOps/s 7.9748 KOps/s $\color{#35bf28}+2.79\%$
test_tdmodule 31.9300μs 17.3201μs 57.7364 KOps/s 54.6265 KOps/s $\textbf{\color{#35bf28}+5.69\%}$
test_tdmodule_dispatch 63.8700μs 35.3030μs 28.3262 KOps/s 26.7716 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_tdseq 38.7830μs 20.3847μs 49.0563 KOps/s 46.1517 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_tdseq_dispatch 71.6440μs 41.0041μs 24.3878 KOps/s 23.2666 KOps/s $\color{#35bf28}+4.82\%$
test_instantiation_functorch 2.4515ms 1.5706ms 636.7133 Ops/s 632.7718 Ops/s $\color{#35bf28}+0.62\%$
test_instantiation_td 1.9716ms 1.1673ms 856.6761 Ops/s 859.9198 Ops/s $\color{#d91a1a}-0.38\%$
test_exec_functorch 0.3607ms 0.1783ms 5.6091 KOps/s 5.4236 KOps/s $\color{#35bf28}+3.42\%$
test_exec_functional_call 0.4061ms 0.1693ms 5.9081 KOps/s 5.8105 KOps/s $\color{#35bf28}+1.68\%$
test_exec_td 0.2832ms 0.1639ms 6.1030 KOps/s 6.0919 KOps/s $\color{#35bf28}+0.18\%$
test_exec_td_decorator 0.9706ms 0.2175ms 4.5988 KOps/s 4.5836 KOps/s $\color{#35bf28}+0.33\%$
test_vmap_mlp_speed[True-True] 0.9611ms 0.6257ms 1.5981 KOps/s 1.5466 KOps/s $\color{#35bf28}+3.33\%$
test_vmap_mlp_speed[True-False] 0.9706ms 0.6316ms 1.5834 KOps/s 1.5445 KOps/s $\color{#35bf28}+2.52\%$
test_vmap_mlp_speed[False-True] 0.5915ms 0.4808ms 2.0800 KOps/s 2.0078 KOps/s $\color{#35bf28}+3.59\%$
test_vmap_mlp_speed[False-False] 0.7343ms 0.4812ms 2.0783 KOps/s 1.9995 KOps/s $\color{#35bf28}+3.94\%$
test_vmap_mlp_speed_decorator[True-True] 1.4779ms 0.6061ms 1.6500 KOps/s 1.5935 KOps/s $\color{#35bf28}+3.54\%$
test_vmap_mlp_speed_decorator[True-False] 0.7938ms 0.6043ms 1.6549 KOps/s 1.5922 KOps/s $\color{#35bf28}+3.94\%$
test_vmap_mlp_speed_decorator[False-True] 0.7783ms 0.4970ms 2.0121 KOps/s 1.9547 KOps/s $\color{#35bf28}+2.94\%$
test_vmap_mlp_speed_decorator[False-False] 0.6713ms 0.4954ms 2.0185 KOps/s 1.9532 KOps/s $\color{#35bf28}+3.34\%$
test_to_module_speed[True] 2.0556ms 1.2720ms 786.1846 Ops/s 787.8979 Ops/s $\color{#d91a1a}-0.22\%$
test_to_module_speed[False] 1.7388ms 1.2350ms 809.6910 Ops/s 803.0600 Ops/s $\color{#35bf28}+0.83\%$
test_tc_init 85.0300μs 41.8414μs 23.8998 KOps/s 23.3086 KOps/s $\color{#35bf28}+2.54\%$
test_tc_init_nested 0.1543ms 84.3027μs 11.8620 KOps/s 11.5337 KOps/s $\color{#35bf28}+2.85\%$
test_tc_first_layer_tensor 18.6350μs 1.5053μs 664.3308 KOps/s 659.9741 KOps/s $\color{#35bf28}+0.66\%$
test_tc_first_layer_nontensor 23.8350μs 4.7279μs 211.5099 KOps/s 216.3164 KOps/s $\color{#d91a1a}-2.22\%$
test_tc_second_layer_tensor 22.6120μs 2.7769μs 360.1116 KOps/s 356.3001 KOps/s $\color{#35bf28}+1.07\%$
test_tc_second_layer_nontensor 32.5610μs 5.9736μs 167.4023 KOps/s 168.5578 KOps/s $\color{#d91a1a}-0.69\%$
test_unbind 0.4679s 12.9519ms 77.2086 Ops/s 77.8041 Ops/s $\color{#d91a1a}-0.77\%$
test_full_like 8.1354ms 7.2237ms 138.4327 Ops/s 141.9568 Ops/s $\color{#d91a1a}-2.48\%$
test_zeros_like 15.7120ms 6.1581ms 162.3887 Ops/s 371.9503 Ops/s $\textbf{\color{#d91a1a}-56.34\%}$
test_ones_like 14.9519ms 7.5116ms 133.1283 Ops/s 306.8231 Ops/s $\textbf{\color{#d91a1a}-56.61\%}$
test_clone 14.7242ms 9.1366ms 109.4499 Ops/s 200.1913 Ops/s $\textbf{\color{#d91a1a}-45.33\%}$
test_squeeze 68.3990μs 12.2793μs 81.4376 KOps/s 80.6323 KOps/s $\color{#35bf28}+1.00\%$
test_unsqueeze 0.1666ms 93.3485μs 10.7125 KOps/s 10.8304 KOps/s $\color{#d91a1a}-1.09\%$
test_split 0.5176ms 0.1968ms 5.0820 KOps/s 5.0547 KOps/s $\color{#35bf28}+0.54\%$
test_permute 0.4448ms 0.2216ms 4.5133 KOps/s 4.5695 KOps/s $\color{#d91a1a}-1.23\%$
test_stack 28.7991ms 24.5203ms 40.7826 Ops/s 41.6967 Ops/s $\color{#d91a1a}-2.19\%$
test_cat 34.9726ms 24.5554ms 40.7243 Ops/s 42.0183 Ops/s $\color{#d91a1a}-3.08\%$

@github-actions
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}15$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.6621ms 14.2653μs 70.1002 KOps/s 67.0169 KOps/s $\color{#35bf28}+4.60\%$
test_plain_set_stack_nested 42.5810μs 14.4702μs 69.1078 KOps/s 66.6279 KOps/s $\color{#35bf28}+3.72\%$
test_plain_set_nested_inplace 44.6710μs 15.4107μs 64.8900 KOps/s 61.9730 KOps/s $\color{#35bf28}+4.71\%$
test_plain_set_stack_nested_inplace 45.2610μs 15.3041μs 65.3422 KOps/s 63.9392 KOps/s $\color{#35bf28}+2.19\%$
test_items 28.1500μs 2.9138μs 343.1964 KOps/s 340.6000 KOps/s $\color{#35bf28}+0.76\%$
test_items_nested 0.3766ms 0.3284ms 3.0451 KOps/s 3.0514 KOps/s $\color{#d91a1a}-0.21\%$
test_items_nested_locked 0.3981ms 0.3312ms 3.0197 KOps/s 3.0332 KOps/s $\color{#d91a1a}-0.44\%$
test_items_nested_leaf 86.3020μs 55.9451μs 17.8747 KOps/s 17.8477 KOps/s $\color{#35bf28}+0.15\%$
test_items_stack_nested 0.3810ms 0.3319ms 3.0133 KOps/s 3.0254 KOps/s $\color{#d91a1a}-0.40\%$
test_items_stack_nested_leaf 90.0320μs 57.8946μs 17.2728 KOps/s 17.6547 KOps/s $\color{#d91a1a}-2.16\%$
test_items_stack_nested_locked 0.3854ms 0.3338ms 2.9961 KOps/s 3.0066 KOps/s $\color{#d91a1a}-0.35\%$
test_keys 34.4810μs 3.4737μs 287.8786 KOps/s 273.1982 KOps/s $\textbf{\color{#35bf28}+5.37\%}$
test_keys_nested 87.3420μs 57.2662μs 17.4623 KOps/s 17.5807 KOps/s $\color{#d91a1a}-0.67\%$
test_keys_nested_locked 2.7301ms 63.0062μs 15.8715 KOps/s 15.9384 KOps/s $\color{#d91a1a}-0.42\%$
test_keys_nested_leaf 80.6220μs 46.8004μs 21.3674 KOps/s 20.6130 KOps/s $\color{#35bf28}+3.66\%$
test_keys_stack_nested 0.1190ms 57.0852μs 17.5177 KOps/s 17.4524 KOps/s $\color{#35bf28}+0.37\%$
test_keys_stack_nested_leaf 87.2320μs 49.4740μs 20.2126 KOps/s 20.3446 KOps/s $\color{#d91a1a}-0.65\%$
test_keys_stack_nested_locked 0.1108ms 62.7964μs 15.9245 KOps/s 15.9923 KOps/s $\color{#d91a1a}-0.42\%$
test_values 5.6918μs 0.8568μs 1.1671 MOps/s 1.1411 MOps/s $\color{#35bf28}+2.27\%$
test_values_nested 78.4220μs 41.0346μs 24.3697 KOps/s 24.4426 KOps/s $\color{#d91a1a}-0.30\%$
test_values_nested_locked 73.3220μs 42.9953μs 23.2584 KOps/s 23.1261 KOps/s $\color{#35bf28}+0.57\%$
test_values_nested_leaf 75.1020μs 35.6191μs 28.0748 KOps/s 27.9944 KOps/s $\color{#35bf28}+0.29\%$
test_values_stack_nested 70.0820μs 42.1099μs 23.7474 KOps/s 23.9372 KOps/s $\color{#d91a1a}-0.79\%$
test_values_stack_nested_leaf 67.0520μs 36.4842μs 27.4092 KOps/s 27.6908 KOps/s $\color{#d91a1a}-1.02\%$
test_values_stack_nested_locked 82.9610μs 43.8298μs 22.8155 KOps/s 22.6108 KOps/s $\color{#35bf28}+0.91\%$
test_membership 2.1266μs 0.5134μs 1.9477 MOps/s 1.9322 MOps/s $\color{#35bf28}+0.80\%$
test_membership_nested 13.8755μs 1.9648μs 508.9542 KOps/s 503.2344 KOps/s $\color{#35bf28}+1.14\%$
test_membership_nested_leaf 16.8255μs 1.9740μs 506.5780 KOps/s 500.5607 KOps/s $\color{#35bf28}+1.20\%$
test_membership_stacked_nested 25.6410μs 2.0303μs 492.5386 KOps/s 490.8985 KOps/s $\color{#35bf28}+0.33\%$
test_membership_stacked_nested_leaf 20.2610μs 2.0212μs 494.7482 KOps/s 485.7125 KOps/s $\color{#35bf28}+1.86\%$
test_membership_nested_last 32.0310μs 2.8687μs 348.5894 KOps/s 342.4500 KOps/s $\color{#35bf28}+1.79\%$
test_membership_nested_leaf_last 29.8500μs 2.8736μs 347.9923 KOps/s 348.4622 KOps/s $\color{#d91a1a}-0.13\%$
test_membership_stacked_nested_last 39.2910μs 4.6543μs 214.8571 KOps/s 303.0632 KOps/s $\textbf{\color{#d91a1a}-29.10\%}$
test_membership_stacked_nested_leaf_last 26.2600μs 4.5350μs 220.5084 KOps/s 302.0429 KOps/s $\textbf{\color{#d91a1a}-26.99\%}$
test_nested_getleaf 30.2110μs 6.0912μs 164.1703 KOps/s 162.5755 KOps/s $\color{#35bf28}+0.98\%$
test_nested_get 31.8400μs 5.7629μs 173.5228 KOps/s 174.7769 KOps/s $\color{#d91a1a}-0.72\%$
test_stacked_getleaf 35.6210μs 6.0514μs 165.2515 KOps/s 165.0858 KOps/s $\color{#35bf28}+0.10\%$
test_stacked_get 25.8010μs 5.6595μs 176.6948 KOps/s 174.1589 KOps/s $\color{#35bf28}+1.46\%$
test_nested_getitemleaf 30.3410μs 6.2064μs 161.1242 KOps/s 163.5879 KOps/s $\color{#d91a1a}-1.51\%$
test_nested_getitem 32.0810μs 5.7794μs 173.0284 KOps/s 173.3029 KOps/s $\color{#d91a1a}-0.16\%$
test_stacked_getitemleaf 36.8710μs 6.0668μs 164.8305 KOps/s 163.3960 KOps/s $\color{#35bf28}+0.88\%$
test_stacked_getitem 27.9600μs 5.7010μs 175.4081 KOps/s 174.1746 KOps/s $\color{#35bf28}+0.71\%$
test_lock_nested 3.3419ms 0.4333ms 2.3079 KOps/s 2.3141 KOps/s $\color{#d91a1a}-0.27\%$
test_lock_stack_nested 0.4513ms 0.3927ms 2.5464 KOps/s 2.5521 KOps/s $\color{#d91a1a}-0.22\%$
test_unlock_nested 0.7880ms 0.3711ms 2.6946 KOps/s 2.7112 KOps/s $\color{#d91a1a}-0.61\%$
test_unlock_stack_nested 0.3693ms 0.3337ms 2.9968 KOps/s 3.0046 KOps/s $\color{#d91a1a}-0.26\%$
test_flatten_speed 97.4520μs 68.9045μs 14.5128 KOps/s 14.5027 KOps/s $\color{#35bf28}+0.07\%$
test_unflatten_speed 0.3640ms 0.2852ms 3.5068 KOps/s 3.4912 KOps/s $\color{#35bf28}+0.45\%$
test_common_ops 1.5311ms 1.2739ms 784.9650 Ops/s 766.9250 Ops/s $\color{#35bf28}+2.35\%$
test_creation 29.4810μs 1.5461μs 646.8093 KOps/s 640.9160 KOps/s $\color{#35bf28}+0.92\%$
test_creation_empty 47.1310μs 16.2562μs 61.5151 KOps/s 57.0157 KOps/s $\textbf{\color{#35bf28}+7.89\%}$
test_creation_nested_1 51.8620μs 17.7646μs 56.2917 KOps/s 51.5264 KOps/s $\textbf{\color{#35bf28}+9.25\%}$
test_creation_nested_2 51.5410μs 20.5246μs 48.7221 KOps/s 45.4305 KOps/s $\textbf{\color{#35bf28}+7.25\%}$
test_clone 1.3810ms 29.7078μs 33.6612 KOps/s 32.9211 KOps/s $\color{#35bf28}+2.25\%$
test_getitem[int] 97.9319ms 24.5714μs 40.6977 KOps/s 60.1168 KOps/s $\textbf{\color{#d91a1a}-32.30\%}$
test_getitem[slice_int] 0.1201ms 28.2573μs 35.3891 KOps/s 34.7576 KOps/s $\color{#35bf28}+1.82\%$
test_getitem[range] 0.1782ms 0.1108ms 9.0289 KOps/s 9.2676 KOps/s $\color{#d91a1a}-2.58\%$
test_getitem[tuple] 0.1192ms 24.6055μs 40.6413 KOps/s 40.1946 KOps/s $\color{#35bf28}+1.11\%$
test_getitem[list] 0.2031ms 0.1003ms 9.9732 KOps/s 10.2012 KOps/s $\color{#d91a1a}-2.24\%$
test_setitem_dim[int] 69.2110μs 45.9789μs 21.7491 KOps/s 22.2476 KOps/s $\color{#d91a1a}-2.24\%$
test_setitem_dim[slice_int] 0.2089ms 69.2527μs 14.4399 KOps/s 13.9240 KOps/s $\color{#35bf28}+3.70\%$
test_setitem_dim[range] 0.1803ms 0.1295ms 7.7245 KOps/s 7.8178 KOps/s $\color{#d91a1a}-1.19\%$
test_setitem_dim[tuple] 86.8410μs 61.9252μs 16.1485 KOps/s 16.4028 KOps/s $\color{#d91a1a}-1.55\%$
test_setitem 81.8720μs 43.4697μs 23.0045 KOps/s 23.3755 KOps/s $\color{#d91a1a}-1.59\%$
test_set 73.7820μs 41.9573μs 23.8337 KOps/s 23.8133 KOps/s $\color{#35bf28}+0.09\%$
test_set_shared 0.3487ms 52.4075μs 19.0812 KOps/s 19.3182 KOps/s $\color{#d91a1a}-1.23\%$
test_update 0.1955ms 50.9331μs 19.6336 KOps/s 19.1274 KOps/s $\color{#35bf28}+2.65\%$
test_update_nested 0.1126ms 57.9027μs 17.2703 KOps/s 17.0773 KOps/s $\color{#35bf28}+1.13\%$
test_update__nested 0.3985ms 60.8326μs 16.4385 KOps/s 17.1966 KOps/s $\color{#d91a1a}-4.41\%$
test_set_nested 83.4210μs 45.2686μs 22.0903 KOps/s 22.2054 KOps/s $\color{#d91a1a}-0.52\%$
test_set_nested_new 88.0920μs 48.7204μs 20.5253 KOps/s 20.6702 KOps/s $\color{#d91a1a}-0.70\%$
test_select 0.1284ms 63.0865μs 15.8513 KOps/s 15.5229 KOps/s $\color{#35bf28}+2.12\%$
test_select_nested 0.4213ms 44.6336μs 22.4046 KOps/s 22.4387 KOps/s $\color{#d91a1a}-0.15\%$
test_exclude_nested 94.7720μs 62.5076μs 15.9980 KOps/s 16.2172 KOps/s $\color{#d91a1a}-1.35\%$
test_empty[True] 0.3125ms 0.2461ms 4.0634 KOps/s 3.9719 KOps/s $\color{#35bf28}+2.30\%$
test_empty[False] 2.6340μs 0.7569μs 1.3211 MOps/s 1.3117 MOps/s $\color{#35bf28}+0.72\%$
test_to 46.5610μs 25.6545μs 38.9795 KOps/s 39.8941 KOps/s $\color{#d91a1a}-2.29\%$
test_to_nonblocking 69.1020μs 24.4025μs 40.9795 KOps/s 40.8184 KOps/s $\color{#35bf28}+0.39\%$
test_unbind_speed 0.9788ms 0.2878ms 3.4746 KOps/s 3.4482 KOps/s $\color{#35bf28}+0.76\%$
test_unbind_speed_stack0 0.3357ms 0.2855ms 3.5028 KOps/s 3.4882 KOps/s $\color{#35bf28}+0.42\%$
test_unbind_speed_stack1 95.9197ms 0.7322ms 1.3657 KOps/s 1.4842 KOps/s $\textbf{\color{#d91a1a}-7.98\%}$
test_split 99.7232ms 2.2107ms 452.3432 Ops/s 435.4593 Ops/s $\color{#35bf28}+3.88\%$
test_chunk 0.1009s 2.2354ms 447.3493 Ops/s 433.7096 Ops/s $\color{#35bf28}+3.14\%$
test_creation[device0] 0.3492ms 0.1267ms 7.8929 KOps/s 7.8426 KOps/s $\color{#35bf28}+0.64\%$
test_creation_from_tensor 0.4360ms 0.1310ms 7.6341 KOps/s 7.7815 KOps/s $\color{#d91a1a}-1.89\%$
test_add_one[memmap_tensor0] 0.1345ms 8.9056μs 112.2887 KOps/s 111.3477 KOps/s $\color{#35bf28}+0.85\%$
test_contiguous[memmap_tensor0] 49.8310μs 2.2245μs 449.5466 KOps/s 451.2022 KOps/s $\color{#d91a1a}-0.37\%$
test_stack[memmap_tensor0] 26.8610μs 6.9888μs 143.0859 KOps/s 148.3671 KOps/s $\color{#d91a1a}-3.56\%$
test_memmaptd_index 1.1158ms 0.4274ms 2.3396 KOps/s 2.2951 KOps/s $\color{#35bf28}+1.94\%$
test_memmaptd_index_astensor 0.7847ms 0.4807ms 2.0802 KOps/s 2.0122 KOps/s $\color{#35bf28}+3.38\%$
test_memmaptd_index_op 1.4468ms 1.0487ms 953.6051 Ops/s 935.5564 Ops/s $\color{#35bf28}+1.93\%$
test_serialize_model 0.1302s 0.1293s 7.7328 Ops/s 7.6928 Ops/s $\color{#35bf28}+0.52\%$
test_serialize_model_pickle 1.3498s 1.2119s 0.8251 Ops/s 0.8251 Ops/s $+0.01\%$
test_serialize_weights 0.1302s 0.1291s 7.7486 Ops/s 7.7374 Ops/s $\color{#35bf28}+0.15\%$
test_serialize_weights_returnearly 0.2736s 63.8486ms 15.6620 Ops/s 18.0751 Ops/s $\textbf{\color{#d91a1a}-13.35\%}$
test_serialize_weights_pickle 1.3482s 1.2119s 0.8252 Ops/s 0.8248 Ops/s $\color{#35bf28}+0.04\%$
test_reshape_pytree 76.2520μs 35.4557μs 28.2042 KOps/s 26.9242 KOps/s $\color{#35bf28}+4.75\%$
test_reshape_td 79.1720μs 42.0986μs 23.7538 KOps/s 22.8503 KOps/s $\color{#35bf28}+3.95\%$
test_view_pytree 72.2820μs 35.6997μs 28.0115 KOps/s 27.0140 KOps/s $\color{#35bf28}+3.69\%$
test_view_td 92.0620μs 45.9630μs 21.7566 KOps/s 20.2348 KOps/s $\textbf{\color{#35bf28}+7.52\%}$
test_unbind_pytree 72.3810μs 34.7772μs 28.7544 KOps/s 28.0198 KOps/s $\color{#35bf28}+2.62\%$
test_unbind_td 0.5114ms 44.9381μs 22.2528 KOps/s 22.4404 KOps/s $\color{#d91a1a}-0.84\%$
test_split_pytree 0.1721ms 46.6180μs 21.4510 KOps/s 20.8963 KOps/s $\color{#35bf28}+2.65\%$
test_split_td 0.6333ms 57.4372μs 17.4103 KOps/s 16.7924 KOps/s $\color{#35bf28}+3.68\%$
test_add_pytree 0.1070ms 60.4240μs 16.5497 KOps/s 16.4669 KOps/s $\color{#35bf28}+0.50\%$
test_add_td 0.2734ms 98.7635μs 10.1252 KOps/s 9.9739 KOps/s $\color{#35bf28}+1.52\%$
test_compile_add_one_nested[tensordict-compile] 0.4221ms 0.2112ms 4.7357 KOps/s 4.7809 KOps/s $\color{#d91a1a}-0.95\%$
test_compile_add_one_nested[tensordict-eager] 0.2936ms 0.1515ms 6.6019 KOps/s 6.5617 KOps/s $\color{#35bf28}+0.61\%$
test_compile_add_one_nested[pytree-compile] 0.1916ms 0.1534ms 6.5183 KOps/s 6.8673 KOps/s $\textbf{\color{#d91a1a}-5.08\%}$
test_compile_add_one_nested[pytree-eager] 0.2499ms 0.1867ms 5.3563 KOps/s 5.4494 KOps/s $\color{#d91a1a}-1.71\%$
test_compile_copy_nested[tensordict-compile] 55.8910μs 21.4378μs 46.6466 KOps/s 47.5016 KOps/s $\color{#d91a1a}-1.80\%$
test_compile_copy_nested[tensordict-eager] 97.4720μs 44.7781μs 22.3324 KOps/s 22.5193 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_copy_nested[pytree-compile] 0.2564ms 63.6437μs 15.7125 KOps/s 15.7891 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_copy_nested[pytree-eager] 85.8220μs 49.3412μs 20.2670 KOps/s 20.3030 KOps/s $\color{#d91a1a}-0.18\%$
test_compile_add_one_flat[tensordict-compile] 0.4595ms 0.3242ms 3.0848 KOps/s 3.1216 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_add_one_flat[tensordict-eager] 0.2528ms 0.2109ms 4.7427 KOps/s 4.7364 KOps/s $\color{#35bf28}+0.13\%$
test_compile_add_one_flat[tensorclass-compile] 0.1987ms 0.1358ms 7.3626 KOps/s 7.7900 KOps/s $\textbf{\color{#d91a1a}-5.49\%}$
test_compile_add_one_flat[tensorclass-eager] 0.2017ms 63.2583μs 15.8082 KOps/s 16.6177 KOps/s $\color{#d91a1a}-4.87\%$
test_compile_add_one_flat[pytree-compile] 0.4241ms 0.3223ms 3.1026 KOps/s 3.1180 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_add_one_flat[pytree-eager] 0.7818ms 0.6337ms 1.5781 KOps/s 1.5001 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_compile_add_self_flat[tensordict-eager] 0.3171ms 0.2520ms 3.9686 KOps/s 4.0071 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_add_self_flat[tensordict-compile] 0.4542ms 0.3266ms 3.0623 KOps/s 3.0903 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_add_self_flat[tensorclass-eager] 0.2279ms 75.0856μs 13.3181 KOps/s 14.1384 KOps/s $\textbf{\color{#d91a1a}-5.80\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2878ms 0.1332ms 7.5077 KOps/s 7.7805 KOps/s $\color{#d91a1a}-3.51\%$
test_compile_add_self_flat[pytree-eager] 0.6610ms 0.5680ms 1.7607 KOps/s 1.9352 KOps/s $\textbf{\color{#d91a1a}-9.02\%}$
test_compile_add_self_flat[pytree-compile] 0.3804ms 0.3215ms 3.1103 KOps/s 3.1256 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_copy_flat[tensordict-compile] 69.8420μs 18.0519μs 55.3960 KOps/s 55.1256 KOps/s $\color{#35bf28}+0.49\%$
test_compile_copy_flat[tensordict-eager] 0.1315ms 28.7059μs 34.8361 KOps/s 35.0251 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_copy_flat[pytree-compile] 0.1396ms 69.0485μs 14.4826 KOps/s 14.2962 KOps/s $\color{#35bf28}+1.30\%$
test_compile_copy_flat[pytree-eager] 0.1002ms 51.0843μs 19.5755 KOps/s 19.4494 KOps/s $\color{#35bf28}+0.65\%$
test_compile_assign_and_add[tensordict-compile] 2.3190ms 0.8069ms 1.2393 KOps/s 1.1369 KOps/s $\textbf{\color{#35bf28}+9.00\%}$
test_compile_assign_and_add[tensordict-eager] 3.4570ms 3.1857ms 313.9020 Ops/s 321.3613 Ops/s $\color{#d91a1a}-2.32\%$
test_compile_assign_and_add[pytree-compile] 2.3167ms 0.8049ms 1.2423 KOps/s 1.1371 KOps/s $\textbf{\color{#35bf28}+9.25\%}$
test_compile_assign_and_add[pytree-eager] 3.3352ms 3.1948ms 313.0116 Ops/s 317.9783 Ops/s $\color{#d91a1a}-1.56\%$
test_compile_indexing[tensor-tensordict-compile] 0.1801ms 0.1107ms 9.0306 KOps/s 9.2343 KOps/s $\color{#d91a1a}-2.21\%$
test_compile_indexing[tensor-tensordict-eager] 0.1915ms 61.9510μs 16.1418 KOps/s 15.3298 KOps/s $\textbf{\color{#35bf28}+5.30\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1467ms 0.1057ms 9.4616 KOps/s 9.5687 KOps/s $\color{#d91a1a}-1.12\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1618ms 46.2396μs 21.6265 KOps/s 22.4959 KOps/s $\color{#d91a1a}-3.86\%$
test_compile_indexing[tensor-pytree-compile] 0.2079ms 0.1059ms 9.4446 KOps/s 9.5409 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[tensor-pytree-eager] 0.1719ms 44.2567μs 22.5955 KOps/s 22.6777 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_indexing[slice-tensordict-compile] 0.2131ms 0.1423ms 7.0298 KOps/s 7.0977 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_indexing[slice-tensordict-eager] 0.1879ms 25.9134μs 38.5900 KOps/s 37.8911 KOps/s $\color{#35bf28}+1.84\%$
test_compile_indexing[slice-tensorclass-compile] 0.1786ms 0.1335ms 7.4879 KOps/s 7.5152 KOps/s $\color{#d91a1a}-0.36\%$
test_compile_indexing[slice-tensorclass-eager] 67.6020μs 20.7397μs 48.2167 KOps/s 46.3251 KOps/s $\color{#35bf28}+4.08\%$
test_compile_indexing[slice-pytree-compile] 0.2008ms 0.1399ms 7.1465 KOps/s 7.4692 KOps/s $\color{#d91a1a}-4.32\%$
test_compile_indexing[slice-pytree-eager] 57.4810μs 20.6896μs 48.3334 KOps/s 46.5294 KOps/s $\color{#35bf28}+3.88\%$
test_compile_indexing[int-tensordict-compile] 0.2089ms 0.1413ms 7.0796 KOps/s 6.8930 KOps/s $\color{#35bf28}+2.71\%$
test_compile_indexing[int-tensordict-eager] 0.5086ms 25.2702μs 39.5723 KOps/s 37.6539 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_compile_indexing[int-tensorclass-compile] 0.2219ms 0.1345ms 7.4364 KOps/s 7.5213 KOps/s $\color{#d91a1a}-1.13\%$
test_compile_indexing[int-tensorclass-eager] 0.1172ms 23.9456μs 41.7613 KOps/s 47.3439 KOps/s $\textbf{\color{#d91a1a}-11.79\%}$
test_compile_indexing[int-pytree-compile] 0.2186ms 0.1344ms 7.4415 KOps/s 7.5030 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_indexing[int-pytree-eager] 60.7610μs 20.6894μs 48.3340 KOps/s 46.8023 KOps/s $\color{#35bf28}+3.27\%$
test_mod_add[eager] 0.1623ms 33.1945μs 30.1255 KOps/s 30.0385 KOps/s $\color{#35bf28}+0.29\%$
test_mod_add[compile] 0.1237ms 68.6185μs 14.5733 KOps/s 14.0008 KOps/s $\color{#35bf28}+4.09\%$
test_mod_add[compile-overhead] 0.2750ms 0.1438ms 6.9522 KOps/s 6.7773 KOps/s $\color{#35bf28}+2.58\%$
test_mod_wrap[eager] 0.3411ms 0.2431ms 4.1136 KOps/s 4.1179 KOps/s $\color{#d91a1a}-0.11\%$
test_mod_wrap[compile] 0.3420ms 0.2970ms 3.3667 KOps/s 3.3395 KOps/s $\color{#35bf28}+0.81\%$
test_mod_wrap[compile-overhead] 7.4942ms 4.0671ms 245.8728 Ops/s 261.5511 Ops/s $\textbf{\color{#d91a1a}-5.99\%}$
test_mod_wrap_and_backward[eager] 1.4736ms 1.3472ms 742.2741 Ops/s 691.1980 Ops/s $\textbf{\color{#35bf28}+7.39\%}$
test_mod_wrap_and_backward[compile] 1.5454ms 1.3232ms 755.7420 Ops/s 699.4220 Ops/s $\textbf{\color{#35bf28}+8.05\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3655ms 0.9121ms 1.0963 KOps/s 932.3213 Ops/s $\textbf{\color{#35bf28}+17.59\%}$
test_seq_add[eager] 0.1533ms 99.7189μs 10.0282 KOps/s 9.9574 KOps/s $\color{#35bf28}+0.71\%$
test_seq_add[compile] 0.1486ms 78.9423μs 12.6675 KOps/s 12.2870 KOps/s $\color{#35bf28}+3.10\%$
test_seq_add[compile-overhead] 0.1939ms 0.1158ms 8.6354 KOps/s 8.7276 KOps/s $\color{#d91a1a}-1.06\%$
test_seq_wrap[eager] 0.4846ms 0.3839ms 2.6050 KOps/s 2.5398 KOps/s $\color{#35bf28}+2.57\%$
test_seq_wrap[compile] 0.3881ms 0.3153ms 3.1716 KOps/s 3.1427 KOps/s $\color{#35bf28}+0.92\%$
test_seq_wrap[compile-overhead] 0.2780ms 0.2217ms 4.5114 KOps/s 4.5179 KOps/s $\color{#d91a1a}-0.14\%$
test_func_call_runtime[False-eager] 0.8508ms 0.7362ms 1.3583 KOps/s 1.3437 KOps/s $\color{#35bf28}+1.09\%$
test_func_call_runtime[False-compile] 1.1980ms 0.7878ms 1.2694 KOps/s 1.2196 KOps/s $\color{#35bf28}+4.08\%$
test_func_call_runtime[False-compile-overhead] 0.4123ms 0.3666ms 2.7275 KOps/s 2.7446 KOps/s $\color{#d91a1a}-0.62\%$
test_func_call_runtime[True-eager] 1.0328ms 0.9018ms 1.1089 KOps/s 1.0847 KOps/s $\color{#35bf28}+2.23\%$
test_func_call_runtime[True-compile] 0.8872ms 0.8318ms 1.2023 KOps/s 1.1949 KOps/s $\color{#35bf28}+0.61\%$
test_func_call_runtime[True-compile-overhead] 0.5211ms 0.4008ms 2.4949 KOps/s 2.5118 KOps/s $\color{#d91a1a}-0.67\%$
test_func_call_cm_runtime[False-eager] 0.8179ms 0.7361ms 1.3585 KOps/s 1.3374 KOps/s $\color{#35bf28}+1.58\%$
test_func_call_cm_runtime[False-compile] 0.8855ms 0.7995ms 1.2508 KOps/s 1.2445 KOps/s $\color{#35bf28}+0.50\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4486ms 0.3693ms 2.7076 KOps/s 2.7308 KOps/s $\color{#d91a1a}-0.85\%$
test_func_call_cm_runtime[True-eager] 1.1211ms 1.0030ms 997.0341 Ops/s 988.7661 Ops/s $\color{#35bf28}+0.84\%$
test_func_call_cm_runtime[True-compile] 0.9462ms 0.8590ms 1.1642 KOps/s 1.1582 KOps/s $\color{#35bf28}+0.52\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5800ms 0.4269ms 2.3427 KOps/s 2.3412 KOps/s $\color{#35bf28}+0.06\%$
test_vmap_func_call_cm_runtime[eager] 2.5868ms 2.0961ms 477.0712 Ops/s 480.8546 Ops/s $\color{#d91a1a}-0.79\%$
test_vmap_func_call_cm_runtime[compile] 1.0100ms 0.8843ms 1.1308 KOps/s 1.1356 KOps/s $\color{#d91a1a}-0.42\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5322ms 0.4316ms 2.3172 KOps/s 2.3200 KOps/s $\color{#d91a1a}-0.12\%$
test_distributed 2.4964ms 0.1747ms 5.7238 KOps/s 8.8498 KOps/s $\textbf{\color{#d91a1a}-35.32\%}$
test_tdmodule 32.0510μs 15.1165μs 66.1528 KOps/s 63.0783 KOps/s $\color{#35bf28}+4.87\%$
test_tdmodule_dispatch 49.3510μs 29.9765μs 33.3594 KOps/s 31.9306 KOps/s $\color{#35bf28}+4.47\%$
test_tdseq 45.9710μs 15.9962μs 62.5149 KOps/s 60.4829 KOps/s $\color{#35bf28}+3.36\%$
test_tdseq_dispatch 52.6010μs 32.4108μs 30.8539 KOps/s 29.1184 KOps/s $\textbf{\color{#35bf28}+5.96\%}$
test_instantiation_functorch 2.0005ms 1.8987ms 526.6701 Ops/s 526.7254 Ops/s $\color{#d91a1a}-0.01\%$
test_instantiation_td 1.8345ms 1.2229ms 817.7250 Ops/s 819.0269 Ops/s $\color{#d91a1a}-0.16\%$
test_exec_functorch 0.2645ms 0.2111ms 4.7372 KOps/s 4.7682 KOps/s $\color{#d91a1a}-0.65\%$
test_exec_functional_call 0.2866ms 0.2099ms 4.7641 KOps/s 4.8080 KOps/s $\color{#d91a1a}-0.91\%$
test_exec_td 0.2750ms 0.2163ms 4.6230 KOps/s 4.6737 KOps/s $\color{#d91a1a}-1.09\%$
test_exec_td_decorator 0.8699ms 0.2572ms 3.8887 KOps/s 3.8826 KOps/s $\color{#35bf28}+0.16\%$
test_vmap_mlp_speed[True-True] 0.8222ms 0.6933ms 1.4424 KOps/s 1.4391 KOps/s $\color{#35bf28}+0.23\%$
test_vmap_mlp_speed[True-False] 0.8240ms 0.6882ms 1.4531 KOps/s 1.4486 KOps/s $\color{#35bf28}+0.31\%$
test_vmap_mlp_speed[False-True] 0.7105ms 0.5764ms 1.7349 KOps/s 1.7272 KOps/s $\color{#35bf28}+0.45\%$
test_vmap_mlp_speed[False-False] 0.6794ms 0.5783ms 1.7291 KOps/s 1.7220 KOps/s $\color{#35bf28}+0.41\%$
test_vmap_mlp_speed_decorator[True-True] 1.3922ms 0.6970ms 1.4347 KOps/s 1.4832 KOps/s $\color{#d91a1a}-3.27\%$
test_vmap_mlp_speed_decorator[True-False] 0.8524ms 0.6973ms 1.4341 KOps/s 1.4823 KOps/s $\color{#d91a1a}-3.25\%$
test_vmap_mlp_speed_decorator[False-True] 0.7553ms 0.6155ms 1.6247 KOps/s 1.6970 KOps/s $\color{#d91a1a}-4.26\%$
test_vmap_mlp_speed_decorator[False-False] 0.7608ms 0.6185ms 1.6167 KOps/s 1.6886 KOps/s $\color{#d91a1a}-4.26\%$
test_vmap_transformer_speed[True-True] 8.8793ms 8.4979ms 117.6757 Ops/s 118.5955 Ops/s $\color{#d91a1a}-0.78\%$
test_vmap_transformer_speed[True-False] 8.8003ms 8.4584ms 118.2252 Ops/s 118.6708 Ops/s $\color{#d91a1a}-0.38\%$
test_vmap_transformer_speed[False-True] 8.6575ms 8.2909ms 120.6141 Ops/s 121.9819 Ops/s $\color{#d91a1a}-1.12\%$
test_vmap_transformer_speed[False-False] 8.5119ms 8.2264ms 121.5598 Ops/s 121.8291 Ops/s $\color{#d91a1a}-0.22\%$
test_vmap_transformer_speed_decorator[True-True] 20.6722ms 19.7820ms 50.5509 Ops/s 51.1626 Ops/s $\color{#d91a1a}-1.20\%$
test_vmap_transformer_speed_decorator[True-False] 20.7381ms 19.7861ms 50.5404 Ops/s 51.1260 Ops/s $\color{#d91a1a}-1.15\%$
test_vmap_transformer_speed_decorator[False-True] 19.7713ms 19.5571ms 51.1324 Ops/s 51.6168 Ops/s $\color{#d91a1a}-0.94\%$
test_vmap_transformer_speed_decorator[False-False] 19.7392ms 19.5856ms 51.0579 Ops/s 51.5111 Ops/s $\color{#d91a1a}-0.88\%$
test_to_module_speed[True] 1.4052ms 0.9697ms 1.0312 KOps/s 1.0164 KOps/s $\color{#35bf28}+1.45\%$
test_to_module_speed[False] 1.3279ms 0.9318ms 1.0732 KOps/s 1.0507 KOps/s $\color{#35bf28}+2.14\%$
test_tc_init 62.3010μs 34.0951μs 29.3297 KOps/s 27.9092 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_tc_init_nested 0.1242ms 69.0644μs 14.4792 KOps/s 13.8115 KOps/s $\color{#35bf28}+4.83\%$
test_tc_first_layer_tensor 7.4946μs 0.6747μs 1.4821 MOps/s 1.4644 MOps/s $\color{#35bf28}+1.20\%$
test_tc_first_layer_nontensor 23.4510μs 2.2464μs 445.1604 KOps/s 446.0304 KOps/s $\color{#d91a1a}-0.20\%$
test_tc_second_layer_tensor 23.3680μs 1.3606μs 734.9494 KOps/s 727.2340 KOps/s $\color{#35bf28}+1.06\%$
test_tc_second_layer_nontensor 79.9320μs 2.9535μs 338.5795 KOps/s 347.3318 KOps/s $\color{#d91a1a}-2.52\%$
test_unbind 0.2020s 12.7860ms 78.2104 Ops/s 88.1482 Ops/s $\textbf{\color{#d91a1a}-11.27\%}$
test_full_like 0.7054ms 0.5729ms 1.7456 KOps/s 1.7387 KOps/s $\color{#35bf28}+0.39\%$
test_zeros_like 0.2943ms 0.1979ms 5.0527 KOps/s 5.0543 KOps/s $\color{#d91a1a}-0.03\%$
test_ones_like 0.2560ms 0.1978ms 5.0565 KOps/s 5.0574 KOps/s $\color{#d91a1a}-0.02\%$
test_clone 0.5137ms 0.4145ms 2.4124 KOps/s 2.4132 KOps/s $\color{#d91a1a}-0.03\%$
test_squeeze 30.1800μs 10.0438μs 99.5639 KOps/s 100.8323 KOps/s $\color{#d91a1a}-1.26\%$
test_unsqueeze 0.2290ms 74.8904μs 13.3529 KOps/s 13.1028 KOps/s $\color{#35bf28}+1.91\%$
test_split 0.4114ms 0.1612ms 6.2030 KOps/s 6.2827 KOps/s $\color{#d91a1a}-1.27\%$
test_permute 0.2224ms 0.1787ms 5.5946 KOps/s 5.3395 KOps/s $\color{#35bf28}+4.78\%$
test_stack 1.2495ms 0.8759ms 1.1416 KOps/s 1.1554 KOps/s $\color{#d91a1a}-1.19\%$
test_cat 1.2528ms 1.2315ms 812.0000 Ops/s 812.3224 Ops/s $\color{#d91a1a}-0.04\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants