Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Sep 26, 2024

[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: fba1451
Pull Request resolved: #1010
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 26, 2024
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: fba1451
Pull Request resolved: #1010
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: fba1451
Pull Request resolved: #1010
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: fba1451
Pull Request resolved: #1010
@github-actions
Copy link

github-actions bot commented Sep 26, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}46$. Worsened: $\large\color{#d91a1a}39$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 44.3130μs 21.4363μs 46.6499 KOps/s 50.5395 KOps/s $\textbf{\color{#d91a1a}-7.70\%}$
test_plain_set_stack_nested 52.0980μs 22.1748μs 45.0963 KOps/s 50.3384 KOps/s $\textbf{\color{#d91a1a}-10.41\%}$
test_plain_set_nested_inplace 73.7380μs 23.7829μs 42.0470 KOps/s 47.1037 KOps/s $\textbf{\color{#d91a1a}-10.74\%}$
test_plain_set_stack_nested_inplace 57.2770μs 23.7994μs 42.0179 KOps/s 46.7057 KOps/s $\textbf{\color{#d91a1a}-10.04\%}$
test_items 22.8320μs 4.2248μs 236.7003 KOps/s 247.9907 KOps/s $\color{#d91a1a}-4.55\%$
test_items_nested 0.4747ms 0.3636ms 2.7499 KOps/s 2.7517 KOps/s $\color{#d91a1a}-0.06\%$
test_items_nested_locked 0.9009ms 0.3675ms 2.7212 KOps/s 2.7362 KOps/s $\color{#d91a1a}-0.55\%$
test_items_nested_leaf 0.1256ms 69.7525μs 14.3364 KOps/s 14.7878 KOps/s $\color{#d91a1a}-3.05\%$
test_items_stack_nested 0.4640ms 0.3707ms 2.6979 KOps/s 2.7421 KOps/s $\color{#d91a1a}-1.61\%$
test_items_stack_nested_leaf 0.1568ms 71.6523μs 13.9563 KOps/s 13.5695 KOps/s $\color{#35bf28}+2.85\%$
test_items_stack_nested_locked 0.9797ms 0.4781ms 2.0918 KOps/s 2.6973 KOps/s $\textbf{\color{#d91a1a}-22.45\%}$
test_keys 42.3600μs 4.0253μs 248.4290 KOps/s 278.8630 KOps/s $\textbf{\color{#d91a1a}-10.91\%}$
test_keys_nested 0.2991ms 0.1120ms 8.9306 KOps/s 9.9469 KOps/s $\textbf{\color{#d91a1a}-10.22\%}$
test_keys_nested_locked 1.6327ms 0.1177ms 8.4995 KOps/s 9.4559 KOps/s $\textbf{\color{#d91a1a}-10.11\%}$
test_keys_nested_leaf 0.2551ms 94.7757μs 10.5512 KOps/s 12.0032 KOps/s $\textbf{\color{#d91a1a}-12.10\%}$
test_keys_stack_nested 0.1822ms 0.1010ms 9.9017 KOps/s 9.8418 KOps/s $\color{#35bf28}+0.61\%$
test_keys_stack_nested_leaf 0.1688ms 83.4255μs 11.9867 KOps/s 11.8356 KOps/s $\color{#35bf28}+1.28\%$
test_keys_stack_nested_locked 0.1962ms 0.1052ms 9.5042 KOps/s 9.4387 KOps/s $\color{#35bf28}+0.69\%$
test_values 11.3572μs 1.0482μs 954.0309 KOps/s 943.4178 KOps/s $\color{#35bf28}+1.12\%$
test_values_nested 0.1432ms 73.9116μs 13.5297 KOps/s 13.0624 KOps/s $\color{#35bf28}+3.58\%$
test_values_nested_locked 0.1443ms 72.9312μs 13.7116 KOps/s 13.2868 KOps/s $\color{#35bf28}+3.20\%$
test_values_nested_leaf 0.1306ms 61.5322μs 16.2517 KOps/s 15.9078 KOps/s $\color{#35bf28}+2.16\%$
test_values_stack_nested 0.1541ms 74.4470μs 13.4324 KOps/s 13.5041 KOps/s $\color{#d91a1a}-0.53\%$
test_values_stack_nested_leaf 0.1181ms 61.4980μs 16.2607 KOps/s 16.3509 KOps/s $\color{#d91a1a}-0.55\%$
test_values_stack_nested_locked 0.1423ms 74.6268μs 13.4000 KOps/s 13.6149 KOps/s $\color{#d91a1a}-1.58\%$
test_membership 3.9831μs 0.7206μs 1.3877 MOps/s 1.3752 MOps/s $\color{#35bf28}+0.91\%$
test_membership_nested 21.9310μs 2.7890μs 358.5549 KOps/s 358.4124 KOps/s $\color{#35bf28}+0.04\%$
test_membership_nested_leaf 22.3730μs 2.7979μs 357.4127 KOps/s 358.1231 KOps/s $\color{#d91a1a}-0.20\%$
test_membership_stacked_nested 24.1050μs 2.7959μs 357.6612 KOps/s 360.7745 KOps/s $\color{#d91a1a}-0.86\%$
test_membership_stacked_nested_leaf 30.9780μs 2.7687μs 361.1738 KOps/s 356.6135 KOps/s $\color{#35bf28}+1.28\%$
test_membership_nested_last 22.3520μs 4.0250μs 248.4462 KOps/s 249.8532 KOps/s $\color{#d91a1a}-0.56\%$
test_membership_nested_leaf_last 49.8740μs 4.0094μs 249.4144 KOps/s 247.7781 KOps/s $\color{#35bf28}+0.66\%$
test_membership_stacked_nested_last 23.5850μs 4.5467μs 219.9405 KOps/s 169.6517 KOps/s $\textbf{\color{#35bf28}+29.64\%}$
test_membership_stacked_nested_leaf_last 48.8740μs 4.5779μs 218.4397 KOps/s 167.9384 KOps/s $\textbf{\color{#35bf28}+30.07\%}$
test_nested_getleaf 35.1060μs 10.8432μs 92.2234 KOps/s 94.5291 KOps/s $\color{#d91a1a}-2.44\%$
test_nested_get 38.9930μs 10.2902μs 97.1799 KOps/s 83.9280 KOps/s $\textbf{\color{#35bf28}+15.79\%}$
test_stacked_getleaf 68.6590μs 10.6957μs 93.4959 KOps/s 73.3471 KOps/s $\textbf{\color{#35bf28}+27.47\%}$
test_stacked_get 43.0600μs 10.1818μs 98.2145 KOps/s 82.4284 KOps/s $\textbf{\color{#35bf28}+19.15\%}$
test_nested_getitemleaf 63.2590μs 11.0311μs 90.6529 KOps/s 82.5964 KOps/s $\textbf{\color{#35bf28}+9.75\%}$
test_nested_getitem 30.9990μs 10.4068μs 96.0908 KOps/s 88.6168 KOps/s $\textbf{\color{#35bf28}+8.43\%}$
test_stacked_getitemleaf 63.2960μs 10.9339μs 91.4585 KOps/s 80.3606 KOps/s $\textbf{\color{#35bf28}+13.81\%}$
test_stacked_getitem 44.8140μs 10.4258μs 95.9155 KOps/s 87.5394 KOps/s $\textbf{\color{#35bf28}+9.57\%}$
test_lock_nested 83.9395ms 0.5747ms 1.7399 KOps/s 1.9433 KOps/s $\textbf{\color{#d91a1a}-10.46\%}$
test_lock_stack_nested 0.7025ms 0.4539ms 2.2032 KOps/s 2.1884 KOps/s $\color{#35bf28}+0.68\%$
test_unlock_nested 82.3425ms 0.4890ms 2.0449 KOps/s 2.4544 KOps/s $\textbf{\color{#d91a1a}-16.68\%}$
test_unlock_stack_nested 0.7333ms 0.3719ms 2.6892 KOps/s 2.6481 KOps/s $\color{#35bf28}+1.55\%$
test_flatten_speed 0.1808ms 87.8620μs 11.3815 KOps/s 9.3829 KOps/s $\textbf{\color{#35bf28}+21.30\%}$
test_unflatten_speed 0.6489ms 0.4631ms 2.1594 KOps/s 1.8769 KOps/s $\textbf{\color{#35bf28}+15.05\%}$
test_common_ops 4.8491ms 1.1491ms 870.2625 Ops/s 772.5172 Ops/s $\textbf{\color{#35bf28}+12.65\%}$
test_creation 30.8880μs 2.0984μs 476.5433 KOps/s 472.8116 KOps/s $\color{#35bf28}+0.79\%$
test_creation_empty 51.7070μs 20.1048μs 49.7393 KOps/s 45.6777 KOps/s $\textbf{\color{#35bf28}+8.89\%}$
test_creation_nested_1 64.9810μs 23.2287μs 43.0501 KOps/s 49.7397 KOps/s $\textbf{\color{#d91a1a}-13.45\%}$
test_creation_nested_2 76.9050μs 27.9362μs 35.7959 KOps/s 34.3380 KOps/s $\color{#35bf28}+4.25\%$
test_clone 75.9230μs 16.7699μs 59.6308 KOps/s 56.5050 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_getitem[int] 1.0798ms 16.5717μs 60.3437 KOps/s 58.2997 KOps/s $\color{#35bf28}+3.51\%$
test_getitem[slice_int] 0.1364ms 29.5335μs 33.8599 KOps/s 31.6424 KOps/s $\textbf{\color{#35bf28}+7.01\%}$
test_getitem[range] 0.1853ms 55.9905μs 17.8602 KOps/s 16.9710 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_getitem[tuple] 0.1328ms 24.3723μs 41.0302 KOps/s 38.7474 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_getitem[list] 0.1779ms 51.2354μs 19.5178 KOps/s 18.2591 KOps/s $\textbf{\color{#35bf28}+6.89\%}$
test_setitem_dim[int] 99.4160μs 33.9431μs 29.4610 KOps/s 29.8793 KOps/s $\color{#d91a1a}-1.40\%$
test_setitem_dim[slice_int] 0.1034ms 61.4205μs 16.2812 KOps/s 15.8627 KOps/s $\color{#35bf28}+2.64\%$
test_setitem_dim[range] 0.1233ms 82.0889μs 12.1819 KOps/s 11.6666 KOps/s $\color{#35bf28}+4.42\%$
test_setitem_dim[tuple] 0.1194ms 50.2572μs 19.8977 KOps/s 19.6980 KOps/s $\color{#35bf28}+1.01\%$
test_setitem 91.4420μs 31.4623μs 31.7841 KOps/s 32.9777 KOps/s $\color{#d91a1a}-3.62\%$
test_set 85.2800μs 30.4454μs 32.8457 KOps/s 33.4991 KOps/s $\color{#d91a1a}-1.95\%$
test_set_shared 1.9860ms 0.2138ms 4.6766 KOps/s 4.5945 KOps/s $\color{#35bf28}+1.79\%$
test_update 0.1402ms 38.5018μs 25.9728 KOps/s 27.4220 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_update_nested 0.1598ms 47.8007μs 20.9202 KOps/s 20.9647 KOps/s $\color{#d91a1a}-0.21\%$
test_update__nested 88.2550μs 34.8766μs 28.6725 KOps/s 27.9213 KOps/s $\color{#35bf28}+2.69\%$
test_set_nested 98.4450μs 32.8774μs 30.4160 KOps/s 31.5034 KOps/s $\color{#d91a1a}-3.45\%$
test_set_nested_new 0.1082ms 37.7506μs 26.4897 KOps/s 27.0555 KOps/s $\color{#d91a1a}-2.09\%$
test_select 0.1169ms 54.3558μs 18.3973 KOps/s 18.3752 KOps/s $\color{#35bf28}+0.12\%$
test_select_nested 0.1372ms 59.1150μs 16.9162 KOps/s 17.1730 KOps/s $\color{#d91a1a}-1.50\%$
test_exclude_nested 0.1657ms 75.7260μs 13.2055 KOps/s 13.7017 KOps/s $\color{#d91a1a}-3.62\%$
test_empty[True] 0.5025ms 0.3117ms 3.2077 KOps/s 3.1991 KOps/s $\color{#35bf28}+0.27\%$
test_empty[False] 10.2443μs 1.2561μs 796.1114 KOps/s 842.3419 KOps/s $\textbf{\color{#d91a1a}-5.49\%}$
test_unbind_speed 0.6114ms 0.4057ms 2.4651 KOps/s 3.2963 KOps/s $\textbf{\color{#d91a1a}-25.22\%}$
test_unbind_speed_stack0 0.6845ms 0.3565ms 2.8053 KOps/s 3.3780 KOps/s $\textbf{\color{#d91a1a}-16.95\%}$
test_unbind_speed_stack1 0.1141s 0.9485ms 1.0543 KOps/s 1.3738 KOps/s $\textbf{\color{#d91a1a}-23.26\%}$
test_split 86.8470ms 2.3396ms 427.4262 Ops/s 453.9179 Ops/s $\textbf{\color{#d91a1a}-5.84\%}$
test_chunk 3.4056ms 2.1369ms 467.9657 Ops/s 447.6919 Ops/s $\color{#35bf28}+4.53\%$
test_creation[device0] 0.2242ms 0.1149ms 8.7042 KOps/s 8.4284 KOps/s $\color{#35bf28}+3.27\%$
test_creation_from_tensor 5.1334ms 0.1223ms 8.1761 KOps/s 8.3662 KOps/s $\color{#d91a1a}-2.27\%$
test_add_one[memmap_tensor0] 0.2144ms 7.2817μs 137.3302 KOps/s 128.4618 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_contiguous[memmap_tensor0] 13.3450μs 1.9203μs 520.7414 KOps/s 512.6073 KOps/s $\color{#35bf28}+1.59\%$
test_stack[memmap_tensor0] 32.1910μs 5.7487μs 173.9515 KOps/s 172.1151 KOps/s $\color{#35bf28}+1.07\%$
test_memmaptd_index 0.7411ms 0.4037ms 2.4768 KOps/s 2.4617 KOps/s $\color{#35bf28}+0.61\%$
test_memmaptd_index_astensor 0.7642ms 0.4830ms 2.0704 KOps/s 2.0484 KOps/s $\color{#35bf28}+1.07\%$
test_memmaptd_index_op 1.6729ms 1.0546ms 948.2705 Ops/s 981.7813 Ops/s $\color{#d91a1a}-3.41\%$
test_serialize_model 0.2062s 0.1305s 7.6648 Ops/s 8.6328 Ops/s $\textbf{\color{#d91a1a}-11.21\%}$
test_serialize_model_pickle 0.4677s 0.3993s 2.5041 Ops/s 2.5661 Ops/s $\color{#d91a1a}-2.42\%$
test_serialize_weights 0.1227s 0.1176s 8.5027 Ops/s 8.6556 Ops/s $\color{#d91a1a}-1.77\%$
test_serialize_weights_returnearly 0.2105s 0.1898s 5.2695 Ops/s 6.2607 Ops/s $\textbf{\color{#d91a1a}-15.83\%}$
test_serialize_weights_pickle 0.5567s 0.4405s 2.2703 Ops/s 2.5170 Ops/s $\textbf{\color{#d91a1a}-9.80\%}$
test_serialize_weights_filesystem 0.2201s 0.1516s 6.5955 Ops/s 6.9735 Ops/s $\textbf{\color{#d91a1a}-5.42\%}$
test_serialize_model_filesystem 0.1615s 0.1474s 6.7826 Ops/s 6.9781 Ops/s $\color{#d91a1a}-2.80\%$
test_reshape_pytree 79.5690μs 38.4837μs 25.9850 KOps/s 24.9799 KOps/s $\color{#35bf28}+4.02\%$
test_reshape_td 96.0900μs 46.4689μs 21.5198 KOps/s 20.7855 KOps/s $\color{#35bf28}+3.53\%$
test_view_pytree 0.1179ms 38.0882μs 26.2548 KOps/s 25.3091 KOps/s $\color{#35bf28}+3.74\%$
test_view_td 0.1293ms 51.2868μs 19.4982 KOps/s 18.2789 KOps/s $\textbf{\color{#35bf28}+6.67\%}$
test_unbind_pytree 79.6400μs 35.4489μs 28.2096 KOps/s 27.3592 KOps/s $\color{#35bf28}+3.11\%$
test_unbind_td 0.3070ms 45.0914μs 22.1772 KOps/s 21.5545 KOps/s $\color{#35bf28}+2.89\%$
test_split_pytree 75.0710μs 37.5558μs 26.6271 KOps/s 25.8551 KOps/s $\color{#35bf28}+2.99\%$
test_split_td 0.5059ms 55.8917μs 17.8917 KOps/s 17.3705 KOps/s $\color{#35bf28}+3.00\%$
test_add_pytree 0.1308ms 44.6468μs 22.3980 KOps/s 21.6775 KOps/s $\color{#35bf28}+3.32\%$
test_add_td 0.2988ms 85.8300μs 11.6509 KOps/s 12.5795 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_compile_add_one_nested[tensordict-compile] 0.1185ms 57.8613μs 17.2827 KOps/s 15.9241 KOps/s $\textbf{\color{#35bf28}+8.53\%}$
test_compile_add_one_nested[tensordict-eager] 0.2645ms 0.1781ms 5.6158 KOps/s 5.0154 KOps/s $\textbf{\color{#35bf28}+11.97\%}$
test_compile_add_one_nested[pytree-compile] 0.1441ms 57.8683μs 17.2806 KOps/s 15.9920 KOps/s $\textbf{\color{#35bf28}+8.06\%}$
test_compile_add_one_nested[pytree-eager] 0.2530ms 0.1384ms 7.2229 KOps/s 6.6722 KOps/s $\textbf{\color{#35bf28}+8.25\%}$
test_compile_copy_nested[tensordict-compile] 61.5550μs 21.3230μs 46.8978 KOps/s 48.0556 KOps/s $\color{#d91a1a}-2.41\%$
test_compile_copy_nested[tensordict-eager] 0.1569ms 68.4555μs 14.6080 KOps/s 14.6095 KOps/s $-0.01\%$
test_compile_copy_nested[pytree-compile] 0.1619ms 74.6586μs 13.3943 KOps/s 13.2687 KOps/s $\color{#35bf28}+0.95\%$
test_compile_copy_nested[pytree-eager] 0.1567ms 68.8357μs 14.5273 KOps/s 14.4625 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_one_flat[tensordict-compile] 0.2925ms 0.1735ms 5.7637 KOps/s 5.6563 KOps/s $\color{#35bf28}+1.90\%$
test_compile_add_one_flat[tensordict-eager] 0.3992ms 0.1970ms 5.0772 KOps/s 5.1969 KOps/s $\color{#d91a1a}-2.30\%$
test_compile_add_one_flat[tensorclass-compile] 96.4310μs 46.0723μs 21.7050 KOps/s 21.2242 KOps/s $\color{#35bf28}+2.27\%$
test_compile_add_one_flat[tensorclass-eager] 0.1723ms 68.4291μs 14.6137 KOps/s 13.3597 KOps/s $\textbf{\color{#35bf28}+9.39\%}$
test_compile_add_one_flat[pytree-compile] 0.3691ms 0.1795ms 5.5697 KOps/s 5.1553 KOps/s $\textbf{\color{#35bf28}+8.04\%}$
test_compile_add_one_flat[pytree-eager] 0.6184ms 0.2860ms 3.4968 KOps/s 3.1149 KOps/s $\textbf{\color{#35bf28}+12.26\%}$
test_compile_add_self_flat[tensordict-eager] 0.3763ms 0.2072ms 4.8252 KOps/s 4.9234 KOps/s $\color{#d91a1a}-1.99\%$
test_compile_add_self_flat[tensordict-compile] 0.4294ms 0.1778ms 5.6234 KOps/s 5.7167 KOps/s $\color{#d91a1a}-1.63\%$
test_compile_add_self_flat[tensorclass-eager] 0.1206ms 61.2634μs 16.3230 KOps/s 15.8772 KOps/s $\color{#35bf28}+2.81\%$
test_compile_add_self_flat[tensorclass-compile] 0.1104ms 46.6363μs 21.4425 KOps/s 21.2740 KOps/s $\color{#35bf28}+0.79\%$
test_compile_add_self_flat[pytree-eager] 0.4082ms 0.2321ms 4.3086 KOps/s 4.2064 KOps/s $\color{#35bf28}+2.43\%$
test_compile_add_self_flat[pytree-compile] 0.2752ms 0.1762ms 5.6753 KOps/s 5.6036 KOps/s $\color{#35bf28}+1.28\%$
test_compile_copy_flat[tensordict-compile] 0.1903ms 0.1039ms 9.6208 KOps/s 9.4875 KOps/s $\color{#35bf28}+1.41\%$
test_compile_copy_flat[tensordict-eager] 0.1362ms 58.2443μs 17.1691 KOps/s 17.1406 KOps/s $\color{#35bf28}+0.17\%$
test_compile_copy_flat[pytree-compile] 0.1753ms 77.8832μs 12.8397 KOps/s 12.7644 KOps/s $\color{#35bf28}+0.59\%$
test_compile_copy_flat[pytree-eager] 0.1353ms 69.2336μs 14.4438 KOps/s 14.4815 KOps/s $\color{#d91a1a}-0.26\%$
test_compile_assign_and_add[tensordict-compile] 0.4984ms 0.2319ms 4.3115 KOps/s 4.9605 KOps/s $\textbf{\color{#d91a1a}-13.08\%}$
test_compile_assign_and_add[tensordict-eager] 3.2491ms 1.9750ms 506.3215 Ops/s 607.8096 Ops/s $\textbf{\color{#d91a1a}-16.70\%}$
test_compile_assign_and_add[pytree-compile] 0.2935ms 0.1950ms 5.1288 KOps/s 4.9048 KOps/s $\color{#35bf28}+4.57\%$
test_compile_assign_and_add[pytree-eager] 1.8538ms 1.1009ms 908.3140 Ops/s 898.1401 Ops/s $\color{#35bf28}+1.13\%$
test_compile_assign_and_add_stack[compile] 0.5367ms 0.4267ms 2.3438 KOps/s 2.1862 KOps/s $\textbf{\color{#35bf28}+7.21\%}$
test_compile_assign_and_add_stack[eager] 6.1418ms 3.9538ms 252.9212 Ops/s 263.3389 Ops/s $\color{#d91a1a}-3.96\%$
test_compile_indexing[tensor-tensordict-compile] 95.5600μs 34.3930μs 29.0756 KOps/s 26.6013 KOps/s $\textbf{\color{#35bf28}+9.30\%}$
test_compile_indexing[tensor-tensordict-eager] 1.0826ms 47.5016μs 21.0519 KOps/s 20.0337 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1071ms 29.7825μs 33.5767 KOps/s 32.7048 KOps/s $\color{#35bf28}+2.67\%$
test_compile_indexing[tensor-tensorclass-eager] 69.8400μs 27.9961μs 35.7193 KOps/s 33.5615 KOps/s $\textbf{\color{#35bf28}+6.43\%}$
test_compile_indexing[tensor-pytree-compile] 0.1037ms 30.3641μs 32.9336 KOps/s 32.8119 KOps/s $\color{#35bf28}+0.37\%$
test_compile_indexing[tensor-pytree-eager] 0.1080ms 27.8343μs 35.9269 KOps/s 34.4605 KOps/s $\color{#35bf28}+4.26\%$
test_compile_indexing[slice-tensordict-compile] 0.1525ms 74.8995μs 13.3512 KOps/s 12.8487 KOps/s $\color{#35bf28}+3.91\%$
test_compile_indexing[slice-tensordict-eager] 0.5362ms 26.5341μs 37.6874 KOps/s 35.1544 KOps/s $\textbf{\color{#35bf28}+7.21\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1643ms 68.7645μs 14.5424 KOps/s 13.9402 KOps/s $\color{#35bf28}+4.32\%$
test_compile_indexing[slice-tensorclass-eager] 97.9830μs 22.7241μs 44.0062 KOps/s 36.9017 KOps/s $\textbf{\color{#35bf28}+19.25\%}$
test_compile_indexing[slice-pytree-compile] 0.1844ms 67.6443μs 14.7832 KOps/s 11.6915 KOps/s $\textbf{\color{#35bf28}+26.44\%}$
test_compile_indexing[slice-pytree-eager] 63.6490μs 22.8397μs 43.7835 KOps/s 35.7917 KOps/s $\textbf{\color{#35bf28}+22.33\%}$
test_compile_indexing[int-tensordict-compile] 0.1578ms 73.3223μs 13.6384 KOps/s 11.4425 KOps/s $\textbf{\color{#35bf28}+19.19\%}$
test_compile_indexing[int-tensordict-eager] 0.9490ms 26.4064μs 37.8696 KOps/s 31.5860 KOps/s $\textbf{\color{#35bf28}+19.89\%}$
test_compile_indexing[int-tensorclass-compile] 0.1537ms 68.1420μs 14.6752 KOps/s 13.2602 KOps/s $\textbf{\color{#35bf28}+10.67\%}$
test_compile_indexing[int-tensorclass-eager] 63.8700μs 22.8524μs 43.7591 KOps/s 41.5358 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_compile_indexing[int-pytree-compile] 0.1487ms 67.7100μs 14.7689 KOps/s 14.1500 KOps/s $\color{#35bf28}+4.37\%$
test_compile_indexing[int-pytree-eager] 92.7040μs 22.9731μs 43.5293 KOps/s 42.0989 KOps/s $\color{#35bf28}+3.40\%$
test_mod_add[eager] 89.5790μs 27.0626μs 36.9514 KOps/s 40.3183 KOps/s $\textbf{\color{#d91a1a}-8.35\%}$
test_mod_add[compile] 0.1406ms 39.6661μs 25.2104 KOps/s 25.8498 KOps/s $\color{#d91a1a}-2.47\%$
test_mod_add[compile-overhead] 86.4520μs 38.8741μs 25.7241 KOps/s 25.1170 KOps/s $\color{#35bf28}+2.42\%$
test_mod_wrap[eager] 0.3202ms 0.2092ms 4.7804 KOps/s 4.7674 KOps/s $\color{#35bf28}+0.27\%$
test_mod_wrap[compile] 0.3873ms 0.2320ms 4.3111 KOps/s 4.2520 KOps/s $\color{#35bf28}+1.39\%$
test_mod_wrap[compile-overhead] 0.3627ms 0.2310ms 4.3299 KOps/s 4.2879 KOps/s $\color{#35bf28}+0.98\%$
test_mod_wrap_and_backward[eager] 14.2373ms 11.4446ms 87.3774 Ops/s 86.5955 Ops/s $\color{#35bf28}+0.90\%$
test_mod_wrap_and_backward[compile] 13.9040ms 11.7950ms 84.7819 Ops/s 77.5904 Ops/s $\textbf{\color{#35bf28}+9.27\%}$
test_mod_wrap_and_backward[compile-overhead] 14.1129ms 12.3865ms 80.7331 Ops/s 84.3387 Ops/s $\color{#d91a1a}-4.28\%$
test_seq_add[eager] 0.2411ms 93.8521μs 10.6551 KOps/s 10.8211 KOps/s $\color{#d91a1a}-1.53\%$
test_seq_add[compile] 0.1334ms 64.9264μs 15.4021 KOps/s 15.4690 KOps/s $\color{#d91a1a}-0.43\%$
test_seq_add[compile-overhead] 0.1659ms 63.4129μs 15.7697 KOps/s 15.4418 KOps/s $\color{#35bf28}+2.12\%$
test_seq_wrap[eager] 0.5044ms 0.3928ms 2.5458 KOps/s 2.6479 KOps/s $\color{#d91a1a}-3.86\%$
test_seq_wrap[compile] 1.2061ms 0.2758ms 3.6253 KOps/s 3.6631 KOps/s $\color{#d91a1a}-1.03\%$
test_seq_wrap[compile-overhead] 1.9764ms 0.3087ms 3.2390 KOps/s 3.6957 KOps/s $\textbf{\color{#d91a1a}-12.36\%}$
test_func_call_runtime[False-eager] 1.6400ms 0.5866ms 1.7048 KOps/s 1.8431 KOps/s $\textbf{\color{#d91a1a}-7.50\%}$
test_func_call_runtime[False-compile] 1.0058ms 0.5365ms 1.8640 KOps/s 1.9669 KOps/s $\textbf{\color{#d91a1a}-5.23\%}$
test_func_call_runtime[False-compile-overhead] 0.6746ms 0.5007ms 1.9970 KOps/s 1.9628 KOps/s $\color{#35bf28}+1.74\%$
test_func_call_runtime[True-eager] 0.8620ms 0.7449ms 1.3425 KOps/s 1.3195 KOps/s $\color{#35bf28}+1.74\%$
test_func_call_runtime[True-compile] 0.6764ms 0.5151ms 1.9412 KOps/s 1.7085 KOps/s $\textbf{\color{#35bf28}+13.62\%}$
test_func_call_runtime[True-compile-overhead] 0.6309ms 0.5114ms 1.9554 KOps/s 1.9159 KOps/s $\color{#35bf28}+2.06\%$
test_func_call_cm_runtime[False-eager] 1.4874ms 0.6272ms 1.5945 KOps/s 1.8681 KOps/s $\textbf{\color{#d91a1a}-14.65\%}$
test_func_call_cm_runtime[False-compile] 1.1813ms 0.5233ms 1.9109 KOps/s 1.9500 KOps/s $\color{#d91a1a}-2.01\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9501ms 0.5072ms 1.9715 KOps/s 1.9592 KOps/s $\color{#35bf28}+0.63\%$
test_func_call_cm_runtime[True-eager] 1.0119ms 0.8664ms 1.1542 KOps/s 1.1217 KOps/s $\color{#35bf28}+2.90\%$
test_func_call_cm_runtime[True-compile] 1.0509ms 0.7300ms 1.3699 KOps/s 1.1478 KOps/s $\textbf{\color{#35bf28}+19.35\%}$
test_func_call_cm_runtime[True-compile-overhead] 1.1779ms 0.7360ms 1.3587 KOps/s 1.3190 KOps/s $\color{#35bf28}+3.01\%$
test_vmap_func_call_cm_runtime[eager] 2.6206ms 1.8649ms 536.2259 Ops/s 530.0275 Ops/s $\color{#35bf28}+1.17\%$
test_vmap_func_call_cm_runtime[compile] 2.4900ms 1.8917ms 528.6301 Ops/s 514.2960 Ops/s $\color{#35bf28}+2.79\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.7097ms 1.9254ms 519.3708 Ops/s 508.7570 Ops/s $\color{#35bf28}+2.09\%$
test_distributed 0.2521ms 0.1259ms 7.9403 KOps/s 7.6638 KOps/s $\color{#35bf28}+3.61\%$
test_tdmodule 46.8680μs 20.2493μs 49.3845 KOps/s 57.8097 KOps/s $\textbf{\color{#d91a1a}-14.57\%}$
test_tdmodule_dispatch 72.4250μs 39.3421μs 25.4181 KOps/s 29.2806 KOps/s $\textbf{\color{#d91a1a}-13.19\%}$
test_tdseq 42.4200μs 22.8410μs 43.7809 KOps/s 51.2625 KOps/s $\textbf{\color{#d91a1a}-14.59\%}$
test_tdseq_dispatch 68.6090μs 44.7773μs 22.3327 KOps/s 25.3506 KOps/s $\textbf{\color{#d91a1a}-11.90\%}$
test_instantiation_functorch 2.5220ms 1.5832ms 631.6355 Ops/s 618.5979 Ops/s $\color{#35bf28}+2.11\%$
test_instantiation_td 1.9181ms 1.1624ms 860.2716 Ops/s 847.2131 Ops/s $\color{#35bf28}+1.54\%$
test_exec_functorch 0.3502ms 0.1862ms 5.3705 KOps/s 5.4202 KOps/s $\color{#d91a1a}-0.92\%$
test_exec_functional_call 0.3433ms 0.1703ms 5.8732 KOps/s 5.5886 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_exec_td 0.3052ms 0.1689ms 5.9208 KOps/s 5.8982 KOps/s $\color{#35bf28}+0.38\%$
test_exec_td_decorator 0.3462ms 0.2223ms 4.4986 KOps/s 4.4003 KOps/s $\color{#35bf28}+2.23\%$
test_vmap_mlp_speed[True-True] 0.8025ms 0.6509ms 1.5364 KOps/s 1.4939 KOps/s $\color{#35bf28}+2.85\%$
test_vmap_mlp_speed[True-False] 0.8657ms 0.6497ms 1.5391 KOps/s 1.5251 KOps/s $\color{#35bf28}+0.92\%$
test_vmap_mlp_speed[False-True] 0.8067ms 0.4952ms 2.0194 KOps/s 1.9964 KOps/s $\color{#35bf28}+1.15\%$
test_vmap_mlp_speed[False-False] 0.6713ms 0.4944ms 2.0225 KOps/s 1.9879 KOps/s $\color{#35bf28}+1.74\%$
test_vmap_mlp_speed_decorator[True-True] 1.3925ms 0.6225ms 1.6065 KOps/s 1.5919 KOps/s $\color{#35bf28}+0.92\%$
test_vmap_mlp_speed_decorator[True-False] 0.9574ms 0.6232ms 1.6047 KOps/s 1.5784 KOps/s $\color{#35bf28}+1.67\%$
test_vmap_mlp_speed_decorator[False-True] 0.7795ms 0.5092ms 1.9637 KOps/s 1.9100 KOps/s $\color{#35bf28}+2.81\%$
test_vmap_mlp_speed_decorator[False-False] 0.7310ms 0.5082ms 1.9679 KOps/s 1.9270 KOps/s $\color{#35bf28}+2.12\%$
test_to_module_speed[True] 2.0503ms 1.3158ms 759.9883 Ops/s 765.3669 Ops/s $\color{#d91a1a}-0.70\%$
test_to_module_speed[False] 1.9691ms 1.2763ms 783.4966 Ops/s 790.3830 Ops/s $\color{#d91a1a}-0.87\%$
test_tc_init 0.1076ms 47.5272μs 21.0406 KOps/s 23.5330 KOps/s $\textbf{\color{#d91a1a}-10.59\%}$
test_tc_init_nested 0.1678ms 94.2758μs 10.6072 KOps/s 11.6543 KOps/s $\textbf{\color{#d91a1a}-8.99\%}$
test_tc_first_layer_tensor 16.1900μs 1.5548μs 643.1893 KOps/s 651.5511 KOps/s $\color{#d91a1a}-1.28\%$
test_tc_first_layer_nontensor 28.9640μs 4.7885μs 208.8345 KOps/s 216.6973 KOps/s $\color{#d91a1a}-3.63\%$
test_tc_second_layer_tensor 38.6330μs 2.8415μs 351.9266 KOps/s 359.1640 KOps/s $\color{#d91a1a}-2.02\%$
test_tc_second_layer_nontensor 43.7920μs 6.1898μs 161.5550 KOps/s 167.1201 KOps/s $\color{#d91a1a}-3.33\%$
test_unbind 0.4589s 14.9766ms 66.7708 Ops/s 77.5994 Ops/s $\textbf{\color{#d91a1a}-13.95\%}$
test_full_like 7.7109ms 6.8349ms 146.3078 Ops/s 142.0332 Ops/s $\color{#35bf28}+3.01\%$
test_zeros_like 2.9933ms 2.6335ms 379.7248 Ops/s 372.2304 Ops/s $\color{#35bf28}+2.01\%$
test_ones_like 10.0016ms 5.8865ms 169.8805 Ops/s 326.0933 Ops/s $\textbf{\color{#d91a1a}-47.90\%}$
test_clone 13.9636ms 7.7088ms 129.7223 Ops/s 207.5325 Ops/s $\textbf{\color{#d91a1a}-37.49\%}$
test_squeeze 75.4920μs 13.0917μs 76.3840 KOps/s 79.2947 KOps/s $\color{#d91a1a}-3.67\%$
test_unsqueeze 0.1694ms 90.0702μs 11.1025 KOps/s 10.4247 KOps/s $\textbf{\color{#35bf28}+6.50\%}$
test_split 0.5553ms 0.1893ms 5.2828 KOps/s 4.9494 KOps/s $\textbf{\color{#35bf28}+6.74\%}$
test_permute 0.3666ms 0.2171ms 4.6071 KOps/s 4.3818 KOps/s $\textbf{\color{#35bf28}+5.14\%}$
test_stack 26.1090ms 24.1343ms 41.4349 Ops/s 41.8202 Ops/s $\color{#d91a1a}-0.92\%$
test_cat 26.2814ms 23.8884ms 41.8613 Ops/s 41.8339 Ops/s $\color{#35bf28}+0.07\%$

@github-actions
Copy link

github-actions bot commented Sep 26, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1423ms 13.4781μs 74.1944 KOps/s 73.2826 KOps/s $\color{#35bf28}+1.24\%$
test_plain_set_stack_nested 36.0900μs 13.6054μs 73.5004 KOps/s 73.0380 KOps/s $\color{#35bf28}+0.63\%$
test_plain_set_nested_inplace 64.2610μs 14.5842μs 68.5676 KOps/s 67.6228 KOps/s $\color{#35bf28}+1.40\%$
test_plain_set_stack_nested_inplace 49.4900μs 14.6887μs 68.0794 KOps/s 67.9551 KOps/s $\color{#35bf28}+0.18\%$
test_items 36.9110μs 2.9916μs 334.2701 KOps/s 348.2568 KOps/s $\color{#d91a1a}-4.02\%$
test_items_nested 0.3791ms 0.3251ms 3.0762 KOps/s 3.0654 KOps/s $\color{#35bf28}+0.35\%$
test_items_nested_locked 0.3684ms 0.3235ms 3.0911 KOps/s 3.0427 KOps/s $\color{#35bf28}+1.59\%$
test_items_nested_leaf 79.2720μs 55.4711μs 18.0274 KOps/s 17.8418 KOps/s $\color{#35bf28}+1.04\%$
test_items_stack_nested 0.3734ms 0.3241ms 3.0859 KOps/s 3.0214 KOps/s $\color{#35bf28}+2.14\%$
test_items_stack_nested_leaf 83.2910μs 55.9829μs 17.8626 KOps/s 17.4104 KOps/s $\color{#35bf28}+2.60\%$
test_items_stack_nested_locked 0.3757ms 0.3246ms 3.0804 KOps/s 2.9910 KOps/s $\color{#35bf28}+2.99\%$
test_keys 39.0510μs 3.4001μs 294.1129 KOps/s 294.5316 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_nested 87.0010μs 56.6905μs 17.6396 KOps/s 17.9659 KOps/s $\color{#d91a1a}-1.82\%$
test_keys_nested_locked 2.5389ms 62.6206μs 15.9692 KOps/s 15.9527 KOps/s $\color{#35bf28}+0.10\%$
test_keys_nested_leaf 74.1610μs 46.0415μs 21.7195 KOps/s 21.0003 KOps/s $\color{#35bf28}+3.43\%$
test_keys_stack_nested 89.3110μs 56.4150μs 17.7258 KOps/s 17.5387 KOps/s $\color{#35bf28}+1.07\%$
test_keys_stack_nested_leaf 79.6110μs 47.5032μs 21.0512 KOps/s 21.1425 KOps/s $\color{#d91a1a}-0.43\%$
test_keys_stack_nested_locked 90.1420μs 62.1923μs 16.0792 KOps/s 15.9650 KOps/s $\color{#35bf28}+0.71\%$
test_values 3.5830μs 0.8362μs 1.1958 MOps/s 1.1847 MOps/s $\color{#35bf28}+0.94\%$
test_values_nested 70.4610μs 41.2954μs 24.2158 KOps/s 24.1881 KOps/s $\color{#35bf28}+0.11\%$
test_values_nested_locked 77.5310μs 43.2008μs 23.1477 KOps/s 23.2035 KOps/s $\color{#d91a1a}-0.24\%$
test_values_nested_leaf 64.2810μs 35.8045μs 27.9295 KOps/s 27.8958 KOps/s $\color{#35bf28}+0.12\%$
test_values_stack_nested 76.2720μs 41.4895μs 24.1025 KOps/s 23.6376 KOps/s $\color{#35bf28}+1.97\%$
test_values_stack_nested_leaf 69.0610μs 36.3362μs 27.5207 KOps/s 27.4096 KOps/s $\color{#35bf28}+0.41\%$
test_values_stack_nested_locked 76.3910μs 43.3493μs 23.0684 KOps/s 22.7081 KOps/s $\color{#35bf28}+1.59\%$
test_membership 1.9535μs 0.5007μs 1.9973 MOps/s 1.9903 MOps/s $\color{#35bf28}+0.35\%$
test_membership_nested 15.2805μs 1.8690μs 535.0445 KOps/s 514.6768 KOps/s $\color{#35bf28}+3.96\%$
test_membership_nested_leaf 11.7800μs 1.8409μs 543.2007 KOps/s 515.2955 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_membership_stacked_nested 48.2700μs 1.9137μs 522.5501 KOps/s 513.5835 KOps/s $\color{#35bf28}+1.75\%$
test_membership_stacked_nested_leaf 27.0300μs 1.9453μs 514.0492 KOps/s 514.4970 KOps/s $\color{#d91a1a}-0.09\%$
test_membership_nested_last 34.0000μs 2.7752μs 360.3346 KOps/s 358.3612 KOps/s $\color{#35bf28}+0.55\%$
test_membership_nested_leaf_last 36.4400μs 2.7970μs 357.5280 KOps/s 359.3850 KOps/s $\color{#d91a1a}-0.52\%$
test_membership_stacked_nested_last 33.6210μs 2.7900μs 358.4272 KOps/s 305.3557 KOps/s $\textbf{\color{#35bf28}+17.38\%}$
test_membership_stacked_nested_leaf_last 44.4010μs 2.8015μs 356.9518 KOps/s 316.6232 KOps/s $\textbf{\color{#35bf28}+12.74\%}$
test_nested_getleaf 53.5610μs 6.0785μs 164.5154 KOps/s 164.8446 KOps/s $\color{#d91a1a}-0.20\%$
test_nested_get 30.1400μs 5.6843μs 175.9231 KOps/s 175.7009 KOps/s $\color{#35bf28}+0.13\%$
test_stacked_getleaf 49.0810μs 6.0303μs 165.8291 KOps/s 165.0051 KOps/s $\color{#35bf28}+0.50\%$
test_stacked_get 39.1100μs 5.6837μs 175.9431 KOps/s 177.7379 KOps/s $\color{#d91a1a}-1.01\%$
test_nested_getitemleaf 30.5700μs 6.1464μs 162.6968 KOps/s 163.3697 KOps/s $\color{#d91a1a}-0.41\%$
test_nested_getitem 36.0600μs 5.7376μs 174.2878 KOps/s 175.3553 KOps/s $\color{#d91a1a}-0.61\%$
test_stacked_getitemleaf 47.1010μs 6.1323μs 163.0696 KOps/s 164.5605 KOps/s $\color{#d91a1a}-0.91\%$
test_stacked_getitem 37.8700μs 5.7496μs 173.9244 KOps/s 176.9446 KOps/s $\color{#d91a1a}-1.71\%$
test_lock_nested 4.6849ms 0.4198ms 2.3822 KOps/s 2.3853 KOps/s $\color{#d91a1a}-0.13\%$
test_lock_stack_nested 0.4349ms 0.3783ms 2.6434 KOps/s 2.6540 KOps/s $\color{#d91a1a}-0.40\%$
test_unlock_nested 0.7371ms 0.3542ms 2.8236 KOps/s 2.8306 KOps/s $\color{#d91a1a}-0.25\%$
test_unlock_stack_nested 0.3784ms 0.3186ms 3.1383 KOps/s 3.1490 KOps/s $\color{#d91a1a}-0.34\%$
test_flatten_speed 0.1185ms 68.8184μs 14.5310 KOps/s 14.5185 KOps/s $\color{#35bf28}+0.09\%$
test_unflatten_speed 0.3390ms 0.2869ms 3.4851 KOps/s 3.4913 KOps/s $\color{#d91a1a}-0.18\%$
test_common_ops 1.4942ms 1.2040ms 830.5443 Ops/s 815.1284 Ops/s $\color{#35bf28}+1.89\%$
test_creation 29.7910μs 1.4717μs 679.5032 KOps/s 694.4175 KOps/s $\color{#d91a1a}-2.15\%$
test_creation_empty 51.3410μs 14.7177μs 67.9454 KOps/s 68.6038 KOps/s $\color{#d91a1a}-0.96\%$
test_creation_nested_1 49.1710μs 16.5256μs 60.5121 KOps/s 60.2004 KOps/s $\color{#35bf28}+0.52\%$
test_creation_nested_2 51.5700μs 18.9357μs 52.8104 KOps/s 52.1240 KOps/s $\color{#35bf28}+1.32\%$
test_clone 61.1110μs 27.9360μs 35.7961 KOps/s 33.1360 KOps/s $\textbf{\color{#35bf28}+8.03\%}$
test_getitem[int] 91.6347ms 22.8879μs 43.6913 KOps/s 64.0822 KOps/s $\textbf{\color{#d91a1a}-31.82\%}$
test_getitem[slice_int] 0.1158ms 27.2425μs 36.7073 KOps/s 36.7136 KOps/s $\color{#d91a1a}-0.02\%$
test_getitem[range] 0.2442ms 0.1077ms 9.2888 KOps/s 9.2332 KOps/s $\color{#35bf28}+0.60\%$
test_getitem[tuple] 0.1230ms 23.3292μs 42.8647 KOps/s 42.6426 KOps/s $\color{#35bf28}+0.52\%$
test_getitem[list] 0.1894ms 96.7958μs 10.3310 KOps/s 10.2136 KOps/s $\color{#35bf28}+1.15\%$
test_setitem_dim[int] 67.9210μs 43.5903μs 22.9409 KOps/s 22.5965 KOps/s $\color{#35bf28}+1.52\%$
test_setitem_dim[slice_int] 92.6110μs 66.5557μs 15.0250 KOps/s 15.1524 KOps/s $\color{#d91a1a}-0.84\%$
test_setitem_dim[range] 0.1698ms 0.1246ms 8.0238 KOps/s 7.9956 KOps/s $\color{#35bf28}+0.35\%$
test_setitem_dim[tuple] 95.7410μs 59.9529μs 16.6798 KOps/s 16.5007 KOps/s $\color{#35bf28}+1.09\%$
test_setitem 78.6520μs 40.1610μs 24.8998 KOps/s 24.3346 KOps/s $\color{#35bf28}+2.32\%$
test_set 70.6310μs 39.2072μs 25.5055 KOps/s 24.0163 KOps/s $\textbf{\color{#35bf28}+6.20\%}$
test_set_shared 0.3532ms 49.0214μs 20.3992 KOps/s 19.6759 KOps/s $\color{#35bf28}+3.68\%$
test_update 86.9520μs 46.8548μs 21.3425 KOps/s 20.7506 KOps/s $\color{#35bf28}+2.85\%$
test_update_nested 81.2620μs 54.0440μs 18.5035 KOps/s 18.1390 KOps/s $\color{#35bf28}+2.01\%$
test_update__nested 95.8710μs 56.7743μs 17.6136 KOps/s 17.3217 KOps/s $\color{#35bf28}+1.69\%$
test_set_nested 72.6110μs 41.4062μs 24.1510 KOps/s 23.0828 KOps/s $\color{#35bf28}+4.63\%$
test_set_nested_new 83.5210μs 44.8928μs 22.2753 KOps/s 21.5610 KOps/s $\color{#35bf28}+3.31\%$
test_select 0.5408ms 58.4989μs 17.0943 KOps/s 16.8132 KOps/s $\color{#35bf28}+1.67\%$
test_select_nested 79.8220μs 41.7820μs 23.9337 KOps/s 23.7951 KOps/s $\color{#35bf28}+0.58\%$
test_exclude_nested 84.0010μs 58.5599μs 17.0765 KOps/s 17.3399 KOps/s $\color{#d91a1a}-1.52\%$
test_empty[True] 0.2837ms 0.2428ms 4.1182 KOps/s 4.0786 KOps/s $\color{#35bf28}+0.97\%$
test_empty[False] 3.4431μs 0.7447μs 1.3428 MOps/s 1.3458 MOps/s $\color{#d91a1a}-0.22\%$
test_to 56.1210μs 25.6997μs 38.9109 KOps/s 38.4271 KOps/s $\color{#35bf28}+1.26\%$
test_to_nonblocking 60.7810μs 24.3804μs 41.0166 KOps/s 39.9256 KOps/s $\color{#35bf28}+2.73\%$
test_unbind_speed 0.3062ms 0.2779ms 3.5981 KOps/s 3.6549 KOps/s $\color{#d91a1a}-1.55\%$
test_unbind_speed_stack0 0.3674ms 0.2788ms 3.5865 KOps/s 3.6690 KOps/s $\color{#d91a1a}-2.25\%$
test_unbind_speed_stack1 90.6798ms 0.7130ms 1.4026 KOps/s 1.4024 KOps/s $\color{#35bf28}+0.01\%$
test_split 93.3989ms 2.1838ms 457.9092 Ops/s 469.7142 Ops/s $\color{#d91a1a}-2.51\%$
test_chunk 93.5502ms 2.1884ms 456.9505 Ops/s 469.1022 Ops/s $\color{#d91a1a}-2.59\%$
test_creation[device0] 0.3892ms 0.1255ms 7.9687 KOps/s 8.0338 KOps/s $\color{#d91a1a}-0.81\%$
test_creation_from_tensor 0.4277ms 0.1282ms 7.7985 KOps/s 7.5976 KOps/s $\color{#35bf28}+2.64\%$
test_add_one[memmap_tensor0] 0.2908ms 8.5004μs 117.6422 KOps/s 116.0175 KOps/s $\color{#35bf28}+1.40\%$
test_contiguous[memmap_tensor0] 31.4810μs 2.1005μs 476.0829 KOps/s 474.1897 KOps/s $\color{#35bf28}+0.40\%$
test_stack[memmap_tensor0] 37.3400μs 6.6454μs 150.4806 KOps/s 156.7164 KOps/s $\color{#d91a1a}-3.98\%$
test_memmaptd_index 1.1140ms 0.4074ms 2.4549 KOps/s 2.4488 KOps/s $\color{#35bf28}+0.25\%$
test_memmaptd_index_astensor 0.9738ms 0.4675ms 2.1389 KOps/s 2.1507 KOps/s $\color{#d91a1a}-0.55\%$
test_memmaptd_index_op 1.3989ms 0.9843ms 1.0159 KOps/s 1.0195 KOps/s $\color{#d91a1a}-0.35\%$
test_serialize_model 0.1300s 0.1291s 7.7456 Ops/s 7.6917 Ops/s $\color{#35bf28}+0.70\%$
test_serialize_model_pickle 1.3765s 1.2175s 0.8213 Ops/s 0.8241 Ops/s $\color{#d91a1a}-0.33\%$
test_serialize_weights 0.2173s 0.1416s 7.0630 Ops/s 7.0187 Ops/s $\color{#35bf28}+0.63\%$
test_serialize_weights_returnearly 0.2168s 54.3828ms 18.3882 Ops/s 17.9311 Ops/s $\color{#35bf28}+2.55\%$
test_serialize_weights_pickle 1.3767s 1.2172s 0.8216 Ops/s 0.8254 Ops/s $\color{#d91a1a}-0.46\%$
test_reshape_pytree 67.9810μs 35.2877μs 28.3384 KOps/s 28.9935 KOps/s $\color{#d91a1a}-2.26\%$
test_reshape_td 69.7510μs 41.9600μs 23.8322 KOps/s 23.6955 KOps/s $\color{#35bf28}+0.58\%$
test_view_pytree 0.4254ms 34.9150μs 28.6410 KOps/s 28.9056 KOps/s $\color{#d91a1a}-0.92\%$
test_view_td 81.4410μs 47.6700μs 20.9776 KOps/s 21.3273 KOps/s $\color{#d91a1a}-1.64\%$
test_unbind_pytree 0.4226ms 33.3162μs 30.0154 KOps/s 29.5940 KOps/s $\color{#35bf28}+1.42\%$
test_unbind_td 0.5281ms 42.0003μs 23.8093 KOps/s 23.5115 KOps/s $\color{#35bf28}+1.27\%$
test_split_pytree 0.4154ms 46.3190μs 21.5894 KOps/s 22.3112 KOps/s $\color{#d91a1a}-3.24\%$
test_split_td 96.7141ms 68.4361μs 14.6122 KOps/s 18.2464 KOps/s $\textbf{\color{#d91a1a}-19.92\%}$
test_add_pytree 92.4020μs 53.0035μs 18.8667 KOps/s 18.1045 KOps/s $\color{#35bf28}+4.21\%$
test_add_td 0.4709ms 86.0537μs 11.6207 KOps/s 11.0315 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_compile_add_one_nested[tensordict-compile] 0.4039ms 0.2057ms 4.8609 KOps/s 4.7091 KOps/s $\color{#35bf28}+3.22\%$
test_compile_add_one_nested[tensordict-eager] 0.2255ms 0.1505ms 6.6427 KOps/s 6.6105 KOps/s $\color{#35bf28}+0.49\%$
test_compile_add_one_nested[pytree-compile] 0.2029ms 0.1414ms 7.0717 KOps/s 7.0729 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_add_one_nested[pytree-eager] 0.2379ms 0.1776ms 5.6309 KOps/s 5.5503 KOps/s $\color{#35bf28}+1.45\%$
test_compile_copy_nested[tensordict-compile] 57.3210μs 21.4823μs 46.5500 KOps/s 47.0058 KOps/s $\color{#d91a1a}-0.97\%$
test_compile_copy_nested[tensordict-eager] 80.6010μs 43.6085μs 22.9313 KOps/s 22.8701 KOps/s $\color{#35bf28}+0.27\%$
test_compile_copy_nested[pytree-compile] 0.2058ms 63.5024μs 15.7474 KOps/s 15.6644 KOps/s $\color{#35bf28}+0.53\%$
test_compile_copy_nested[pytree-eager] 98.7120μs 49.1645μs 20.3399 KOps/s 20.5237 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_add_one_flat[tensordict-compile] 0.3993ms 0.3088ms 3.2382 KOps/s 3.2327 KOps/s $\color{#35bf28}+0.17\%$
test_compile_add_one_flat[tensordict-eager] 0.2695ms 0.2082ms 4.8027 KOps/s 4.7757 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_flat[tensorclass-compile] 0.1654ms 0.1246ms 8.0266 KOps/s 7.6707 KOps/s $\color{#35bf28}+4.64\%$
test_compile_add_one_flat[tensorclass-eager] 0.1099ms 59.7617μs 16.7331 KOps/s 16.1501 KOps/s $\color{#35bf28}+3.61\%$
test_compile_add_one_flat[pytree-compile] 0.4120ms 0.3069ms 3.2585 KOps/s 3.2076 KOps/s $\color{#35bf28}+1.59\%$
test_compile_add_one_flat[pytree-eager] 0.6422ms 0.5982ms 1.6716 KOps/s 1.6327 KOps/s $\color{#35bf28}+2.38\%$
test_compile_add_self_flat[tensordict-eager] 0.3228ms 0.2478ms 4.0355 KOps/s 4.0178 KOps/s $\color{#35bf28}+0.44\%$
test_compile_add_self_flat[tensordict-compile] 0.3732ms 0.3094ms 3.2317 KOps/s 3.2326 KOps/s $\color{#d91a1a}-0.03\%$
test_compile_add_self_flat[tensorclass-eager] 0.1279ms 69.6745μs 14.3525 KOps/s 14.2795 KOps/s $\color{#35bf28}+0.51\%$
test_compile_add_self_flat[tensorclass-compile] 0.1802ms 0.1251ms 7.9936 KOps/s 7.7907 KOps/s $\color{#35bf28}+2.60\%$
test_compile_add_self_flat[pytree-eager] 0.6173ms 0.5167ms 1.9353 KOps/s 1.8992 KOps/s $\color{#35bf28}+1.90\%$
test_compile_add_self_flat[pytree-compile] 0.3728ms 0.3071ms 3.2560 KOps/s 3.2115 KOps/s $\color{#35bf28}+1.39\%$
test_compile_copy_flat[tensordict-compile] 75.6610μs 17.6717μs 56.5875 KOps/s 50.9541 KOps/s $\textbf{\color{#35bf28}+11.06\%}$
test_compile_copy_flat[tensordict-eager] 65.5700μs 27.1981μs 36.7672 KOps/s 36.6189 KOps/s $\color{#35bf28}+0.41\%$
test_compile_copy_flat[pytree-compile] 98.6310μs 68.2908μs 14.6433 KOps/s 14.6635 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_copy_flat[pytree-eager] 84.6010μs 51.7496μs 19.3238 KOps/s 19.4489 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_assign_and_add[tensordict-compile] 2.3322ms 0.8269ms 1.2094 KOps/s 1.1420 KOps/s $\textbf{\color{#35bf28}+5.90\%}$
test_compile_assign_and_add[tensordict-eager] 3.3595ms 3.1376ms 318.7135 Ops/s 317.2517 Ops/s $\color{#35bf28}+0.46\%$
test_compile_assign_and_add[pytree-compile] 2.3156ms 0.8071ms 1.2389 KOps/s 1.1363 KOps/s $\textbf{\color{#35bf28}+9.04\%}$
test_compile_assign_and_add[pytree-eager] 3.3668ms 3.1833ms 314.1382 Ops/s 308.4233 Ops/s $\color{#35bf28}+1.85\%$
test_compile_indexing[tensor-tensordict-compile] 0.1627ms 0.1124ms 8.8947 KOps/s 9.4667 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_compile_indexing[tensor-tensordict-eager] 0.2040ms 66.0920μs 15.1304 KOps/s 16.6217 KOps/s $\textbf{\color{#d91a1a}-8.97\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.1535ms 0.1005ms 9.9527 KOps/s 9.9968 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1212ms 44.7222μs 22.3602 KOps/s 23.6388 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_compile_indexing[tensor-pytree-compile] 0.1465ms 0.1072ms 9.3321 KOps/s 9.4625 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_indexing[tensor-pytree-eager] 93.1620μs 45.3779μs 22.0372 KOps/s 22.1496 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_indexing[slice-tensordict-compile] 0.2134ms 0.1374ms 7.2798 KOps/s 7.4917 KOps/s $\color{#d91a1a}-2.83\%$
test_compile_indexing[slice-tensordict-eager] 0.2048ms 25.8398μs 38.7000 KOps/s 40.2366 KOps/s $\color{#d91a1a}-3.82\%$
test_compile_indexing[slice-tensorclass-compile] 0.1936ms 0.1355ms 7.3815 KOps/s 7.8344 KOps/s $\textbf{\color{#d91a1a}-5.78\%}$
test_compile_indexing[slice-tensorclass-eager] 65.0010μs 21.9263μs 45.6073 KOps/s 49.5386 KOps/s $\textbf{\color{#d91a1a}-7.94\%}$
test_compile_indexing[slice-pytree-compile] 0.2656ms 0.1307ms 7.6498 KOps/s 7.8626 KOps/s $\color{#d91a1a}-2.71\%$
test_compile_indexing[slice-pytree-eager] 61.6500μs 20.7714μs 48.1431 KOps/s 49.3627 KOps/s $\color{#d91a1a}-2.47\%$
test_compile_indexing[int-tensordict-compile] 0.1968ms 0.1357ms 7.3690 KOps/s 7.4501 KOps/s $\color{#d91a1a}-1.09\%$
test_compile_indexing[int-tensordict-eager] 0.4767ms 25.2125μs 39.6628 KOps/s 40.0491 KOps/s $\color{#d91a1a}-0.96\%$
test_compile_indexing[int-tensorclass-compile] 0.2586ms 0.1360ms 7.3519 KOps/s 7.8455 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_compile_indexing[int-tensorclass-eager] 0.2149ms 27.5826μs 36.2547 KOps/s 50.1291 KOps/s $\textbf{\color{#d91a1a}-27.68\%}$
test_compile_indexing[int-pytree-compile] 0.1839ms 0.1368ms 7.3087 KOps/s 7.7988 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_compile_indexing[int-pytree-eager] 55.0310μs 22.0881μs 45.2733 KOps/s 50.7824 KOps/s $\textbf{\color{#d91a1a}-10.85\%}$
test_mod_add[eager] 67.2910μs 33.0604μs 30.2477 KOps/s 32.3842 KOps/s $\textbf{\color{#d91a1a}-6.60\%}$
test_mod_add[compile] 0.3469ms 69.3055μs 14.4289 KOps/s 14.6400 KOps/s $\color{#d91a1a}-1.44\%$
test_mod_add[compile-overhead] 0.2637ms 0.1347ms 7.4265 KOps/s 6.8346 KOps/s $\textbf{\color{#35bf28}+8.66\%}$
test_mod_wrap[eager] 0.3451ms 0.2356ms 4.2448 KOps/s 4.1297 KOps/s $\color{#35bf28}+2.79\%$
test_mod_wrap[compile] 0.3576ms 0.2866ms 3.4886 KOps/s 3.4189 KOps/s $\color{#35bf28}+2.04\%$
test_mod_wrap[compile-overhead] 7.7496ms 4.0984ms 244.0001 Ops/s 246.0708 Ops/s $\color{#d91a1a}-0.84\%$
test_mod_wrap_and_backward[eager] 1.4370ms 1.3399ms 746.3274 Ops/s 700.4672 Ops/s $\textbf{\color{#35bf28}+6.55\%}$
test_mod_wrap_and_backward[compile] 1.5196ms 1.2918ms 774.1379 Ops/s 708.7194 Ops/s $\textbf{\color{#35bf28}+9.23\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3396ms 0.8930ms 1.1198 KOps/s 998.3503 Ops/s $\textbf{\color{#35bf28}+12.17\%}$
test_seq_add[eager] 0.2208ms 98.0737μs 10.1964 KOps/s 10.2953 KOps/s $\color{#d91a1a}-0.96\%$
test_seq_add[compile] 0.1812ms 81.4467μs 12.2780 KOps/s 12.5827 KOps/s $\color{#d91a1a}-2.42\%$
test_seq_add[compile-overhead] 0.1708ms 0.1145ms 8.7346 KOps/s 8.8979 KOps/s $\color{#d91a1a}-1.84\%$
test_seq_wrap[eager] 0.5127ms 0.3908ms 2.5592 KOps/s 2.5929 KOps/s $\color{#d91a1a}-1.30\%$
test_seq_wrap[compile] 0.7132ms 0.3049ms 3.2797 KOps/s 3.2296 KOps/s $\color{#35bf28}+1.55\%$
test_seq_wrap[compile-overhead] 0.2642ms 0.2157ms 4.6364 KOps/s 4.6080 KOps/s $\color{#35bf28}+0.62\%$
test_func_call_runtime[False-eager] 1.1382ms 0.7265ms 1.3764 KOps/s 1.3753 KOps/s $\color{#35bf28}+0.08\%$
test_func_call_runtime[False-compile] 1.1739ms 0.7582ms 1.3190 KOps/s 1.3025 KOps/s $\color{#35bf28}+1.27\%$
test_func_call_runtime[False-compile-overhead] 0.7536ms 0.3552ms 2.8153 KOps/s 2.8393 KOps/s $\color{#d91a1a}-0.84\%$
test_func_call_runtime[True-eager] 1.3272ms 0.8939ms 1.1187 KOps/s 1.1260 KOps/s $\color{#d91a1a}-0.65\%$
test_func_call_runtime[True-compile] 1.1878ms 0.7839ms 1.2757 KOps/s 1.2685 KOps/s $\color{#35bf28}+0.57\%$
test_func_call_runtime[True-compile-overhead] 0.4801ms 0.3760ms 2.6597 KOps/s 2.6663 KOps/s $\color{#d91a1a}-0.25\%$
test_func_call_cm_runtime[False-eager] 0.8648ms 0.7202ms 1.3885 KOps/s 1.3694 KOps/s $\color{#35bf28}+1.40\%$
test_func_call_cm_runtime[False-compile] 0.9046ms 0.7613ms 1.3135 KOps/s 1.2941 KOps/s $\color{#35bf28}+1.50\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4481ms 0.3541ms 2.8243 KOps/s 2.8142 KOps/s $\color{#35bf28}+0.36\%$
test_func_call_cm_runtime[True-eager] 1.1103ms 0.9794ms 1.0211 KOps/s 1.0053 KOps/s $\color{#35bf28}+1.57\%$
test_func_call_cm_runtime[True-compile] 1.2193ms 0.8125ms 1.2307 KOps/s 1.2202 KOps/s $\color{#35bf28}+0.86\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5348ms 0.3978ms 2.5141 KOps/s 2.4910 KOps/s $\color{#35bf28}+0.93\%$
test_vmap_func_call_cm_runtime[eager] 2.5006ms 2.0566ms 486.2302 Ops/s 484.5449 Ops/s $\color{#35bf28}+0.35\%$
test_vmap_func_call_cm_runtime[compile] 0.9187ms 0.8241ms 1.2135 KOps/s 1.1996 KOps/s $\color{#35bf28}+1.16\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5394ms 0.4023ms 2.4857 KOps/s 2.4795 KOps/s $\color{#35bf28}+0.25\%$
test_distributed 3.2547ms 0.1730ms 5.7802 KOps/s 8.9335 KOps/s $\textbf{\color{#d91a1a}-35.30\%}$
test_tdmodule 24.5300μs 14.2490μs 70.1805 KOps/s 66.9259 KOps/s $\color{#35bf28}+4.86\%$
test_tdmodule_dispatch 49.3610μs 27.8747μs 35.8749 KOps/s 35.6520 KOps/s $\color{#35bf28}+0.63\%$
test_tdseq 43.8900μs 15.2888μs 65.4076 KOps/s 65.0597 KOps/s $\color{#35bf28}+0.53\%$
test_tdseq_dispatch 53.0710μs 30.4529μs 32.8376 KOps/s 32.8196 KOps/s $\color{#35bf28}+0.05\%$
test_instantiation_functorch 1.9859ms 1.8259ms 547.6637 Ops/s 543.9815 Ops/s $\color{#35bf28}+0.68\%$
test_instantiation_td 1.8240ms 1.1813ms 846.5333 Ops/s 843.9704 Ops/s $\color{#35bf28}+0.30\%$
test_exec_functorch 0.2716ms 0.2131ms 4.6929 KOps/s 4.8045 KOps/s $\color{#d91a1a}-2.32\%$
test_exec_functional_call 0.2947ms 0.2191ms 4.5639 KOps/s 4.8396 KOps/s $\textbf{\color{#d91a1a}-5.70\%}$
test_exec_td 0.2745ms 0.2282ms 4.3820 KOps/s 4.4992 KOps/s $\color{#d91a1a}-2.60\%$
test_exec_td_decorator 0.7420ms 0.2690ms 3.7171 KOps/s 3.7538 KOps/s $\color{#d91a1a}-0.98\%$
test_vmap_mlp_speed[True-True] 0.7893ms 0.6726ms 1.4867 KOps/s 1.4454 KOps/s $\color{#35bf28}+2.86\%$
test_vmap_mlp_speed[True-False] 0.8065ms 0.6702ms 1.4920 KOps/s 1.4535 KOps/s $\color{#35bf28}+2.65\%$
test_vmap_mlp_speed[False-True] 0.6852ms 0.5638ms 1.7736 KOps/s 1.6896 KOps/s $\color{#35bf28}+4.97\%$
test_vmap_mlp_speed[False-False] 0.6811ms 0.5622ms 1.7788 KOps/s 1.6746 KOps/s $\textbf{\color{#35bf28}+6.22\%}$
test_vmap_mlp_speed_decorator[True-True] 1.1304ms 0.6855ms 1.4588 KOps/s 1.4663 KOps/s $\color{#d91a1a}-0.51\%$
test_vmap_mlp_speed_decorator[True-False] 0.8242ms 0.6941ms 1.4407 KOps/s 1.4759 KOps/s $\color{#d91a1a}-2.38\%$
test_vmap_mlp_speed_decorator[False-True] 0.7708ms 0.6081ms 1.6444 KOps/s 1.6413 KOps/s $\color{#35bf28}+0.19\%$
test_vmap_mlp_speed_decorator[False-False] 0.7134ms 0.6089ms 1.6423 KOps/s 1.6762 KOps/s $\color{#d91a1a}-2.03\%$
test_vmap_transformer_speed[True-True] 8.3545ms 8.2208ms 121.6427 Ops/s 120.6177 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_transformer_speed[True-False] 8.5564ms 8.2000ms 121.9518 Ops/s 120.4110 Ops/s $\color{#35bf28}+1.28\%$
test_vmap_transformer_speed[False-True] 8.1616ms 8.0031ms 124.9516 Ops/s 123.7449 Ops/s $\color{#35bf28}+0.98\%$
test_vmap_transformer_speed[False-False] 8.1863ms 8.0438ms 124.3200 Ops/s 123.4753 Ops/s $\color{#35bf28}+0.68\%$
test_vmap_transformer_speed_decorator[True-True] 19.4271ms 19.2599ms 51.9213 Ops/s 51.7660 Ops/s $\color{#35bf28}+0.30\%$
test_vmap_transformer_speed_decorator[True-False] 20.2922ms 19.2877ms 51.8465 Ops/s 51.7339 Ops/s $\color{#35bf28}+0.22\%$
test_vmap_transformer_speed_decorator[False-True] 20.0554ms 19.7287ms 50.6874 Ops/s 52.1247 Ops/s $\color{#d91a1a}-2.76\%$
test_vmap_transformer_speed_decorator[False-False] 20.1971ms 19.4122ms 51.5141 Ops/s 52.1079 Ops/s $\color{#d91a1a}-1.14\%$
test_to_module_speed[True] 2.0420ms 0.9491ms 1.0536 KOps/s 1.0726 KOps/s $\color{#d91a1a}-1.77\%$
test_to_module_speed[False] 1.1293ms 0.9207ms 1.0861 KOps/s 1.0841 KOps/s $\color{#35bf28}+0.19\%$
test_tc_init 61.9210μs 33.3983μs 29.9417 KOps/s 31.5204 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_tc_init_nested 0.1102ms 66.9339μs 14.9401 KOps/s 15.1638 KOps/s $\color{#d91a1a}-1.48\%$
test_tc_first_layer_tensor 14.6246μs 0.6621μs 1.5104 MOps/s 1.5164 MOps/s $\color{#d91a1a}-0.40\%$
test_tc_first_layer_nontensor 25.9300μs 2.2092μs 452.6463 KOps/s 454.0020 KOps/s $\color{#d91a1a}-0.30\%$
test_tc_second_layer_tensor 10.2275μs 1.3684μs 730.7956 KOps/s 742.5225 KOps/s $\color{#d91a1a}-1.58\%$
test_tc_second_layer_nontensor 0.1051ms 2.8973μs 345.1443 KOps/s 349.0280 KOps/s $\color{#d91a1a}-1.11\%$
test_unbind 0.1944s 12.2077ms 81.9153 Ops/s 92.7359 Ops/s $\textbf{\color{#d91a1a}-11.67\%}$
test_full_like 0.6736ms 0.5761ms 1.7357 KOps/s 1.7400 KOps/s $\color{#d91a1a}-0.25\%$
test_zeros_like 0.2610ms 0.1979ms 5.0532 KOps/s 5.0519 KOps/s $\color{#35bf28}+0.03\%$
test_ones_like 0.2403ms 0.1980ms 5.0517 KOps/s 5.0559 KOps/s $\color{#d91a1a}-0.08\%$
test_clone 0.4445ms 0.4146ms 2.4121 KOps/s 2.4117 KOps/s $\color{#35bf28}+0.01\%$
test_squeeze 33.3410μs 9.9399μs 100.6051 KOps/s 98.7886 KOps/s $\color{#35bf28}+1.84\%$
test_unsqueeze 0.2212ms 76.3118μs 13.1041 KOps/s 13.4892 KOps/s $\color{#d91a1a}-2.85\%$
test_split 0.4340ms 0.1582ms 6.3223 KOps/s 6.3652 KOps/s $\color{#d91a1a}-0.67\%$
test_permute 0.2225ms 0.1767ms 5.6588 KOps/s 5.6490 KOps/s $\color{#35bf28}+0.17\%$
test_stack 1.2550ms 0.8700ms 1.1495 KOps/s 1.1713 KOps/s $\color{#d91a1a}-1.86\%$
test_cat 1.2521ms 1.2320ms 811.6905 Ops/s 811.9882 Ops/s $\color{#d91a1a}-0.04\%$

vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: fba1451
Pull Request resolved: #1010
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: fba1451
Pull Request resolved: #1010
[ghstack-poisoned]
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: be809b4
Pull Request resolved: #1010
@vmoens vmoens added the bug Something isn't working label Sep 26, 2024
@vmoens vmoens merged commit 3f446c6 into gh/vmoens/19/base Sep 26, 2024
vmoens pushed a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: be809b4
Pull Request resolved: #1010
@vmoens vmoens deleted the gh/vmoens/19/head branch September 26, 2024 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants