Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented Jun 4, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 4, 2024
@vmoens vmoens added bug Something isn't working versioning labels Jun 4, 2024
@vmoens vmoens merged commit efda1a7 into main Jun 4, 2024
@vmoens vmoens deleted the torch-2.0-filename branch June 4, 2024 11:48
@github-actions
Copy link

github-actions bot commented Jun 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 44.3720μs 17.1692μs 58.2438 KOps/s 58.8694 KOps/s $\color{#d91a1a}-1.06\%$
test_plain_set_stack_nested 58.1590μs 17.4045μs 57.4565 KOps/s 57.5401 KOps/s $\color{#d91a1a}-0.15\%$
test_plain_set_nested_inplace 79.9990μs 19.2652μs 51.9070 KOps/s 51.5647 KOps/s $\color{#35bf28}+0.66\%$
test_plain_set_stack_nested_inplace 55.1930μs 19.3218μs 51.7549 KOps/s 51.7393 KOps/s $\color{#35bf28}+0.03\%$
test_items 40.9360μs 2.5246μs 396.1059 KOps/s 396.6390 KOps/s $\color{#d91a1a}-0.13\%$
test_items_nested 0.4745ms 0.2669ms 3.7473 KOps/s 3.7328 KOps/s $\color{#35bf28}+0.39\%$
test_items_nested_locked 1.1438ms 0.2693ms 3.7129 KOps/s 3.7069 KOps/s $\color{#35bf28}+0.16\%$
test_items_nested_leaf 0.1580ms 76.8254μs 13.0165 KOps/s 12.7362 KOps/s $\color{#35bf28}+2.20\%$
test_items_stack_nested 0.4082ms 0.2701ms 3.7026 KOps/s 3.6847 KOps/s $\color{#35bf28}+0.49\%$
test_items_stack_nested_leaf 0.1602ms 77.3717μs 12.9246 KOps/s 12.5424 KOps/s $\color{#35bf28}+3.05\%$
test_items_stack_nested_locked 0.8833ms 0.2764ms 3.6186 KOps/s 3.6898 KOps/s $\color{#d91a1a}-1.93\%$
test_keys 44.4520μs 3.8452μs 260.0614 KOps/s 260.3823 KOps/s $\color{#d91a1a}-0.12\%$
test_keys_nested 0.2370ms 0.1388ms 7.2065 KOps/s 6.8611 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_keys_nested_locked 0.7351ms 0.1440ms 6.9432 KOps/s 6.8239 KOps/s $\color{#35bf28}+1.75\%$
test_keys_nested_leaf 0.2044ms 0.1189ms 8.4107 KOps/s 8.2551 KOps/s $\color{#35bf28}+1.88\%$
test_keys_stack_nested 0.4988ms 0.1410ms 7.0940 KOps/s 7.0752 KOps/s $\color{#35bf28}+0.27\%$
test_keys_stack_nested_leaf 0.1948ms 0.1171ms 8.5431 KOps/s 8.3861 KOps/s $\color{#35bf28}+1.87\%$
test_keys_stack_nested_locked 0.2668ms 0.1451ms 6.8905 KOps/s 6.8992 KOps/s $\color{#d91a1a}-0.12\%$
test_values 20.1450μs 1.1682μs 855.9886 KOps/s 849.9054 KOps/s $\color{#35bf28}+0.72\%$
test_values_nested 0.1022ms 50.4438μs 19.8241 KOps/s 19.1987 KOps/s $\color{#35bf28}+3.26\%$
test_values_nested_locked 0.1009ms 50.6406μs 19.7470 KOps/s 19.6208 KOps/s $\color{#35bf28}+0.64\%$
test_values_nested_leaf 0.1768ms 45.6157μs 21.9223 KOps/s 21.5979 KOps/s $\color{#35bf28}+1.50\%$
test_values_stack_nested 0.1027ms 52.8183μs 18.9328 KOps/s 19.2623 KOps/s $\color{#d91a1a}-1.71\%$
test_values_stack_nested_leaf 81.3720μs 45.2157μs 22.1162 KOps/s 21.8471 KOps/s $\color{#35bf28}+1.23\%$
test_values_stack_nested_locked 0.1015ms 50.9377μs 19.6318 KOps/s 19.2595 KOps/s $\color{#35bf28}+1.93\%$
test_membership 39.4940μs 1.3241μs 755.2264 KOps/s 735.8894 KOps/s $\color{#35bf28}+2.63\%$
test_membership_nested 49.0510μs 3.3584μs 297.7586 KOps/s 292.5428 KOps/s $\color{#35bf28}+1.78\%$
test_membership_nested_leaf 45.9550μs 3.3645μs 297.2247 KOps/s 292.1932 KOps/s $\color{#35bf28}+1.72\%$
test_membership_stacked_nested 20.2270μs 3.3556μs 298.0084 KOps/s 294.2292 KOps/s $\color{#35bf28}+1.28\%$
test_membership_stacked_nested_leaf 55.2430μs 3.3730μs 296.4705 KOps/s 290.7881 KOps/s $\color{#35bf28}+1.95\%$
test_membership_nested_last 33.2420μs 4.0911μs 244.4342 KOps/s 239.1444 KOps/s $\color{#35bf28}+2.21\%$
test_membership_nested_leaf_last 54.0910μs 4.1503μs 240.9465 KOps/s 240.6917 KOps/s $\color{#35bf28}+0.11\%$
test_membership_stacked_nested_last 25.6080μs 4.0714μs 245.6153 KOps/s 240.6716 KOps/s $\color{#35bf28}+2.05\%$
test_membership_stacked_nested_leaf_last 45.9150μs 4.0801μs 245.0936 KOps/s 217.9650 KOps/s $\textbf{\color{#35bf28}+12.45\%}$
test_nested_getleaf 51.6660μs 10.4860μs 95.3654 KOps/s 95.7580 KOps/s $\color{#d91a1a}-0.41\%$
test_nested_get 49.9340μs 9.9529μs 100.4732 KOps/s 100.3964 KOps/s $\color{#35bf28}+0.08\%$
test_stacked_getleaf 33.5530μs 10.5219μs 95.0395 KOps/s 95.5337 KOps/s $\color{#d91a1a}-0.52\%$
test_stacked_get 37.4400μs 9.9748μs 100.2523 KOps/s 101.0203 KOps/s $\color{#d91a1a}-0.76\%$
test_nested_getitemleaf 57.4770μs 11.1713μs 89.5154 KOps/s 89.7715 KOps/s $\color{#d91a1a}-0.29\%$
test_nested_getitem 41.6680μs 10.1829μs 98.2041 KOps/s 96.5764 KOps/s $\color{#35bf28}+1.69\%$
test_stacked_getitemleaf 61.3250μs 11.0463μs 90.5280 KOps/s 90.6147 KOps/s $\color{#d91a1a}-0.10\%$
test_stacked_getitem 73.1470μs 10.1569μs 98.4549 KOps/s 97.9951 KOps/s $\color{#35bf28}+0.47\%$
test_lock_nested 0.8037ms 0.3528ms 2.8343 KOps/s 2.4960 KOps/s $\textbf{\color{#35bf28}+13.55\%}$
test_lock_stack_nested 0.4662ms 0.3095ms 3.2314 KOps/s 3.2737 KOps/s $\color{#d91a1a}-1.29\%$
test_unlock_nested 0.8079ms 0.3519ms 2.8419 KOps/s 2.4720 KOps/s $\textbf{\color{#35bf28}+14.96\%}$
test_unlock_stack_nested 0.5830ms 0.3163ms 3.1616 KOps/s 3.2017 KOps/s $\color{#d91a1a}-1.25\%$
test_flatten_speed 0.5709ms 96.1681μs 10.3985 KOps/s 10.3329 KOps/s $\color{#35bf28}+0.63\%$
test_unflatten_speed 0.5546ms 0.4040ms 2.4753 KOps/s 2.4268 KOps/s $\color{#35bf28}+2.00\%$
test_common_ops 1.3216ms 0.7078ms 1.4128 KOps/s 1.4097 KOps/s $\color{#35bf28}+0.22\%$
test_creation 21.0190μs 1.8934μs 528.1641 KOps/s 525.0693 KOps/s $\color{#35bf28}+0.59\%$
test_creation_empty 37.5400μs 10.6702μs 93.7186 KOps/s 92.8713 KOps/s $\color{#35bf28}+0.91\%$
test_creation_nested_1 43.4810μs 13.5384μs 73.8641 KOps/s 72.9606 KOps/s $\color{#35bf28}+1.24\%$
test_creation_nested_2 67.4460μs 16.7261μs 59.7867 KOps/s 59.0823 KOps/s $\color{#35bf28}+1.19\%$
test_clone 77.8340μs 13.4685μs 74.2473 KOps/s 75.3041 KOps/s $\color{#d91a1a}-1.40\%$
test_getitem[int] 38.2510μs 11.3161μs 88.3700 KOps/s 85.8445 KOps/s $\color{#35bf28}+2.94\%$
test_getitem[slice_int] 56.2760μs 22.9779μs 43.5202 KOps/s 44.7888 KOps/s $\color{#d91a1a}-2.83\%$
test_getitem[range] 82.8840μs 60.2562μs 16.5958 KOps/s 14.9722 KOps/s $\textbf{\color{#35bf28}+10.84\%}$
test_getitem[tuple] 49.4320μs 18.6654μs 53.5750 KOps/s 53.1339 KOps/s $\color{#35bf28}+0.83\%$
test_getitem[list] 0.1068ms 40.2781μs 24.8274 KOps/s 24.6623 KOps/s $\color{#35bf28}+0.67\%$
test_setitem_dim[int] 70.4720μs 34.3608μs 29.1030 KOps/s 27.5847 KOps/s $\textbf{\color{#35bf28}+5.50\%}$
test_setitem_dim[slice_int] 0.1067ms 61.0075μs 16.3914 KOps/s 16.1216 KOps/s $\color{#35bf28}+1.67\%$
test_setitem_dim[range] 0.1403ms 85.0680μs 11.7553 KOps/s 11.4916 KOps/s $\color{#35bf28}+2.29\%$
test_setitem_dim[tuple] 91.9620μs 49.8923μs 20.0432 KOps/s 19.8749 KOps/s $\color{#35bf28}+0.85\%$
test_setitem 75.1900μs 20.3976μs 49.0254 KOps/s 50.2522 KOps/s $\color{#d91a1a}-2.44\%$
test_set 83.6460μs 19.7997μs 50.5059 KOps/s 51.9715 KOps/s $\color{#d91a1a}-2.82\%$
test_set_shared 3.4076ms 0.1404ms 7.1239 KOps/s 6.8783 KOps/s $\color{#35bf28}+3.57\%$
test_update 99.8370μs 21.9401μs 45.5787 KOps/s 45.8493 KOps/s $\color{#d91a1a}-0.59\%$
test_update_nested 94.7260μs 30.2386μs 33.0703 KOps/s 31.6602 KOps/s $\color{#35bf28}+4.45\%$
test_update__nested 69.2990μs 25.1059μs 39.8313 KOps/s 40.2237 KOps/s $\color{#d91a1a}-0.98\%$
test_set_nested 0.2965ms 22.5618μs 44.3226 KOps/s 45.8489 KOps/s $\color{#d91a1a}-3.33\%$
test_set_nested_new 63.0670μs 25.6977μs 38.9139 KOps/s 38.1149 KOps/s $\color{#35bf28}+2.10\%$
test_select 0.1206ms 41.5813μs 24.0493 KOps/s 23.8232 KOps/s $\color{#35bf28}+0.95\%$
test_select_nested 0.1136ms 59.1117μs 16.9171 KOps/s 16.5709 KOps/s $\color{#35bf28}+2.09\%$
test_exclude_nested 0.2160ms 0.1171ms 8.5380 KOps/s 8.2794 KOps/s $\color{#35bf28}+3.12\%$
test_empty[True] 0.6899ms 0.3857ms 2.5930 KOps/s 2.5111 KOps/s $\color{#35bf28}+3.26\%$
test_empty[False] 44.3000μs 1.1707μs 854.1648 KOps/s 819.9655 KOps/s $\color{#35bf28}+4.17\%$
test_unbind_speed 1.6107ms 0.2539ms 3.9379 KOps/s 3.8228 KOps/s $\color{#35bf28}+3.01\%$
test_unbind_speed_stack0 0.4678ms 0.2568ms 3.8938 KOps/s 3.9930 KOps/s $\color{#d91a1a}-2.48\%$
test_unbind_speed_stack1 69.3052ms 0.7271ms 1.3754 KOps/s 1.2872 KOps/s $\textbf{\color{#35bf28}+6.85\%}$
test_split 68.3547ms 1.6685ms 599.3445 Ops/s 613.9878 Ops/s $\color{#d91a1a}-2.38\%$
test_chunk 67.7892ms 1.6275ms 614.4246 Ops/s 611.5122 Ops/s $\color{#35bf28}+0.48\%$
test_creation[device0] 0.1940ms 83.3921μs 11.9915 KOps/s 11.9307 KOps/s $\color{#35bf28}+0.51\%$
test_creation_from_tensor 0.2318ms 84.8480μs 11.7858 KOps/s 11.3051 KOps/s $\color{#35bf28}+4.25\%$
test_add_one[memmap_tensor0] 85.3290μs 5.3758μs 186.0194 KOps/s 189.0907 KOps/s $\color{#d91a1a}-1.62\%$
test_contiguous[memmap_tensor0] 11.9530μs 0.6289μs 1.5901 MOps/s 1.5374 MOps/s $\color{#35bf28}+3.43\%$
test_stack[memmap_tensor0] 25.3280μs 3.6083μs 277.1420 KOps/s 281.2610 KOps/s $\color{#d91a1a}-1.46\%$
test_memmaptd_index 0.9188ms 0.2525ms 3.9608 KOps/s 3.9281 KOps/s $\color{#35bf28}+0.83\%$
test_memmaptd_index_astensor 0.7590ms 0.3261ms 3.0668 KOps/s 2.9929 KOps/s $\color{#35bf28}+2.47\%$
test_memmaptd_index_op 0.9842ms 0.6115ms 1.6352 KOps/s 1.6344 KOps/s $\color{#35bf28}+0.05\%$
test_serialize_model 0.1892s 0.1148s 8.7082 Ops/s 8.1075 Ops/s $\textbf{\color{#35bf28}+7.41\%}$
test_serialize_model_pickle 0.4510s 0.3540s 2.8252 Ops/s 2.5845 Ops/s $\textbf{\color{#35bf28}+9.31\%}$
test_serialize_weights 0.1115s 0.1042s 9.5943 Ops/s 8.5576 Ops/s $\textbf{\color{#35bf28}+12.11\%}$
test_serialize_weights_returnearly 0.1361s 0.1306s 7.6546 Ops/s 7.6955 Ops/s $\color{#d91a1a}-0.53\%$
test_serialize_weights_pickle 0.9509s 0.5657s 1.7678 Ops/s 1.5616 Ops/s $\textbf{\color{#35bf28}+13.20\%}$
test_serialize_weights_filesystem 0.1046s 95.1016ms 10.5151 Ops/s 10.7098 Ops/s $\color{#d91a1a}-1.82\%$
test_serialize_model_filesystem 0.1674s 0.1047s 9.5527 Ops/s 10.3378 Ops/s $\textbf{\color{#d91a1a}-7.59\%}$
test_reshape_pytree 51.9370μs 25.0837μs 39.8665 KOps/s 39.7905 KOps/s $\color{#35bf28}+0.19\%$
test_reshape_td 70.8120μs 33.0161μs 30.2882 KOps/s 29.7688 KOps/s $\color{#35bf28}+1.75\%$
test_view_pytree 65.8630μs 25.1901μs 39.6982 KOps/s 39.8000 KOps/s $\color{#d91a1a}-0.26\%$
test_view_td 0.1004ms 37.0588μs 26.9842 KOps/s 26.8930 KOps/s $\color{#35bf28}+0.34\%$
test_unbind_pytree 63.4280μs 29.2206μs 34.2224 KOps/s 34.3896 KOps/s $\color{#d91a1a}-0.49\%$
test_unbind_td 0.3683ms 37.4543μs 26.6992 KOps/s 26.0016 KOps/s $\color{#35bf28}+2.68\%$
test_split_pytree 66.3440μs 29.0060μs 34.4756 KOps/s 33.9120 KOps/s $\color{#35bf28}+1.66\%$
test_split_td 0.1258ms 41.1165μs 24.3211 KOps/s 24.4061 KOps/s $\color{#d91a1a}-0.35\%$
test_add_pytree 97.2720μs 35.4743μs 28.1894 KOps/s 28.8716 KOps/s $\color{#d91a1a}-2.36\%$
test_add_td 0.1644ms 56.9149μs 17.5701 KOps/s 18.2703 KOps/s $\color{#d91a1a}-3.83\%$
test_distributed 0.2113ms 0.1011ms 9.8903 KOps/s 9.7378 KOps/s $\color{#35bf28}+1.57\%$
test_tdmodule 29.4550μs 17.0460μs 58.6648 KOps/s 55.7264 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_tdmodule_dispatch 51.9470μs 33.7213μs 29.6549 KOps/s 28.5858 KOps/s $\color{#35bf28}+3.74\%$
test_tdseq 47.3380μs 20.0826μs 49.7943 KOps/s 49.4885 KOps/s $\color{#35bf28}+0.62\%$
test_tdseq_dispatch 58.3890μs 39.4534μs 25.3464 KOps/s 25.0965 KOps/s $\color{#35bf28}+1.00\%$
test_instantiation_functorch 2.0620ms 1.3001ms 769.1771 Ops/s 777.4273 Ops/s $\color{#d91a1a}-1.06\%$
test_instantiation_td 1.5153ms 1.0023ms 997.6970 Ops/s 904.3322 Ops/s $\textbf{\color{#35bf28}+10.32\%}$
test_exec_functorch 4.0267ms 0.1630ms 6.1338 KOps/s 6.4194 KOps/s $\color{#d91a1a}-4.45\%$
test_exec_functional_call 0.2730ms 0.1497ms 6.6820 KOps/s 6.9692 KOps/s $\color{#d91a1a}-4.12\%$
test_exec_td 0.3623ms 0.1428ms 7.0016 KOps/s 7.0890 KOps/s $\color{#d91a1a}-1.23\%$
test_exec_td_decorator 0.9414ms 0.2164ms 4.6213 KOps/s 4.6000 KOps/s $\color{#35bf28}+0.46\%$
test_vmap_mlp_speed[True-True] 0.6880ms 0.4894ms 2.0433 KOps/s 2.0681 KOps/s $\color{#d91a1a}-1.20\%$
test_vmap_mlp_speed[True-False] 0.7399ms 0.4828ms 2.0714 KOps/s 2.0613 KOps/s $\color{#35bf28}+0.49\%$
test_vmap_mlp_speed[False-True] 0.7865ms 0.3931ms 2.5437 KOps/s 2.5504 KOps/s $\color{#d91a1a}-0.26\%$
test_vmap_mlp_speed[False-False] 0.6413ms 0.3923ms 2.5489 KOps/s 2.5533 KOps/s $\color{#d91a1a}-0.18\%$
test_vmap_mlp_speed_decorator[True-True] 1.2122ms 0.5559ms 1.7989 KOps/s 1.7891 KOps/s $\color{#35bf28}+0.54\%$
test_vmap_mlp_speed_decorator[True-False] 0.7772ms 0.5541ms 1.8048 KOps/s 1.8067 KOps/s $\color{#d91a1a}-0.10\%$
test_vmap_mlp_speed_decorator[False-True] 0.8558ms 0.4568ms 2.1889 KOps/s 2.1987 KOps/s $\color{#d91a1a}-0.45\%$
test_vmap_mlp_speed_decorator[False-False] 0.6888ms 0.4569ms 2.1887 KOps/s 2.1824 KOps/s $\color{#35bf28}+0.29\%$
test_to_module_speed[True] 2.3325ms 1.6714ms 598.3089 Ops/s 577.8363 Ops/s $\color{#35bf28}+3.54\%$
test_to_module_speed[False] 1.9816ms 1.6489ms 606.4624 Ops/s 593.2635 Ops/s $\color{#35bf28}+2.22\%$
test_tc_init 64.9610μs 28.0809μs 35.6114 KOps/s 33.3896 KOps/s $\textbf{\color{#35bf28}+6.65\%}$
test_tc_init_nested 0.1181ms 59.2538μs 16.8765 KOps/s 16.0496 KOps/s $\textbf{\color{#35bf28}+5.15\%}$
test_tc_first_layer_tensor 4.6601μs 0.7943μs 1.2589 MOps/s 1.1755 MOps/s $\textbf{\color{#35bf28}+7.09\%}$
test_tc_first_layer_nontensor 3.2792μs 0.7711μs 1.2968 MOps/s 1.3554 MOps/s $\color{#d91a1a}-4.32\%$
test_tc_second_layer_tensor 19.2360μs 2.1643μs 462.0365 KOps/s 532.9451 KOps/s $\textbf{\color{#d91a1a}-13.31\%}$
test_tc_second_layer_nontensor 17.8130μs 1.9280μs 518.6611 KOps/s 592.2435 KOps/s $\textbf{\color{#d91a1a}-12.42\%}$
test_unbind 87.8614ms 6.4532ms 154.9630 Ops/s 135.6833 Ops/s $\textbf{\color{#35bf28}+14.21\%}$
test_full_like 17.3094ms 11.9819ms 83.4595 Ops/s 87.0209 Ops/s $\color{#d91a1a}-4.09\%$
test_zeros_like 10.9616ms 5.9491ms 168.0939 Ops/s 160.2641 Ops/s $\color{#35bf28}+4.89\%$
test_ones_like 12.2748ms 6.6056ms 151.3876 Ops/s 145.9216 Ops/s $\color{#35bf28}+3.75\%$
test_clone 16.3968ms 8.1704ms 122.3926 Ops/s 119.1915 Ops/s $\color{#35bf28}+2.69\%$
test_squeeze 61.4850μs 13.6568μs 73.2234 KOps/s 73.1707 KOps/s $\color{#35bf28}+0.07\%$
test_unsqueeze 0.1327ms 59.2038μs 16.8908 KOps/s 16.6623 KOps/s $\color{#35bf28}+1.37\%$
test_split 0.2056ms 0.1118ms 8.9474 KOps/s 8.9152 KOps/s $\color{#35bf28}+0.36\%$
test_permute 0.2760ms 0.1241ms 8.0603 KOps/s 8.0329 KOps/s $\color{#35bf28}+0.34\%$
test_stack 28.3104ms 23.0501ms 43.3838 Ops/s 40.4245 Ops/s $\textbf{\color{#35bf28}+7.32\%}$
test_cat 28.7690ms 23.0134ms 43.4529 Ops/s 43.2200 Ops/s $\color{#35bf28}+0.54\%$

@github-actions
Copy link

github-actions bot commented Jun 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 152. Improved: $\large\color{#35bf28}8$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.4177ms 13.4644μs 74.2697 KOps/s 74.8727 KOps/s $\color{#d91a1a}-0.81\%$
test_plain_set_stack_nested 45.0910μs 13.5486μs 73.8085 KOps/s 74.1366 KOps/s $\color{#d91a1a}-0.44\%$
test_plain_set_nested_inplace 0.2060ms 14.7464μs 67.8132 KOps/s 68.2952 KOps/s $\color{#d91a1a}-0.71\%$
test_plain_set_stack_nested_inplace 0.2166ms 14.7993μs 67.5710 KOps/s 67.1798 KOps/s $\color{#35bf28}+0.58\%$
test_items 20.0800μs 4.6986μs 212.8302 KOps/s 213.1737 KOps/s $\color{#d91a1a}-0.16\%$
test_items_nested 0.5261ms 0.3407ms 2.9348 KOps/s 2.9370 KOps/s $\color{#d91a1a}-0.07\%$
test_items_nested_locked 0.5364ms 0.3511ms 2.8484 KOps/s 2.8647 KOps/s $\color{#d91a1a}-0.57\%$
test_items_nested_leaf 0.2633ms 82.7510μs 12.0845 KOps/s 11.9465 KOps/s $\color{#35bf28}+1.15\%$
test_items_stack_nested 0.5314ms 0.3496ms 2.8607 KOps/s 2.9113 KOps/s $\color{#d91a1a}-1.74\%$
test_items_stack_nested_leaf 0.2631ms 84.9831μs 11.7670 KOps/s 11.9619 KOps/s $\color{#d91a1a}-1.63\%$
test_items_stack_nested_locked 0.4414ms 0.3482ms 2.8719 KOps/s 2.8631 KOps/s $\color{#35bf28}+0.31\%$
test_keys 24.4010μs 4.3412μs 230.3502 KOps/s 229.5017 KOps/s $\color{#35bf28}+0.37\%$
test_keys_nested 0.2501ms 67.7357μs 14.7633 KOps/s 14.7443 KOps/s $\color{#35bf28}+0.13\%$
test_keys_nested_locked 0.7452ms 71.4954μs 13.9869 KOps/s 13.7285 KOps/s $\color{#35bf28}+1.88\%$
test_keys_nested_leaf 0.2392ms 57.7471μs 17.3169 KOps/s 17.1169 KOps/s $\color{#35bf28}+1.17\%$
test_keys_stack_nested 95.4410μs 67.4002μs 14.8367 KOps/s 14.8170 KOps/s $\color{#35bf28}+0.13\%$
test_keys_stack_nested_leaf 0.2451ms 58.2271μs 17.1741 KOps/s 17.2318 KOps/s $\color{#d91a1a}-0.33\%$
test_keys_stack_nested_locked 0.2554ms 72.7155μs 13.7522 KOps/s 13.8581 KOps/s $\color{#d91a1a}-0.76\%$
test_values 9.5433μs 1.8311μs 546.1255 KOps/s 549.4466 KOps/s $\color{#d91a1a}-0.60\%$
test_values_nested 0.2173ms 34.9828μs 28.5855 KOps/s 28.3167 KOps/s $\color{#35bf28}+0.95\%$
test_values_nested_locked 0.2148ms 36.8302μs 27.1517 KOps/s 26.6993 KOps/s $\color{#35bf28}+1.69\%$
test_values_nested_leaf 0.2097ms 31.2997μs 31.9492 KOps/s 31.7344 KOps/s $\color{#35bf28}+0.68\%$
test_values_stack_nested 62.8610μs 36.1698μs 27.6474 KOps/s 27.7669 KOps/s $\color{#d91a1a}-0.43\%$
test_values_stack_nested_leaf 0.2175ms 32.2636μs 30.9947 KOps/s 31.1036 KOps/s $\color{#d91a1a}-0.35\%$
test_values_stack_nested_locked 0.2188ms 37.8250μs 26.4376 KOps/s 26.2488 KOps/s $\color{#35bf28}+0.72\%$
test_membership 26.4476μs 0.7089μs 1.4106 MOps/s 1.4400 MOps/s $\color{#d91a1a}-2.04\%$
test_membership_nested 0.1881ms 2.5878μs 386.4217 KOps/s 385.0410 KOps/s $\color{#35bf28}+0.36\%$
test_membership_nested_leaf 32.5700μs 2.5536μs 391.6066 KOps/s 388.0177 KOps/s $\color{#35bf28}+0.92\%$
test_membership_stacked_nested 14.2900μs 2.5742μs 388.4709 KOps/s 388.5280 KOps/s $\color{#d91a1a}-0.01\%$
test_membership_stacked_nested_leaf 0.1906ms 2.5872μs 386.5153 KOps/s 390.0435 KOps/s $\color{#d91a1a}-0.90\%$
test_membership_nested_last 33.6900μs 3.1182μs 320.7021 KOps/s 321.4054 KOps/s $\color{#d91a1a}-0.22\%$
test_membership_nested_leaf_last 0.1927ms 3.0986μs 322.7246 KOps/s 320.4408 KOps/s $\color{#35bf28}+0.71\%$
test_membership_stacked_nested_last 32.0100μs 3.6013μs 277.6739 KOps/s 134.9267 KOps/s $\textbf{\color{#35bf28}+105.80\%}$
test_membership_stacked_nested_leaf_last 0.1938ms 3.5994μs 277.8237 KOps/s 135.5131 KOps/s $\textbf{\color{#35bf28}+105.02\%}$
test_nested_getleaf 38.9110μs 8.3673μs 119.5134 KOps/s 119.7762 KOps/s $\color{#d91a1a}-0.22\%$
test_nested_get 0.1952ms 7.8375μs 127.5925 KOps/s 126.7455 KOps/s $\color{#35bf28}+0.67\%$
test_stacked_getleaf 0.2000ms 8.4237μs 118.7127 KOps/s 118.5783 KOps/s $\color{#35bf28}+0.11\%$
test_stacked_get 38.7400μs 7.8496μs 127.3953 KOps/s 126.9802 KOps/s $\color{#35bf28}+0.33\%$
test_nested_getitemleaf 0.1929ms 8.5591μs 116.8347 KOps/s 116.9156 KOps/s $\color{#d91a1a}-0.07\%$
test_nested_getitem 0.1918ms 8.0299μs 124.5338 KOps/s 124.1413 KOps/s $\color{#35bf28}+0.32\%$
test_stacked_getitemleaf 29.1510μs 8.5509μs 116.9470 KOps/s 116.2920 KOps/s $\color{#35bf28}+0.56\%$
test_stacked_getitem 22.4000μs 8.0593μs 124.0801 KOps/s 123.1827 KOps/s $\color{#35bf28}+0.73\%$
test_lock_nested 59.0149ms 0.4156ms 2.4061 KOps/s 2.3644 KOps/s $\color{#35bf28}+1.76\%$
test_lock_stack_nested 0.3413ms 0.3068ms 3.2599 KOps/s 3.2595 KOps/s $\color{#35bf28}+0.01\%$
test_unlock_nested 0.8549ms 0.3541ms 2.8241 KOps/s 2.7994 KOps/s $\color{#35bf28}+0.88\%$
test_unlock_stack_nested 0.5083ms 0.3132ms 3.1930 KOps/s 3.1838 KOps/s $\color{#35bf28}+0.29\%$
test_flatten_speed 0.1858ms 0.1013ms 9.8750 KOps/s 9.8064 KOps/s $\color{#35bf28}+0.70\%$
test_unflatten_speed 0.4802ms 0.2931ms 3.4121 KOps/s 3.4562 KOps/s $\color{#d91a1a}-1.28\%$
test_common_ops 1.1937ms 0.5959ms 1.6780 KOps/s 1.6639 KOps/s $\color{#35bf28}+0.85\%$
test_creation 15.4200μs 1.6289μs 613.9033 KOps/s 607.6955 KOps/s $\color{#35bf28}+1.02\%$
test_creation_empty 0.1988ms 9.8068μs 101.9702 KOps/s 103.4237 KOps/s $\color{#d91a1a}-1.41\%$
test_creation_nested_1 31.8310μs 11.7777μs 84.9065 KOps/s 85.3849 KOps/s $\color{#d91a1a}-0.56\%$
test_creation_nested_2 0.2069ms 13.8434μs 72.2368 KOps/s 72.7819 KOps/s $\color{#d91a1a}-0.75\%$
test_clone 73.7610μs 11.3722μs 87.9335 KOps/s 85.3959 KOps/s $\color{#35bf28}+2.97\%$
test_getitem[int] 27.5810μs 10.9705μs 91.1533 KOps/s 90.6941 KOps/s $\color{#35bf28}+0.51\%$
test_getitem[slice_int] 38.4500μs 20.5728μs 48.6079 KOps/s 48.8054 KOps/s $\color{#d91a1a}-0.40\%$
test_getitem[range] 65.5510μs 46.8226μs 21.3572 KOps/s 19.0793 KOps/s $\textbf{\color{#35bf28}+11.94\%}$
test_getitem[tuple] 38.6410μs 18.8622μs 53.0160 KOps/s 53.4282 KOps/s $\color{#d91a1a}-0.77\%$
test_getitem[list] 0.2413ms 33.2357μs 30.0881 KOps/s 29.4917 KOps/s $\color{#35bf28}+2.02\%$
test_setitem_dim[int] 47.3700μs 30.6939μs 32.5797 KOps/s 33.4834 KOps/s $\color{#d91a1a}-2.70\%$
test_setitem_dim[slice_int] 91.7610μs 51.1115μs 19.5651 KOps/s 20.0736 KOps/s $\color{#d91a1a}-2.53\%$
test_setitem_dim[range] 86.1620μs 67.0593μs 14.9122 KOps/s 14.9729 KOps/s $\color{#d91a1a}-0.41\%$
test_setitem_dim[tuple] 63.4210μs 45.3835μs 22.0344 KOps/s 22.3699 KOps/s $\color{#d91a1a}-1.50\%$
test_setitem 47.0810μs 16.9644μs 58.9469 KOps/s 58.3725 KOps/s $\color{#35bf28}+0.98\%$
test_set 0.2321ms 16.6196μs 60.1701 KOps/s 60.6869 KOps/s $\color{#d91a1a}-0.85\%$
test_set_shared 1.1018ms 97.6465μs 10.2410 KOps/s 9.8257 KOps/s $\color{#35bf28}+4.23\%$
test_update 79.0110μs 19.3850μs 51.5864 KOps/s 52.1526 KOps/s $\color{#d91a1a}-1.09\%$
test_update_nested 0.2155ms 24.3850μs 41.0089 KOps/s 41.1097 KOps/s $\color{#d91a1a}-0.25\%$
test_update__nested 62.4010μs 21.8503μs 45.7659 KOps/s 44.0087 KOps/s $\color{#35bf28}+3.99\%$
test_set_nested 0.2088ms 17.4201μs 57.4050 KOps/s 57.7177 KOps/s $\color{#d91a1a}-0.54\%$
test_set_nested_new 54.3810μs 20.3262μs 49.1975 KOps/s 49.3880 KOps/s $\color{#d91a1a}-0.39\%$
test_select 77.8010μs 34.1829μs 29.2544 KOps/s 28.9645 KOps/s $\color{#35bf28}+1.00\%$
test_select_nested 0.4656ms 55.5380μs 18.0057 KOps/s 18.2154 KOps/s $\color{#d91a1a}-1.15\%$
test_exclude_nested 0.2971ms 0.1123ms 8.9051 KOps/s 9.0451 KOps/s $\color{#d91a1a}-1.55\%$
test_empty[True] 0.5316ms 0.3480ms 2.8737 KOps/s 2.8949 KOps/s $\color{#d91a1a}-0.74\%$
test_empty[False] 31.3238μs 0.9385μs 1.0655 MOps/s 1.0812 MOps/s $\color{#d91a1a}-1.45\%$
test_to 0.1012ms 76.5635μs 13.0611 KOps/s 12.9369 KOps/s $\color{#35bf28}+0.96\%$
test_to_nonblocking 0.2641ms 61.3889μs 16.2896 KOps/s 16.2244 KOps/s $\color{#35bf28}+0.40\%$
test_unbind_speed 1.7247ms 0.2701ms 3.7023 KOps/s 3.7101 KOps/s $\color{#d91a1a}-0.21\%$
test_unbind_speed_stack0 0.4647ms 0.2725ms 3.6696 KOps/s 3.7055 KOps/s $\color{#d91a1a}-0.97\%$
test_unbind_speed_stack1 76.5444ms 0.7963ms 1.2558 KOps/s 1.2703 KOps/s $\color{#d91a1a}-1.15\%$
test_split 76.6784ms 1.6894ms 591.9109 Ops/s 594.3589 Ops/s $\color{#d91a1a}-0.41\%$
test_chunk 77.0373ms 1.6846ms 593.6264 Ops/s 592.9319 Ops/s $\color{#35bf28}+0.12\%$
test_creation[device0] 0.1393ms 57.3741μs 17.4295 KOps/s 17.4730 KOps/s $\color{#d91a1a}-0.25\%$
test_creation_from_tensor 0.2531ms 54.5787μs 18.3222 KOps/s 18.7588 KOps/s $\color{#d91a1a}-2.33\%$
test_add_one[memmap_tensor0] 86.8320μs 6.5645μs 152.3334 KOps/s 152.0071 KOps/s $\color{#35bf28}+0.21\%$
test_contiguous[memmap_tensor0] 12.1410μs 0.6760μs 1.4792 MOps/s 1.4806 MOps/s $\color{#d91a1a}-0.09\%$
test_stack[memmap_tensor0] 28.8910μs 4.6598μs 214.5992 KOps/s 214.3240 KOps/s $\color{#35bf28}+0.13\%$
test_memmaptd_index 1.3030ms 0.2923ms 3.4212 KOps/s 3.4345 KOps/s $\color{#d91a1a}-0.39\%$
test_memmaptd_index_astensor 0.6228ms 0.3635ms 2.7512 KOps/s 2.7403 KOps/s $\color{#35bf28}+0.40\%$
test_memmaptd_index_op 0.9588ms 0.6743ms 1.4829 KOps/s 1.4962 KOps/s $\color{#d91a1a}-0.89\%$
test_serialize_model 0.1848s 0.1125s 8.8909 Ops/s 8.6500 Ops/s $\color{#35bf28}+2.78\%$
test_serialize_model_pickle 1.3488s 1.2360s 0.8091 Ops/s 0.8083 Ops/s $\color{#35bf28}+0.10\%$
test_serialize_weights 0.1812s 0.1093s 9.1464 Ops/s 8.7270 Ops/s $\color{#35bf28}+4.81\%$
test_serialize_weights_returnearly 0.2085s 99.5337ms 10.0469 Ops/s 10.4654 Ops/s $\color{#d91a1a}-4.00\%$
test_serialize_weights_pickle 1.3856s 1.2521s 0.7987 Ops/s 0.8013 Ops/s $\color{#d91a1a}-0.33\%$
test_reshape_pytree 58.2810μs 25.9583μs 38.5234 KOps/s 38.7395 KOps/s $\color{#d91a1a}-0.56\%$
test_reshape_td 77.8410μs 31.0661μs 32.1895 KOps/s 29.6747 KOps/s $\textbf{\color{#35bf28}+8.47\%}$
test_view_pytree 56.2410μs 25.7584μs 38.8223 KOps/s 37.1621 KOps/s $\color{#35bf28}+4.47\%$
test_view_td 79.6210μs 35.6678μs 28.0365 KOps/s 25.7454 KOps/s $\textbf{\color{#35bf28}+8.90\%}$
test_unbind_pytree 57.7700μs 31.8462μs 31.4010 KOps/s 29.8505 KOps/s $\textbf{\color{#35bf28}+5.19\%}$
test_unbind_td 0.4512ms 41.0984μs 24.3318 KOps/s 22.1977 KOps/s $\textbf{\color{#35bf28}+9.61\%}$
test_split_pytree 70.0810μs 35.4994μs 28.1695 KOps/s 28.9928 KOps/s $\color{#d91a1a}-2.84\%$
test_split_td 99.6310μs 40.3281μs 24.7966 KOps/s 25.1560 KOps/s $\color{#d91a1a}-1.43\%$
test_add_pytree 71.6110μs 37.2630μs 26.8363 KOps/s 27.1984 KOps/s $\color{#d91a1a}-1.33\%$
test_add_td 77.4710μs 50.8168μs 19.6785 KOps/s 20.0635 KOps/s $\color{#d91a1a}-1.92\%$
test_distributed 1.6799ms 70.4710μs 14.1902 KOps/s 15.0624 KOps/s $\textbf{\color{#d91a1a}-5.79\%}$
test_tdmodule 0.1262ms 15.5809μs 64.1812 KOps/s 64.3482 KOps/s $\color{#d91a1a}-0.26\%$
test_tdmodule_dispatch 48.1100μs 31.0265μs 32.2305 KOps/s 33.0797 KOps/s $\color{#d91a1a}-2.57\%$
test_tdseq 33.0600μs 17.2198μs 58.0727 KOps/s 56.5203 KOps/s $\color{#35bf28}+2.75\%$
test_tdseq_dispatch 50.2710μs 33.6267μs 29.7383 KOps/s 29.8867 KOps/s $\color{#d91a1a}-0.50\%$
test_instantiation_functorch 1.6362ms 1.5400ms 649.3510 Ops/s 659.8100 Ops/s $\color{#d91a1a}-1.59\%$
test_instantiation_td 78.6174ms 1.1413ms 876.1859 Ops/s 948.6022 Ops/s $\textbf{\color{#d91a1a}-7.63\%}$
test_exec_functorch 0.2099ms 0.1469ms 6.8080 KOps/s 6.7505 KOps/s $\color{#35bf28}+0.85\%$
test_exec_functional_call 0.1822ms 0.1327ms 7.5365 KOps/s 7.3948 KOps/s $\color{#35bf28}+1.92\%$
test_exec_td 0.1760ms 0.1302ms 7.6830 KOps/s 7.4702 KOps/s $\color{#35bf28}+2.85\%$
test_exec_td_decorator 0.6058ms 0.2064ms 4.8443 KOps/s 4.8841 KOps/s $\color{#d91a1a}-0.82\%$
test_vmap_mlp_speed[True-True] 0.6357ms 0.5811ms 1.7210 KOps/s 1.7432 KOps/s $\color{#d91a1a}-1.28\%$
test_vmap_mlp_speed[True-False] 0.6481ms 0.5775ms 1.7317 KOps/s 1.7526 KOps/s $\color{#d91a1a}-1.19\%$
test_vmap_mlp_speed[False-True] 0.5783ms 0.5079ms 1.9690 KOps/s 1.9711 KOps/s $\color{#d91a1a}-0.11\%$
test_vmap_mlp_speed[False-False] 0.6316ms 0.5161ms 1.9376 KOps/s 1.9839 KOps/s $\color{#d91a1a}-2.33\%$
test_vmap_mlp_speed_decorator[True-True] 0.7490ms 0.6413ms 1.5594 KOps/s 1.5719 KOps/s $\color{#d91a1a}-0.79\%$
test_vmap_mlp_speed_decorator[True-False] 0.9465ms 0.6391ms 1.5646 KOps/s 1.5791 KOps/s $\color{#d91a1a}-0.92\%$
test_vmap_mlp_speed_decorator[False-True] 0.6869ms 0.5608ms 1.7832 KOps/s 1.7851 KOps/s $\color{#d91a1a}-0.11\%$
test_vmap_mlp_speed_decorator[False-False] 0.6884ms 0.5627ms 1.7771 KOps/s 1.7887 KOps/s $\color{#d91a1a}-0.65\%$
test_vmap_transformer_speed[True-True] 7.7332ms 7.4518ms 134.1960 Ops/s 134.5440 Ops/s $\color{#d91a1a}-0.26\%$
test_vmap_transformer_speed[True-False] 8.0562ms 7.5122ms 133.1162 Ops/s 135.1804 Ops/s $\color{#d91a1a}-1.53\%$
test_vmap_transformer_speed[False-True] 7.7047ms 7.4244ms 134.6918 Ops/s 135.9167 Ops/s $\color{#d91a1a}-0.90\%$
test_vmap_transformer_speed[False-False] 7.7204ms 7.3970ms 135.1896 Ops/s 135.4546 Ops/s $\color{#d91a1a}-0.20\%$
test_vmap_transformer_speed_decorator[True-True] 18.7966ms 18.1080ms 55.2241 Ops/s 55.4446 Ops/s $\color{#d91a1a}-0.40\%$
test_vmap_transformer_speed_decorator[True-False] 18.7187ms 18.1216ms 55.1827 Ops/s 55.6612 Ops/s $\color{#d91a1a}-0.86\%$
test_vmap_transformer_speed_decorator[False-True] 18.5168ms 17.9738ms 55.6365 Ops/s 55.8500 Ops/s $\color{#d91a1a}-0.38\%$
test_vmap_transformer_speed_decorator[False-False] 18.1599ms 17.9616ms 55.6744 Ops/s 55.8698 Ops/s $\color{#d91a1a}-0.35\%$
test_to_module_speed[True] 1.6589ms 1.5293ms 653.8739 Ops/s 653.6537 Ops/s $\color{#35bf28}+0.03\%$
test_to_module_speed[False] 1.6604ms 1.4968ms 668.0723 Ops/s 662.2644 Ops/s $\color{#35bf28}+0.88\%$
test_tc_init 43.9400μs 26.6290μs 37.5531 KOps/s 36.9489 KOps/s $\color{#35bf28}+1.64\%$
test_tc_init_nested 85.2210μs 57.7374μs 17.3198 KOps/s 18.8114 KOps/s $\textbf{\color{#d91a1a}-7.93\%}$
test_tc_first_layer_tensor 0.7485μs 0.3557μs 2.8112 MOps/s 2.8045 MOps/s $\color{#35bf28}+0.24\%$
test_tc_first_layer_nontensor 1.6638μs 0.3855μs 2.5943 MOps/s 2.5577 MOps/s $\color{#35bf28}+1.43\%$
test_tc_second_layer_tensor 14.9910μs 1.0579μs 945.2416 KOps/s 1.0453 MOps/s $\textbf{\color{#d91a1a}-9.57\%}$
test_tc_second_layer_nontensor 3.6002μs 0.8171μs 1.2238 MOps/s 1.2509 MOps/s $\color{#d91a1a}-2.16\%$
test_unbind 95.9511ms 6.7185ms 148.8429 Ops/s 133.7322 Ops/s $\textbf{\color{#35bf28}+11.30\%}$
test_full_like 11.9684ms 11.0465ms 90.5263 Ops/s 88.5561 Ops/s $\color{#35bf28}+2.22\%$
test_zeros_like 7.9939ms 7.7249ms 129.4512 Ops/s 128.6225 Ops/s $\color{#35bf28}+0.64\%$
test_ones_like 8.0680ms 7.7682ms 128.7295 Ops/s 128.2758 Ops/s $\color{#35bf28}+0.35\%$
test_clone 9.3329ms 9.2012ms 108.6816 Ops/s 107.4327 Ops/s $\color{#35bf28}+1.16\%$
test_squeeze 65.7910μs 12.3513μs 80.9632 KOps/s 91.9053 KOps/s $\textbf{\color{#d91a1a}-11.91\%}$
test_unsqueeze 0.1133ms 53.8534μs 18.5689 KOps/s 19.3730 KOps/s $\color{#d91a1a}-4.15\%$
test_split 0.1465ms 99.6609μs 10.0340 KOps/s 10.2156 KOps/s $\color{#d91a1a}-1.78\%$
test_permute 0.1661ms 0.1157ms 8.6446 KOps/s 8.6153 KOps/s $\color{#35bf28}+0.34\%$
test_stack 26.8877ms 26.7130ms 37.4349 Ops/s 37.0167 Ops/s $\color{#35bf28}+1.13\%$
test_cat 26.8211ms 26.6413ms 37.5357 Ops/s 37.1072 Ops/s $\color{#35bf28}+1.15\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. versioning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants