-
Notifications
You must be signed in to change notification settings - Fork 109
[Benchmark] Fix recursion and cache errors in benchmarks #900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…o fix-recursion-error-compile
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 41.2670μs | 22.3327μs | 44.7775 KOps/s | 47.2705 KOps/s | |
| test_plain_set_stack_nested | 54.2420μs | 22.7727μs | 43.9123 KOps/s | 46.2512 KOps/s | |
| test_plain_set_nested_inplace | 59.3310μs | 24.5552μs | 40.7245 KOps/s | 42.8104 KOps/s | |
| test_plain_set_stack_nested_inplace | 68.1480μs | 24.5523μs | 40.7294 KOps/s | 43.0467 KOps/s | |
| test_items | 20.4090μs | 2.6164μs | 382.2037 KOps/s | 372.8860 KOps/s | |
| test_items_nested | 0.6352ms | 0.3751ms | 2.6657 KOps/s | 2.7320 KOps/s | |
| test_items_nested_locked | 0.7902ms | 0.3743ms | 2.6718 KOps/s | 2.6903 KOps/s | |
| test_items_nested_leaf | 0.1700ms | 85.9521μs | 11.6344 KOps/s | 11.5076 KOps/s | |
| test_items_stack_nested | 0.5892ms | 0.3756ms | 2.6624 KOps/s | 2.6872 KOps/s | |
| test_items_stack_nested_leaf | 0.1484ms | 88.5361μs | 11.2948 KOps/s | 11.6748 KOps/s | |
| test_items_stack_nested_locked | 0.4859ms | 0.3779ms | 2.6459 KOps/s | 2.7008 KOps/s | |
| test_keys | 22.7820μs | 3.9652μs | 252.1950 KOps/s | 263.8008 KOps/s | |
| test_keys_nested | 0.2794ms | 0.1451ms | 6.8910 KOps/s | 6.8613 KOps/s | |
| test_keys_nested_locked | 1.6184ms | 0.1511ms | 6.6178 KOps/s | 6.6606 KOps/s | |
| test_keys_nested_leaf | 0.2407ms | 0.1251ms | 7.9949 KOps/s | 8.1068 KOps/s | |
| test_keys_stack_nested | 0.2622ms | 0.1458ms | 6.8602 KOps/s | 6.8795 KOps/s | |
| test_keys_stack_nested_leaf | 0.2288ms | 0.1253ms | 7.9820 KOps/s | 7.9948 KOps/s | |
| test_keys_stack_nested_locked | 0.2893ms | 0.1505ms | 6.6450 KOps/s | 6.6886 KOps/s | |
| test_values | 10.6725μs | 1.2037μs | 830.7718 KOps/s | 875.2848 KOps/s | |
| test_values_nested | 0.1030ms | 50.3576μs | 19.8580 KOps/s | 19.8722 KOps/s | |
| test_values_nested_locked | 0.1770ms | 49.7373μs | 20.1056 KOps/s | 19.8393 KOps/s | |
| test_values_nested_leaf | 87.9840μs | 45.5609μs | 21.9486 KOps/s | 22.0497 KOps/s | |
| test_values_stack_nested | 99.3850μs | 50.9628μs | 19.6221 KOps/s | 19.3357 KOps/s | |
| test_values_stack_nested_leaf | 0.1063ms | 45.5034μs | 21.9764 KOps/s | 20.6632 KOps/s | |
| test_values_stack_nested_locked | 0.1043ms | 50.5534μs | 19.7811 KOps/s | 19.1629 KOps/s | |
| test_membership | 3.2135μs | 0.7112μs | 1.4061 MOps/s | 1.1081 MOps/s | |
| test_membership_nested | 48.0900μs | 2.7289μs | 366.4462 KOps/s | 377.2467 KOps/s | |
| test_membership_nested_leaf | 36.2280μs | 2.7952μs | 357.7532 KOps/s | 372.4986 KOps/s | |
| test_membership_stacked_nested | 51.7770μs | 2.7433μs | 364.5292 KOps/s | 373.3364 KOps/s | |
| test_membership_stacked_nested_leaf | 31.1690μs | 2.7492μs | 363.7399 KOps/s | 371.4185 KOps/s | |
| test_membership_nested_last | 56.5660μs | 4.0768μs | 245.2934 KOps/s | 253.0430 KOps/s | |
| test_membership_nested_leaf_last | 44.4340μs | 4.0943μs | 244.2418 KOps/s | 255.0172 KOps/s | |
| test_membership_stacked_nested_last | 27.9120μs | 5.1562μs | 193.9407 KOps/s | 78.6652 KOps/s | |
| test_membership_stacked_nested_leaf_last | 36.8390μs | 5.1627μs | 193.6968 KOps/s | 77.9643 KOps/s | |
| test_nested_getleaf | 63.0680μs | 10.8609μs | 92.0734 KOps/s | 90.6370 KOps/s | |
| test_nested_get | 43.9620μs | 10.3972μs | 96.1799 KOps/s | 96.6965 KOps/s | |
| test_stacked_getleaf | 65.0620μs | 10.4712μs | 95.5003 KOps/s | 91.9871 KOps/s | |
| test_stacked_get | 55.7750μs | 10.2647μs | 97.4212 KOps/s | 100.5363 KOps/s | |
| test_nested_getitemleaf | 40.6760μs | 11.3255μs | 88.2961 KOps/s | 87.5949 KOps/s | |
| test_nested_getitem | 59.2310μs | 10.4746μs | 95.4688 KOps/s | 96.4757 KOps/s | |
| test_stacked_getitemleaf | 61.7860μs | 11.2892μs | 88.5802 KOps/s | 87.7158 KOps/s | |
| test_stacked_getitem | 55.9450μs | 10.4089μs | 96.0718 KOps/s | 95.8771 KOps/s | |
| test_lock_nested | 2.1975ms | 0.5217ms | 1.9169 KOps/s | 1.6271 KOps/s | |
| test_lock_stack_nested | 1.0682ms | 0.4870ms | 2.0533 KOps/s | 2.1447 KOps/s | |
| test_unlock_nested | 0.9227ms | 0.4339ms | 2.3047 KOps/s | 1.9242 KOps/s | |
| test_unlock_stack_nested | 0.5992ms | 0.3997ms | 2.5019 KOps/s | 2.6452 KOps/s | |
| test_flatten_speed | 0.2355ms | 0.1060ms | 9.4320 KOps/s | 9.1870 KOps/s | |
| test_unflatten_speed | 0.9553ms | 0.4445ms | 2.2496 KOps/s | 2.2782 KOps/s | |
| test_common_ops | 5.1704ms | 1.1786ms | 848.4909 Ops/s | 906.1683 Ops/s | |
| test_creation | 49.9240μs | 2.4577μs | 406.8869 KOps/s | 394.0119 KOps/s | |
| test_creation_empty | 51.5360μs | 19.7085μs | 50.7396 KOps/s | 56.0719 KOps/s | |
| test_creation_nested_1 | 59.5010μs | 23.3456μs | 42.8346 KOps/s | 46.3229 KOps/s | |
| test_creation_nested_2 | 73.1370μs | 27.1621μs | 36.8160 KOps/s | 39.6547 KOps/s | |
| test_clone | 0.1198ms | 17.4142μs | 57.4245 KOps/s | 56.8936 KOps/s | |
| test_getitem[int] | 1.0193ms | 13.0495μs | 76.6314 KOps/s | 79.5205 KOps/s | |
| test_getitem[slice_int] | 0.1368ms | 32.4720μs | 30.7958 KOps/s | 30.7712 KOps/s | |
| test_getitem[range] | 0.2295ms | 58.4195μs | 17.1176 KOps/s | 17.4586 KOps/s | |
| test_getitem[tuple] | 0.1267ms | 26.7037μs | 37.4479 KOps/s | 38.5783 KOps/s | |
| test_getitem[list] | 0.3692ms | 53.4265μs | 18.7173 KOps/s | 19.3139 KOps/s | |
| test_setitem_dim[int] | 78.2970μs | 35.0458μs | 28.5341 KOps/s | 32.0081 KOps/s | |
| test_setitem_dim[slice_int] | 0.1502ms | 74.2317μs | 13.4713 KOps/s | 14.5400 KOps/s | |
| test_setitem_dim[range] | 0.1749ms | 94.6450μs | 10.5658 KOps/s | 11.1054 KOps/s | |
| test_setitem_dim[tuple] | 0.1107ms | 61.1951μs | 16.3412 KOps/s | 17.8738 KOps/s | |
| test_setitem | 0.1572ms | 31.3353μs | 31.9129 KOps/s | 34.0949 KOps/s | |
| test_set | 0.1920ms | 30.1939μs | 33.1192 KOps/s | 34.9456 KOps/s | |
| test_set_shared | 1.4086ms | 0.2160ms | 4.6300 KOps/s | 4.5868 KOps/s | |
| test_update | 0.1601ms | 37.5484μs | 26.6323 KOps/s | 28.3464 KOps/s | |
| test_update_nested | 0.1579ms | 47.8228μs | 20.9105 KOps/s | 21.8993 KOps/s | |
| test_update__nested | 0.1804ms | 35.1588μs | 28.4424 KOps/s | 29.0287 KOps/s | |
| test_set_nested | 0.1489ms | 32.9493μs | 30.3496 KOps/s | 32.3335 KOps/s | |
| test_set_nested_new | 0.1438ms | 38.8985μs | 25.7079 KOps/s | 27.4840 KOps/s | |
| test_select | 1.1193ms | 57.4133μs | 17.4176 KOps/s | 18.5922 KOps/s | |
| test_select_nested | 0.1221ms | 60.0285μs | 16.6588 KOps/s | 16.4910 KOps/s | |
| test_exclude_nested | 0.1476ms | 80.5500μs | 12.4146 KOps/s | 12.4628 KOps/s | |
| test_empty[True] | 0.9470ms | 0.3435ms | 2.9110 KOps/s | 2.9176 KOps/s | |
| test_empty[False] | 10.9930μs | 1.2343μs | 810.1875 KOps/s | 808.8310 KOps/s | |
| test_unbind_speed | 0.4811ms | 0.3256ms | 3.0716 KOps/s | 3.0591 KOps/s | |
| test_unbind_speed_stack0 | 0.7317ms | 0.3222ms | 3.1037 KOps/s | 3.2264 KOps/s | |
| test_unbind_speed_stack1 | 82.4024ms | 0.8295ms | 1.2056 KOps/s | 1.3603 KOps/s | |
| test_split | 84.6888ms | 2.2808ms | 438.4507 Ops/s | 438.1662 Ops/s | |
| test_chunk | 88.4839ms | 2.2955ms | 435.6347 Ops/s | 436.1372 Ops/s | |
| test_creation[device0] | 0.3047ms | 0.1224ms | 8.1667 KOps/s | 8.4004 KOps/s | |
| test_creation_from_tensor | 4.1154ms | 0.1239ms | 8.0741 KOps/s | 8.2803 KOps/s | |
| test_add_one[memmap_tensor0] | 0.2328ms | 8.1489μs | 122.7153 KOps/s | 129.9311 KOps/s | |
| test_contiguous[memmap_tensor0] | 30.2470μs | 2.2302μs | 448.3908 KOps/s | 458.2103 KOps/s | |
| test_stack[memmap_tensor0] | 80.6110μs | 6.0135μs | 166.2922 KOps/s | 169.2746 KOps/s | |
| test_memmaptd_index | 1.0794ms | 0.4529ms | 2.2082 KOps/s | 2.2854 KOps/s | |
| test_memmaptd_index_astensor | 1.3037ms | 0.5430ms | 1.8417 KOps/s | 1.9308 KOps/s | |
| test_memmaptd_index_op | 1.5440ms | 1.1093ms | 901.4740 Ops/s | 959.0837 Ops/s | |
| test_serialize_model | 0.2171s | 0.1426s | 7.0118 Ops/s | 7.1754 Ops/s | |
| test_serialize_model_pickle | 0.4469s | 0.3965s | 2.5220 Ops/s | 2.5297 Ops/s | |
| test_serialize_weights | 0.1309s | 0.1269s | 7.8820 Ops/s | 8.1250 Ops/s | |
| test_serialize_weights_returnearly | 0.1853s | 0.1669s | 5.9917 Ops/s | 5.6774 Ops/s | |
| test_serialize_weights_pickle | 0.6087s | 0.4566s | 2.1900 Ops/s | 2.4105 Ops/s | |
| test_serialize_weights_filesystem | 0.1481s | 0.1437s | 6.9583 Ops/s | 7.1232 Ops/s | |
| test_serialize_model_filesystem | 0.1713s | 0.1495s | 6.6909 Ops/s | 6.6329 Ops/s | |
| test_reshape_pytree | 0.1021ms | 39.4145μs | 25.3714 KOps/s | 26.1432 KOps/s | |
| test_reshape_td | 0.1076ms | 50.7443μs | 19.7066 KOps/s | 20.0686 KOps/s | |
| test_view_pytree | 90.2490μs | 39.1116μs | 25.5679 KOps/s | 26.3620 KOps/s | |
| test_view_td | 0.1496ms | 56.2542μs | 17.7765 KOps/s | 18.1518 KOps/s | |
| test_unbind_pytree | 95.4890μs | 36.2241μs | 27.6059 KOps/s | 27.4883 KOps/s | |
| test_unbind_td | 0.3484ms | 48.6959μs | 20.5356 KOps/s | 20.5432 KOps/s | |
| test_split_pytree | 96.5000μs | 38.1404μs | 26.2189 KOps/s | 25.8132 KOps/s | |
| test_split_td | 0.5239ms | 60.2857μs | 16.5877 KOps/s | 16.2975 KOps/s | |
| test_add_pytree | 0.1377ms | 44.9285μs | 22.2576 KOps/s | 23.3308 KOps/s | |
| test_add_td | 0.1716ms | 87.5172μs | 11.4263 KOps/s | 12.3397 KOps/s | |
| test_distributed | 0.2913ms | 0.1302ms | 7.6819 KOps/s | 7.6349 KOps/s | |
| test_tdmodule | 36.3580μs | 17.5388μs | 57.0164 KOps/s | 61.4635 KOps/s | |
| test_tdmodule_dispatch | 62.2460μs | 36.7649μs | 27.1999 KOps/s | 29.2958 KOps/s | |
| test_tdseq | 44.9240μs | 19.4919μs | 51.3035 KOps/s | 54.8414 KOps/s | |
| test_tdseq_dispatch | 64.4610μs | 41.8155μs | 23.9146 KOps/s | 26.1242 KOps/s | |
| test_instantiation_functorch | 2.5133ms | 1.6209ms | 616.9437 Ops/s | 631.5782 Ops/s | |
| test_instantiation_td | 1.8071ms | 1.1629ms | 859.9118 Ops/s | 781.5795 Ops/s | |
| test_exec_functorch | 0.2921ms | 0.1879ms | 5.3232 KOps/s | 5.4640 KOps/s | |
| test_exec_functional_call | 0.5209ms | 0.1801ms | 5.5533 KOps/s | 5.6595 KOps/s | |
| test_exec_td | 0.3162ms | 0.1798ms | 5.5633 KOps/s | 5.8318 KOps/s | |
| test_exec_td_decorator | 0.6039ms | 0.2594ms | 3.8546 KOps/s | 3.9272 KOps/s | |
| test_vmap_mlp_speed[True-True] | 0.9316ms | 0.6301ms | 1.5871 KOps/s | 1.6939 KOps/s | |
| test_vmap_mlp_speed[True-False] | 0.8243ms | 0.6112ms | 1.6361 KOps/s | 1.7053 KOps/s | |
| test_vmap_mlp_speed[False-True] | 0.8420ms | 0.5050ms | 1.9801 KOps/s | 2.0541 KOps/s | |
| test_vmap_mlp_speed[False-False] | 0.7360ms | 0.5069ms | 1.9729 KOps/s | 2.0591 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 1.3213ms | 0.7118ms | 1.4049 KOps/s | 1.4569 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 1.1727ms | 0.7084ms | 1.4116 KOps/s | 1.4616 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8884ms | 0.5908ms | 1.6925 KOps/s | 1.7600 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 1.0691ms | 0.5929ms | 1.6866 KOps/s | 1.7624 KOps/s | |
| test_to_module_speed[True] | 89.6009ms | 2.0605ms | 485.3098 Ops/s | 550.3214 Ops/s | |
| test_to_module_speed[False] | 2.4419ms | 1.7757ms | 563.1674 Ops/s | 556.3118 Ops/s | |
| test_tc_init | 98.2330μs | 45.6403μs | 21.9104 KOps/s | 23.8864 KOps/s | |
| test_tc_init_nested | 0.1584ms | 93.3428μs | 10.7132 KOps/s | 11.4032 KOps/s | |
| test_tc_first_layer_tensor | 41.8480μs | 9.0932μs | 109.9719 KOps/s | 106.7136 KOps/s | |
| test_tc_first_layer_nontensor | 38.0810μs | 9.0644μs | 110.3217 KOps/s | 107.0521 KOps/s | |
| test_tc_second_layer_tensor | 28.9340μs | 2.8170μs | 354.9869 KOps/s | 337.3531 KOps/s | |
| test_tc_second_layer_nontensor | 57.9580μs | 10.1290μs | 98.7267 KOps/s | 96.1739 KOps/s | |
| test_unbind | 99.7243ms | 13.1491ms | 76.0509 Ops/s | 76.2895 Ops/s | |
| test_full_like | 9.9851ms | 7.7908ms | 128.3560 Ops/s | 146.9490 Ops/s | |
| test_zeros_like | 14.7553ms | 7.3622ms | 135.8290 Ops/s | 161.8183 Ops/s | |
| test_ones_like | 14.1131ms | 7.7282ms | 129.3957 Ops/s | 133.8703 Ops/s | |
| test_clone | 18.4513ms | 9.0467ms | 110.5378 Ops/s | 116.1391 Ops/s | |
| test_squeeze | 71.1130μs | 14.4248μs | 69.3249 KOps/s | 70.3237 KOps/s | |
| test_unsqueeze | 0.2836ms | 98.2902μs | 10.1740 KOps/s | 10.1829 KOps/s | |
| test_split | 0.3801ms | 0.2069ms | 4.8337 KOps/s | 4.8461 KOps/s | |
| test_permute | 0.3698ms | 0.2267ms | 4.4120 KOps/s | 4.3654 KOps/s | |
| test_stack | 29.5012ms | 24.8108ms | 40.3051 Ops/s | 42.5380 Ops/s | |
| test_cat | 30.4265ms | 24.8243ms | 40.2831 Ops/s | 42.6001 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Benchmarks
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.