-
Notifications
You must be signed in to change notification settings - Fork 109
[Refactor] composite_lp_aggregate to handle log-probs aggregates globally
#1181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 47.6490μs | 19.1288μs | 52.2773 KOps/s | 50.9612 KOps/s | |
| test_plain_set_stack_nested | 50.6140μs | 19.4405μs | 51.4391 KOps/s | 50.1858 KOps/s | |
| test_plain_set_nested_inplace | 72.2750μs | 21.0282μs | 47.5551 KOps/s | 46.3551 KOps/s | |
| test_plain_set_stack_nested_inplace | 62.2160μs | 21.0214μs | 47.5706 KOps/s | 46.5642 KOps/s | |
| test_items | 39.4940μs | 4.1608μs | 240.3359 KOps/s | 231.1232 KOps/s | |
| test_items_nested | 0.4813ms | 0.3958ms | 2.5263 KOps/s | 2.4399 KOps/s | |
| test_items_nested_locked | 0.7288ms | 0.3963ms | 2.5233 KOps/s | 2.4552 KOps/s | |
| test_items_nested_leaf | 0.1209ms | 77.6177μs | 12.8837 KOps/s | 12.6881 KOps/s | |
| test_items_stack_nested | 0.5459ms | 0.3998ms | 2.5014 KOps/s | 2.4422 KOps/s | |
| test_items_stack_nested_leaf | 0.1518ms | 78.9272μs | 12.6699 KOps/s | 12.2968 KOps/s | |
| test_items_stack_nested_locked | 0.6043ms | 0.3990ms | 2.5064 KOps/s | 2.4353 KOps/s | |
| test_keys | 28.3730μs | 3.4943μs | 286.1811 KOps/s | 287.2819 KOps/s | |
| test_keys_nested | 0.2928ms | 0.1650ms | 6.0602 KOps/s | 5.9116 KOps/s | |
| test_keys_nested_locked | 0.6882ms | 0.1696ms | 5.8965 KOps/s | 5.6655 KOps/s | |
| test_keys_nested_leaf | 0.2618ms | 0.1431ms | 6.9906 KOps/s | 6.7656 KOps/s | |
| test_keys_stack_nested | 0.2618ms | 0.1658ms | 6.0307 KOps/s | 6.0366 KOps/s | |
| test_keys_stack_nested_leaf | 0.2190ms | 0.1431ms | 6.9857 KOps/s | 7.0435 KOps/s | |
| test_keys_stack_nested_locked | 0.2737ms | 0.1709ms | 5.8503 KOps/s | 5.8339 KOps/s | |
| test_values | 9.9846μs | 1.0267μs | 974.0270 KOps/s | 950.8590 KOps/s | |
| test_values_nested | 0.1225ms | 62.4896μs | 16.0027 KOps/s | 15.8927 KOps/s | |
| test_values_nested_locked | 0.1135ms | 62.4497μs | 16.0129 KOps/s | 15.7265 KOps/s | |
| test_values_nested_leaf | 0.1239ms | 71.1562μs | 14.0536 KOps/s | 13.7923 KOps/s | |
| test_values_stack_nested | 0.1184ms | 62.6220μs | 15.9688 KOps/s | 15.3605 KOps/s | |
| test_values_stack_nested_leaf | 0.1250ms | 71.7296μs | 13.9412 KOps/s | 14.0569 KOps/s | |
| test_values_stack_nested_locked | 0.1157ms | 63.0234μs | 15.8671 KOps/s | 15.3180 KOps/s | |
| test_membership | 5.2881μs | 0.6915μs | 1.4461 MOps/s | 1.1304 MOps/s | |
| test_membership_nested | 24.1050μs | 2.9025μs | 344.5357 KOps/s | 340.9579 KOps/s | |
| test_membership_nested_leaf | 43.4110μs | 2.9194μs | 342.5396 KOps/s | 342.3902 KOps/s | |
| test_membership_stacked_nested | 16.5110μs | 2.8670μs | 348.8025 KOps/s | 339.4935 KOps/s | |
| test_membership_stacked_nested_leaf | 39.6540μs | 2.8821μs | 346.9653 KOps/s | 340.6511 KOps/s | |
| test_membership_nested_last | 26.3990μs | 4.3044μs | 232.3202 KOps/s | 226.6151 KOps/s | |
| test_membership_nested_leaf_last | 48.6010μs | 4.3479μs | 229.9974 KOps/s | 225.6689 KOps/s | |
| test_membership_stacked_nested_last | 42.2480μs | 4.2741μs | 233.9692 KOps/s | 73.8938 KOps/s | |
| test_membership_stacked_nested_leaf_last | 23.8750μs | 4.3086μs | 232.0963 KOps/s | 74.5746 KOps/s | |
| test_nested_getleaf | 57.4670μs | 10.5724μs | 94.5856 KOps/s | 92.2604 KOps/s | |
| test_nested_get | 54.4820μs | 10.0525μs | 99.4779 KOps/s | 97.0814 KOps/s | |
| test_stacked_getleaf | 37.3100μs | 10.5969μs | 94.3674 KOps/s | 93.4605 KOps/s | |
| test_stacked_get | 46.1570μs | 10.0403μs | 99.5984 KOps/s | 98.2228 KOps/s | |
| test_nested_getitemleaf | 54.9830μs | 11.2386μs | 88.9791 KOps/s | 87.8680 KOps/s | |
| test_nested_getitem | 46.6170μs | 10.6394μs | 93.9905 KOps/s | 92.3678 KOps/s | |
| test_stacked_getitemleaf | 56.4360μs | 11.2025μs | 89.2661 KOps/s | 88.2837 KOps/s | |
| test_stacked_getitem | 38.4020μs | 10.6039μs | 94.3052 KOps/s | 92.0940 KOps/s | |
| test_lock_nested | 1.1256ms | 0.4571ms | 2.1875 KOps/s | 1.7835 KOps/s | |
| test_lock_stack_nested | 0.5093ms | 0.4237ms | 2.3603 KOps/s | 2.3935 KOps/s | |
| test_unlock_nested | 0.9275ms | 0.3756ms | 2.6625 KOps/s | 2.6387 KOps/s | |
| test_unlock_stack_nested | 0.5263ms | 0.3444ms | 2.9033 KOps/s | 3.0112 KOps/s | |
| test_flatten_speed | 0.1866ms | 0.1011ms | 9.8908 KOps/s | 9.7920 KOps/s | |
| test_unflatten_speed | 0.7095ms | 0.5126ms | 1.9509 KOps/s | 1.9054 KOps/s | |
| test_common_ops | 1.7894ms | 0.7483ms | 1.3364 KOps/s | 1.2984 KOps/s | |
| test_creation | 45.2950μs | 2.4877μs | 401.9721 KOps/s | 376.0471 KOps/s | |
| test_creation_empty | 32.8920μs | 9.5150μs | 105.0974 KOps/s | 103.1581 KOps/s | |
| test_creation_nested_1 | 38.3410μs | 12.4624μs | 80.2413 KOps/s | 80.1200 KOps/s | |
| test_creation_nested_2 | 37.7710μs | 17.0233μs | 58.7431 KOps/s | 57.5631 KOps/s | |
| test_clone | 74.5700μs | 13.4062μs | 74.5924 KOps/s | 73.3338 KOps/s | |
| test_getitem[int] | 1.2554ms | 12.9086μs | 77.4679 KOps/s | 76.8288 KOps/s | |
| test_getitem[slice_int] | 0.1387ms | 24.3230μs | 41.1133 KOps/s | 40.1741 KOps/s | |
| test_getitem[range] | 0.1724ms | 48.8903μs | 20.4540 KOps/s | 20.7079 KOps/s | |
| test_getitem[tuple] | 0.1335ms | 19.9061μs | 50.2358 KOps/s | 47.6578 KOps/s | |
| test_getitem[list] | 0.1960ms | 43.2736μs | 23.1088 KOps/s | 22.7659 KOps/s | |
| test_setitem_dim[int] | 57.5170μs | 26.6233μs | 37.5611 KOps/s | 37.0612 KOps/s | |
| test_setitem_dim[slice_int] | 77.7250μs | 52.0196μs | 19.2235 KOps/s | 18.9459 KOps/s | |
| test_setitem_dim[range] | 0.1231ms | 75.4533μs | 13.2532 KOps/s | 13.2333 KOps/s | |
| test_setitem_dim[tuple] | 71.2730μs | 40.6808μs | 24.5816 KOps/s | 23.5941 KOps/s | |
| test_setitem | 0.1217ms | 19.2944μs | 51.8285 KOps/s | 51.0088 KOps/s | |
| test_set | 86.3020μs | 18.2843μs | 54.6916 KOps/s | 53.7685 KOps/s | |
| test_set_shared | 3.6403ms | 0.1728ms | 5.7883 KOps/s | 5.8163 KOps/s | |
| test_update | 0.1489ms | 20.1705μs | 49.5773 KOps/s | 48.6470 KOps/s | |
| test_update_nested | 99.6770μs | 30.1542μs | 33.1629 KOps/s | 32.0276 KOps/s | |
| test_update__nested | 0.6457ms | 33.4576μs | 29.8885 KOps/s | 28.5921 KOps/s | |
| test_set_nested | 89.0560μs | 20.5440μs | 48.6761 KOps/s | 47.0010 KOps/s | |
| test_set_nested_new | 78.0660μs | 25.1113μs | 39.8227 KOps/s | 38.4947 KOps/s | |
| test_select | 0.1525ms | 41.1227μs | 24.3175 KOps/s | 23.6860 KOps/s | |
| test_select_nested | 0.1199ms | 62.8898μs | 15.9008 KOps/s | 15.2421 KOps/s | |
| test_exclude_nested | 0.1781ms | 81.3378μs | 12.2944 KOps/s | 11.8701 KOps/s | |
| test_empty[True] | 0.6171ms | 0.4059ms | 2.4634 KOps/s | 2.3691 KOps/s | |
| test_empty[False] | 9.6980μs | 1.3357μs | 748.6649 KOps/s | 720.4520 KOps/s | |
| test_unbind_speed | 1.0582ms | 0.2793ms | 3.5809 KOps/s | 3.6639 KOps/s | |
| test_unbind_speed_stack0 | 0.4076ms | 0.2694ms | 3.7125 KOps/s | 3.8906 KOps/s | |
| test_unbind_speed_stack1 | 0.1055s | 0.7937ms | 1.2599 KOps/s | 1.4065 KOps/s | |
| test_split | 1.7572ms | 1.5753ms | 634.7836 Ops/s | 559.5763 Ops/s | |
| test_chunk | 0.1040s | 1.8954ms | 527.5966 Ops/s | 560.5367 Ops/s | |
| test_consolidate_njt[False-None] | 8.7183ms | 8.2635ms | 121.0135 Ops/s | 120.4005 Ops/s | |
| test_creation[device0] | 0.4692ms | 91.5584μs | 10.9220 KOps/s | 11.0487 KOps/s | |
| test_creation_from_tensor | 3.4285ms | 94.7100μs | 10.5585 KOps/s | 10.5379 KOps/s | |
| test_add_one[memmap_tensor0] | 0.1386ms | 4.9219μs | 203.1727 KOps/s | 202.9119 KOps/s | |
| test_contiguous[memmap_tensor0] | 10.8910μs | 0.5101μs | 1.9605 MOps/s | 1.9437 MOps/s | |
| test_stack[memmap_tensor0] | 39.9150μs | 3.4439μs | 290.3700 KOps/s | 280.6249 KOps/s | |
| test_memmaptd_index | 0.8969ms | 0.2332ms | 4.2883 KOps/s | 4.2834 KOps/s | |
| test_memmaptd_index_astensor | 0.5782ms | 0.3210ms | 3.1157 KOps/s | 3.1266 KOps/s | |
| test_memmaptd_index_op | 1.1240ms | 0.5352ms | 1.8685 KOps/s | 1.8270 KOps/s | |
| test_serialize_model | 0.1309s | 0.1183s | 8.4560 Ops/s | 8.5155 Ops/s | |
| test_serialize_model_pickle | 0.4495s | 0.4005s | 2.4969 Ops/s | 2.5922 Ops/s | |
| test_serialize_weights | 0.1221s | 0.1140s | 8.7687 Ops/s | 8.7251 Ops/s | |
| test_serialize_weights_returnearly | 0.2541s | 0.1750s | 5.7138 Ops/s | 6.2655 Ops/s | |
| test_serialize_weights_pickle | 0.6038s | 0.4360s | 2.2936 Ops/s | 2.4338 Ops/s | |
| test_serialize_weights_filesystem | 0.1498s | 0.1399s | 7.1483 Ops/s | 7.0174 Ops/s | |
| test_serialize_model_filesystem | 0.1620s | 0.1444s | 6.9258 Ops/s | 6.2658 Ops/s | |
| test_reshape_pytree | 64.8910μs | 26.1525μs | 38.2372 KOps/s | 37.2711 KOps/s | |
| test_reshape_td | 0.1133ms | 33.5616μs | 29.7960 KOps/s | 30.4184 KOps/s | |
| test_view_pytree | 58.9500μs | 26.0084μs | 38.4491 KOps/s | 37.2420 KOps/s | |
| test_view_td | 79.1170μs | 38.7501μs | 25.8064 KOps/s | 25.3651 KOps/s | |
| test_unbind_pytree | 69.7810μs | 29.2032μs | 34.2428 KOps/s | 33.5226 KOps/s | |
| test_unbind_td | 0.2845ms | 39.7299μs | 25.1700 KOps/s | 24.9380 KOps/s | |
| test_split_pytree | 71.8950μs | 28.9900μs | 34.4947 KOps/s | 32.8617 KOps/s | |
| test_split_td | 0.5508ms | 44.3035μs | 22.5716 KOps/s | 21.9001 KOps/s | |
| test_add_pytree | 77.5950μs | 34.8440μs | 28.6993 KOps/s | 27.8110 KOps/s | |
| test_add_td | 0.1162ms | 49.2040μs | 20.3236 KOps/s | 19.2069 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1474ms | 62.3955μs | 16.0268 KOps/s | 15.3222 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.3970ms | 0.1751ms | 5.7105 KOps/s | 5.6996 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1434ms | 46.0805μs | 21.7011 KOps/s | 21.1912 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 0.2623ms | 0.1196ms | 8.3620 KOps/s | 8.2963 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 0.1123ms | 27.1519μs | 36.8298 KOps/s | 35.7910 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 0.1210ms | 58.7892μs | 17.0099 KOps/s | 16.8959 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.1504ms | 79.4527μs | 12.5861 KOps/s | 12.4355 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 0.1403ms | 67.7769μs | 14.7543 KOps/s | 14.5783 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.1833ms | 0.1048ms | 9.5459 KOps/s | 9.3733 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.4750ms | 0.2139ms | 4.6748 KOps/s | 4.6081 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1361ms | 45.2505μs | 22.0992 KOps/s | 21.6125 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.5069ms | 67.5564μs | 14.8025 KOps/s | 14.8204 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.2002ms | 0.1023ms | 9.7715 KOps/s | 9.6031 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.3221ms | 0.2016ms | 4.9602 KOps/s | 5.0305 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.4692ms | 0.2334ms | 4.2838 KOps/s | 4.2502 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.1910ms | 0.1050ms | 9.5280 KOps/s | 9.4442 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1627ms | 63.5928μs | 15.7250 KOps/s | 15.5060 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.1076ms | 47.9943μs | 20.8358 KOps/s | 20.8957 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.3140ms | 0.1617ms | 6.1841 KOps/s | 6.3367 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.2279ms | 0.1032ms | 9.6925 KOps/s | 9.7482 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 68.3280μs | 21.8203μs | 45.8290 KOps/s | 41.7296 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 0.1484ms | 67.2744μs | 14.8645 KOps/s | 15.0009 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1972ms | 79.7585μs | 12.5378 KOps/s | 12.5338 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.1574ms | 68.0473μs | 14.6957 KOps/s | 14.7513 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 0.3417ms | 0.2091ms | 4.7831 KOps/s | 4.8609 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 2.4661ms | 1.3445ms | 743.7948 Ops/s | 759.1004 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 0.3110ms | 0.2075ms | 4.8188 KOps/s | 4.9939 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 1.3878ms | 0.7796ms | 1.2828 KOps/s | 1.2755 KOps/s | |
| test_compile_assign_and_add_stack[compile] | 0.5299ms | 0.4503ms | 2.2209 KOps/s | 2.2230 KOps/s | |
| test_compile_assign_and_add_stack[eager] | 3.6112ms | 2.5841ms | 386.9759 Ops/s | 391.7029 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 90.9100μs | 36.0378μs | 27.7487 KOps/s | 27.0949 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.5484ms | 33.3181μs | 30.0138 KOps/s | 29.4987 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1011ms | 29.0904μs | 34.3757 KOps/s | 32.0626 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 69.7910μs | 23.4624μs | 42.6213 KOps/s | 42.1796 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 99.3960μs | 30.3871μs | 32.9087 KOps/s | 32.1964 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 91.5920μs | 23.7806μs | 42.0510 KOps/s | 42.1952 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.1052ms | 52.1408μs | 19.1788 KOps/s | 19.2238 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.5660ms | 20.2335μs | 49.4230 KOps/s | 48.9278 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1134ms | 44.6351μs | 22.4039 KOps/s | 21.7814 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 83.4850μs | 18.6986μs | 53.4799 KOps/s | 52.9760 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 0.1049ms | 45.7825μs | 21.8424 KOps/s | 21.6798 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 72.6360μs | 18.6654μs | 53.5751 KOps/s | 52.9815 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1097ms | 53.3576μs | 18.7415 KOps/s | 18.5446 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.9487ms | 19.8450μs | 50.3905 KOps/s | 49.5054 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 0.1556ms | 45.2566μs | 22.0962 KOps/s | 21.4267 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 78.2360μs | 18.4711μs | 54.1386 KOps/s | 52.9385 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 0.1253ms | 44.8537μs | 22.2947 KOps/s | 21.3760 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 0.8249ms | 18.4554μs | 54.1848 KOps/s | 53.3213 KOps/s | |
| test_mod_add[eager] | 99.3060μs | 33.5517μs | 29.8047 KOps/s | 29.8641 KOps/s | |
| test_mod_add[compile] | 0.1234ms | 48.5003μs | 20.6184 KOps/s | 20.6306 KOps/s | |
| test_mod_add[compile-overhead] | 0.1203ms | 49.2114μs | 20.3205 KOps/s | 20.7275 KOps/s | |
| test_mod_wrap[eager] | 0.3500ms | 0.2207ms | 4.5311 KOps/s | 4.4463 KOps/s | |
| test_mod_wrap[compile] | 0.3997ms | 0.2111ms | 4.7381 KOps/s | 4.5772 KOps/s | |
| test_mod_wrap[compile-overhead] | 0.3817ms | 0.2070ms | 4.8315 KOps/s | 4.7781 KOps/s | |
| test_mod_wrap_and_backward[eager] | 20.7550ms | 12.6583ms | 78.9995 Ops/s | 84.2877 Ops/s | |
| test_mod_wrap_and_backward[compile] | 17.8635ms | 12.6071ms | 79.3203 Ops/s | 87.4856 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 16.8940ms | 13.3364ms | 74.9827 Ops/s | 73.4562 Ops/s | |
| test_seq_add[eager] | 0.2425ms | 0.1105ms | 9.0523 KOps/s | 8.7521 KOps/s | |
| test_seq_add[compile] | 0.1410ms | 63.6004μs | 15.7232 KOps/s | 15.5645 KOps/s | |
| test_seq_add[compile-overhead] | 0.1279ms | 60.6360μs | 16.4919 KOps/s | 16.0288 KOps/s | |
| test_seq_wrap[eager] | 0.5690ms | 0.4356ms | 2.2957 KOps/s | 2.2328 KOps/s | |
| test_seq_wrap[compile] | 0.4583ms | 0.2299ms | 4.3496 KOps/s | 4.2251 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.4277ms | 0.2270ms | 4.4062 KOps/s | 4.2615 KOps/s | |
| test_func_call_runtime[False-eager] | 0.8923ms | 0.5482ms | 1.8241 KOps/s | 1.7428 KOps/s | |
| test_func_call_runtime[False-compile] | 0.5259ms | 0.4222ms | 2.3687 KOps/s | 2.3252 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.8112ms | 0.4288ms | 2.3320 KOps/s | 2.3469 KOps/s | |
| test_func_call_runtime[True-eager] | 1.0889ms | 0.7721ms | 1.2952 KOps/s | 1.2662 KOps/s | |
| test_func_call_runtime[True-compile] | 0.6059ms | 0.4583ms | 2.1821 KOps/s | 2.1316 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.6477ms | 0.4583ms | 2.1822 KOps/s | 2.1326 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 1.2497ms | 0.5431ms | 1.8413 KOps/s | 1.7516 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 0.5414ms | 0.4204ms | 2.3786 KOps/s | 2.3409 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.5443ms | 0.4209ms | 2.3758 KOps/s | 2.3275 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.0508ms | 0.9152ms | 1.0927 KOps/s | 1.0742 KOps/s | |
| test_func_call_cm_runtime[True-compile] | 0.5865ms | 0.4867ms | 2.0546 KOps/s | 1.9897 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.6536ms | 0.4843ms | 2.0650 KOps/s | 1.9971 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.4372ms | 1.9095ms | 523.6965 Ops/s | 508.5175 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 0.9127ms | 0.5228ms | 1.9129 KOps/s | 1.9040 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.8897ms | 0.5294ms | 1.8889 KOps/s | 1.9032 KOps/s | |
| test_distributed | 0.2703ms | 0.1247ms | 8.0221 KOps/s | 7.7828 KOps/s | |
| test_tdmodule | 69.3200μs | 24.9635μs | 40.0585 KOps/s | 39.2775 KOps/s | |
| test_tdmodule_dispatch | 77.9350μs | 46.1653μs | 21.6613 KOps/s | 21.4891 KOps/s | |
| test_tdseq | 45.1050μs | 27.1566μs | 36.8235 KOps/s | 35.1758 KOps/s | |
| test_tdseq_dispatch | 76.8740μs | 49.8152μs | 20.0742 KOps/s | 18.8677 KOps/s | |
| test_instantiation_functorch | 2.6785ms | 1.5525ms | 644.1021 Ops/s | 644.1201 Ops/s | |
| test_exec_functorch | 0.3061ms | 0.1820ms | 5.4939 KOps/s | 5.4136 KOps/s | |
| test_exec_functional_call | 0.3117ms | 0.1701ms | 5.8781 KOps/s | 5.5181 KOps/s | |
| test_exec_td_decorator | 0.4731ms | 0.2299ms | 4.3505 KOps/s | 4.2465 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 0.9658ms | 0.6533ms | 1.5307 KOps/s | 1.5454 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 1.0666ms | 0.6541ms | 1.5288 KOps/s | 1.5341 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.8438ms | 0.5383ms | 1.8576 KOps/s | 1.8715 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.9680ms | 0.5359ms | 1.8660 KOps/s | 1.8709 KOps/s | |
| test_to_module_speed[True] | 1.6913ms | 1.3340ms | 749.6008 Ops/s | 729.9320 Ops/s | |
| test_to_module_speed[False] | 1.8258ms | 1.2992ms | 769.6971 Ops/s | 754.4421 Ops/s | |
| test_tc_init | 96.4900μs | 45.8129μs | 21.8279 KOps/s | 22.4815 KOps/s | |
| test_tc_init_nested | 0.1501ms | 91.6613μs | 10.9097 KOps/s | 11.2932 KOps/s | |
| test_tc_first_layer_tensor | 38.5010μs | 1.5800μs | 632.9303 KOps/s | 630.9741 KOps/s | |
| test_tc_first_layer_nontensor | 27.0910μs | 5.1402μs | 194.5434 KOps/s | 212.4574 KOps/s | |
| test_tc_second_layer_tensor | 24.3560μs | 2.9292μs | 341.3922 KOps/s | 347.6049 KOps/s | |
| test_tc_second_layer_nontensor | 46.3770μs | 6.2856μs | 159.0948 KOps/s | 166.5600 KOps/s | |
| test_unbind | 0.2363s | 13.4628ms | 74.2787 Ops/s | 77.5577 Ops/s | |
| test_full_like | 18.0313ms | 11.8828ms | 84.1552 Ops/s | 82.1803 Ops/s | |
| test_zeros_like | 11.2453ms | 7.1377ms | 140.1009 Ops/s | 126.1127 Ops/s | |
| test_ones_like | 12.1094ms | 7.9260ms | 126.1663 Ops/s | 130.3149 Ops/s | |
| test_clone | 13.2835ms | 9.4340ms | 105.9993 Ops/s | 107.5229 Ops/s | |
| test_squeeze | 58.4590μs | 11.8234μs | 84.5777 KOps/s | 81.5354 KOps/s | |
| test_unsqueeze | 0.2969ms | 91.6152μs | 10.9152 KOps/s | 11.0472 KOps/s | |
| test_split | 0.3974ms | 0.1999ms | 5.0030 KOps/s | 5.0856 KOps/s | |
| test_permute | 0.2795ms | 0.2010ms | 4.9754 KOps/s | 5.0038 KOps/s | |
| test_stack | 29.9623ms | 24.9111ms | 40.1427 Ops/s | 40.3167 Ops/s | |
| test_cat | 29.3022ms | 24.5323ms | 40.7626 Ops/s | 40.9552 Ops/s |
|
| Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
|---|---|---|---|---|---|
| test_plain_set_nested | 36.7700μs | 12.1604μs | 82.2343 KOps/s | 77.2928 KOps/s | |
| test_plain_set_stack_nested | 35.1810μs | 12.3944μs | 80.6819 KOps/s | 75.6935 KOps/s | |
| test_plain_set_nested_inplace | 42.2810μs | 13.2189μs | 75.6493 KOps/s | 70.9914 KOps/s | |
| test_plain_set_stack_nested_inplace | 50.1610μs | 13.3467μs | 74.9249 KOps/s | 70.7601 KOps/s | |
| test_items | 30.2510μs | 2.8966μs | 345.2309 KOps/s | 342.5848 KOps/s | |
| test_items_nested | 0.4402ms | 0.3627ms | 2.7567 KOps/s | 2.7295 KOps/s | |
| test_items_nested_locked | 0.4509ms | 0.3656ms | 2.7355 KOps/s | 2.7308 KOps/s | |
| test_items_nested_leaf | 87.2210μs | 59.1260μs | 16.9130 KOps/s | 17.0656 KOps/s | |
| test_items_stack_nested | 0.3990ms | 0.3694ms | 2.7070 KOps/s | 2.6723 KOps/s | |
| test_items_stack_nested_leaf | 0.1242ms | 60.7118μs | 16.4713 KOps/s | 16.4726 KOps/s | |
| test_items_stack_nested_locked | 0.4415ms | 0.3658ms | 2.7334 KOps/s | 2.7083 KOps/s | |
| test_keys | 41.8100μs | 3.7139μs | 269.2615 KOps/s | 282.7091 KOps/s | |
| test_keys_nested | 0.1269ms | 87.9966μs | 11.3641 KOps/s | 11.2808 KOps/s | |
| test_keys_nested_locked | 0.8254ms | 94.0174μs | 10.6363 KOps/s | 10.5535 KOps/s | |
| test_keys_nested_leaf | 0.1627ms | 79.0679μs | 12.6474 KOps/s | 12.5033 KOps/s | |
| test_keys_stack_nested | 0.1137ms | 89.6862μs | 11.1500 KOps/s | 11.0618 KOps/s | |
| test_keys_stack_nested_leaf | 0.1237ms | 80.5663μs | 12.4121 KOps/s | 12.1894 KOps/s | |
| test_keys_stack_nested_locked | 0.1319ms | 94.7110μs | 10.5584 KOps/s | 10.3595 KOps/s | |
| test_values | 5.5235μs | 0.8525μs | 1.1730 MOps/s | 1.1747 MOps/s | |
| test_values_nested | 0.1193ms | 37.4863μs | 26.6764 KOps/s | 26.5271 KOps/s | |
| test_values_nested_locked | 74.8110μs | 38.7042μs | 25.8370 KOps/s | 25.3997 KOps/s | |
| test_values_nested_leaf | 64.8510μs | 41.6562μs | 24.0060 KOps/s | 23.7377 KOps/s | |
| test_values_stack_nested | 61.7910μs | 38.0347μs | 26.2918 KOps/s | 26.0260 KOps/s | |
| test_values_stack_nested_leaf | 0.1061ms | 42.6442μs | 23.4499 KOps/s | 23.3623 KOps/s | |
| test_values_stack_nested_locked | 68.9620μs | 39.2769μs | 25.4603 KOps/s | 25.2499 KOps/s | |
| test_membership | 1.5646μs | 0.5119μs | 1.9534 MOps/s | 1.9019 MOps/s | |
| test_membership_nested | 20.5805μs | 2.0277μs | 493.1772 KOps/s | 497.2547 KOps/s | |
| test_membership_nested_leaf | 21.8205μs | 2.0286μs | 492.9475 KOps/s | 493.7240 KOps/s | |
| test_membership_stacked_nested | 35.5300μs | 2.0607μs | 485.2644 KOps/s | 480.2739 KOps/s | |
| test_membership_stacked_nested_leaf | 64.3910μs | 2.0875μs | 479.0488 KOps/s | 483.7824 KOps/s | |
| test_membership_nested_last | 35.2310μs | 3.1296μs | 319.5277 KOps/s | 322.5093 KOps/s | |
| test_membership_nested_leaf_last | 44.5310μs | 3.1521μs | 317.2441 KOps/s | 328.3025 KOps/s | |
| test_membership_stacked_nested_last | 60.4210μs | 3.6421μs | 274.5690 KOps/s | 241.0404 KOps/s | |
| test_membership_stacked_nested_leaf_last | 34.9610μs | 3.6383μs | 274.8567 KOps/s | 238.1124 KOps/s | |
| test_nested_getleaf | 34.1910μs | 6.1620μs | 162.2851 KOps/s | 163.6503 KOps/s | |
| test_nested_get | 41.5600μs | 5.8230μs | 171.7315 KOps/s | 172.0856 KOps/s | |
| test_stacked_getleaf | 32.1300μs | 6.1384μs | 162.9089 KOps/s | 163.6275 KOps/s | |
| test_stacked_get | 64.9610μs | 5.8244μs | 171.6925 KOps/s | 172.3025 KOps/s | |
| test_nested_getitemleaf | 38.0310μs | 6.4414μs | 155.2463 KOps/s | 156.9865 KOps/s | |
| test_nested_getitem | 26.9210μs | 6.2150μs | 160.9007 KOps/s | 164.6507 KOps/s | |
| test_stacked_getitemleaf | 29.0500μs | 6.4562μs | 154.8896 KOps/s | 155.5357 KOps/s | |
| test_stacked_getitem | 67.1820μs | 6.1159μs | 163.5095 KOps/s | 164.1392 KOps/s | |
| test_lock_nested | 0.7300ms | 0.3727ms | 2.6834 KOps/s | 2.6181 KOps/s | |
| test_lock_stack_nested | 0.3839ms | 0.3461ms | 2.8897 KOps/s | 2.8986 KOps/s | |
| test_unlock_nested | 0.6389ms | 0.3183ms | 3.1419 KOps/s | 3.1908 KOps/s | |
| test_unlock_stack_nested | 0.3136ms | 0.2847ms | 3.5128 KOps/s | 3.5395 KOps/s | |
| test_flatten_speed | 0.1529ms | 76.6164μs | 13.0520 KOps/s | 13.1478 KOps/s | |
| test_unflatten_speed | 0.4321ms | 0.3163ms | 3.1611 KOps/s | 3.0934 KOps/s | |
| test_common_ops | 1.6097ms | 0.5993ms | 1.6685 KOps/s | 1.5647 KOps/s | |
| test_creation | 0.1061ms | 1.7548μs | 569.8675 KOps/s | 564.8825 KOps/s | |
| test_creation_empty | 1.4018ms | 8.4065μs | 118.9549 KOps/s | 100.5220 KOps/s | |
| test_creation_nested_1 | 34.5010μs | 10.0227μs | 99.7735 KOps/s | 85.5179 KOps/s | |
| test_creation_nested_2 | 35.4000μs | 12.8992μs | 77.5240 KOps/s | 70.4418 KOps/s | |
| test_clone | 0.1185ms | 10.2422μs | 97.6357 KOps/s | 98.7538 KOps/s | |
| test_getitem[int] | 1.1996ms | 11.0342μs | 90.6274 KOps/s | 93.2100 KOps/s | |
| test_getitem[slice_int] | 0.1138ms | 21.2509μs | 47.0568 KOps/s | 47.7115 KOps/s | |
| test_getitem[range] | 0.1308ms | 36.9925μs | 27.0325 KOps/s | 26.7071 KOps/s | |
| test_getitem[tuple] | 0.1098ms | 18.4658μs | 54.1542 KOps/s | 55.0269 KOps/s | |
| test_getitem[list] | 0.3401ms | 32.6327μs | 30.6441 KOps/s | 30.3330 KOps/s | |
| test_setitem_dim[int] | 39.7710μs | 18.9138μs | 52.8714 KOps/s | 54.3547 KOps/s | |
| test_setitem_dim[slice_int] | 60.4510μs | 37.5831μs | 26.6077 KOps/s | 26.3350 KOps/s | |
| test_setitem_dim[range] | 81.2110μs | 52.2510μs | 19.1384 KOps/s | 18.8594 KOps/s | |
| test_setitem_dim[tuple] | 53.1210μs | 31.0818μs | 32.1732 KOps/s | 30.6896 KOps/s | |
| test_setitem | 0.1020ms | 14.2883μs | 69.9871 KOps/s | 64.4163 KOps/s | |
| test_set | 0.1126ms | 13.9216μs | 71.8309 KOps/s | 66.6573 KOps/s | |
| test_set_shared | 1.7834ms | 0.1509ms | 6.6277 KOps/s | 6.5484 KOps/s | |
| test_update | 0.3057ms | 17.1600μs | 58.2750 KOps/s | 52.9868 KOps/s | |
| test_update_nested | 0.1084ms | 22.3174μs | 44.8081 KOps/s | 41.2323 KOps/s | |
| test_update__nested | 0.4617ms | 24.6946μs | 40.4947 KOps/s | 40.6581 KOps/s | |
| test_set_nested | 97.2510μs | 15.3936μs | 64.9621 KOps/s | 61.2502 KOps/s | |
| test_set_nested_new | 0.1166ms | 17.7337μs | 56.3900 KOps/s | 53.1964 KOps/s | |
| test_select | 0.2180ms | 29.9161μs | 33.4268 KOps/s | 32.5586 KOps/s | |
| test_select_nested | 75.8010μs | 43.4362μs | 23.0223 KOps/s | 22.7952 KOps/s | |
| test_exclude_nested | 0.1377ms | 61.9355μs | 16.1458 KOps/s | 15.6175 KOps/s | |
| test_empty[True] | 0.3482ms | 0.2935ms | 3.4069 KOps/s | 3.4094 KOps/s | |
| test_empty[False] | 3.2061μs | 0.8175μs | 1.2232 MOps/s | 1.2009 MOps/s | |
| test_to | 86.3410μs | 56.7723μs | 17.6142 KOps/s | 17.9021 KOps/s | |
| test_to_nonblocking | 83.1310μs | 48.0174μs | 20.8258 KOps/s | 20.8741 KOps/s | |
| test_unbind_speed | 1.5898ms | 0.2401ms | 4.1641 KOps/s | 4.2174 KOps/s | |
| test_unbind_speed_stack0 | 0.3007ms | 0.2384ms | 4.1952 KOps/s | 4.1174 KOps/s | |
| test_unbind_speed_stack1 | 95.4108ms | 0.6747ms | 1.4822 KOps/s | 1.4895 KOps/s | |
| test_split | 95.8192ms | 1.6130ms | 619.9460 Ops/s | 628.2366 Ops/s | |
| test_chunk | 97.7415ms | 1.6082ms | 621.8095 Ops/s | 633.5333 Ops/s | |
| test_consolidate[False-None] | 0.1003s | 2.9850ms | 335.0127 Ops/s | 335.7523 Ops/s | |
| test_consolidate[default-None] | 2.6830ms | 1.7078ms | 585.5547 Ops/s | 585.5819 Ops/s | |
| test_consolidate[reduce-overhead-None] | 1.8348ms | 1.7604ms | 568.0394 Ops/s | 571.1071 Ops/s | |
| test_consolidate_njt[False-None] | 7.0287ms | 6.5763ms | 152.0604 Ops/s | 150.4412 Ops/s | |
| test_to[False-False-None] | 1.7675ms | 1.6942ms | 590.2374 Ops/s | 584.9811 Ops/s | |
| test_to[True-False-None] | 1.5924ms | 1.3354ms | 748.8619 Ops/s | 735.2609 Ops/s | |
| test_to[within-False-None] | 4.4522ms | 4.1478ms | 241.0901 Ops/s | 238.5639 Ops/s | |
| test_to[True-default-None] | 5.6983ms | 5.3044ms | 188.5217 Ops/s | 186.6576 Ops/s | |
| test_to_njt[False-False-None] | 7.1147ms | 6.9354ms | 144.1885 Ops/s | 143.1087 Ops/s | |
| test_to_njt[True-False-None] | 5.7031ms | 5.5007ms | 181.7962 Ops/s | 181.4035 Ops/s | |
| test_to_njt[within-False-None] | 12.4025ms | 12.2847ms | 81.4022 Ops/s | 80.2353 Ops/s | |
| test_creation[device0] | 0.4664ms | 80.5937μs | 12.4079 KOps/s | 12.3514 KOps/s | |
| test_creation_from_tensor | 0.6318ms | 83.5315μs | 11.9715 KOps/s | 11.8743 KOps/s | |
| test_add_one[memmap_tensor0] | 0.4221ms | 6.3898μs | 156.4995 KOps/s | 156.4370 KOps/s | |
| test_contiguous[memmap_tensor0] | 1.7776μs | 0.4309μs | 2.3209 MOps/s | 2.3168 MOps/s | |
| test_stack[memmap_tensor0] | 20.3800μs | 4.6408μs | 215.4812 KOps/s | 222.7217 KOps/s | |
| test_memmaptd_index | 1.3680ms | 0.2604ms | 3.8398 KOps/s | 3.9264 KOps/s | |
| test_memmaptd_index_astensor | 0.5887ms | 0.3243ms | 3.0831 KOps/s | 3.1749 KOps/s | |
| test_memmaptd_index_op | 1.0458ms | 0.5847ms | 1.7103 KOps/s | 1.6701 KOps/s | |
| test_serialize_model | 0.1326s | 0.1317s | 7.5927 Ops/s | 7.6200 Ops/s | |
| test_serialize_model_pickle | 1.3492s | 1.2106s | 0.8260 Ops/s | 0.8259 Ops/s | |
| test_serialize_weights | 0.1321s | 0.1304s | 7.6689 Ops/s | 7.6726 Ops/s | |
| test_serialize_weights_returnearly | 0.3428s | 53.8079ms | 18.5846 Ops/s | 14.4342 Ops/s | |
| test_serialize_weights_pickle | 1.3458s | 1.1860s | 0.8432 Ops/s | 0.7953 Ops/s | |
| test_reshape_pytree | 62.1910μs | 22.9149μs | 43.6398 KOps/s | 44.2175 KOps/s | |
| test_reshape_td | 63.6010μs | 26.5714μs | 37.6345 KOps/s | 36.0599 KOps/s | |
| test_view_pytree | 0.1207ms | 22.1478μs | 45.1513 KOps/s | 45.1227 KOps/s | |
| test_view_td | 59.6910μs | 31.2538μs | 31.9961 KOps/s | 30.8098 KOps/s | |
| test_unbind_pytree | 0.1522ms | 28.3553μs | 35.2667 KOps/s | 35.2425 KOps/s | |
| test_unbind_td | 0.7515ms | 37.0397μs | 26.9981 KOps/s | 27.3447 KOps/s | |
| test_split_pytree | 57.0210μs | 30.6438μs | 32.6330 KOps/s | 31.8864 KOps/s | |
| test_split_td | 1.0053ms | 39.7555μs | 25.1537 KOps/s | 24.9813 KOps/s | |
| test_add_pytree | 61.8510μs | 33.6762μs | 29.6946 KOps/s | 29.9606 KOps/s | |
| test_add_td | 0.1802ms | 47.6207μs | 20.9993 KOps/s | 19.3273 KOps/s | |
| test_compile_add_one_nested[tensordict-compile] | 0.1757ms | 0.1218ms | 8.2123 KOps/s | 7.9854 KOps/s | |
| test_compile_add_one_nested[tensordict-eager] | 0.2843ms | 0.1325ms | 7.5452 KOps/s | 7.3881 KOps/s | |
| test_compile_add_one_nested[pytree-compile] | 0.1472ms | 98.6278μs | 10.1391 KOps/s | 10.3441 KOps/s | |
| test_compile_add_one_nested[pytree-eager] | 1.6285ms | 0.1505ms | 6.6465 KOps/s | 6.8249 KOps/s | |
| test_compile_copy_nested[tensordict-compile] | 96.7610μs | 22.3815μs | 44.6798 KOps/s | 44.6272 KOps/s | |
| test_compile_copy_nested[tensordict-eager] | 82.9920μs | 29.6450μs | 33.7325 KOps/s | 33.0964 KOps/s | |
| test_compile_copy_nested[pytree-compile] | 0.4085ms | 65.1945μs | 15.3387 KOps/s | 15.0402 KOps/s | |
| test_compile_copy_nested[pytree-eager] | 83.6310μs | 49.7392μs | 20.1049 KOps/s | 19.8329 KOps/s | |
| test_compile_add_one_flat[tensordict-compile] | 0.1824ms | 0.1420ms | 7.0419 KOps/s | 6.9144 KOps/s | |
| test_compile_add_one_flat[tensordict-eager] | 0.3106ms | 0.2165ms | 4.6191 KOps/s | 4.6053 KOps/s | |
| test_compile_add_one_flat[tensorclass-compile] | 0.1854ms | 99.1465μs | 10.0861 KOps/s | 10.1742 KOps/s | |
| test_compile_add_one_flat[tensorclass-eager] | 0.1150ms | 56.3841μs | 17.7355 KOps/s | 17.8347 KOps/s | |
| test_compile_add_one_flat[pytree-compile] | 0.1738ms | 0.1354ms | 7.3844 KOps/s | 7.3848 KOps/s | |
| test_compile_add_one_flat[pytree-eager] | 0.5729ms | 0.4789ms | 2.0880 KOps/s | 2.1430 KOps/s | |
| test_compile_add_self_flat[tensordict-eager] | 0.3816ms | 0.2608ms | 3.8348 KOps/s | 3.8593 KOps/s | |
| test_compile_add_self_flat[tensordict-compile] | 0.2022ms | 0.1497ms | 6.6784 KOps/s | 7.0389 KOps/s | |
| test_compile_add_self_flat[tensorclass-eager] | 0.1696ms | 70.0694μs | 14.2716 KOps/s | 14.5317 KOps/s | |
| test_compile_add_self_flat[tensorclass-compile] | 0.1510ms | 0.1064ms | 9.4023 KOps/s | 10.1703 KOps/s | |
| test_compile_add_self_flat[pytree-eager] | 0.5538ms | 0.4089ms | 2.4455 KOps/s | 2.4698 KOps/s | |
| test_compile_add_self_flat[pytree-compile] | 0.1756ms | 0.1369ms | 7.3058 KOps/s | 7.4310 KOps/s | |
| test_compile_copy_flat[tensordict-compile] | 79.6710μs | 19.2872μs | 51.8479 KOps/s | 55.0637 KOps/s | |
| test_compile_copy_flat[tensordict-eager] | 96.5020μs | 30.8022μs | 32.4652 KOps/s | 31.6246 KOps/s | |
| test_compile_copy_flat[pytree-compile] | 0.1053ms | 70.4879μs | 14.1868 KOps/s | 13.9578 KOps/s | |
| test_compile_copy_flat[pytree-eager] | 0.1373ms | 52.1730μs | 19.1670 KOps/s | 19.2130 KOps/s | |
| test_compile_assign_and_add[tensordict-compile] | 1.6349ms | 0.3935ms | 2.5414 KOps/s | 2.1919 KOps/s | |
| test_compile_assign_and_add[tensordict-eager] | 2.6887ms | 2.6134ms | 382.6433 Ops/s | 383.4553 Ops/s | |
| test_compile_assign_and_add[pytree-compile] | 1.5948ms | 0.4353ms | 2.2975 KOps/s | 2.1218 KOps/s | |
| test_compile_assign_and_add[pytree-eager] | 2.7371ms | 2.6327ms | 379.8401 Ops/s | 381.5021 Ops/s | |
| test_compile_indexing[tensor-tensordict-compile] | 0.1635ms | 0.1117ms | 8.9493 KOps/s | 8.4632 KOps/s | |
| test_compile_indexing[tensor-tensordict-eager] | 0.5511ms | 77.9309μs | 12.8319 KOps/s | 12.3661 KOps/s | |
| test_compile_indexing[tensor-tensorclass-compile] | 0.1705ms | 0.1046ms | 9.5589 KOps/s | 9.5692 KOps/s | |
| test_compile_indexing[tensor-tensorclass-eager] | 0.1106ms | 67.2966μs | 14.8596 KOps/s | 14.4228 KOps/s | |
| test_compile_indexing[tensor-pytree-compile] | 0.1521ms | 0.1056ms | 9.4674 KOps/s | 9.5142 KOps/s | |
| test_compile_indexing[tensor-pytree-eager] | 0.1162ms | 66.9258μs | 14.9419 KOps/s | 14.8972 KOps/s | |
| test_compile_indexing[slice-tensordict-compile] | 0.2335ms | 0.1031ms | 9.7005 KOps/s | 9.9924 KOps/s | |
| test_compile_indexing[slice-tensordict-eager] | 0.1677ms | 17.4747μs | 57.2255 KOps/s | 57.5534 KOps/s | |
| test_compile_indexing[slice-tensorclass-compile] | 0.1355ms | 96.0504μs | 10.4112 KOps/s | 10.2783 KOps/s | |
| test_compile_indexing[slice-tensorclass-eager] | 57.0610μs | 16.0511μs | 62.3011 KOps/s | 63.4290 KOps/s | |
| test_compile_indexing[slice-pytree-compile] | 0.1443ms | 96.5170μs | 10.3609 KOps/s | 10.0928 KOps/s | |
| test_compile_indexing[slice-pytree-eager] | 0.1150ms | 15.8779μs | 62.9808 KOps/s | 63.5312 KOps/s | |
| test_compile_indexing[int-tensordict-compile] | 0.1440ms | 0.1001ms | 9.9858 KOps/s | 9.4654 KOps/s | |
| test_compile_indexing[int-tensordict-eager] | 0.5718ms | 17.2797μs | 57.8714 KOps/s | 59.1036 KOps/s | |
| test_compile_indexing[int-tensorclass-compile] | 0.1798ms | 96.8586μs | 10.3243 KOps/s | 10.2614 KOps/s | |
| test_compile_indexing[int-tensorclass-eager] | 43.7710μs | 15.8422μs | 63.1225 KOps/s | 64.0035 KOps/s | |
| test_compile_indexing[int-pytree-compile] | 0.1395ms | 96.6823μs | 10.3432 KOps/s | 10.2593 KOps/s | |
| test_compile_indexing[int-pytree-eager] | 47.9610μs | 15.7370μs | 63.5445 KOps/s | 63.1236 KOps/s | |
| test_mod_add[eager] | 0.1300ms | 38.4616μs | 25.9999 KOps/s | 25.6554 KOps/s | |
| test_mod_add[compile] | 0.1268ms | 82.4111μs | 12.1343 KOps/s | 11.7979 KOps/s | |
| test_mod_add[compile-overhead] | 0.3182ms | 0.1674ms | 5.9742 KOps/s | 5.6789 KOps/s | |
| test_mod_wrap[eager] | 0.3306ms | 0.2521ms | 3.9660 KOps/s | 3.8181 KOps/s | |
| test_mod_wrap[compile] | 0.3374ms | 0.2843ms | 3.5177 KOps/s | 3.4913 KOps/s | |
| test_mod_wrap[compile-overhead] | 7.0129ms | 3.7234ms | 268.5737 Ops/s | 271.5757 Ops/s | |
| test_mod_wrap_and_backward[eager] | 1.4668ms | 1.3280ms | 753.0186 Ops/s | 709.5403 Ops/s | |
| test_mod_wrap_and_backward[compile] | 1.3594ms | 1.2742ms | 784.7818 Ops/s | 730.8724 Ops/s | |
| test_mod_wrap_and_backward[compile-overhead] | 1.3589ms | 0.9206ms | 1.0862 KOps/s | 907.8000 Ops/s | |
| test_seq_add[eager] | 0.2486ms | 0.1165ms | 8.5874 KOps/s | 8.4640 KOps/s | |
| test_seq_add[compile] | 0.1356ms | 88.8900μs | 11.2499 KOps/s | 11.1793 KOps/s | |
| test_seq_add[compile-overhead] | 0.1760ms | 0.1298ms | 7.7049 KOps/s | 7.7251 KOps/s | |
| test_seq_wrap[eager] | 0.4769ms | 0.4117ms | 2.4291 KOps/s | 2.3596 KOps/s | |
| test_seq_wrap[compile] | 0.3514ms | 0.3019ms | 3.3120 KOps/s | 3.3144 KOps/s | |
| test_seq_wrap[compile-overhead] | 0.3075ms | 0.2245ms | 4.4539 KOps/s | 4.4037 KOps/s | |
| test_func_call_runtime[False-eager] | 0.7886ms | 0.7146ms | 1.3995 KOps/s | 1.3596 KOps/s | |
| test_func_call_runtime[False-compile] | 0.8074ms | 0.7540ms | 1.3262 KOps/s | 1.3451 KOps/s | |
| test_func_call_runtime[False-compile-overhead] | 0.4240ms | 0.3661ms | 2.7314 KOps/s | 2.7109 KOps/s | |
| test_func_call_runtime[True-eager] | 0.9428ms | 0.8697ms | 1.1498 KOps/s | 1.1456 KOps/s | |
| test_func_call_runtime[True-compile] | 0.8289ms | 0.7747ms | 1.2908 KOps/s | 1.3083 KOps/s | |
| test_func_call_runtime[True-compile-overhead] | 0.4478ms | 0.3870ms | 2.5843 KOps/s | 2.5818 KOps/s | |
| test_func_call_cm_runtime[False-eager] | 0.8085ms | 0.7127ms | 1.4030 KOps/s | 1.4136 KOps/s | |
| test_func_call_cm_runtime[False-compile] | 0.8052ms | 0.7570ms | 1.3210 KOps/s | 1.3252 KOps/s | |
| test_func_call_cm_runtime[False-compile-overhead] | 0.4376ms | 0.3676ms | 2.7206 KOps/s | 2.7130 KOps/s | |
| test_func_call_cm_runtime[True-eager] | 1.0664ms | 0.9818ms | 1.0185 KOps/s | 999.9982 Ops/s | |
| test_func_call_cm_runtime[True-compile] | 0.9481ms | 0.8013ms | 1.2479 KOps/s | 1.2568 KOps/s | |
| test_func_call_cm_runtime[True-compile-overhead] | 0.5597ms | 0.4115ms | 2.4303 KOps/s | 2.4129 KOps/s | |
| test_vmap_func_call_cm_runtime[eager] | 2.4839ms | 2.0239ms | 494.0973 Ops/s | 492.1497 Ops/s | |
| test_vmap_func_call_cm_runtime[compile] | 0.9538ms | 0.8186ms | 1.2216 KOps/s | 1.2297 KOps/s | |
| test_vmap_func_call_cm_runtime[compile-overhead] | 0.5218ms | 0.4140ms | 2.4153 KOps/s | 2.4009 KOps/s | |
| test_distributed | 0.6819ms | 0.1624ms | 6.1583 KOps/s | 8.5343 KOps/s | |
| test_tdmodule | 0.1675ms | 20.7283μs | 48.2433 KOps/s | 47.4886 KOps/s | |
| test_tdmodule_dispatch | 70.0910μs | 36.3810μs | 27.4869 KOps/s | 26.4004 KOps/s | |
| test_tdseq | 32.6110μs | 20.8580μs | 47.9433 KOps/s | 45.4092 KOps/s | |
| test_tdseq_dispatch | 59.7210μs | 38.7125μs | 25.8315 KOps/s | 24.2046 KOps/s | |
| test_instantiation_functorch | 1.6062ms | 1.5420ms | 648.5210 Ops/s | 639.9389 Ops/s | |
| test_exec_functorch | 0.2084ms | 0.1408ms | 7.1024 KOps/s | 7.0790 KOps/s | |
| test_exec_functional_call | 0.2012ms | 0.1314ms | 7.6075 KOps/s | 7.4408 KOps/s | |
| test_exec_td_decorator | 0.3685ms | 0.1787ms | 5.5945 KOps/s | 5.4593 KOps/s | |
| test_vmap_mlp_speed_decorator[True-True] | 0.8233ms | 0.6674ms | 1.4983 KOps/s | 1.4942 KOps/s | |
| test_vmap_mlp_speed_decorator[True-False] | 0.8346ms | 0.6701ms | 1.4922 KOps/s | 1.5016 KOps/s | |
| test_vmap_mlp_speed_decorator[False-True] | 0.7217ms | 0.5821ms | 1.7180 KOps/s | 1.7452 KOps/s | |
| test_vmap_mlp_speed_decorator[False-False] | 0.7486ms | 0.5881ms | 1.7005 KOps/s | 1.7470 KOps/s | |
| test_vmap_transformer_speed_decorator[True-True] | 19.2999ms | 18.6852ms | 53.5182 Ops/s | 53.9437 Ops/s | |
| test_vmap_transformer_speed_decorator[True-False] | 19.3566ms | 18.7070ms | 53.4558 Ops/s | 53.9514 Ops/s | |
| test_vmap_transformer_speed_decorator[False-True] | 18.8087ms | 18.6279ms | 53.6830 Ops/s | 54.5233 Ops/s | |
| test_vmap_transformer_speed_decorator[False-False] | 18.7740ms | 18.6011ms | 53.7602 Ops/s | 54.3385 Ops/s | |
| test_to_module_speed[True] | 1.0653ms | 0.9736ms | 1.0272 KOps/s | 1.0364 KOps/s | |
| test_to_module_speed[False] | 1.3530ms | 0.9653ms | 1.0359 KOps/s | 1.0441 KOps/s | |
| test_tc_init | 76.8120μs | 35.2068μs | 28.4036 KOps/s | 26.6669 KOps/s | |
| test_tc_init_nested | 0.2397ms | 69.5196μs | 14.3844 KOps/s | 13.2001 KOps/s | |
| test_tc_first_layer_tensor | 13.4973μs | 0.7143μs | 1.4000 MOps/s | 1.2071 MOps/s | |
| test_tc_first_layer_nontensor | 67.1910μs | 2.3070μs | 433.4695 KOps/s | 434.9652 KOps/s | |
| test_tc_second_layer_tensor | 47.7973μs | 1.4370μs | 695.9178 KOps/s | 694.3641 KOps/s | |
| test_tc_second_layer_nontensor | 36.2310μs | 3.0287μs | 330.1780 KOps/s | 331.9329 KOps/s | |
| test_unbind | 0.2300s | 10.2275ms | 97.7758 Ops/s | 140.4178 Ops/s | |
| test_full_like | 10.9462ms | 10.0337ms | 99.6638 Ops/s | 103.9180 Ops/s | |
| test_zeros_like | 9.3303ms | 7.3134ms | 136.7358 Ops/s | 138.9037 Ops/s | |
| test_ones_like | 5.0610ms | 4.3868ms | 227.9588 Ops/s | 227.4273 Ops/s | |
| test_clone | 12.1982ms | 9.4718ms | 105.5766 Ops/s | 147.4653 Ops/s | |
| test_squeeze | 54.5110μs | 9.8328μs | 101.7006 KOps/s | 103.3615 KOps/s | |
| test_unsqueeze | 0.1935ms | 73.2570μs | 13.6506 KOps/s | 13.3855 KOps/s | |
| test_split | 0.3924ms | 0.1598ms | 6.2561 KOps/s | 5.8446 KOps/s | |
| test_permute | 0.3209ms | 0.1802ms | 5.5504 KOps/s | 5.5609 KOps/s | |
| test_stack | 52.0102ms | 50.5134ms | 19.7967 Ops/s | 19.5183 Ops/s | |
| test_cat | 51.8713ms | 51.3347ms | 19.4800 Ops/s | 19.5330 Ops/s |
8c1335e to
04e1c1d
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Deprecation
Announces or enacts a deprecation
Refactor
Refactoring code - not a new feature
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I propose a global flag
composite_lp_aggregateto handle the issue of the aggregation of log-probs.So far, we have dealt with this using kwargs everywhere (in
ProbabilisticTDModule,ProbabilisticTDSequential,CompositeDistributionand subclasses). The hierarchy of these classes and what to do when args conflict isn't easy to handle. It's also confusing for users, who I suspect will usually want to work with either collapsed or non-collapsed log-probs.A global flag (set to True for now and False in the future) will make things easier to handle.
Globally, the v0.6.2 behaviour will not be changed all users who rely on it will be informed about the upcoming change through a warning that will tell them to set the global var to
Falseto accommodate upcoming changes. If they set it toTrue, nothing will change for them but that also means that bugs will not be solved (we won't maintain theTruebehaviour).When
composite_lp_aggregate() == True, we'll haveaggregate_probabilities=True,include_sum=Trueandinplace=Trueby default everywhere. Whencomposite_lp_aggregate() == False, all of these will be set to False, meaning that any call towhatever.log_prob(tensordict)will return another tensordict containing the leaf log-probs.