Skip to content

Conversation

@vmoens
Copy link
Collaborator

@vmoens vmoens commented May 15, 2024

Unfortunately I didn't manage to find a way to make the error more explicit if the file is being written when it isn't writable (currently this will raise a segfault).

Also the workaround I found doesn't give you a storage with a filename associated - it's a bit hacky from that point of view. MemoryMappedTensor will have its filename attribute set however. See tests for more context

cc @albanD, @MateuszGuzek @mikaylagawarecki

Vincent Moens added 3 commits May 15, 2024 14:38
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 15, 2024
@vmoens vmoens added the bug Something isn't working label May 15, 2024
@github-actions
Copy link

github-actions bot commented May 15, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 37.2300μs 17.8328μs 56.0765 KOps/s 58.6068 KOps/s $\color{#d91a1a}-4.32\%$
test_plain_set_stack_nested 76.7240μs 17.6627μs 56.6163 KOps/s 59.2569 KOps/s $\color{#d91a1a}-4.46\%$
test_plain_set_nested_inplace 65.0620μs 19.8925μs 50.2702 KOps/s 53.0902 KOps/s $\textbf{\color{#d91a1a}-5.31\%}$
test_plain_set_stack_nested_inplace 86.2620μs 19.8630μs 50.3448 KOps/s 53.3432 KOps/s $\textbf{\color{#d91a1a}-5.62\%}$
test_items 23.9960μs 2.5433μs 393.1967 KOps/s 391.0210 KOps/s $\color{#35bf28}+0.56\%$
test_items_nested 0.8629ms 0.2657ms 3.7634 KOps/s 3.7183 KOps/s $\color{#35bf28}+1.21\%$
test_items_nested_locked 0.5132ms 0.2713ms 3.6866 KOps/s 3.6970 KOps/s $\color{#d91a1a}-0.28\%$
test_items_nested_leaf 0.1437ms 75.9310μs 13.1699 KOps/s 12.7789 KOps/s $\color{#35bf28}+3.06\%$
test_items_stack_nested 0.4838ms 0.2700ms 3.7033 KOps/s 3.7162 KOps/s $\color{#d91a1a}-0.35\%$
test_items_stack_nested_leaf 0.1561ms 76.0209μs 13.1543 KOps/s 12.7718 KOps/s $\color{#35bf28}+2.99\%$
test_items_stack_nested_locked 0.9298ms 0.2697ms 3.7075 KOps/s 3.7092 KOps/s $\color{#d91a1a}-0.05\%$
test_keys 29.3450μs 3.9244μs 254.8185 KOps/s 261.4348 KOps/s $\color{#d91a1a}-2.53\%$
test_keys_nested 0.2350ms 0.1418ms 7.0528 KOps/s 7.0450 KOps/s $\color{#35bf28}+0.11\%$
test_keys_nested_locked 2.0682ms 0.1479ms 6.7592 KOps/s 6.7625 KOps/s $\color{#d91a1a}-0.05\%$
test_keys_nested_leaf 0.2429ms 0.1202ms 8.3171 KOps/s 8.3813 KOps/s $\color{#d91a1a}-0.77\%$
test_keys_stack_nested 0.2266ms 0.1412ms 7.0821 KOps/s 7.0204 KOps/s $\color{#35bf28}+0.88\%$
test_keys_stack_nested_leaf 0.2785ms 0.1203ms 8.3132 KOps/s 8.3585 KOps/s $\color{#d91a1a}-0.54\%$
test_keys_stack_nested_locked 0.2007ms 0.1425ms 7.0196 KOps/s 6.7920 KOps/s $\color{#35bf28}+3.35\%$
test_values 9.8285μs 1.1652μs 858.2310 KOps/s 855.4459 KOps/s $\color{#35bf28}+0.33\%$
test_values_nested 0.1063ms 51.0348μs 19.5945 KOps/s 19.7098 KOps/s $\color{#d91a1a}-0.59\%$
test_values_nested_locked 0.1289ms 51.4090μs 19.4518 KOps/s 19.4785 KOps/s $\color{#d91a1a}-0.14\%$
test_values_nested_leaf 0.1009ms 46.9936μs 21.2795 KOps/s 21.9245 KOps/s $\color{#d91a1a}-2.94\%$
test_values_stack_nested 0.1387ms 52.6875μs 18.9798 KOps/s 19.6585 KOps/s $\color{#d91a1a}-3.45\%$
test_values_stack_nested_leaf 96.7710μs 46.5528μs 21.4810 KOps/s 21.7988 KOps/s $\color{#d91a1a}-1.46\%$
test_values_stack_nested_locked 0.1081ms 52.7810μs 18.9462 KOps/s 19.8508 KOps/s $\color{#d91a1a}-4.56\%$
test_membership 28.7140μs 1.3842μs 722.4627 KOps/s 740.6599 KOps/s $\color{#d91a1a}-2.46\%$
test_membership_nested 37.5400μs 3.5297μs 283.3070 KOps/s 288.8658 KOps/s $\color{#d91a1a}-1.92\%$
test_membership_nested_leaf 33.5730μs 3.5544μs 281.3425 KOps/s 256.3752 KOps/s $\textbf{\color{#35bf28}+9.74\%}$
test_membership_stacked_nested 37.1600μs 3.5083μs 285.0387 KOps/s 292.0845 KOps/s $\color{#d91a1a}-2.41\%$
test_membership_stacked_nested_leaf 22.0220μs 3.4911μs 286.4420 KOps/s 288.7756 KOps/s $\color{#d91a1a}-0.81\%$
test_membership_nested_last 24.2550μs 4.2179μs 237.0876 KOps/s 235.9028 KOps/s $\color{#35bf28}+0.50\%$
test_membership_nested_leaf_last 42.0190μs 4.2663μs 234.3937 KOps/s 235.2277 KOps/s $\color{#d91a1a}-0.35\%$
test_membership_stacked_nested_last 74.8110μs 12.9855μs 77.0091 KOps/s 232.7743 KOps/s $\textbf{\color{#d91a1a}-66.92\%}$
test_membership_stacked_nested_leaf_last 51.5970μs 12.8803μs 77.6377 KOps/s 236.4966 KOps/s $\textbf{\color{#d91a1a}-67.17\%}$
test_nested_getleaf 88.1350μs 11.0012μs 90.8990 KOps/s 92.9817 KOps/s $\color{#d91a1a}-2.24\%$
test_nested_get 48.7710μs 10.4153μs 96.0130 KOps/s 99.3829 KOps/s $\color{#d91a1a}-3.39\%$
test_stacked_getleaf 53.8810μs 10.9360μs 91.4413 KOps/s 94.2439 KOps/s $\color{#d91a1a}-2.97\%$
test_stacked_get 44.7040μs 10.2242μs 97.8068 KOps/s 100.5809 KOps/s $\color{#d91a1a}-2.76\%$
test_nested_getitemleaf 67.1460μs 11.4091μs 87.6491 KOps/s 89.0585 KOps/s $\color{#d91a1a}-1.58\%$
test_nested_getitem 45.6960μs 10.2375μs 97.6804 KOps/s 98.2836 KOps/s $\color{#d91a1a}-0.61\%$
test_stacked_getitemleaf 51.2660μs 11.4113μs 87.6322 KOps/s 90.7866 KOps/s $\color{#d91a1a}-3.47\%$
test_stacked_getitem 46.2970μs 10.3138μs 96.9571 KOps/s 97.6002 KOps/s $\color{#d91a1a}-0.66\%$
test_lock_nested 52.0416ms 0.4055ms 2.4659 KOps/s 2.7941 KOps/s $\textbf{\color{#d91a1a}-11.75\%}$
test_lock_stack_nested 0.8690ms 0.3055ms 3.2734 KOps/s 3.1926 KOps/s $\color{#35bf28}+2.53\%$
test_unlock_nested 0.8160ms 0.3569ms 2.8022 KOps/s 2.3891 KOps/s $\textbf{\color{#35bf28}+17.29\%}$
test_unlock_stack_nested 0.4777ms 0.3105ms 3.2206 KOps/s 3.1093 KOps/s $\color{#35bf28}+3.58\%$
test_flatten_speed 0.2180ms 96.2799μs 10.3864 KOps/s 10.3396 KOps/s $\color{#35bf28}+0.45\%$
test_unflatten_speed 0.6951ms 0.4119ms 2.4280 KOps/s 2.4317 KOps/s $\color{#d91a1a}-0.15\%$
test_common_ops 1.6461ms 0.7378ms 1.3554 KOps/s 1.3976 KOps/s $\color{#d91a1a}-3.02\%$
test_creation 23.3140μs 1.9227μs 520.0980 KOps/s 513.2244 KOps/s $\color{#35bf28}+1.34\%$
test_creation_empty 35.0260μs 11.8888μs 84.1126 KOps/s 98.1154 KOps/s $\textbf{\color{#d91a1a}-14.27\%}$
test_creation_nested_1 45.5550μs 14.3388μs 69.7409 KOps/s 76.8071 KOps/s $\textbf{\color{#d91a1a}-9.20\%}$
test_creation_nested_2 47.4990μs 17.7822μs 56.2361 KOps/s 61.7305 KOps/s $\textbf{\color{#d91a1a}-8.90\%}$
test_clone 0.1261ms 13.4002μs 74.6257 KOps/s 72.4048 KOps/s $\color{#35bf28}+3.07\%$
test_getitem[int] 36.4090μs 11.7445μs 85.1461 KOps/s 84.5062 KOps/s $\color{#35bf28}+0.76\%$
test_getitem[slice_int] 63.0480μs 23.4116μs 42.7138 KOps/s 43.2964 KOps/s $\color{#d91a1a}-1.35\%$
test_getitem[range] 81.7240μs 61.5136μs 16.2566 KOps/s 15.5694 KOps/s $\color{#35bf28}+4.41\%$
test_getitem[tuple] 54.7830μs 19.4037μs 51.5365 KOps/s 51.5100 KOps/s $\color{#35bf28}+0.05\%$
test_getitem[list] 0.2104ms 41.7842μs 23.9325 KOps/s 24.2767 KOps/s $\color{#d91a1a}-1.42\%$
test_setitem_dim[int] 74.7010μs 35.3827μs 28.2624 KOps/s 28.8296 KOps/s $\color{#d91a1a}-1.97\%$
test_setitem_dim[slice_int] 0.1098ms 61.7210μs 16.2019 KOps/s 16.1941 KOps/s $\color{#35bf28}+0.05\%$
test_setitem_dim[range] 0.2092ms 85.6517μs 11.6752 KOps/s 11.4189 KOps/s $\color{#35bf28}+2.24\%$
test_setitem_dim[tuple] 88.4360μs 49.5824μs 20.1685 KOps/s 19.9402 KOps/s $\color{#35bf28}+1.14\%$
test_setitem 56.7560μs 20.4429μs 48.9168 KOps/s 48.3813 KOps/s $\color{#35bf28}+1.11\%$
test_set 57.5170μs 19.6619μs 50.8598 KOps/s 49.6155 KOps/s $\color{#35bf28}+2.51\%$
test_set_shared 2.9637ms 0.1481ms 6.7526 KOps/s 6.9362 KOps/s $\color{#d91a1a}-2.65\%$
test_update 0.2350ms 22.3237μs 44.7955 KOps/s 46.4449 KOps/s $\color{#d91a1a}-3.55\%$
test_update_nested 93.1050μs 31.3679μs 31.8797 KOps/s 33.1526 KOps/s $\color{#d91a1a}-3.84\%$
test_update__nested 74.9710μs 24.9403μs 40.0957 KOps/s 39.3952 KOps/s $\color{#35bf28}+1.78\%$
test_set_nested 88.8470μs 21.9076μs 45.6463 KOps/s 46.1832 KOps/s $\color{#d91a1a}-1.16\%$
test_set_nested_new 89.6590μs 25.7769μs 38.7945 KOps/s 39.5010 KOps/s $\color{#d91a1a}-1.79\%$
test_select 0.1068ms 41.3093μs 24.2076 KOps/s 24.2425 KOps/s $\color{#d91a1a}-0.14\%$
test_select_nested 0.1344ms 61.4908μs 16.2626 KOps/s 16.3946 KOps/s $\color{#d91a1a}-0.81\%$
test_exclude_nested 0.3024ms 0.1238ms 8.0784 KOps/s 8.1062 KOps/s $\color{#d91a1a}-0.34\%$
test_empty[True] 0.6603ms 0.4084ms 2.4489 KOps/s 2.4375 KOps/s $\color{#35bf28}+0.47\%$
test_empty[False] 6.2136μs 1.0941μs 913.9827 KOps/s 914.4619 KOps/s $\color{#d91a1a}-0.05\%$
test_unbind_speed 1.9461ms 0.2640ms 3.7885 KOps/s 3.7639 KOps/s $\color{#35bf28}+0.65\%$
test_unbind_speed_stack0 0.4790ms 0.2491ms 4.0139 KOps/s 3.8337 KOps/s $\color{#35bf28}+4.70\%$
test_unbind_speed_stack1 75.5524ms 0.7219ms 1.3852 KOps/s 1.2174 KOps/s $\textbf{\color{#35bf28}+13.79\%}$
test_split 77.8431ms 1.6688ms 599.2342 Ops/s 600.5056 Ops/s $\color{#d91a1a}-0.21\%$
test_chunk 74.9934ms 1.6656ms 600.3724 Ops/s 651.8272 Ops/s $\textbf{\color{#d91a1a}-7.89\%}$
test_creation[device0] 0.2176ms 85.4402μs 11.7041 KOps/s 11.5072 KOps/s $\color{#35bf28}+1.71\%$
test_creation_from_tensor 7.2233ms 87.8966μs 11.3770 KOps/s 11.6533 KOps/s $\color{#d91a1a}-2.37\%$
test_add_one[memmap_tensor0] 0.1330ms 5.4486μs 183.5335 KOps/s 178.7304 KOps/s $\color{#35bf28}+2.69\%$
test_contiguous[memmap_tensor0] 25.6680μs 0.6446μs 1.5513 MOps/s 1.5444 MOps/s $\color{#35bf28}+0.45\%$
test_stack[memmap_tensor0] 62.9580μs 3.5548μs 281.3134 KOps/s 278.3062 KOps/s $\color{#35bf28}+1.08\%$
test_memmaptd_index 1.0288ms 0.2599ms 3.8479 KOps/s 3.2351 KOps/s $\textbf{\color{#35bf28}+18.94\%}$
test_memmaptd_index_astensor 0.5973ms 0.3354ms 2.9814 KOps/s 3.0226 KOps/s $\color{#d91a1a}-1.37\%$
test_memmaptd_index_op 1.2195ms 0.6480ms 1.5433 KOps/s 1.5839 KOps/s $\color{#d91a1a}-2.56\%$
test_serialize_model 0.1869s 0.1194s 8.3732 Ops/s 8.4313 Ops/s $\color{#d91a1a}-0.69\%$
test_serialize_model_pickle 0.4515s 0.3785s 2.6422 Ops/s 2.6611 Ops/s $\color{#d91a1a}-0.71\%$
test_serialize_weights 0.1837s 0.1179s 8.4789 Ops/s 8.6232 Ops/s $\color{#d91a1a}-1.67\%$
test_serialize_weights_returnearly 0.1376s 0.1320s 7.5780 Ops/s 6.8839 Ops/s $\textbf{\color{#35bf28}+10.08\%}$
test_serialize_weights_pickle 1.0879s 0.7642s 1.3086 Ops/s 1.3720 Ops/s $\color{#d91a1a}-4.62\%$
test_serialize_weights_filesystem 0.1646s 0.1017s 9.8341 Ops/s 9.6639 Ops/s $\color{#35bf28}+1.76\%$
test_serialize_model_filesystem 0.1053s 97.4778ms 10.2587 Ops/s 10.2525 Ops/s $\color{#35bf28}+0.06\%$
test_reshape_pytree 73.4780μs 26.0624μs 38.3695 KOps/s 37.2903 KOps/s $\color{#35bf28}+2.89\%$
test_reshape_td 80.6320μs 34.8791μs 28.6705 KOps/s 29.2106 KOps/s $\color{#d91a1a}-1.85\%$
test_view_pytree 81.3730μs 25.9662μs 38.5117 KOps/s 39.0547 KOps/s $\color{#d91a1a}-1.39\%$
test_view_td 78.8680μs 38.4607μs 26.0006 KOps/s 25.7011 KOps/s $\color{#35bf28}+1.17\%$
test_unbind_pytree 78.4480μs 29.3995μs 34.0142 KOps/s 34.2178 KOps/s $\color{#d91a1a}-0.60\%$
test_unbind_td 0.3734ms 38.4517μs 26.0066 KOps/s 26.0365 KOps/s $\color{#d91a1a}-0.11\%$
test_split_pytree 92.3830μs 29.5245μs 33.8702 KOps/s 34.1390 KOps/s $\color{#d91a1a}-0.79\%$
test_split_td 0.1206ms 41.4200μs 24.1430 KOps/s 24.0776 KOps/s $\color{#35bf28}+0.27\%$
test_add_pytree 85.1300μs 35.3108μs 28.3200 KOps/s 27.9962 KOps/s $\color{#35bf28}+1.16\%$
test_add_td 0.1245ms 57.5727μs 17.3694 KOps/s 18.5274 KOps/s $\textbf{\color{#d91a1a}-6.25\%}$
test_distributed 0.2514ms 0.1036ms 9.6536 KOps/s 9.5422 KOps/s $\color{#35bf28}+1.17\%$
test_tdmodule 34.4450μs 18.6737μs 53.5512 KOps/s 56.9033 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_tdmodule_dispatch 77.8150μs 36.4958μs 27.4004 KOps/s 28.2611 KOps/s $\color{#d91a1a}-3.05\%$
test_tdseq 62.0260μs 21.7660μs 45.9433 KOps/s 47.9837 KOps/s $\color{#d91a1a}-4.25\%$
test_tdseq_dispatch 64.9220μs 41.8903μs 23.8719 KOps/s 24.8662 KOps/s $\color{#d91a1a}-4.00\%$
test_instantiation_functorch 2.2223ms 1.3170ms 759.3075 Ops/s 745.2970 Ops/s $\color{#35bf28}+1.88\%$
test_instantiation_td 1.7840ms 1.0275ms 973.2069 Ops/s 949.0038 Ops/s $\color{#35bf28}+2.55\%$
test_exec_functorch 0.3043ms 0.1644ms 6.0826 KOps/s 5.7207 KOps/s $\textbf{\color{#35bf28}+6.33\%}$
test_exec_functional_call 0.3464ms 0.1553ms 6.4382 KOps/s 6.4000 KOps/s $\color{#35bf28}+0.60\%$
test_exec_td 0.3702ms 0.1476ms 6.7738 KOps/s 6.4859 KOps/s $\color{#35bf28}+4.44\%$
test_exec_td_decorator 0.9214ms 0.2245ms 4.4548 KOps/s 4.3993 KOps/s $\color{#35bf28}+1.26\%$
test_vmap_mlp_speed[True-True] 0.8012ms 0.5028ms 1.9889 KOps/s 2.0297 KOps/s $\color{#d91a1a}-2.01\%$
test_vmap_mlp_speed[True-False] 0.9311ms 0.5021ms 1.9915 KOps/s 2.0504 KOps/s $\color{#d91a1a}-2.88\%$
test_vmap_mlp_speed[False-True] 0.5101ms 0.4066ms 2.4596 KOps/s 2.5019 KOps/s $\color{#d91a1a}-1.69\%$
test_vmap_mlp_speed[False-False] 0.6510ms 0.4079ms 2.4515 KOps/s 2.4916 KOps/s $\color{#d91a1a}-1.61\%$
test_vmap_mlp_speed_decorator[True-True] 1.0401ms 0.5738ms 1.7429 KOps/s 1.7783 KOps/s $\color{#d91a1a}-1.99\%$
test_vmap_mlp_speed_decorator[True-False] 1.4447ms 0.5899ms 1.6952 KOps/s 1.6428 KOps/s $\color{#35bf28}+3.19\%$
test_vmap_mlp_speed_decorator[False-True] 0.6756ms 0.4718ms 2.1195 KOps/s 2.1623 KOps/s $\color{#d91a1a}-1.98\%$
test_vmap_mlp_speed_decorator[False-False] 0.6882ms 0.4724ms 2.1167 KOps/s 2.1821 KOps/s $\color{#d91a1a}-3.00\%$
test_to_module_speed[True] 2.5492ms 1.7266ms 579.1741 Ops/s 596.2517 Ops/s $\color{#d91a1a}-2.86\%$
test_to_module_speed[False] 2.6736ms 1.6742ms 597.3104 Ops/s 587.7338 Ops/s $\color{#35bf28}+1.63\%$

@github-actions
Copy link

github-actions bot commented May 15, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 135. Improved: $\large\color{#35bf28}5$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 61.4920μs 12.8124μs 78.0492 KOps/s 76.3652 KOps/s $\color{#35bf28}+2.21\%$
test_plain_set_stack_nested 35.2210μs 12.9310μs 77.3336 KOps/s 76.6519 KOps/s $\color{#35bf28}+0.89\%$
test_plain_set_nested_inplace 31.2910μs 14.2044μs 70.4006 KOps/s 69.8260 KOps/s $\color{#35bf28}+0.82\%$
test_plain_set_stack_nested_inplace 33.0310μs 14.1825μs 70.5093 KOps/s 70.2689 KOps/s $\color{#35bf28}+0.34\%$
test_items 23.0810μs 4.7415μs 210.9058 KOps/s 210.6723 KOps/s $\color{#35bf28}+0.11\%$
test_items_nested 0.3933ms 0.3401ms 2.9401 KOps/s 3.0122 KOps/s $\color{#d91a1a}-2.39\%$
test_items_nested_locked 0.3749ms 0.3400ms 2.9410 KOps/s 2.9544 KOps/s $\color{#d91a1a}-0.45\%$
test_items_nested_leaf 0.1124ms 82.2884μs 12.1524 KOps/s 12.0983 KOps/s $\color{#35bf28}+0.45\%$
test_items_stack_nested 0.3936ms 0.3401ms 2.9401 KOps/s 2.9641 KOps/s $\color{#d91a1a}-0.81\%$
test_items_stack_nested_leaf 0.1146ms 84.7251μs 11.8029 KOps/s 12.0715 KOps/s $\color{#d91a1a}-2.23\%$
test_items_stack_nested_locked 0.3701ms 0.3434ms 2.9124 KOps/s 2.9493 KOps/s $\color{#d91a1a}-1.25\%$
test_keys 22.9810μs 4.7035μs 212.6060 KOps/s 210.9931 KOps/s $\color{#35bf28}+0.76\%$
test_keys_nested 92.8720μs 68.3657μs 14.6272 KOps/s 14.4890 KOps/s $\color{#35bf28}+0.95\%$
test_keys_nested_locked 0.6692ms 73.3098μs 13.6407 KOps/s 13.5198 KOps/s $\color{#35bf28}+0.89\%$
test_keys_nested_leaf 81.1020μs 58.4546μs 17.1073 KOps/s 16.8688 KOps/s $\color{#35bf28}+1.41\%$
test_keys_stack_nested 90.8020μs 67.1996μs 14.8810 KOps/s 14.6009 KOps/s $\color{#35bf28}+1.92\%$
test_keys_stack_nested_leaf 78.5620μs 58.1832μs 17.1871 KOps/s 16.8913 KOps/s $\color{#35bf28}+1.75\%$
test_keys_stack_nested_locked 97.4020μs 72.5413μs 13.7852 KOps/s 13.7307 KOps/s $\color{#35bf28}+0.40\%$
test_values 7.1500μs 1.8164μs 550.5477 KOps/s 546.3679 KOps/s $\color{#35bf28}+0.77\%$
test_values_nested 60.1020μs 35.6867μs 28.0216 KOps/s 28.0062 KOps/s $\color{#35bf28}+0.06\%$
test_values_nested_locked 52.4610μs 38.2695μs 26.1305 KOps/s 26.5325 KOps/s $\color{#d91a1a}-1.52\%$
test_values_nested_leaf 46.3910μs 31.8232μs 31.4236 KOps/s 31.3875 KOps/s $\color{#35bf28}+0.12\%$
test_values_stack_nested 59.3120μs 36.2752μs 27.5670 KOps/s 27.4077 KOps/s $\color{#35bf28}+0.58\%$
test_values_stack_nested_leaf 51.1610μs 32.6518μs 30.6262 KOps/s 30.6511 KOps/s $\color{#d91a1a}-0.08\%$
test_values_stack_nested_locked 60.1510μs 38.8458μs 25.7428 KOps/s 25.9860 KOps/s $\color{#d91a1a}-0.94\%$
test_membership 1.3465μs 0.7000μs 1.4286 MOps/s 1.3832 MOps/s $\color{#35bf28}+3.28\%$
test_membership_nested 15.0400μs 2.4950μs 400.7959 KOps/s 391.4485 KOps/s $\color{#35bf28}+2.39\%$
test_membership_nested_leaf 16.4800μs 2.4876μs 401.9936 KOps/s 390.4335 KOps/s $\color{#35bf28}+2.96\%$
test_membership_stacked_nested 27.0210μs 2.5091μs 398.5417 KOps/s 391.0967 KOps/s $\color{#35bf28}+1.90\%$
test_membership_stacked_nested_leaf 20.2810μs 2.4897μs 401.6581 KOps/s 393.8437 KOps/s $\color{#35bf28}+1.98\%$
test_membership_nested_last 17.3800μs 3.0487μs 328.0139 KOps/s 324.8281 KOps/s $\color{#35bf28}+0.98\%$
test_membership_nested_leaf_last 24.6600μs 3.0434μs 328.5754 KOps/s 325.4699 KOps/s $\color{#35bf28}+0.95\%$
test_membership_stacked_nested_last 22.4800μs 3.0522μs 327.6274 KOps/s 237.0520 KOps/s $\textbf{\color{#35bf28}+38.21\%}$
test_membership_stacked_nested_leaf_last 16.5300μs 3.0334μs 329.6663 KOps/s 237.6900 KOps/s $\textbf{\color{#35bf28}+38.70\%}$
test_nested_getleaf 29.1110μs 8.4428μs 118.4444 KOps/s 118.8568 KOps/s $\color{#d91a1a}-0.35\%$
test_nested_get 28.8410μs 7.9654μs 125.5431 KOps/s 126.4472 KOps/s $\color{#d91a1a}-0.72\%$
test_stacked_getleaf 28.6100μs 8.4083μs 118.9305 KOps/s 118.6985 KOps/s $\color{#35bf28}+0.20\%$
test_stacked_get 24.4300μs 7.9167μs 126.3153 KOps/s 127.4697 KOps/s $\color{#d91a1a}-0.91\%$
test_nested_getitemleaf 29.3710μs 8.5856μs 116.4735 KOps/s 116.8615 KOps/s $\color{#d91a1a}-0.33\%$
test_nested_getitem 23.3210μs 8.1050μs 123.3808 KOps/s 123.8724 KOps/s $\color{#d91a1a}-0.40\%$
test_stacked_getitemleaf 24.6210μs 8.5665μs 116.7336 KOps/s 116.6433 KOps/s $\color{#35bf28}+0.08\%$
test_stacked_getitem 26.0610μs 8.0531μs 124.1751 KOps/s 123.8829 KOps/s $\color{#35bf28}+0.24\%$
test_lock_nested 55.4446ms 0.4171ms 2.3974 KOps/s 2.3566 KOps/s $\color{#35bf28}+1.73\%$
test_lock_stack_nested 0.3548ms 0.3146ms 3.1783 KOps/s 3.2247 KOps/s $\color{#d91a1a}-1.44\%$
test_unlock_nested 0.7583ms 0.3607ms 2.7723 KOps/s 2.7695 KOps/s $\color{#35bf28}+0.10\%$
test_unlock_stack_nested 0.3491ms 0.3227ms 3.0991 KOps/s 3.1467 KOps/s $\color{#d91a1a}-1.51\%$
test_flatten_speed 0.4760ms 0.1036ms 9.6515 KOps/s 9.8370 KOps/s $\color{#d91a1a}-1.89\%$
test_unflatten_speed 0.3446ms 0.2903ms 3.4449 KOps/s 3.4354 KOps/s $\color{#35bf28}+0.28\%$
test_common_ops 1.0384ms 0.5902ms 1.6944 KOps/s 1.6369 KOps/s $\color{#35bf28}+3.52\%$
test_creation 35.4510μs 1.6531μs 604.9194 KOps/s 607.9347 KOps/s $\color{#d91a1a}-0.50\%$
test_creation_empty 32.6710μs 8.4916μs 117.7637 KOps/s 113.2835 KOps/s $\color{#35bf28}+3.95\%$
test_creation_nested_1 26.4000μs 10.3313μs 96.7934 KOps/s 94.7624 KOps/s $\color{#35bf28}+2.14\%$
test_creation_nested_2 33.2610μs 12.6544μs 79.0242 KOps/s 78.4378 KOps/s $\color{#35bf28}+0.75\%$
test_clone 91.3920μs 12.1646μs 82.2059 KOps/s 81.5191 KOps/s $\color{#35bf28}+0.84\%$
test_getitem[int] 30.6600μs 10.9877μs 91.0110 KOps/s 90.5109 KOps/s $\color{#35bf28}+0.55\%$
test_getitem[slice_int] 46.4310μs 21.8252μs 45.8187 KOps/s 46.2548 KOps/s $\color{#d91a1a}-0.94\%$
test_getitem[range] 67.8010μs 48.1130μs 20.7844 KOps/s 20.7087 KOps/s $\color{#35bf28}+0.37\%$
test_getitem[tuple] 42.9310μs 19.5648μs 51.1121 KOps/s 50.5928 KOps/s $\color{#35bf28}+1.03\%$
test_getitem[list] 0.1327ms 35.6342μs 28.0629 KOps/s 28.4419 KOps/s $\color{#d91a1a}-1.33\%$
test_setitem_dim[int] 44.9610μs 28.8105μs 34.7096 KOps/s 33.3173 KOps/s $\color{#35bf28}+4.18\%$
test_setitem_dim[slice_int] 75.0020μs 50.9103μs 19.6424 KOps/s 19.6281 KOps/s $\color{#35bf28}+0.07\%$
test_setitem_dim[range] 96.9620μs 67.6645μs 14.7788 KOps/s 14.6209 KOps/s $\color{#35bf28}+1.08\%$
test_setitem_dim[tuple] 62.2010μs 43.7589μs 22.8525 KOps/s 22.2995 KOps/s $\color{#35bf28}+2.48\%$
test_setitem 42.3000μs 16.9815μs 58.8878 KOps/s 57.7126 KOps/s $\color{#35bf28}+2.04\%$
test_set 51.8310μs 16.4974μs 60.6156 KOps/s 59.0736 KOps/s $\color{#35bf28}+2.61\%$
test_set_shared 69.9313ms 0.1134ms 8.8214 KOps/s 9.8632 KOps/s $\textbf{\color{#d91a1a}-10.56\%}$
test_update 0.1132ms 19.6079μs 51.0000 KOps/s 52.5282 KOps/s $\color{#d91a1a}-2.91\%$
test_update_nested 72.3420μs 23.8593μs 41.9124 KOps/s 40.4761 KOps/s $\color{#35bf28}+3.55\%$
test_update__nested 67.7420μs 23.4245μs 42.6903 KOps/s 42.8482 KOps/s $\color{#d91a1a}-0.37\%$
test_set_nested 54.9110μs 17.7570μs 56.3159 KOps/s 55.8582 KOps/s $\color{#35bf28}+0.82\%$
test_set_nested_new 63.5310μs 20.1082μs 49.7310 KOps/s 48.2609 KOps/s $\color{#35bf28}+3.05\%$
test_select 70.9610μs 33.8151μs 29.5726 KOps/s 29.4347 KOps/s $\color{#35bf28}+0.47\%$
test_select_nested 0.8938ms 55.8336μs 17.9104 KOps/s 18.2825 KOps/s $\color{#d91a1a}-2.04\%$
test_exclude_nested 0.1314ms 0.1121ms 8.9224 KOps/s 9.0631 KOps/s $\color{#d91a1a}-1.55\%$
test_empty[True] 0.4089ms 0.3482ms 2.8723 KOps/s 2.8903 KOps/s $\color{#d91a1a}-0.62\%$
test_empty[False] 2.9561μs 0.8791μs 1.1375 MOps/s 1.1343 MOps/s $\color{#35bf28}+0.28\%$
test_to 0.1068ms 79.1949μs 12.6271 KOps/s 12.7249 KOps/s $\color{#d91a1a}-0.77\%$
test_to_nonblocking 95.0610μs 64.4088μs 15.5258 KOps/s 15.5601 KOps/s $\color{#d91a1a}-0.22\%$
test_unbind_speed 0.3264ms 0.2798ms 3.5740 KOps/s 3.6801 KOps/s $\color{#d91a1a}-2.88\%$
test_unbind_speed_stack0 0.3421ms 0.2805ms 3.5648 KOps/s 3.6673 KOps/s $\color{#d91a1a}-2.79\%$
test_unbind_speed_stack1 74.1470ms 0.8325ms 1.2011 KOps/s 1.2354 KOps/s $\color{#d91a1a}-2.77\%$
test_split 1.7099ms 1.6123ms 620.2511 Ops/s 635.4181 Ops/s $\color{#d91a1a}-2.39\%$
test_chunk 72.5356ms 1.7165ms 582.5649 Ops/s 592.4406 Ops/s $\color{#d91a1a}-1.67\%$
test_creation[device0] 0.1109ms 58.0814μs 17.2172 KOps/s 17.1298 KOps/s $\color{#35bf28}+0.51\%$
test_creation_from_tensor 0.1323ms 57.1640μs 17.4935 KOps/s 18.1581 KOps/s $\color{#d91a1a}-3.66\%$
test_add_one[memmap_tensor0] 81.8120μs 7.2663μs 137.6212 KOps/s 134.0293 KOps/s $\color{#35bf28}+2.68\%$
test_contiguous[memmap_tensor0] 24.8010μs 0.6688μs 1.4953 MOps/s 1.4938 MOps/s $\color{#35bf28}+0.10\%$
test_stack[memmap_tensor0] 30.4000μs 5.0194μs 199.2283 KOps/s 198.5644 KOps/s $\color{#35bf28}+0.33\%$
test_memmaptd_index 1.1831ms 0.2991ms 3.3431 KOps/s 3.4451 KOps/s $\color{#d91a1a}-2.96\%$
test_memmaptd_index_astensor 0.6350ms 0.3738ms 2.6753 KOps/s 2.7393 KOps/s $\color{#d91a1a}-2.34\%$
test_memmaptd_index_op 1.0895ms 0.6766ms 1.4780 KOps/s 1.4529 KOps/s $\color{#35bf28}+1.73\%$
test_serialize_model 0.1804s 0.1125s 8.8871 Ops/s 8.7878 Ops/s $\color{#35bf28}+1.13\%$
test_serialize_model_pickle 1.3509s 1.2359s 0.8091 Ops/s 0.8077 Ops/s $\color{#35bf28}+0.18\%$
test_serialize_weights 0.1053s 0.1016s 9.8449 Ops/s 9.8062 Ops/s $\color{#35bf28}+0.40\%$
test_serialize_weights_returnearly 0.2091s 95.5424ms 10.4666 Ops/s 11.1326 Ops/s $\textbf{\color{#d91a1a}-5.98\%}$
test_serialize_weights_pickle 1.3968s 1.2433s 0.8043 Ops/s 0.8007 Ops/s $\color{#35bf28}+0.45\%$
test_reshape_pytree 57.6610μs 23.8681μs 41.8969 KOps/s 41.9448 KOps/s $\color{#d91a1a}-0.11\%$
test_reshape_td 48.9510μs 32.1657μs 31.0890 KOps/s 31.6345 KOps/s $\color{#d91a1a}-1.72\%$
test_view_pytree 51.4700μs 23.8030μs 42.0115 KOps/s 40.5803 KOps/s $\color{#35bf28}+3.53\%$
test_view_td 68.1710μs 35.8908μs 27.8623 KOps/s 27.4556 KOps/s $\color{#35bf28}+1.48\%$
test_unbind_pytree 46.6810μs 29.6199μs 33.7611 KOps/s 33.7426 KOps/s $\color{#35bf28}+0.05\%$
test_unbind_td 0.4200ms 43.9334μs 22.7617 KOps/s 23.7929 KOps/s $\color{#d91a1a}-4.33\%$
test_split_pytree 0.1613ms 33.3299μs 30.0031 KOps/s 30.4525 KOps/s $\color{#d91a1a}-1.48\%$
test_split_td 0.5034ms 41.6291μs 24.0217 KOps/s 25.0954 KOps/s $\color{#d91a1a}-4.28\%$
test_add_pytree 55.3110μs 35.5235μs 28.1504 KOps/s 24.8327 KOps/s $\textbf{\color{#35bf28}+13.36\%}$
test_add_td 83.8120μs 49.3270μs 20.2729 KOps/s 18.3231 KOps/s $\textbf{\color{#35bf28}+10.64\%}$
test_distributed 0.1428ms 66.6071μs 15.0134 KOps/s 14.3450 KOps/s $\color{#35bf28}+4.66\%$
test_tdmodule 35.4500μs 14.8983μs 67.1216 KOps/s 63.2948 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_tdmodule_dispatch 44.3710μs 29.0734μs 34.3957 KOps/s 33.5262 KOps/s $\color{#35bf28}+2.59\%$
test_tdseq 32.2700μs 16.6118μs 60.1981 KOps/s 58.5950 KOps/s $\color{#35bf28}+2.74\%$
test_tdseq_dispatch 48.8510μs 32.6214μs 30.6547 KOps/s 29.7273 KOps/s $\color{#35bf28}+3.12\%$
test_instantiation_functorch 1.8452ms 1.5373ms 650.4705 Ops/s 658.8586 Ops/s $\color{#d91a1a}-1.27\%$
test_instantiation_td 75.6019ms 1.1683ms 855.9489 Ops/s 853.5232 Ops/s $\color{#35bf28}+0.28\%$
test_exec_functorch 0.1891ms 0.1537ms 6.5053 KOps/s 6.3209 KOps/s $\color{#35bf28}+2.92\%$
test_exec_functional_call 0.1753ms 0.1427ms 7.0091 KOps/s 6.9795 KOps/s $\color{#35bf28}+0.42\%$
test_exec_td 0.1859ms 0.1424ms 7.0225 KOps/s 6.9480 KOps/s $\color{#35bf28}+1.07\%$
test_exec_td_decorator 0.3103ms 0.2170ms 4.6090 KOps/s 4.5628 KOps/s $\color{#35bf28}+1.01\%$
test_vmap_mlp_speed[True-True] 0.7492ms 0.6196ms 1.6139 KOps/s 1.6202 KOps/s $\color{#d91a1a}-0.39\%$
test_vmap_mlp_speed[True-False] 0.6498ms 0.6169ms 1.6209 KOps/s 1.6338 KOps/s $\color{#d91a1a}-0.79\%$
test_vmap_mlp_speed[False-True] 0.5975ms 0.5462ms 1.8308 KOps/s 1.7978 KOps/s $\color{#35bf28}+1.84\%$
test_vmap_mlp_speed[False-False] 0.5905ms 0.5452ms 1.8343 KOps/s 1.7733 KOps/s $\color{#35bf28}+3.44\%$
test_vmap_mlp_speed_decorator[True-True] 1.4972ms 0.6808ms 1.4689 KOps/s 1.4477 KOps/s $\color{#35bf28}+1.47\%$
test_vmap_mlp_speed_decorator[True-False] 0.7967ms 0.6794ms 1.4720 KOps/s 1.4717 KOps/s $\color{#35bf28}+0.02\%$
test_vmap_mlp_speed_decorator[False-True] 0.7325ms 0.6039ms 1.6558 KOps/s 1.6468 KOps/s $\color{#35bf28}+0.54\%$
test_vmap_mlp_speed_decorator[False-False] 0.7428ms 0.6029ms 1.6586 KOps/s 1.6648 KOps/s $\color{#d91a1a}-0.37\%$
test_vmap_transformer_speed[True-True] 8.3046ms 8.1781ms 122.2783 Ops/s 122.0643 Ops/s $\color{#35bf28}+0.18\%$
test_vmap_transformer_speed[True-False] 8.7796ms 8.3612ms 119.6000 Ops/s 122.7525 Ops/s $\color{#d91a1a}-2.57\%$
test_vmap_transformer_speed[False-True] 8.6831ms 8.3457ms 119.8220 Ops/s 123.1338 Ops/s $\color{#d91a1a}-2.69\%$
test_vmap_transformer_speed[False-False] 8.8026ms 8.2931ms 120.5819 Ops/s 123.2504 Ops/s $\color{#d91a1a}-2.17\%$
test_vmap_transformer_speed_decorator[True-True] 20.9969ms 20.2842ms 49.2994 Ops/s 50.6019 Ops/s $\color{#d91a1a}-2.57\%$
test_vmap_transformer_speed_decorator[True-False] 20.6694ms 20.3133ms 49.2288 Ops/s 50.3893 Ops/s $\color{#d91a1a}-2.30\%$
test_vmap_transformer_speed_decorator[False-True] 20.9412ms 20.2414ms 49.4036 Ops/s 50.4768 Ops/s $\color{#d91a1a}-2.13\%$
test_vmap_transformer_speed_decorator[False-False] 20.8692ms 20.2392ms 49.4091 Ops/s 50.8435 Ops/s $\color{#d91a1a}-2.82\%$
test_to_module_speed[True] 1.8091ms 1.5698ms 637.0206 Ops/s 644.0178 Ops/s $\color{#d91a1a}-1.09\%$
test_to_module_speed[False] 1.7497ms 1.5694ms 637.1756 Ops/s 653.2855 Ops/s $\color{#d91a1a}-2.47\%$

@vmoens vmoens merged commit a088b87 into main May 16, 2024
@vmoens vmoens deleted the fix-unbind-memmap branch May 16, 2024 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants