You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A lot has changed in the last years. Go's sort and overall language implementation has improved. So I've decided to rerun a few tests and do research in order to decide if the included Timsort's implementation is worth the maintenance and additional memory allocation in GoDS:
github.com/psilva261/timsort/# go test -test.bench=.*
(Go 1.6, Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz)
Speed, as per given datasets, is still in favor of Timsort, albeit these xor-datasets are biased towards Timsort's adaptivness from minrun, exactly the purpose for which Timsort was crafted with assumption that real-world datasets show similar patterns. Xor-dataset generation : 0xff & (i ^ 0xab), starting xor-dataset sequence:
However, filtering out only random datasets, Go's sort expectedly has better performance, because Timsort's attempt to adapt to a random dataset only harms it. Go's simple two phase approach (quicksort first for large, then insertion sort for small datasets) performs better.
Modifying the test parameters to show memory allocation : go test -bench . -benchmem -benchtime 1s (allocated & allocation):
The dataset item in question (int is 64-bit on my machine):
typerecordstruct {
key, orderint
}
Go's sort as per documentation in source code does not make any memory allocations. This favors Go's sort.
Conclusion:
Timsort has better performance for dataset resembling (questionable?) real-world data with intrinsic patterns.
Go's sort is memory friendlier.
Timsort is stable unlike Go's sort, albeit Go's sort provides also a stable sort not tested here.
Using Go's sort in this project would mean less maintenance.
Using Go's sort would offload any optimizations to the algorithm to the Go's authors in future.
Even after this "quick" analysis, I am not sure if a switch to Go's sort is a good or bad idea or something in-between (most-probably). There is nothing like an average dataset and the choice depends on the nature of the dataset. When using GoDS in writing and solving complex systems, sorting is least likely to be the bottleneck, so analyzing this further is irrelevant and topic to academic discussions.
Timsort might be replaced in GoDS by Go's native sorting simply for keeping the code base of this library as small as possible and arguments given above.
A lot has changed in the last years. Go's sort and overall language implementation has improved. So I've decided to rerun a few tests and do research in order to decide if the included Timsort's implementation is worth the maintenance and additional memory allocation in GoDS:
github.com/psilva261/timsort/# go test -test.bench=.*
(Go 1.6, Intel(R) Core(TM) i5-4690K CPU @ 3.50GHz)
Speed, as per given datasets, is still in favor of Timsort, albeit these xor-datasets are biased towards Timsort's adaptivness from minrun, exactly the purpose for which Timsort was crafted with assumption that real-world datasets show similar patterns. Xor-dataset generation :
0xff & (i ^ 0xab)
, starting xor-dataset sequence:171, 170, 169, 168, 175, 174, 173, 172, 163, 162, 161, 160, 167, 166, 165, 164, 187, 186, 185, 184, 191, 190, 189, 188, 179, 178, 177, 176, 183, 182, 181, 180, 139, 138, 137, 136, 143, 142, 141, 140, 131, 130, 129, 128, 135, 134, 133, 132, 155, 154, 153, 152, 159, 158, 157, 156, 147, 146, 145, 144, 151, 150, 149, 148, 235, 234, 233, 232, 239, 238, 237, 236, 227, 226, 225, 224, 231, 230, 229, 228, 251, 250, 249, 248, 255, 254, 253, 252, 243, 242, 241, 240, 247, 246, 245, 244, 203, 202, 201, 200, 207, 206, 205, 204, 195, 194, 193, 192, 199, 198, 197, 196, 219, 218, 217, 216, 223, 222, 221, 220, 211, 210, 209, 208, 215, 214, 213, 212, 43, 42, 41, 40, 47, 46, 45, 44, 35, 34, 33, 32, 39, 38, 37, 36, 59, 58, 57, 56, 63, 62, 61, 60, 51, 50, 49, 48, 55, 54, 53, 52, 11, 10, 9, 8, 15, 14, 13, 12, 3, 2, 1, 0, 7, 6, 5, 4, 27, 26, 25, 24, 31, 30, 29, 28, 19, 18, 17, 16, 23, 22, 21, 20, 107, 106, 105, 104, 111, 110, 109, 108, 99, 98, 97, 96, 103, 102, 101, 100, 123, 122, 121, 120, 127, 126, 125, 124, 115, 114, 113, 112, 119, 118, 117, 116, 75, 74, 73, 72, 79, 78, 77, 76, 67, 66, 65, 64, 71, 70, 69, 68, 91, 90, 89, 88, 95, 94, 93, 92, 83, 82, 81, 80, 87, 86, 85, 84, 171, 170, 169, 168, 175, 174, 173, 172, 163, 162, 161, 160, 167, 166, 165, 164, 187, 186, 185, 184, 191, 190, 189, 188, 179, 178, 177,176, 183, 182, 181, 180, 139, 138, 137, 136, 143, 142, 141, 140, 131, 130, 129, 128, 135, 134, 133, 132, 155, 154, 153, ...
However, filtering out only random datasets, Go's sort expectedly has better performance, because Timsort's attempt to adapt to a random dataset only harms it. Go's simple two phase approach (quicksort first for large, then insertion sort for small datasets) performs better.
Modifying the test parameters to show memory allocation :
go test -bench . -benchmem -benchtime 1s
(allocated & allocation):The dataset item in question (int is 64-bit on my machine):
Go's sort as per documentation in source code does not make any memory allocations. This favors Go's sort.
Conclusion:
Even after this "quick" analysis, I am not sure if a switch to Go's sort is a good or bad idea or something in-between (most-probably). There is nothing like an average dataset and the choice depends on the nature of the dataset. When using GoDS in writing and solving complex systems, sorting is least likely to be the bottleneck, so analyzing this further is irrelevant and topic to academic discussions.
Timsort might be replaced in GoDS by Go's native sorting simply for keeping the code base of this library as small as possible and arguments given above.
References:
The text was updated successfully, but these errors were encountered: