Skip to content
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.

Don't mark it obsolete, speed tests for sortedcontainers are mistaken #1

Closed
netch80 opened this issue Feb 1, 2017 · 1 comment
Closed

Comments

@netch80
Copy link

netch80 commented Feb 1, 2017

Bintrees are substantially faster in case of inexact key and value search is used. The performance tests (reported at http://www.grantjenks.com/docs/sortedcontainers/performance.html) mainly ignore key ordering in containers. Only iter() test has something common with ordering; others can be faster satisfied on any non-sorted store like hash map. Moreover, the API of SortedDict with grounding on absolute index and additional operations to retrieve real data is conceptually misdesigned.

With inexact search of ceiling key or key-value pair, bintrees performance is much better. My results on 100000 key set (FreeBSD 10.3/amd64, bintrees with Cython, Python 2.7.13):

$ ./bench
Time for AVLTree (key only): 3.88829088211
Time for RBTree (key only): 3.60424399376
Time for SortedDict (key only): 5.31183195114
Time for AVLTree (KVP): 3.37197518349
Time for RBTree (KVP): 3.34165000916
Time for SortedDict (KVP): 5.43365883827

So, bintrees are almost twice faster for this task class.

The test code follows.


N = 100000*10

if name == "main":
avld = bintrees.avltree.AVLTree()
rbd = bintrees.rbtree.RBTree()
sd = sortedcontainers.SortedDict()
for k in range(0, N, 10):
avld[k] = rbd[k] = sd[k] = k * 11
t0 = time.time()
for k in range(N):
try:
t = avld.ceiling_key(k)
except KeyError:
pass
t1 = time.time()
print "Time for AVLTree (key only):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = rbd.ceiling_key(k)
except KeyError:
pass
t1 = time.time()
print "Time for RBTree (key only):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = sd.iloc[sd.bisect_left(k)]
except (KeyError, IndexError):
pass
t1 = time.time()
print "Time for SortedDict (key only):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = avld.ceiling_item(k)
except KeyError:
pass
t1 = time.time()
print "Time for AVLTree (KVP):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = rbd.ceiling_item(k)
except KeyError:
pass
t1 = time.time()
print "Time for RBTree (KVP):", t1 - t0
t0 = time.time()
for k in range(N):
try:
t = sd.iloc[sd.bisect_left(k)]
t2 = sd[t]
except (KeyError, IndexError):
pass
t1 = time.time()
print "Time for SortedDict (KVP):", t1 - t0

@mozman
Copy link
Owner

mozman commented Feb 2, 2017

bintrees wont't go away, but no further development is planned, just bugfixes and check for compatibility with newer python releases.

@mozman mozman closed this as completed Feb 2, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants