Hopscotch Queues: Skipping and Jumping Around Tagged, Ordered Data
hopscotch::Queue<K, V> is a first-in-first-out queue where each
in the queue has
- a value of type
- an immutable tag of type
- a unique
u64which is assigned in sequential order of insertion starting at 0 when the queue is created.
In addition to supporting the ordinary
get methods of a
FIFO queue, a hopscotch queue also supports the special, optimized methods
These methods find the next (or previous)
ItemMut) in the
queue whose tag is equal to any of a given set of tags. For instance, the
pub fn after<'a, Tags>(&self, index: u64, tags: Tags) -> Option<Item<K, V>> where Tags: IntoIterator<Item = &'a K>, K: 'a,
If we use
Tags, this signature simplifies to:
pub fn after(&self, index: u64, tags: &[K]) -> Option<Item<K, V>>
These methods are the real benefit of using a hopscotch queue over another data structure. Their asymptotic time complexity is:
- linear relative to the number of tags queried,
- logarithmic relative to the total number of distinct tags in the queue,
- constant relative to the length of the queue, and
- constant relative to the distance between successive items with the same tag.
Hopscotch queues also provide flexible iterators
Iter(Mut) to efficiently
traverse portions of their contents, sliced by index using
Iter(Mut)::after, and filtered by set of desired tags using
When might I use this?
This queue performs well when:
- the set of total tags in the queue is small
- the number of query operations is greater than the number of insertion/deletion operations
- you can afford a little bit of extra memory for bookkeeping
One use-case for such a structure is implementing a bounded pub/sub event buffer where clients interested in a particular subset of events repeatedly query for the next event matching their desired subset. This scenario plays to the strengths of a hopscotch queue, because each query can be performed very quickly, regardless of the contents or size of the buffer.
pop operations are slower by a considerable constant factor
than their equivalents for
VecDeque<(K, T)>, but for this price, you gain the
ability to skip and hop between sets of tags in almost-constant time.
In scenarios with a few hundred tags,
pop operations on my machine
(recent-ish MacBook Pro) take on the order of low single-digits of microseconds
(hundreds of thousands of operations per second), and
take on the order of tens of nanoseconds (tens of millions of operations per
second), depending on the number of tags present in the queue. More detailed and
scientific benchmarks are to come, and please feel free to contribute!
If this is similar to your needs, this data structure might be for you!
When should I not use this?
- If your access pattern to the data in the queue does not frequently make
queries looking for the next/previous/first/last with a given tag, this queue
will underperform a simpler
- If you do need such query methods, but the distribution of tags is such that
tags are unlikely to be frequently repeated in the queue (number of tags is
close to the max size of the queue), this queue will underperform a data
structure built from a pair of a
VecDeque<T>of values and a mapping from tags to sorted vectors of of indices.
- If most tags will be close to others of their same ilk, and you can tolerate
an occasional long-lasting linear scan of the queue when this is not the case,
VecDeque<(K, T)>may also suffice.
- If you need to change the tag of an item which is already in the queue, this data structure will not permit that operation. Allowing this possibility would degrade the performance of all other operations on the queue by a constant factor, but it is possible that future releases will include this as an optional flag.
- If you need to remove an item from the "in" end of the queue, add an item to the "out" end of the queue, or insert an item in the middle of the queue, these operations would be linear-time or worse for a hopscotch queue, and while we may eventually add them as methods, they would not be efficient.