An OrderedSet is a custom MutableSet that remembers its order, so that every entry has an index that can be looked up.
An OrderedSet is created and used like a set:
>>> from ordered_set import OrderedSet >>> letters = OrderedSet('abracadabra') >>> letters OrderedSet(['a', 'b', 'r', 'c', 'd']) >>> 'r' in letters True
It is efficient to find the index of an entry in an OrderedSet, or find an
entry by its index. To help with this use case, the
.add() method returns
the index of the added item, whether it was already in the set or not.
>>> letters.index('r') 2 >>> letters 'r' >>> letters.add('r') 2 >>> letters.add('x') 5
OrderedSets implement the union (
|), intersection (
&), and difference (
operators like sets do.
>>> letters |= OrderedSet('shazam') >>> letters OrderedSet(['a', 'b', 'r', 'c', 'd', 'x', 's', 'h', 'z', 'm']) >>> letters & set('aeiou') OrderedSet(['a']) >>> letters -= 'abcd' >>> letters OrderedSet(['r', 'x', 's', 'h', 'z', 'm'])
index() methods have been extended to accept any
iterable except a string, to perform NumPy-like "fancy indexing".
>>> letters = OrderedSet('abracadabra') >>> letters[[0, 2, 3]] OrderedSet(['a', 'r', 'c']) >>> letters.index(['a', 'r', 'c']) [0, 2, 3]
This combination of features makes OrderedSet a simple implementation of many
of the things that
pandas.Index is used for. An OrderedSet can be used as a
bi-directional mapping between a sparse vocabulary and dense index numbers.
__setstate__ so it can be pickled,
and implements the abstract base classes
OrderedSet was implemented by Rob Speer. Jon Crall contributed changes and tests to make it fit the Python set API.
The original implementation of OrderedSet was a recipe posted to ActiveState Recipes by Raymond Hettiger, released under the MIT license.
Hettiger's implementation kept its content in a doubly-linked list referenced by a dict. As a result, looking up an item by its index was an O(N) operation, while deletion was O(1).
This version makes different trade-offs for the sake of efficient lookups. Its content is a standard Python list instead of a doubly-linked list. This provides O(1) lookups by index at the expense of O(N) deletion, as well as slightly faster iteration.
If you were to use a Python
dict as an OrderedSet by ignoring its values, its
lookups and deletions would be similar to Hettiger's implementation, with
iteration speed similar to this implementation.
OrderedSet is automatically tested on Python 2.7, 3.3, 3.4, 3.5, 3.6, and 3.7. We've checked more informally that it works on PyPy and PyPy3.