# collections — Container Data Types

### The collections module includes container data types beyond the built-in types list, dict, and tuple.

* ChainMap — Search Multiple Dictionaries
* Counter — Count Hashable Objects
* defaultdict — Missing Keys Return a Default Value
* deque — Double-Ended Queue
* namedtuple — Tuple Subclass with Named Fields
* OrderedDict — Remember the Order Keys are Added to a Dictionary
* collections.abc — Abstract Base Classes for Containers


### ChainMap — Search Multiple Dictionaries

The ChainMap class manages a sequence of dictionaries, and searches through them in the order they are given to find values associated with keys. A ChainMap makes a good “context” container, since it can be treated as a stack for which changes happen as the stack grows, with these changes being discarded again as the stack shrinks.

#### Accessing Values
The ChainMap supports the same API as a regular dictionary for accessing existing values.

In [6]:
import collections

a = {'a': 'A', 'c': 'C'}
b = {'b': 'B', 'c': 'D'}

m = collections.ChainMap(a, b)

print('Individual Values')
print('a = {}'.format(m['a']))
print('b = {}'.format(m['b']))
print('c = {}'.format(m['c']))
print()

print('Keys = {}'.format(list(m.keys())))
print('Values = {}'.format(list(m.values())))
print()

print('Items:')
for k, v in m.items():
    print('{} = {}'.format(k, v))
print()

print('"d" in m: {}'.format(('d' in m)))

Individual Values
a = A
b = B
c = C

Keys = ['a', 'c', 'b']
Values = ['A', 'C', 'B']

Items:
a = A
c = C
b = B

"d" in m: False


-----
The child mappings are searched in the order they are passed to the constructor, so the value reported for the key 'c' comes from the a dictionary.

#### Reordering
The ChainMap stores the list of mappings over which it searches in a list in its maps attribute. This list is mutable, so it is possible to add new mappings directly or to change the order of the elements to control lookup and update behavior.
When the list of mappings is reversed, the value associated with 'c' changes.

In [8]:
import collections

a = {'a': 'A', 'c': 'C'}
b = {'b': 'B', 'c': 'D'}

m = collections.ChainMap(a, b)

print(m.maps)
print('c = {}\n'.format(m['c']))

# reverse the list
m.maps = list(reversed(m.maps))

print(m.maps)
print('c = {}'.format(m['c']))

[{'a': 'A', 'c': 'C'}, {'b': 'B', 'c': 'D'}]
c = C

[{'b': 'B', 'c': 'D'}, {'a': 'A', 'c': 'C'}]
c = D


#### Updating Values
A ChainMap does not cache the values in the child mappings. Thus, if their contents are modified, the results are reflected when the ChainMap is accessed.

Changing the values associated with existing keys and adding new elements works the same way.

In [9]:
import collections

a = {'a': 'A', 'c': 'C'}
b = {'b': 'B', 'c': 'D'}

m = collections.ChainMap(a, b)
print('Before: {}'.format(m['c']))
a['c'] = 'E'
print('After : {}'.format(m['c']))

Before: C
After : E


It is also possible to set values through the ChainMap directly, although only the first mapping in the chain is actually modified.

When the new value is stored using m, the a mapping is updated.

In [11]:
import collections

a = {'a': 'A', 'c': 'C'}
b = {'b': 'B', 'c': 'D'}

m = collections.ChainMap(a, b)
print('Before:', m)
m['c'] = 'E'
print('After :', m)
print('a:', a)

Before: ChainMap({'a': 'A', 'c': 'C'}, {'b': 'B', 'c': 'D'})
After : ChainMap({'a': 'A', 'c': 'E'}, {'b': 'B', 'c': 'D'})
a: {'a': 'A', 'c': 'E'}


ChainMap provides a convenience method for creating a new instance with one extra mapping at the front of the maps list to make it easy to avoid modifying the existing underlying data structures.

This stacking behavior is what makes it convenient to use ChainMap instances as template or application contexts. Specifically, it is easy to add or update values in one iteration, then discard the changes for the next iteration.

In [12]:
import collections

a = {'a': 'A', 'c': 'C'}
b = {'b': 'B', 'c': 'D'}

m1 = collections.ChainMap(a, b)
m2 = m1.new_child()

print('m1 before:', m1)
print('m2 before:', m2)

m2['c'] = 'E'

print('m1 after:', m1)
print('m2 after:', m2)

m1 before: ChainMap({'a': 'A', 'c': 'C'}, {'b': 'B', 'c': 'D'})
m2 before: ChainMap({}, {'a': 'A', 'c': 'C'}, {'b': 'B', 'c': 'D'})
m1 after: ChainMap({'a': 'A', 'c': 'C'}, {'b': 'B', 'c': 'D'})
m2 after: ChainMap({'c': 'E'}, {'a': 'A', 'c': 'C'}, {'b': 'B', 'c': 'D'})


For situations where the new context is known or built in advance, it is also possible to pass a mapping to new_child().

In [14]:
import collections

a = {'a': 'A', 'c': 'C'}
b = {'b': 'B', 'c': 'D'}
c = {'c': 'E'}

m1 = collections.ChainMap(a, b)
m2 = m1.new_child(c) # This is the equivalent of m2 = collections.ChainMap(c, *m1.maps) 

print('m1["c"] = {}'.format(m1['c']))
print('m2["c"] = {}'.format(m2['c']))

m1["c"] = C
m2["c"] = E


----

### Counter — Count Hashable Objects
A Counter is a container that keeps track of how many times equivalent values are added. It can be used to implement the same algorithms for which other languages commonly use bag or multiset data structures.

#### Initializing
Counter supports three forms of initialization. Its constructor can be called with a sequence of items, a dictionary containing keys and counts, or using keyword arguments that map string names to counts.

The results of all three forms of initialization are the same.

In [15]:
import collections

print(collections.Counter(['a', 'b', 'c', 'a', 'b', 'b']))
print(collections.Counter({'a': 2, 'b': 3, 'c': 1}))
print(collections.Counter(a=2, b=3, c=1))

Counter({'b': 3, 'a': 2, 'c': 1})
Counter({'b': 3, 'a': 2, 'c': 1})
Counter({'b': 3, 'a': 2, 'c': 1})


An empty Counter can be constructed with no arguments and populated via the update() method.

The count values are increased based on the new data, rather than replaced. In the preceding example, the count for a goes from 3 to 4.

In [16]:
import collections

c = collections.Counter()
print('Initial :', c)

c.update('abcdaab')
print('Sequence:', c)

c.update({'a': 1, 'd': 5})
print('Dict    :', c)

Initial : Counter()
Sequence: Counter({'a': 3, 'b': 2, 'c': 1, 'd': 1})
Dict    : Counter({'d': 6, 'a': 4, 'b': 2, 'c': 1})


#### Accessing Counts
Once a Counter is populated, its values can be retrieved using the dictionary API.

Counter does not raise KeyError for unknown items. If a value has not been seen in the input (as with e in this example), its count is 0.

In [17]:
import collections

c = collections.Counter('abcdaab')

for letter in 'abcde':
    print('{} : {}'.format(letter, c[letter]))

a : 3
b : 2
c : 1
d : 1
e : 0


The elements() method returns an iterator that produces all of the items known to the Counter.

The order of elements is not guaranteed, and items with counts less than or equal to zero are not included.

In [18]:
import collections

c = collections.Counter('extremely')
c['z'] = 0
print(c)
print(list(c.elements()))

Counter({'e': 3, 'x': 1, 't': 1, 'r': 1, 'm': 1, 'l': 1, 'y': 1, 'z': 0})
['e', 'e', 'e', 'x', 't', 'r', 'm', 'l', 'y']


Use most_common() to produce a sequence of the n most frequently encountered input values and their respective counts.

This example counts the letters appearing in all of the words in the system dictionary to produce a frequency distribution, then prints the three most common letters. Leaving out the argument to most_common() produces a list of all the items, in order of frequency.

In [19]:
import collections

c = collections.Counter()
with open('/usr/share/dict/words', 'rt') as f:
    for line in f:
        c.update(line.rstrip().lower())

print('Most common:')
for letter, count in c.most_common(3):
    print('{}: {:>7}'.format(letter, count))

Most common:
e:  235331
i:  201032
a:  199554


#### Arithmetic
Counter instances support arithmetic and set operations for aggregating results. This example shows the standard operators for creating new Counter instances, but the in-place operators +=, -=, &=, and |= are also supported.

Each time a new Counter is produced through an operation, any items with zero or negative counts are discarded. The count for a is the same in c1 and c2, so subtraction leaves it at zero.

In [20]:
import collections

c1 = collections.Counter(['a', 'b', 'c', 'a', 'b', 'b'])
c2 = collections.Counter('alphabet')

print('C1:', c1)
print('C2:', c2)

print('\nCombined counts:')
print(c1 + c2)

print('\nSubtraction:')
print(c1 - c2)

print('\nIntersection (taking positive minimums):')
print(c1 & c2)

print('\nUnion (taking maximums):')
print(c1 | c2)

C1: Counter({'b': 3, 'a': 2, 'c': 1})
C2: Counter({'a': 2, 'l': 1, 'p': 1, 'h': 1, 'b': 1, 'e': 1, 't': 1})

Combined counts:
Counter({'a': 4, 'b': 4, 'c': 1, 'l': 1, 'p': 1, 'h': 1, 'e': 1, 't': 1})

Subtraction:
Counter({'b': 2, 'c': 1})

Intersection (taking positive minimums):
Counter({'a': 2, 'b': 1})

Union (taking maximums):
Counter({'b': 3, 'a': 2, 'c': 1, 'l': 1, 'p': 1, 'h': 1, 'e': 1, 't': 1})


---

### defaultdict — Missing Keys Return a Default Value
The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default up front when the container is initialized.

In [1]:
import collections


def default_factory():
    return 'default value'


d = collections.defaultdict(default_factory, foo='bar')
print('d:', d)
print('foo =>', d['foo'])
print('bar =>', d['bar'])

d: defaultdict(<function default_factory at 0x10db92b70>, {'foo': 'bar'})
foo => bar
bar => default value


This method works well as long as it is appropriate for all keys to have the same default. It can be especially useful if the default is a type used for aggregating or accumulating values, such as a list, set, or even int.

Using list as the default_factory, it is easy to group a sequence of key-value pairs into a dictionary of lists

In [4]:
import collections

s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = collections.defaultdict(list)

for k, v in s:
    d[k].append(v)

sorted(d.items())


[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the default_factory function which returns an empty list. The list.append() operation then attaches the value to the new list. When keys are encountered again, the look-up proceeds normally (returning the list for that key) and the list.append() operation adds another value to the list. This technique is simpler and faster than an equivalent technique using dict.setdefault():

In [5]:
d = {}
for k, v in s:
    d.setdefault(k, []).append(v)

sorted(d.items())

[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]

Setting the default_factory to int makes the defaultdict useful for counting (like a bag or multiset in other languages):

In [7]:
import collections

s = 'mississippi'
d = collections.defaultdict(int)

for k in s:
    d[k] += 1

sorted(d.items())

[('i', 4), ('m', 1), ('p', 2), ('s', 4)]

When a letter is first encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero. The increment operation then builds up the count for each letter.

The function int() which always returns zero is just a special case of constant functions. A faster and more flexible way to create constant functions is to use a lambda function which can supply any constant value (not just zero):

In [17]:
import collections

def constant_factory(value):
    return lambda: value

d = collections.defaultdict(constant_factory('<missing>'))
d.update(name='John', action='ran')

f"{d['name']} {d['action']} to {d['object']}"

'John ran to <missing>'

Setting the default_factory to set makes the defaultdict useful for building a dictionary of sets:

In [18]:
import collections

s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)]
d = collections.defaultdict(set)

for k, v in s:
    d[k].add(v)

sorted(d.items())

[('blue', {2, 4}), ('red', {1, 3})]

---

### deque — Double-Ended Queue
A double-ended queue, or deque, supports adding and removing elements from either end of the queue. The more commonly used stacks and queues are degenerate forms of deques, where the inputs and outputs are restricted to a single end.

Since deques are a type of sequence container, they support some of the same operations as list, such as examining the contents with __getitem__(), determining length, and removing elements from the middle of the queue by matching identity.

In [19]:
import collections

d = collections.deque('abcdefg')
print('Deque:', d)
print('Length:', len(d))
print('Left end:', d[0])
print('Right end:', d[-1])

d.remove('c')
print('remove(c):', d)

Deque: deque(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
Length: 7
Left end: a
Right end: g
remove(c): deque(['a', 'b', 'd', 'e', 'f', 'g'])


#### Populating
A deque can be populated from either end, termed “left” and “right” in the Python implementation.

The extendleft() function iterates over its input and performs the equivalent of an appendleft() for each item. The end result is that the deque contains the input sequence in reverse order.

In [20]:
import collections

# Add to the right
d1 = collections.deque()
d1.extend('abcdefg')
print('extend    :', d1)
d1.append('h')
print('append    :', d1)

# Add to the left
d2 = collections.deque()
d2.extendleft(range(6))
print('extendleft:', d2)
d2.appendleft(6)
print('appendleft:', d2)

extend    : deque(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
append    : deque(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
extendleft: deque([5, 4, 3, 2, 1, 0])
appendleft: deque([6, 5, 4, 3, 2, 1, 0])


#### Consuming
Similarly, the elements of the deque can be consumed from both ends or either end, depending on the algorithm being applied.

Use pop() to remove an item from the “right” end of the deque and popleft() to take an item from the “left” end.

In [21]:
import collections

print('From the right:')
d = collections.deque('abcdefg')
while True:
    try:
        print(d.pop(), end='')
    except IndexError:
        break
print()

print('\nFrom the left:')
d = collections.deque(range(6))
while True:
    try:
        print(d.popleft(), end='')
    except IndexError:
        break
print()

From the right:
gfedcba

From the left:
012345



Since deques are thread-safe, the contents can even be consumed from both ends at the same time from separate threads.

In [24]:
import collections
import threading
import time

candle = collections.deque(range(5))


def burn(direction, nextSource):
    while True:
        try:
            next = nextSource()
        except IndexError:
            break
        else:
            print('{:>8}: {}'.format(direction, next))
            time.sleep(0.1)
    print('{:>8} done'.format(direction))
    return


left = threading.Thread(target=burn,
                        args=('Left', candle.popleft))
right = threading.Thread(target=burn,
                         args=('Right', candle.pop))

left.start()
right.start()

left.join()
right.join()

    Left: 0
   Right: 4
    Left: 1
   Right: 3
    Left: 2
   Right done
    Left done


#### Rotating
Another useful aspect of the deque is the ability to rotate it in either direction, so as to skip over some items.

Rotating the deque to the right (using a positive rotation) takes items from the right end and moves them to the left end. Rotating to the left (with a negative value) takes items from the left end and moves them to the right end. It may help to visualize the items in the deque as being engraved along the edge of a dial.

In [25]:
import collections

d = collections.deque(range(10))
print('Normal        :', d)

d = collections.deque(range(10))
d.rotate(2)
print('Right rotation:', d)

d = collections.deque(range(10))
d.rotate(-2)
print('Left rotation :', d)

Normal        : deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Right rotation: deque([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
Left rotation : deque([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])


#### Constraining the Queue Size
A deque instance can be configured with a maximum length so that it never grows beyond that size. When the queue reaches the specified length, existing items are discarded as new items are added. This behavior is useful for finding the last n items in a stream of undetermined length.

The deque length is maintained regardless of which end the items are added to.

In [26]:
import collections
import random

# Set the random seed so we see the same output each time
# the script is run.
random.seed(1)

d1 = collections.deque(maxlen=3)
d2 = collections.deque(maxlen=3)

for i in range(5):
    n = random.randint(0, 100)
    print('n =', n)
    d1.append(n)
    d2.appendleft(n)
    print('D1:', d1)
    print('D2:', d2)

n = 17
D1: deque([17], maxlen=3)
D2: deque([17], maxlen=3)
n = 72
D1: deque([17, 72], maxlen=3)
D2: deque([72, 17], maxlen=3)
n = 97
D1: deque([17, 72, 97], maxlen=3)
D2: deque([97, 72, 17], maxlen=3)
n = 8
D1: deque([72, 97, 8], maxlen=3)
D2: deque([8, 97, 72], maxlen=3)
n = 32
D1: deque([97, 8, 32], maxlen=3)
D2: deque([32, 8, 97], maxlen=3)


---

### namedtuple — Tuple Subclass with Named Fields
The standard tuple uses numerical indexes to access its members.

This makes tuples convenient containers for simple uses.

bob = ('Bob', 30, 'male')
print('Representation:', bob)

jane = ('Jane', 29, 'female')
print('\nField by index:', jane[0])

print('\nFields by index:')
for p in [bob, jane]:
    print('{} is a {} year old {}'.format(*p))


In contrast, remembering which index should be used for each value can lead to errors, especially if the tuple has a lot of fields and is constructed far from where it is used. A namedtuple assigns names, as well as the numerical index, to each member.

#### Defining
namedtuple instances are just as memory efficient as regular tuples because they do not have per-instance dictionaries. Each kind of namedtuple is represented by its own class, which is created by using the namedtuple() factory function. The arguments are the name of the new class and a string containing the names of the elements.

As the example illustrates, it is possible to access the fields of the namedtuple by name using dotted notation (obj.attr) as well as by using the positional indexes of standard tuples.

In [2]:
import collections

Person = collections.namedtuple('Person', 'name age')

bob = Person(name='Bob', age=30)
print('\nRepresentation:', bob)

jane = Person(name='Jane', age=29)
print('\nField by name:', jane.name)

print('\nFields by index:')
for p in [bob, jane]:
    print('{} is {} years old'.format(*p))


Representation: Person(name='Bob', age=30)

Field by name: Jane

Fields by index:
Bob is 30 years old
Jane is 29 years old


Just like a regular tuple, a namedtuple is immutable. This restriction allows tuple instances to have a consistent hash value, which makes it possible to use them as keys in dictionaries and to be included in sets.

Trying to change a value through its named attribute results in an AttributeError.

In [3]:
import collections

Person = collections.namedtuple('Person', 'name age')

pat = Person(name='Pat', age=12)
print('\nRepresentation:', pat)

pat.age = 21


Representation: Person(name='Pat', age=12)


AttributeError: can't set attribute

#### Invalid Field Names
Field names are invalid if they are repeated or conflict with Python keywords.

As the field names are parsed, invalid values cause ValueError exceptions.

In [4]:
import collections

try:
    collections.namedtuple('Person', 'name class age')
except ValueError as err:
    print(err)

try:
    collections.namedtuple('Person', 'name age age')
except ValueError as err:
    print(err)

Type names and field names cannot be a keyword: 'class'
Encountered duplicate field name: 'age'



In situations where a namedtuple is created based on values outside the control of the program (such as to represent the rows returned by a database query, where the schema is not known in advance), the rename option should be set to True so the invalid fields are renamed.

The new names for renamed fields depend on their index in the tuple, so the field with name class becomes _1 and the duplicate age field is changed to _2.

In [5]:
import collections

with_class = collections.namedtuple(
    'Person', 'name class age',
    rename=True)
print(with_class._fields)

two_ages = collections.namedtuple(
    'Person', 'name age age',
    rename=True)
print(two_ages._fields)

('name', '_1', 'age')
('name', 'age', '_2')


#### Special Attributes
namedtuple provides several useful attributes and methods for working with subclasses and instances. All of these built-in properties have names prefixed with an underscore (_), which by convention in most Python programs indicates a private attribute. For namedtuple, however, the prefix is intended to protect the name from collision with user-provided attribute names.

The names of the fields passed to namedtuple to define the new class are saved in the _fields attribute.

Although the argument is a single space-separated string, the stored value is the sequence of individual names.

In [6]:
import collections

Person = collections.namedtuple('Person', 'name age')

bob = Person(name='Bob', age=30)
print('Representation:', bob)
print('Fields:', bob._fields)

Representation: Person(name='Bob', age=30)
Fields: ('name', 'age')



_namedtuple_ instances can be converted to OrderedDict instances using _asdict().

The keys of the OrderedDict are in the same order as the fields for the namedtuple.

In [7]:
import collections

Person = collections.namedtuple('Person', 'name age')

bob = Person(name='Bob', age=30)
print('Representation:', bob)
print('As Dictionary:', bob._asdict())

Representation: Person(name='Bob', age=30)
As Dictionary: OrderedDict([('name', 'Bob'), ('age', 30)])


The _replace() method builds a new instance, replacing the values of some fields in the process.

Although the name implies it is modifying the existing object, because namedtuple instances are immutable the method actually returns a new object.

In [8]:
import collections

Person = collections.namedtuple('Person', 'name age')

bob = Person(name='Bob', age=30)
print('\nBefore:', bob)
bob2 = bob._replace(name='Robert')
print('After:', bob2)
print('Same?:', bob is bob2)


Before: Person(name='Bob', age=30)
After: Person(name='Robert', age=30)
Same?: False


#### Some more examples

In [12]:
import collections

# Basic example
Point = collections.namedtuple('Point', ['x', 'y'])
p = Point(11, y=22)     # instantiate with positional or keyword arguments
print('indexable like the plain tuple: ', p[0] + p[1])             # indexable like the plain tuple (11, 22)

x, y = p                # unpack like a regular tuple
print('unpack like a regular tuple:', x, y)

p.x + p.y               # fields also accessible by name

p                       # readable __repr__ with a name=value style

indexable like the plain tuple:  33
unpack like a regular tuple: 11 22


Point(x=11, y=22)

Named tuples are especially useful for assigning field names to result tuples returned by the csv or sqlite3 modules

#### Methods an attributes

##### classmethod somenamedtuple._make(iterable)

Class method that makes a new instance from an existing sequence or iterable.

In [13]:
t = [11, 22]
Point._make(t)

Point(x=11, y=22)

##### somenamedtuple._asdict()

Return a new OrderedDict which maps field names to their corresponding values:

In [14]:
p = Point(x=11, y=22)
p._asdict()

OrderedDict([('x', 11), ('y', 22)])


##### somenamedtuple._replace(**kwargs)

Return a new instance of the named tuple replacing specified fields with new values:

In [15]:
p = Point(x=11, y=22)
p._replace(x=33)

Point(x=33, y=22)

##### somenamedtuple._fields

Tuple of strings listing the field names. Useful for introspection and for creating new named tuple types from existing named tuples.

In [18]:
import collections

print('fields: ', p._fields)            # view the field names

Color = collections.namedtuple('Color', 'red green blue')
Pixel = collections.namedtuple('Pixel', Point._fields + Color._fields)
Pixel(11, 22, 128, 255, 0)

fields:  ('x', 'y')


Pixel(x=11, y=22, red=128, green=255, blue=0)

##### somenamedtuple._fields_defaults

Dictionary mapping field names to default values.

##### Note: python 3.7

In [19]:
import collections

Account = collections.namedtuple('Account', ['type', 'balance'], defaults=[0])
print('Defaults: ', Account._fields_defaults)

Account('premium')

TypeError: namedtuple() got an unexpected keyword argument 'defaults'


To convert a dictionary to a named tuple, use the double-star-operator (as described in Unpacking Argument Lists):


In [20]:
d = {'x': 11, 'y': 22}
Point(**d)

Point(x=11, y=22)


Since a named tuple is a regular Python class, it is easy to add or change functionality with a subclass. Here is how to add a calculated field and a fixed-width print format:

In [21]:
import collections

class Point(collections.namedtuple('Point', ['x', 'y'])):
    __slots__ = ()
   
    @property
    def hypot(self):
        return (self.x ** 2 + self.y ** 2) ** 0.5

    def __str__(self):
        return 'Point: x=%6.3f  y=%6.3f  hypot=%6.3f' % (self.x, self.y, self.hypot)

for p in Point(3, 4), Point(14, 5/7):
    print(p)

Point: x= 3.000  y= 4.000  hypot= 5.000
Point: x=14.000  y= 0.714  hypot=14.018



The subclass shown above sets __slots__ to an empty tuple. This helps keep memory requirements low by preventing the creation of instance dictionaries.

Subclassing is not useful for adding new, stored fields. Instead, simply create a new named tuple type from the _fields attribute:

In [24]:
import collections

Point3D = collections.namedtuple('Point3D', Point._fields + ('z',))


Default values can be implemented by using _replace() to customize a prototype instance:

In [26]:
import collections

Account = collections.namedtuple('Account', 'owner balance transaction_count')
default_account = Account('<owner name>', 0.0, 0)
johns_account = default_account._replace(owner='John')
janes_account = default_account._replace(owner='Jane')

---

### OrderedDict — Remember the Order Keys are Added to a Dictionary
An OrderedDict is a dictionary subclass that remembers the order in which its contents are added.

Before Python 3.6 a regular dict did not track the insertion order, and iterating over it produced the values in order based on how the keys are stored in the hash table, which is in turn influenced by a random value to reduce collisions. In an OrderedDict, by contrast, the order in which the items are inserted is remembered and used when creating an iterator.

Under Python 3.6, the built-in dict does track insertion order, although this behavior is a side-effect of an implementation change and should not be relied on.

In [36]:
import collections

print('Regular dictionary:')
d = {}
d['a'] = 'A'
d['b'] = 'B'
d['c'] = 'C'

for k, v in d.items():
    print(k, v)

print('\nOrderedDict:')
d = collections.OrderedDict()
d['a'] = 'A'
d['b'] = 'B'
d['c'] = 'C'

for k, v in d.items():
    print(k, v)

Regular dictionary:
a A
b B
c C

OrderedDict:
a A
b B
c C



#### Equality
A regular dict looks at its contents when testing for equality. An OrderedDict also considers the order in which the items were added.

In this case, since the two ordered dictionaries are created from values in a different order, they are considered to be different.

In [38]:
import collections

print('dict       :', end=' ')
d1 = {}
d1['a'] = 'A'
d1['b'] = 'B'
d1['c'] = 'C'

d2 = {}
d2['c'] = 'C'
d2['b'] = 'B'
d2['a'] = 'A'

print(d1 == d2)

print('OrderedDict:', end=' ')

d1 = collections.OrderedDict()
d1['a'] = 'A'
d1['b'] = 'B'
d1['c'] = 'C'

d2 = collections.OrderedDict()
d2['c'] = 'C'
d2['b'] = 'B'
d2['a'] = 'A'

print(d1 == d2)

dict       : True
OrderedDict: False


#### Reordering
It is possible to change the order of the keys in an OrderedDict by moving them to either the beginning or the end of the sequence using move_to_end().

The last argument tells move_to_end() whether to move the item to be the last item in the key sequence (when True) or the first (when False).

In [39]:
import collections

d = collections.OrderedDict(
    [('a', 'A'), ('b', 'B'), ('c', 'C')]
)

print('Before:')
for k, v in d.items():
    print(k, v)

d.move_to_end('b')

print('\nmove_to_end():')
for k, v in d.items():
    print(k, v)

d.move_to_end('b', last=False)

print('\nmove_to_end(last=False):')
for k, v in d.items():
    print(k, v)

Before:
a A
b B
c C

move_to_end():
a A
c C
b B

move_to_end(last=False):
b B
a A
c C
