# collections

* [Counter](#Counter)
* [deque](#deque)
* [defaultdict](#defaultdict)
* [namedtuple](#namedtuple)
* [OrderedDict](#OrderedDict)

In [2]:
import collections

## Counter

计数器。

In [26]:
cnt = collections.Counter()
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
    cnt[word] += 1 # 不需要判断key是否存在, 如果key不存在，返回0

In [27]:
print cnt

Counter({'blue': 3, 'red': 2, 'green': 1})


In [28]:
d = {}
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
    d[word] += 1

KeyError: 'red'

查找次数最多的2个词, Counter()接收一个iterable或mapping对象，比如list、dict。

class collections.Counter([iterable-or-mapping])

In [29]:
import re

In [30]:
collections.Counter(['red', 'blue', 'red', 'green', 'blue', 'blue']).most_common(2)

[('blue', 3), ('red', 2)]

In [31]:
collections.Counter({'red':3, 'blue':2, 'black':1}).most_common(2)

[('red', 3), ('blue', 2)]

In [32]:
c = collections.Counter()
c = collections.Counter('gallahad')

In [33]:
print c

Counter({'a': 3, 'l': 2, 'h': 1, 'g': 1, 'd': 1})


In [34]:
c = collections.Counter(cats=4, dogs=8)

In [35]:
c

Counter({'cats': 4, 'dogs': 8})

Counter不同于dict，如果key不存在，返回0而不是KeyError

In [36]:
c = collections.Counter(['egg', 'ham'])
c['bacon']

0

In [37]:
c['sausage'] = 0 # 把value设置为0，就是0，而不是删除啦，虽然不存在的value是0

In [38]:
del c['sausage']

In [39]:
c

Counter({'egg': 1, 'ham': 1})

In [40]:
c['bacon']

0

In [41]:
c

Counter({'egg': 1, 'ham': 1})

In [42]:
c['sausage'] = 0

In [43]:
c

Counter({'egg': 1, 'ham': 1, 'sausage': 0})

Counter的方法:
* elements()
* most_common(n): 返回一个list

In [44]:
c = collections.Counter(a=4, b=2, c=0, d=-2)
list(c.elements())

['a', 'a', 'a', 'a', 'b', 'b']

In [45]:
collections.Counter('abracadabra').most_common(3)

[('a', 5), ('r', 2), ('b', 2)]

In [46]:
c

Counter({'a': 4, 'b': 2, 'c': 0, 'd': -2})

In [47]:
list(c)

['a', 'c', 'b', 'd']

In [48]:
c.clear()
c

Counter()

# deque

stack和queue

左右两端都可以append和pop

In [51]:
from collections import deque

In [52]:
d = deque('ghi')
for elem in d:
    print elem.upper()

G
H
I


In [53]:
d.append('j')

In [54]:
d

deque(['g', 'h', 'i', 'j'])

In [55]:
d.appendleft('f')

In [56]:
d

deque(['f', 'g', 'h', 'i', 'j'])

In [57]:
d.pop()

'j'

In [58]:
d

deque(['f', 'g', 'h', 'i'])

In [60]:
d.popleft()

'f'

In [61]:
d

deque(['g', 'h', 'i'])

In [62]:
list(d)

['g', 'h', 'i']

In [63]:
d[0]

'g'

# defaultdict

defaultdict是dict的子类。它和dict不同的是，在构造时需要用一个function_factory来创建，这个function_factory就是构造的defaultdict的value类型，并且还有默认值！

什么是function factory？虽然他 
们看上去有点象函数， 实质上他们是类。当你调用它们时， 实际上是生成了该类型的一个实 
例， 就象工厂生产货物一样。比如 list(), dict(), object() 

dict subclass that calls a factory function to supply missing values

这里的defaultdict(function_factory)构建的是一个类似dictionary的对象，其中keys的值，自行确定赋值，但是values的类型，是function_factory的类实例，而且具有默认值。比如default(int)则创建一个类似dictionary对象，里面任何的values都是int的实例，而且就算是一个不存在的key, d[key] 也有一个默认值，这个默认值是int()的默认值0.



In [64]:
from collections import defaultdict

In [67]:
s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
d = defaultdict(list) # list 是value
for k, v in s:
    d[k].append(v)

In [68]:
d

defaultdict(list, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})

In [69]:
s = 'mississippi'
d = defaultdict(int)
for k in s:
    d[k] += 1
d.items()

[('i', 4), ('p', 2), ('s', 4), ('m', 1)]

When a letter is first encountered, it is missing from the mapping, so the default_factory function calls int() to supply a default count of zero.

# namedtuple

Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.

实际上就是给tuple中每个位置加一个field name，这样，一个tuple更像数据库的一行，可以根据field name访问而不必必须使用index检索tuple.

collections.namedtuple(typename, field_names[, verbose=False][, rename=False])


Returns a new tuple subclass named **typename**

In [107]:
from collections import namedtuple

Point = namedtuple('Point', ['x', 'y'], verbose=True) # typename = Point


class Point(tuple):
    'Point(x, y)'

    __slots__ = ()

    _fields = ('x', 'y')

    def __new__(_cls, x, y):
        'Create new instance of Point(x, y)'
        return _tuple.__new__(_cls, (x, y))

    @classmethod
    def _make(cls, iterable, new=tuple.__new__, len=len):
        'Make a new Point object from a sequence or iterable'
        result = new(cls, iterable)
        if len(result) != 2:
            raise TypeError('Expected 2 arguments, got %d' % len(result))
        return result

    def __repr__(self):
        'Return a nicely formatted representation string'
        return 'Point(x=%r, y=%r)' % self

    def _asdict(self):
        'Return a new OrderedDict which maps field names to their values'
        return OrderedDict(zip(self._fields, self))

    def _replace(_self, **kwds):
        'Return a new Point object replacing specified fields with new values'
        result = _self._make(map(kwds.pop, ('x', 'y'), _self))
        if kwds:
            raise ValueError('

In [108]:
p = Point(11, y=22)
p[0] + p[1]
x, y = p

In [109]:
print (x, y)

(11, 22)


In [110]:
p.x + p.y

33

In [111]:
p[0] + p[1]

33

In [112]:
p

Point(x=11, y=22)

# OrderedDict

Ordered dictionaries are just like regular dictionaries but they remember the order that items were inserted. 按照key的插入顺序排序，而不是按照值大小排序。