# Collections

## defaultdict

>Unlike dict, with defaultdict you do not need to check whether a key is present or not. So we can do:

In [13]:
from collections import defaultdict

colours = (
    ('Yasoob', 'Yellow'),
    ('Ali', 'Blue'),
    ('Arham', 'Green'),
    ('Ali', 'Black'),
    ('Yasoob', 'Red'),
    ('Ahmed', 'Silver'),
)

favourite_color = defaultdict(list)

for name, color in colours:
    favourite_color[name].append(color)

print(favourite_color)
print(type(favourite_color['Yasoob']))



defaultdict(<class 'list'>, {'Yasoob': ['Yellow', 'Red'], 'Ali': ['Blue', 'Black'], 'Arham': ['Green'], 'Ahmed': ['Silver']})
<class 'list'>


>problem with a nested list inside a dictionary:

In [14]:
some_dict = {}
some_dict['colours']['favourite'] = "yellow"

KeyError: 'colours'

>solution in defaultdict

In [29]:
tree = lambda: defaultdict(tree)

some_dict = tree()

some_dict['colours']['favourite'] = "yellow"

import json

str = json.dumps(some_dict)

print(str)

{"colours": {"favourite": "yellow"}}


## OrderedDict

>OrderedDict keeps its entries sorted as they are initially inserted. Overwriting a value of an existing key doesn’t change the position of that key. However, deleting and reinserting an entry moves the key to the end of the dictionary.


In [37]:
>OrderedDict keeps its entries sorted as they are initially inserted. Overwriting a value of an existing key doesn’t change the position of that key. However, deleting and reinserting an entry moves the key to the end of the dictionary.

from collections import OrderedDict

colours = OrderedDict([("Red", 150), ("Green", 170), ("Blue", 160)])

for k,v in colours.items():
    print(k, v)

Red 198
Green 170
Blue 160


## counter:
>Counter allows us to count the occurrences of a particular item. For instance it can be used to count the number of individual favourite colours:

In [39]:
from collections import Counter

In [40]:
colours = (
    ('Yasoob', 'Yellow'),
    ('Ali', 'Blue'),
    ('Arham', 'Green'),
    ('Ali', 'Black'),
    ('Yasoob', 'Red'),
    ('Ahmed', 'Silver'),
)

In [42]:
c = Counter(k for k, v in colours)

In [43]:
print(c)

Counter({'Yasoob': 2, 'Ali': 2, 'Arham': 1, 'Ahmed': 1})


>We can also count the most common lines in a file using it. For example:
```python
with open('filename', 'rb') as f:
    line_count = Counter(f)
print(line_count)
```

## deque
>deque provides you with a double ended queue which means that you can append and delete elements from either side of the queue. First of all you have to import the deque module from the collections library:

In [62]:
from collections import deque

In [63]:
d = deque(range(5))
print(len(d))
# Output: 5

5


In [64]:
d.popleft()

0

In [65]:
d.pop()

4

In [66]:
print(d)

deque([1, 2, 3])


In [67]:
d.extendleft('left')
d.extend('right')

In [68]:
print(d)

deque(['t', 'f', 'e', 'l', 1, 2, 3, 'r', 'i', 'g', 'h', 't'])


In [69]:
d.appendleft('left')
d.append('right')

In [70]:
print(d)

deque(['left', 't', 'f', 'e', 'l', 1, 2, 3, 'r', 'i', 'g', 'h', 't', 'right'])


In [71]:
print(len(d))

print(d.maxlen)

14
None


>We can also limit the amount of items a deque can hold. By doing this when we achieve the maximum limit of our deque it will simply pop out the items from the opposite end. It is better to explain it using an example so here you go:

In [72]:
d = deque(range(5), maxlen = 10)

In [73]:
print(d.maxlen)

10


In [74]:
d.extend('56789')

In [75]:
print(d)

deque([0, 1, 2, 3, 4, '5', '6', '7', '8', '9'], maxlen=10)


In [76]:
d.extend('x')

In [77]:
print(d)

deque([1, 2, 3, 4, '5', '6', '7', '8', '9', 'x'], maxlen=10)


## namedtuple
>You might already be acquainted with tuples. A tuple is basically a immutable list which allows you to store a sequence of values separated by commas. They are just like lists but have a few key differences. The major one is that unlike lists, you can not reassign an item in a tuple. In order to access the value in a tuple you use integer indexes like:

In [78]:
man = ('Ali', 30)
print(man[0])
# Output: Ali

Ali


>Well, so now what are namedtuples? They turn tuples into convenient containers for simple tasks. With namedtuples you don’t have to use integer indexes for accessing members of a tuple. You can think of namedtuples like dictionaries but unlike dictionaries they are immutable.

In [79]:
from collections import namedtuple

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="perry", age=31, type="cat")

print(perry)
# Output: Animal(name='perry', age=31, type='cat')

print(perry.name)
# Output: 'perry'

Animal(name='perry', age=31, type='cat')
perry


>You can now see that we can access members of a tuple just by their name using a .. Let’s dissect it a little more. A named tuple has two required arguments. They are the tuple name and the tuple field_names. In the above example our tuple name was ‘Animal’ and the tuple field_names were ‘name’, ‘age’ and ‘type’. Namedtuple makes your tuples self-document. You can easily understand what is going on by having a quick glance at your code. And as you are not bound to use integer indexes to access members of a tuple, it makes it more easy to maintain your code. Moreover, as `namedtuple` instances do not have per-instance dictionaries, they are lightweight and require no more memory than regular tuples. This makes them faster than dictionaries. However, do remember that as with tuples, attributes in namedtuples are immutable. It means that this would not work:

In [80]:
from collections import namedtuple

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="perry", age=31, type="cat")
perry.age = 42

# Output: Traceback (most recent call last):
#            File "", line 1, in
#         AttributeError: can't set attribute

AttributeError: can't set attribute

>You should use named tuples to make your code self-documenting. They are backwards compatible with normal tuples. It means that you can use integer indexes with namedtuples as well:

In [81]:
from collections import namedtuple

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="perry", age=31, type="cat")
print(perry[0])
# Output: perry


perry


>Last but not the least, you can convert a namedtuple to a dictionary. Like this:

In [82]:
from collections import namedtuple

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="Perry", age=31, type="cat")
print(perry._asdict())
# Output: OrderedDict([('name', 'Perry'), ('age', 31), ...

OrderedDict([('name', 'Perry'), ('age', 31), ('type', 'cat')])


## enum.Enum (Python 3.4+)

>Another useful collection is the enum object. It is available in the enum module, in Python 3.4 and up (also available as a backport in PyPI named enum34.) Enums (enumerated type) are basically a way to organize various things.

>Let’s consider the Animal namedtuple from the last example. It had a type field. The problem is, the type was a string. This poses some problems for us. What if the user types in Cat because they held the Shift key? Or CAT? Or kitten?

>Enumerations can help us avoid this problem, by not using strings. Consider this example:



In [83]:
from collections import namedtuple
from enum import Enum

class Species(Enum):
    cat = 1
    dog = 2
    horse = 3
    aardvark = 4
    butterfly = 5
    owl = 6
    platypus = 7
    dragon = 8
    unicorn = 9
    # The list goes on and on...

    # But we don't really care about age, so we can use an alias.
    kitten = 1
    puppy = 2

Animal = namedtuple('Animal', 'name age type')
perry = Animal(name="Perry", age=31, type=Species.cat)
drogon = Animal(name="Drogon", age=4, type=Species.dragon)
tom = Animal(name="Tom", age=75, type=Species.cat)
charlie = Animal(name="Charlie", age=2, type=Species.kitten)


In [84]:
>>> charlie.type == tom.type

True

In [85]:
>>> charlie.type

<Species.cat: 1>

>This is much less error-prone. We have to be specific, and we should use only the enumeration to name types.

>There are three ways to access enumeration members. For example, all three methods will get you the value for cat:

In [86]:
Species(1)
Species['cat']
Species.cat

<Species.cat: 1>