###  Collections

The collections module in Python implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.
The following tools exist:

    namedtuple : factory function for creating tuple subclasses with named fields
    OrderedDict : dict subclass that remembers the order entries were added
    Counter : dict subclass for counting hashable objects
    defaultdict : dict subclass that calls a factory function to supply missing values
    deque : list-like container with fast appends and pops on either end

In Python 3 some more modules exist (ChainMap, UserDict, UserList, UserString). See https://docs.python.org/3/library/collections.html for further references.


### 1. The Counter - subclass of dict object
Counter is a subclass of dictionary object. 
The Counter() function in collections module takes an iterable or a mapping as the argument and 
returns a Dictionary. 
In this dictionary, a key is an element in the iterable or the mapping and value is the number of times that element exists in the iterable or the mapping.

A counter is a container that stores elements as dictionary keys, and their counts are stored as dictionary values.

https://stackabuse.com/introduction-to-pythons-collections-module/#thenamedtuple

https://github.com/python-engineer/python-engineer-notebooks/blob/master/advanced-python/06-Collections.ipynb

https://www.machinelearningplus.com/python-collections-guide/

In [2]:
from collections import Counter

In [3]:
a = 'aaabbbbbccccddddddd'
my_counter = Counter(a)
my_counter

Counter({'a': 3, 'b': 5, 'c': 4, 'd': 7})

In [4]:
my_counter.items()

dict_items([('a', 3), ('b', 5), ('c', 4), ('d', 7)])

In [5]:
my_counter.keys()

dict_keys(['a', 'b', 'c', 'd'])

#### Apart from that, Counter has three additional functions:

- Elements
- Most_common([n])

- The most_common() Function

The Counter() function returns a dictionary which is unordered. You can sort it according to the number of counts in each element using most_common() function of the Counter object.

In [7]:
my_counter.most_common()

[('d', 7), ('b', 5), ('c', 4), ('a', 3)]

You can see that most_common function returns a list, which is sorted based on the count of the elements. 'd' has a count of seven, therefore it is the first element of the list.

In [9]:
# # most common items
my_counter.most_common(1)

[('d', 7)]

In [10]:
my_counter.most_common(2)

[('d', 7), ('b', 5)]

- The element() Function

You can get the items of a Counter object with elements() function. It returns a list containing all the elements in the Counter object.



In [11]:
my_counter.elements()

<itertools.chain at 0x1f9f1eb9c08>

In [14]:
print(list(my_counter.elements()))

['a', 'a', 'a', 'b', 'b', 'b', 'b', 'b', 'c', 'c', 'c', 'c', 'd', 'd', 'd', 'd', 'd', 'd', 'd']


### The namedtuple()

namedtuples are easy to create, lightweight object types. They assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.

In [15]:
from collections import namedtuple

In [16]:
Point = namedtuple('point', 'x,y')
p1 = Point(1, 3)
p1

point(x=1, y=3)

In [17]:
p1.x

1

In [18]:
p1.y

3

In [19]:
stud = namedtuple('student', 'fname, lname, age')
s1 = stud('Amar', 'Roy', 18)
s1

student(fname='Amar', lname='Roy', age=18)

In [21]:
type(s1)

__main__.student

In [22]:
type(s1.fname)

str

In [23]:
s1.fname

'Amar'

In [24]:
s1.lname

'Roy'

In [26]:
s2 = stud(fname='Ajay', lname='Gupta', age=19)
s2

student(fname='Ajay', lname='Gupta', age=19)

In [27]:
s2.fname

'Ajay'

### The OrderedDict
OrderedDict is a dictionary where keys maintain the order in which they are inserted, which means if you change the value of a key later, it will not change the position of the key.



In [2]:
from collections import OrderedDict

In [3]:
orderdict = OrderedDict()
orderdict['a'] = 1
orderdict['b'] = 2
orderdict['c'] = 3
orderdict['d'] = 4

In [4]:
orderdict

OrderedDict([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

In [5]:
type(orderdict)

collections.OrderedDict

In [6]:
orderdict.items()

odict_items([('a', 1), ('b', 2), ('c', 3), ('d', 4)])

In [10]:
d = {1:2, 3:4}
d = OrderedDict(d)
type(d)

collections.OrderedDict

### defaultdict

The defaultdict is a container that's similar to the usual dict container, but the only difference is that a defaultdict will have a default value if that key has not been set yet. If you didn't use a defaultdict you'd have to check to see if that key exists, and if it doesn't, set it to what you want.


In [33]:
from collections import defaultdict

In [47]:
d = defaultdict(int) # initialize with a default integer value, i.e 0
d['a'] = 1
d['b']  =2
d

defaultdict(int, {'a': 1, 'b': 2})

In [35]:
d['c']

0

In [38]:
# Key is not present in the dict that's why it is returning default value
d['d']

0

In [39]:
d['a']

1

In [40]:
type(d)

collections.defaultdict

In [48]:
d1 = defaultdict(list)  # initialize with a default list value, i.e an empty list
d1['a'] = 1
d1['b']  =2
d1

defaultdict(list, {'a': 1, 'b': 2})

In [42]:
d1['c']

[]

In [43]:
d2 = defaultdict(tuple)  # initialize with a default tuple value, i.e an empty tuple
d2['a'] = 1
d2['b']  =2
d2

defaultdict(tuple, {'a': 1, 'b': 2})

In [44]:
d2['c']

()

In [45]:
d3 = defaultdict(str)
d3['a'] = 1
d3['b']  =2
d3

defaultdict(str, {'a': 1, 'b': 2})

In [46]:
d['c']

0

In [49]:
d4 = defaultdict(dict)  # initialize with a default dictionary value, i.e an empty dictionary
d4['a'] = 1
d4['b']  =2
d4

defaultdict(dict, {'a': 1, 'b': 2})

In [50]:
d4['c']  

{}

### The deque
The deque is a list optimized for inserting and removing items. like QUEUE data structure

A deque is a double-ended queue. It can be used to add or remove elements from both ends. Deques support thread safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction. The more commonly used stacks and queues are degenerate forms of deques, where the inputs and outputs are restricted to a single end.


In [51]:
from collections import deque

In [53]:
de = deque()

In [55]:
de.append(1)
de.append(5)
de.append(3)
de.append(6)
de

deque([1, 1, 5, 3, 6])

In [56]:
de.appendleft(9)
de

deque([9, 1, 1, 5, 3, 6])

In [57]:
de.pop()
de

deque([9, 1, 1, 5, 3])

In [58]:
de.popleft()
de

deque([1, 1, 5, 3])

In [60]:
de.extend([10,11,12])
de

deque([1, 1, 5, 3, 10, 11, 12, 10, 11, 12])

In [61]:
de.extendleft([21,22,23])
de

deque([23, 22, 21, 1, 1, 5, 3, 10, 11, 12, 10, 11, 12])

In [62]:
de.rotate(1)
de

deque([12, 23, 22, 21, 1, 1, 5, 3, 10, 11, 12, 10, 11])

In [63]:
de.rotate(2)
de

deque([10, 11, 12, 23, 22, 21, 1, 1, 5, 3, 10, 11, 12])

In [64]:
de.rotate(-1)
de

deque([11, 12, 23, 22, 21, 1, 1, 5, 3, 10, 11, 12, 10])

### The ChainMap
ChainMap is used to combine several dictionaries or mappings. It returns a list of dictionaries.

In [66]:
from collections import ChainMap

In [68]:
dict1 = { 'a' : 1, 'b' : 2 }
dict2 = { 'c' : 3, 'b' : 4 }

In [69]:
d= ChainMap(dict1, dict2)
d

ChainMap({'a': 1, 'b': 2}, {'c': 3, 'b': 4})

In [70]:
list(d)

['c', 'b', 'a']

In [71]:
type(d)

collections.ChainMap

In [72]:
d.new_child()

ChainMap({}, {'a': 1, 'b': 2}, {'c': 3, 'b': 4})

##### Adding a New Dictionary to ChainMap
If you want to add a new dictionary to an existing ChainMap, use new_child() function. It creates a new ChainMap with the newly added dictionary.

In [73]:
dict3 = {'e' : 5, 'f' : 6}
d.new_child(dict3)

ChainMap({'e': 5, 'f': 6}, {'a': 1, 'b': 2}, {'c': 3, 'b': 4})

In [74]:
d

ChainMap({'a': 1, 'b': 2}, {'c': 3, 'b': 4})

In [75]:
new_chain_map = d.new_child(dict3)
new_chain_map

ChainMap({'e': 5, 'f': 6}, {'a': 1, 'b': 2}, {'c': 3, 'b': 4})

In [76]:
len(new_chain_map)

5

In [77]:
d.keys()

KeysView(ChainMap({'a': 1, 'b': 2}, {'c': 3, 'b': 4}))

In [78]:
list(d.keys())

['c', 'b', 'a']

In [79]:
list(d.values())

[3, 2, 1]

In [81]:
d.parents

ChainMap({'c': 3, 'b': 4})

In [82]:
new_chain_map.parents

ChainMap({'a': 1, 'b': 2}, {'c': 3, 'b': 4})

In [85]:
d.maps  # You can print all the items in a ChainMap using .map operator. 

[{'a': 1, 'b': 2}, {'c': 3, 'b': 4}]

In [84]:
d

ChainMap({'a': 1, 'b': 2}, {'c': 3, 'b': 4})

In [89]:
list(reversed(d.maps))  # We are reversing the order of dictionaries using reversed() function

[{'c': 3, 'b': 4}, {'a': 1, 'b': 2}]

### UserList
Hope you are familiar with python lists?.

A UserList is list-like container datatype, which is wrapper class for lists.

Syntax: collections.UserList([list])

You pass a normal list as an argument to userlist. This list is stored in the data attribute and can be accessed through UserList.data method.



In [90]:
from collections import UserList

In [91]:
my_list=[11,22,33,44]

# Accessing it through `data` attribute
user_list = UserList(my_list)
user_list

[11, 22, 33, 44]

In [92]:
type(user_list)

collections.UserList

In [93]:
user_list.data

[11, 22, 33, 44]

###### What is the use of UserLists
Suppose you want to double all the elements in some particular lists as a reward. Or maybe you want to ensure that no element can be deleted from a given list.

In such cases, we need to add a certain ‘behavior’ to our lists, which can be done using UserLists.

For example, Let me show you how UserList can be used to override the functionality of a built-in method. The below code prevents the addition of a new value (or appending) to a list.



In [94]:
# Creating a userlist where adding new elements is not allowed.
from collections import UserList


class user_list(UserList):
    def append(self, s=None):
        raise RuntimeError("Authority denied for new insertion")


# trying to insert new element
my_list = user_list([10, 11, 12])
my_list.append(13)


RuntimeError: Authority denied for new insertion

The above code prints RunTimeError message and does not allow appending. This can be helpful if you want to make sure nobody can insert their name after a particular deadline. So, UserList have very real time efficient.



### UserString
Just like UserLists are wrapper class for lists, UserString is a wrapper class for strings.

It allows you to add certain functionality/behavior to the string. You can pass any string convertible argument to this class and can access the string using the data attribute of the class.

In [95]:
from collections import UserString

In [96]:
num = 899
user_str = UserString(num)
user_str

'899'

In [97]:
type(user_str)

collections.UserString

In [98]:
print(user_str.data)

899


In [99]:
type(user_str.data)

str

As you can see in above example, the number 899 was converted into a string ‘899’ and can be accessed through the UserString.data method.

### UserDict
It is a wrapper class for dictionaries. The syntax, functions are similar to UserList and UserString.

syntax:collections.UserDict([data])

We pass a dictionary as the argument which is stored in the data attribute of UserDict.



In [101]:
from collections import UserDict

In [102]:
d = {1:'a', 2:'d'}
user_dict = UserDict(d)

In [104]:
type(user_dict)

collections.UserDict

In [105]:
user_dict.data

{1: 'a', 2: 'd'}

In [106]:
type(user_dict.data)

dict

##### How UserDict can be used
UserDict allows you to create a dictionary modified to your needs. Let’s see an example of how UserDict can be used to override the functionality of a built-in method. The below code prevents a key-value pair from being dropped.



In [108]:
class user_dict1(UserDict):
    def pop(self, s=None):
        raise RuntimeError("Authority denied for deletion")
        
my_dict = user_dict1({1:'a', 2:'d'})
my_dict.pop(1)
        

RuntimeError: Authority denied for deletion

You will receive an RunTimeError message. This will help if you don’t want to lose data.

These are all the container datatypes from the collections module. They increase efficiency by a great amount when used on large datasets.
