# Collections Module

The collections module is a built-in module that implements specialized container data types providing alternatives to Python’s general purpose built-in containers. We've already gone over the basics: dict, list, set, and tuple.

Now we'll learn about the alternatives that the collections module provides.

## Counter

*Counter* is a *dict* subclass which helps count hashable objects. Inside of it elements are stored as dictionary keys and the counts of the objects are stored as the value.

Let's see how it can be used:

In [1]:
# counter ini buat ngitung
from collections import Counter

**Counter() with lists**

In [2]:
# contoh pake list
lst = [1,2,2,3,4,1,1,1,5,5,6,6,6,2,2,3,3,4,8]
Counter(lst)

Counter({1: 4, 2: 4, 3: 3, 4: 2, 5: 2, 6: 3, 8: 1})

In [3]:
# dia bakal itung semua unique values
# bakalan di format sebagai dictionary

In [4]:
x = ['a','a',1,2,3,3]
Counter(x)

Counter({'a': 2, 1: 1, 2: 1, 3: 2})

**Counter with strings**

In [5]:
Counter('aadsfasaaasdfjoajsjfjsfsffsafew')

Counter({'a': 8, 'd': 2, 's': 7, 'f': 7, 'j': 4, 'o': 1, 'e': 1, 'w': 1})

**Counter with words in a sentence**

In [7]:
# bisa juga buat ngitung kata dalam kalimat
s = 'How many times does each word show up in this sentence word times each each word'
words = s.split()

Counter(words)

Counter({'How': 1,
         'many': 1,
         'times': 2,
         'does': 1,
         'each': 3,
         'word': 3,
         'show': 1,
         'up': 1,
         'in': 1,
         'this': 1,
         'sentence': 1})

In [8]:
# Methods with Counter()
# ada banyak method juga di dalem counter
# contoh most.common() buat liat yang paling common nya
c = Counter(words)
c.most_common(2)

[('each', 3), ('word', 3)]

In [9]:
c.most_common(4)

[('each', 3), ('word', 3), ('times', 2), ('How', 1)]

# Contoh lain dari method counter object

    sum(c.values())                 # total of all counts
    c.clear()                       # reset all counts
    list(c)                         # list unique elements
    set(c)                          # convert to a set
    dict(c)                         # convert to a regular dictionary
    c.items()                       # convert to a list of (elem, cnt) pairs
    Counter(dict(list_of_pairs))    # convert from a list of (elem, cnt) pairs
    c.most_common()[:-n-1:-1]       # n least common elements
    c += Counter()                  # remove zero and negative counts

In [10]:
# kita coba yang list
# buat list unique elements nya
c

Counter({'How': 1,
         'many': 1,
         'times': 2,
         'does': 1,
         'each': 3,
         'word': 3,
         'show': 1,
         'up': 1,
         'in': 1,
         'this': 1,
         'sentence': 1})

In [12]:
list(c) # masukin ke list semua unique elements nya

['How',
 'many',
 'times',
 'does',
 'each',
 'word',
 'show',
 'up',
 'in',
 'this',
 'sentence']

## defaultdict

defaultdict is a dictionary-like object which provides all methods provided by a dictionary but takes a first argument (default_factory) as a default data type for the dictionary. Using defaultdict is faster than doing the same using dict.set_default method.

**A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.**

ini sebenernya simple banget, tujuannya buat kalo kita misal manggil key dari dictionary yang salah, atau bahkan yang ga exist, kita harusnya dapet KeyError soalnya key nya salah, tp defaultdict ini tujuannya biar code kita terus jalan dan ga muncul error, jadi dikasih value default ke key yang salah.

In [13]:
from collections import defaultdict

In [15]:
# contoh kita buat dictionary kosong
d = {}

In [16]:
# kita call key nya yang ga exist
d['key']

KeyError: 'key'

In [17]:
# skrg kita assign default value nya ke d, pake lambda kalo mau ada valuenya
d = defaultdict(lambda: 0)

In [18]:
d['one']

0

In [27]:
x  = [f'{x}:{d[x]}' for x in d]

In [28]:
x

['one:0']

In [29]:
# bisa juga kalo cuma mau jadiin default object tanpa value
d = defaultdict(object)

In [30]:
d['one']

<object at 0x7fee88c034b0>

In [31]:
d

defaultdict(object, {'one': <object at 0x7fee88c034b0>})

# namedtuple
The standard tuple uses numerical indexes to access its members, for example:

biasa kita call tuple kan dari index nya, ini buat kalo misal kita punya tuple yang banyak banget dan kita gabakalan inget index dari semua value nya, jadi namedtuple ini itu mirip kayak class di OOP, kita jadi bisa call berdasarkan index, sama nama

In [33]:
t = [1,2,3]

In [34]:
# cuma 3 value jadi gampang call indexnya
t[0]

1

For simple use cases, this is usually enough. On the other hand, remembering which index should be used for each value can lead to errors, especially if the tuple has a lot of fields and is constructed far from where it is used. A namedtuple assigns names, as well as the numerical index, to each member. 

Each kind of namedtuple is represented by its own class, created by using the namedtuple() factory function. The arguments are the name of the new class and a string containing the names of the elements.

You can basically think of namedtuples as a very quick way of creating a new object/class type with some attribute fields.
For example:

In [35]:
from collections import namedtuple

In [36]:
Dog = namedtuple('Dog',['age','breed','name'])

sam = Dog(age=2,breed='Lab',name='Sammy')

frank = Dog(age=2,breed='Shepard',name="Frankie")

We construct the namedtuple by first passing the object type name (Dog) and then passing a string with the variety of fields as a string with spaces between the field names. We can then call on the various attributes:

In [37]:
# kita bisa call pake nama
sam

Dog(age=2, breed='Lab', name='Sammy')

In [38]:
sam.age

2

In [39]:
sam.breed

'Lab'

In [40]:
# bisa pake index juga tetep
sam[0]

2