# `collections` module
---

**Table of Contents**<a id='toc0_'></a>    
- [`Counter`](#toc1_)    
  - [Counter Methods](#toc1_1_)    
- [`defaultdict `](#toc2_)    
- [`OrderedDict`](#toc3_)    
  - [Equality With `OrderedDict`](#toc3_1_)    
- [`namedtuple`](#toc4_)    

<!-- vscode-jupyter-toc-config
	numbering=false
	anchor=true
	flat=false
	minLevel=2
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

---

- Built-in module that implements specialized container data types
- Alternatives to Python’s general purpose built-in containers

## <a id='toc1_'></a>`Counter` [&#8593;](#toc0_)

- A `dict` subclass which helps count hashable objects
- Elements are stored as dictionary keys and the counts of the objects are stored as the value

In [1]:
from collections import Counter

# From list to Counter
ls: list[int] = [1, 2, 2, 2, 2, 3, 3, 3, 1, 2, 1, 12, 3, 2, 32, 1, 21, 1, 223, 1]
print(Counter(ls))

Counter({1: 6, 2: 6, 3: 4, 12: 1, 32: 1, 21: 1, 223: 1})


In [2]:
# From string to Counter
sentence: str = "Hello world! This is a simple test for Counter with strings!"
words_in_sentence = Counter(sentence)
print(words_in_sentence)

Counter({' ': 10, 's': 6, 'i': 5, 't': 5, 'e': 4, 'l': 4, 'o': 4, 'r': 4, 'w': 2, '!': 2, 'h': 2, 'n': 2, 'H': 1, 'd': 1, 'T': 1, 'a': 1, 'm': 1, 'p': 1, 'f': 1, 'C': 1, 'u': 1, 'g': 1})


In [3]:
# We cannot iterate through Counter() object result directly
# But we can iterate through Counter().most_common()
for k, v in words_in_sentence.most_common():
    print(f"{str(k)}: {str(v)}", end=" | ")

 : 10 | s: 6 | i: 5 | t: 5 | e: 4 | l: 4 | o: 4 | r: 4 | w: 2 | !: 2 | h: 2 | n: 2 | H: 1 | d: 1 | T: 1 | a: 1 | m: 1 | p: 1 | f: 1 | C: 1 | u: 1 | g: 1 | 

In [4]:
# `Counter` with words in a sentence
sentence = "How many times does each word show up in this sentence word times each each word"
list_words_in_sentence: list[str] = sentence.split() # Produces a list
print(Counter(list_words_in_sentence))

Counter({'each': 3, 'word': 3, 'times': 2, 'How': 1, 'many': 1, 'does': 1, 'show': 1, 'up': 1, 'in': 1, 'this': 1, 'sentence': 1})


### <a id='toc1_1_'></a>Counter Methods [&#8593;](#toc0_)

In [5]:
l: list[int] = [1, 2, 2, 2, 2, 3, 3, 3, 1, 2, 1, 12, 3, 2, 32, 1, 21, 1, 223, 1]
c = Counter(l)

print("sum(c.values()):", sum(c.values()))      # total of all counts
c.clear()                                       # reset all counts

c = Counter(l)
print("list(c):", list(c))                      # convert the keys to a list
print("set(c):", set(c))                        # convert to a set (uniques): Would result in the same as list(c)
print("dict(c):", dict(c))                      # convert to a regular dictionary: {k: v}
print("c.items():", c.items())                  # convert to a list of (elem, cnt) pairs

sum(c.values()): 20
list(c): [1, 2, 3, 12, 32, 21, 223]
set(c): {32, 1, 2, 3, 12, 21, 223}
dict(c): {1: 6, 2: 6, 3: 4, 12: 1, 32: 1, 21: 1, 223: 1}
c.items(): dict_items([(1, 6), (2, 6), (3, 4), (12, 1), (32, 1), (21, 1), (223, 1)])


---

## <a id='toc2_'></a>`defaultdict ` [&#8593;](#toc0_)

- Dictionary-like object which provides all methods provided by dictionary
- But takes first argument (`default_factory`) as default data type for the dictionary
- Using `defaultdict` is faster than doing the same using `dict.set_default` method
- Will never raise a `KeyError`: Any key that does not exist gets the value returned by the default factory


In [6]:
from collections import defaultdict
from typing import Any

d: dict = {}
# d["one"] # => Error: There is no key "one" in d

dd: defaultdict[str, str] = defaultdict(lambda: "Default value") # Default: () => 0
print(dd["one"]) # Not Error: Default value returned
dd["two"] = "Hello"

for item in dd:
    print(f"{str(item)} - {str(dd[item])}")

# Can also initialize with default values:
dd2: defaultdict[Any, int] = defaultdict(lambda: 0)
print(dd2["one"])

Default value
one - Default value
two - Hello
0


---

## <a id='toc3_'></a>`OrderedDict` [&#8593;](#toc0_)

- A dictionary subclass that remembers the order in which its contents are added

In [7]:
# Normal Dictionary
print("Normal dictionary:")
di: dict[str, str] = {}
di["a"] = "A"
di["c"] = "c"
di["b"] = "B"
di["e"] = "E"
di["d"] = "D"

for k1, v1 in di.items():
    print(k, v)

Normal dictionary:
g 1
g 1
g 1
g 1
g 1


In [8]:
# An Ordered Dictionary
from collections import OrderedDict

print("OrderedDict:")
od: OrderedDict[str, str] = OrderedDict()
od["a"] = "A"
od["b"] = "B"
od["c"] = "C"
od["d"] = "D"
od["e"] = "E"

for k1, v1 in od.items():
    print(k, v)

OrderedDict:
g 1
g 1
g 1
g 1
g 1


### <a id='toc3_1_'></a>Equality With `OrderedDict` [&#8593;](#toc0_)

- A regular `dict` looks at its contents when testing for equality
- An `OrderedDict` also considers the order the items were added

In [9]:
# A normal Dictionary
print("Dictionaries are equal? ")

d1: dict[str, str] = {}
d1["a"] = "A"
d1["b"] = "B"

d2: dict[str, str] = {}
d2["b"] = "B"
d2["a"] = "A"

print(d1 == d2)

Dictionaries are equal? 
True


In [10]:
# An Ordered Dictionary:
print("Ordered Dictionaries are equal? ")

od1: OrderedDict[str, str] = OrderedDict()
od1["a"] = "A"
od1["b"] = "B"

od2: OrderedDict[str, str] = OrderedDict()
od2["b"] = "B"
od2["a"] = "A"

print(od1 == od2)

Ordered Dictionaries are equal? 
False


---

## <a id='toc4_'></a>`namedtuple` [&#8593;](#toc0_)

- The standard tuple uses numerical indexes to access its members
- For simple use cases, this is usually enough
- Trying to remember which index should be used for each value can lead to error
- A `namedtuple` assigns names, as well as the numerical index, to each member
- Each kind of `namedtuple` is represented by its own class
  - Created by using the `namedtuple()` factory function
  - The arguments are the name of the new class and a string containing the names of the elements
- Think of `namedtuple` as a very quick way of creating a new object/class type with some attribute fields

In [11]:
from collections import namedtuple

# Construction: namedtuple("ObjectName", "attr1 attr2 attr3...")
Dog: namedtuple = namedtuple("Dog", "age breed name")

sam: Dog = Dog(age=2, breed="Lab", name="Sammy")
frank: Dog = Dog(age=3, breed="Shepard", name="Frankie")

print(sam)
print(sam.age)
print(sam.breed)
print(sam[0])

Dog(age=2, breed='Lab', name='Sammy')
2
Lab
2
