# Table of Contents
* [Learning Objectives:](#Learning-Objectives:)
* [Data Types from the Python Standard Library](#Data-Types-from-the-Python-Standard-Library)
	* [`collections` module](#collections-module)
		* [`collections.namedtuple`](#collections.namedtuple)
		* [`collections.OrderedDict`](#collections.OrderedDict)
	* [`collections.Counter`](#collections.Counter)


# Learning Objectives:

After completion of this module, learners should be able to:

* import the collections module
* undesrtand and use the `namedtuple`, `OrderedDict` and `Counter` collection types

# Data Types from the Python Standard Library

There are many other Python builtin data types and classes we could study in more detail (e.g., `byte`, `bytearray`, `iterator`, etc.). Among these are:
* `collections`: a module extending standard builtin data collections (`list`, `dict`, etc.);

## `collections` module

The `collections` module extends the base Python data collection types with a few useful variations. Three particular extensions are:

* `collections.namedtuple`: a function for creating `tuple`-type objects with named fields
* `collections.OrderedDict`: a subclass of `dict` objects with ordered keys
* `collections.Counter`: a subclass of `dict` that works like a "bad" or "multiset" (it's good for counting things, as the name indicates).
* More details are available from the [`collections` module documentation](https://docs.python.org/3/library/collections.html)

Some other nice collection types also include `collections.deque`, `collections.defaultdict` and `queue.Queue` (also `queue.LifoQueue` and `queue.PriorityQueue`).

### `collections.namedtuple`

A `namedtuple` is worth considering very often for clean code.  It requires no extra memory per `tuple`, but allows us to *name* each index position in a `tuple`. This provides for better documentation of our intent when using tuples.

In [None]:
from collections import namedtuple

In [None]:
# We declare a new data type with the identifier "Account"
# As well as indexing tuple entries by position, we can use the labels
# "accountID", "firstname", & "lastname" to retrieve entries from a namedtuple
account_fields = ["accountID", "firstname", "lastname"]
Account = namedtuple('Account', account_fields)
newton = Account('123456789', 'Isaac', 'Newton')
leibnitz = Account('987654321', 'Gottfried', 'Leibnitz')
print(newton)
print(leibnitz)

In [None]:
print(leibnitz[1])
print(leibnitz.firstname) # Same as above
print(newton.accountID)

In [None]:
import datetime as dt # We'll use datetime to represent dates

# We declare a new data type with the identifier "Stock"
# As well as indexing tuple entries by position, we can use the labels
# "symbol", "shares", "price", & "acquired" to retrieve entries from a namedtuple
# Space separated field names
Stock = namedtuple("Stock", "symbol shares price acquired") 

In [None]:
# Having defined the namedtuple data-type Stock, we create a value of type Stock
goog = Stock('GOOG', 100, 538.22, dt.date(2015, 1, 15))
print(goog)
print(goog[2])     # We can extract values from the namedtuple using tuple position
print(goog.price)  # ... or we can use an attribute with the appropriate name.

In [None]:
ibm = Stock('IBM', 500, 172.68, dt.date(1952, 6, 1))
aapl = Stock ('AAPL', 250, 127.62, dt.date(1999, 3, 14))
print(ibm)
print(aapl)
print(ibm.symbol)

In [None]:
mystocks = [goog, ibm, aapl] # Construct a list of the stocks
mystocks

In [None]:
# This is a way to implement the asset computation in a readable way
asset_value = 0
for stock in mystocks:
    asset_value += stock.shares * stock.price
print(asset_value)

In [None]:
sum(stock.shares * stock.price for stock in mystocks)

### `collections.OrderedDict`

Generic `dict` objects do not store the keys in any particular order; the specific way the keys are ordered are implementation-dependent and may vary. The `OrderedDict` from the `collections` module is a special data type in the standard Python library acts as a dictionary but also retains the insertion order of keys within the dictionary. If it is important to maintain a particular ordering for the keys (which may be useful when looping over the keys), an `OrderedDict` permits a fixed ordering of the `dict` keys.

In [None]:
from collections import OrderedDict

# Define a few key-value pairs as a list of tuples
key_value_pairs = [('broker','Roberto Cruz'), 
                   ('price',521.78),
                   ('shares',100), 
                   ('symbol','GOOG')]
plain_dict   = dict(key_value_pairs)
ordered_dict = OrderedDict(key_value_pairs)

In [None]:
print(list(plain_dict.keys()))  # No guarantee about order
print(plain_dict)

In [None]:
print(list(ordered_dict.keys()))   # Keys in specific order of insertion
print(ordered_dict)

In [None]:
ordered_dict['symbol']

In [None]:
ordered_dict['location'] = "Mountain View"

In [None]:
ordered_dict

In [None]:
del ordered_dict['broker']
ordered_dict

## `collections.Counter`

In [None]:
from collections import Counter
c = Counter('abracadabra')
c.most_common()

In [None]:
sorted(c)

In [None]:
c['r']

In [None]:
c['r'] += 4
c['r']

In [None]:
c['x']

In [None]:
c.update("abracadabra")
c

In [None]:
from random import randint
nums = [randint(1,9) for _ in range(100)]
numcount = Counter(nums)
numcount.most_common()

In [None]:
numcount

In [None]:
numcount.most_common(3)

In [None]:
numcount.subtract([7,7,7,7])
numcount.most_common()

In [None]:
numcount[8] -= 2
numcount.most_common()