### Day 4 is centered around the [Collections module](https://docs.python.org/2/library/collections.html)  - a high performance set of container data types



## 1. Nametuple

 Namedtuple is a great way to create a tuple with very readable code

In [34]:
from collections import defaultdict, namedtuple, Counter, deque
import random

In [3]:
User = namedtuple('User', 'name role')

---
Let's create a new User whose name is Bob and role is coder.

In [4]:
newUser = User(name='bob', role='coder')

---
Now we can access attributes very easily.  If we need to look back on this code down the road we don't have to scan everything looking for index numbers!  

In [5]:
newUser.name

'bob'

In [6]:
newUser.role

'coder'

In [7]:
f'{newUser.name} is a {newUser.role}!!!'

'bob is a coder!!!'

## 2. defaultdict

simple dictionary

In [8]:
user = {'bob' : 'coder'}

In [9]:
user['bob']

'coder'

---
No issue with the key there...when we search for bob, we get coder as expected.  

What if we search for a key that doesn't exist? 

In [12]:
user['Julian']

KeyError: 'Julian'

---
The workaround to avoid the error is to use the get method

In [13]:
user.get('bob')

'coder'

In [14]:
user.get('julian')

In [15]:
user.get('julian') is None

True

In [16]:
challenges_done = [('mike', 10), ('julian', 7), ('bob', 5),
                  ('mike', 11), ('julian', 8), ('bob', 6)]

In [18]:
challenges_done

[('mike', 10),
 ('julian', 7),
 ('bob', 5),
 ('mike', 11),
 ('julian', 8),
 ('bob', 6)]

---
When we use a traditional dictionary, we get a keyerror when we cleanup our list.  

In [19]:
challenges = {}

In [20]:
type(challenges)

dict

In [21]:
for name, challenge in challenges_done:
    challenges[name].append(challenge)

KeyError: 'mike'

---
Let's try that with a defaultdict object instead! 

In [22]:
challenges = defaultdict(list)
for name, challenge in challenges_done:
    challenges[name].append(challenge)

In [24]:
type(challenges)

collections.defaultdict

In [23]:
challenges

defaultdict(list, {'bob': [5, 6], 'julian': [7, 8], 'mike': [10, 11]})

Easily appends those values to the key without issue!  

---
## 3. Counter

Need to perform some basic text analysis?  Counter to the rescue! 

In [25]:
words = """Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been 
the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and 
scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into 
electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of
Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus
PageMaker including versions of Lorem Ipsum""".split()

---
Now we check the 1st 5 words

In [29]:
words[:5]

['Lorem', 'Ipsum', 'is', 'simply', 'dummy']

In [30]:
Counter(words).most_common(5)

[('the', 6), ('Lorem', 4), ('Ipsum', 4), ('of', 4), ('and', 3)]

Seriously?!?! 1 line of code?  Nice, Python...nice!

---
### 4. deque

when you have a bunch of inserts and deletes in a list, use a deque instead for superior performance! 

let's create a list and a deque with 10 million integers and compare performance with insert and delete.

In [31]:
lst = list(range(10000000))
deq = deque(range(10000000))

In [32]:
def insert_and_delete(ds):
    for _ in range(10):
        index = random.choice(range(100))
        ds.remove(index)
        ds.insert(index, index)

In [35]:
%timeit insert_and_delete(lst)

113 ms ± 1.73 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [36]:
%timeit insert_and_delete(deq)

21.1 µs ± 199 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


---
Even on a simple example such as this, the performance gain is mind blowing!  