#### Walk through of the ipython notebook on github

In [1]:
from collections import Counter, defaultdict, namedtuple, deque
import csv
import random
from urllib.request import urlretrieve

From the Python docs:

> collections.namedtuple(typename, field_names, *, rename=False, defaults=None, module=None)

>Returns a new tuple subclass named typename. The new subclass is used to create tuple-like objects that have fields accessible by attribute lookup as well as being indexable and iterable. Instances of the subclass also have a helpful docstring (with typename and field_names) and a helpful __repr__() method which lists the tuple contents in a name=value format.

>The field_names are a sequence of strings such as ['x', 'y']. Alternatively, field_names can be a single string with each fieldname separated by whitespace and/or commas, for example 'x y' or 'x, y'.

regular tuple:

In [3]:
user = ('bob', 'coder')
f'user {user[0]} is a {user[1]}'

'user bob is a coder'

named tuple:

In [6]:
User = namedtuple('User', 'name role')
User

__main__.User

In [7]:
user = User(name = 'bob', role = 'coder')
user

User(name='bob', role='coder')

In [8]:
f'user {user.name} is a {user.role}'

'user bob is a coder'

Avoid key error with `defaultdict`

In [9]:
users = {'bob':'coder'}

In [10]:
users['bob']

'coder'

In [11]:
users['julian']

KeyError: 'julian'

Some ways around the keyerror with `get` and `setdefault`

In [12]:
users.setdefault('julian', None)

In [13]:
users

{'bob': 'coder', 'julian': None}

In [14]:
users.get('bob')

'coder'

In [15]:
users.get('mike')

In [16]:
users

{'bob': 'coder', 'julian': None}

In [17]:
challenges_done = [('mike', 10), ('julian', 7), ('bob', 5),
                   ('mike', 11), ('julian', 8), ('bob', 6)]
challenges_done

[('mike', 10),
 ('julian', 7),
 ('bob', 5),
 ('mike', 11),
 ('julian', 8),
 ('bob', 6)]

this won't work

In [18]:
challenges = {}
for name, challenge in challenges_done:
    challenges[name].append(challenge)

KeyError: 'mike'

try this

In [22]:
challenges = {}
for name, challenge in challenges_done:
    challenges.setdefault(name, []).append(challenge)

In [23]:
challenges

{'mike': [10, 11], 'julian': [7, 8], 'bob': [5, 6]}

or use `defaultdict`

In [24]:
challenges_dd = defaultdict(list)
for name, challenge in challenges_done:
    challenges_dd[name].append(challenge)

challenges_dd

defaultdict(list, {'mike': [10, 11], 'julian': [7, 8], 'bob': [5, 6]})

Checking out `Counter`

In [25]:
words = """Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been 
the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and 
scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into 
electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of
Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus
PageMaker including versions of Lorem Ipsum""".split()
words[:5]

['Lorem', 'Ipsum', 'is', 'simply', 'dummy']

In [31]:
common_words = {}
for word in words:
    if word not in common_words:
        common_words[word]  = 1
    else:
        common_words[word] += 1

# if you iterate throught the items you iterate through (k, v) tuples
# this is a list comprehension which is why you can tack on the [:5]
for k, v in sorted(common_words.items(), key = lambda x: x[1], reverse = True)[:5]:
   print(k, v)
    

the 6
Lorem 4
Ipsum 4
of 4
and 3


In [32]:
Counter(words).most_common(5)

[('the', 6), ('Lorem', 4), ('Ipsum', 4), ('of', 4), ('and', 3)]

The deque example with timeit is interesting because deques are supposed to be fast on appends or pops on both ends but the middle is still slow I thought.  So I was surprised how much faster it was.  Is it just because they are only picking indexes at the beginning (first 100 indexes of the list)?

In [41]:
lst = list(range(10000000))
deq = deque(range(10000000))

In [35]:
def insert_and_delete(ds):
    for _ in range(10):
        index = random.choice(range(100))
        ds.remove(index)
        ds.insert(index, index)

%timeit insert_and_delete(lst)

%timeit insert_and_delete(deq)

165 ms ± 5.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
25.3 µs ± 259 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [40]:
indexes = [random.choice(range(100000)) for _ in range(10)]
indexes

[30114, 82961, 48537, 51564, 29720, 95811, 73864, 90753, 80297, 57272]

In [42]:
def diff_insert_and_delete(ds):
    for index in indexes:
        ds.remove(index)
        ds.insert(index, index)
        

%timeit insert_and_delete(lst)

%timeit insert_and_delete(deq)

160 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
25 µs ± 141 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


Nope, still faster even if I use random indexes through out the whole range