# Introduction

This particular document deals with python's `tuple`, `namedtuple`, `dict`, `defaultdict` and `Counter`. 

It is meant as a quick reminder on how to code in python. 

## Tuples

Python has a data structure for simple data points, like coordinates called `tuples`

The following is a simple list of tuples

In [1]:
letter = [(0, 'a'), (1, 'b'), (2, 'c')]
print(letter)

[(0, 'a'), (1, 'b'), (2, 'c')]


Now for some more difficult code, we'll make tuples of every index in the alphabet and its letter using `enumerate`.

`enumerate` will try to loop over its argument and return a tuple (index, value) for every one of them, for instance in the next example it will get the string `"abcdefghijklmnopqrstuvwxyz"` and loop through every letter. 

In [2]:
for index, letter in enumerate("abcdefghijklmnopqrstuvwxyz"):
    print(index, letter)

0 a
1 b
2 c
3 d
4 e
5 f
6 g
7 h
8 i
9 j
10 k
11 l
12 m
13 n
14 o
15 p
16 q
17 r
18 s
19 t
20 u
21 v
22 w
23 x
24 y
25 z


**Notes:**
tuples are, what is called, immutable, that is, once created you can not modify (either by overwriting elements or adding to them)


```
t = (1,2,3)
print(t[0]) # reading works
t[0] = 4    # this line fails, because you'd be writing to it
t.append(4) # this line fails, you cannot add to tuples after creating
```

Also you'll often come across a situation where you want to sort something based on something else, say you're dealing with persons, and you want to sort based on income. Tuples come in quite handy here as well.

In [3]:
person_names =   ['Albert', 'Heijn', 'Stein']
person_incomes = [     20 ,     49 ,     10 ]

# combine the 2 can be done with the zip function
persons = zip(person_names, person_incomes) # returns (Albert, 20), ... , ('Stein', 10)

# we want to sort based on the second argument, the income
def give_second(person):
    return person[1]

sorted(persons, key=give_second, reverse=True) # reverse is to sort high to low, rather than the other way around

[('Heijn', 49), ('Albert', 20), ('Stein', 10)]

## Named tuples

Accessing tuples by indexes (`t[0]`) is considered difficult to read, so to help out, named_tuples were created (you need to get them from the `collections` module)

In basis they create a class for you based off a tuple.

In [4]:
from collections import namedtuple

# Create a new sort of tuple called a coordinate, consisting of an x, and a y
Coordinate = namedtuple('coordinate', ['x', 'y'])

# create a coordinate with a raw tuple
raw_coordinate = (1, 2)
proper_coordinate = Coordinate(1, 2)

# difficult to tell what argument is the x and what argument is the y
print(raw_coordinate)
print(raw_coordinate[0])   # this is the x
print(raw_coordinate[1])   # this is the y

# easier to tell the x and y apart
print(proper_coordinate)
print(proper_coordinate.x) # this is the x
print(proper_coordinate.y) # this is the y

(1, 2)
1
2
coordinate(x=1, y=2)
1
2


You could imagine that the following line:

    Coordinate = namedtuple('coordinate', ['x', 'y'])

translates into the following code:
    
    class Coordinate:
        def __init__(self, x, y):
            self.x = x
            self.y = y


The only difference is that you cannot alter the x or y after creating the Coordinate with a `namedtuple`

# dict

A very basic data type in python is the dict, short for dictionairy.

The idea behind it is to map one object to another, a simple example would be an actual dictionairy, which maps a word to its definition.

Or in python terminology, dicts consists of keys (the words) and a values (the desciptions), the keys as well as the values may be of any type. 

Going back to the dictionairy example, imagine we're trying to make an oxford `dict`. For instance if we look up the word `"python"` in the oxford dictionairy, we get these entries:

- A large heavy-bodied non-venomous snake occurring throughout the Old World tropics, killing prey by constriction and asphyxiation.
- Computing [mass noun] A high-level general-purpose programming language.

Let's find out how we can make this, starting with an empty dictionairy.

In [5]:
oxford = dict()  # long  form
oxford = {}      # short form

### Filling the `dict`

As mentioned before, `dicts` consist out of keys and values. 

You're trying to map one thing to another.

Let's see an example of this:

In [6]:
oxford['python'] = 'A snake'

That's all there is to it, now let's set some more entries

In [7]:
# needs to be created first, if it didnt exist already
oxford = dict() 

# filling the oxford with
# 'a' -> 'Letter 1 of alphabet'
# ...
# 'z' -> 'Letter 26 of alphabet'
for i, l in enumerate("abcdefghijklmnopqrstuvwxyz"):
    oxford[l] = 'Letter %d of the alphabet' % (i + 1)

Since python is trying to be as short and readable as possible, it is possible to set some key value pairs while creating a dictionairy.

In [8]:
# python's simple way of creating initialised dicts (stores 2 key,value pairs)
oxford = {
    'a' : 'First letter of alphabet',
    'z' : 'Last letter of alphabet'
}

# more complicated comprehension form, allows for 26 key value pairs to be created here
oxford = { l : 'Letter %d of the alphabet' % (i + 1) for i, l in enumerate("abcdefghijklmnopqrstuvwxyz") }

print(oxford['a'])

Letter 1 of the alphabet


### our oxford `dict`

Looking back we saw that `"python"` actually can mean 2 things, since we can only store one thing in `oxford['python']` we need to work around this.

We can do this by letting `oxford['python']` be a list rather than just a string:

In [9]:
oxford           = dict()
oxford['python'] = list()

# oxford['python'] is a list, so we can use the append function on it
oxford['python'].append('A large heavy-bodied non-venomous snake occurring throughout the Old World tropics.')
oxford['python'].append('Computing [mass noun] A high-level general-purpose programming language.')

#Let's see the definitions of python:
print("python:")
for definition in oxford['python']:
    print(" - ", definition)

python:
 -  A large heavy-bodied non-venomous snake occurring throughout the Old World tropics.
 -  Computing [mass noun] A high-level general-purpose programming language.


alternatively, it could be that a dictionairy entry does not exist, resulting in a key error:

In [10]:
# Wrapped in try ... except to show that an error is printed
try:
    oxford["key that wasn't added"]
except Exception as e:
    print(repr(e))

KeyError("key that wasn't added",)


### Advanced features of dicts

At times we want to know what is in a certain dict, loop over them etc.

We'll discuss those here

In [11]:
# create a larger dictionairy:

oxford = {
    'html' : [
        'Hypertext Markup Language, system for tagging text files to display World Wide Web pages.'
    ],
    'python' : [
        'A large heavy-bodied non-venomous snake occurring throughout the Old World tropics.',
        'A high-level general-purpose programming language.'
    ],
    'c' : [
        'The third letter of the alphabet.',
        'The Roman numeral for 100.'
        'A computer programming language.'
    ]
}

In [12]:
# Check whether a key exists in a dict
print("key that wasn't added" in oxford)
print("python" in oxford)

False
True


In [13]:
# Loop over keys
for k in oxford.keys():
    print("-", k)

- python
- c
- html


In [14]:
# Looping over just the dictionairy, loops over keys as well
for k in oxford:
    print("-", k)

- python
- c
- html


In [15]:
# Looping over values (note that you cannot easily get back to keys from here)
for value in oxford.values():
    print("- entry:")
    for description in value:
        print("    -", description)
    print("") # leave some space

- entry:
    - A large heavy-bodied non-venomous snake occurring throughout the Old World tropics.
    - A high-level general-purpose programming language.

- entry:
    - The third letter of the alphabet.
    - The Roman numeral for 100.A computer programming language.

- entry:
    - Hypertext Markup Language, system for tagging text files to display World Wide Web pages.



In [16]:
# Looping over key value pairs
for key, value in oxford.items():
    print("- entry", key)
    for description in value:
        print("    -", description)
    print("") # leave some space

- entry python
    - A large heavy-bodied non-venomous snake occurring throughout the Old World tropics.
    - A high-level general-purpose programming language.

- entry c
    - The third letter of the alphabet.
    - The Roman numeral for 100.A computer programming language.

- entry html
    - Hypertext Markup Language, system for tagging text files to display World Wide Web pages.



## `defaultdict` and `Counter`

**Note these objects need to be imported from collections**

At some point there was a need for 2 extra objects, which work for a large part the same as `dicts` with some modifications. 

The defaultdict is able to call what are called default constructors, and Counters... well they count stuff.

We'll see some examples.

In our previous example, with the oxford `dict` every value was supposed to be a list. The disadvantage is that before you can add a definition it needs the list to be there

    # assumes we already made a list at oxford['Ruby']
    oxford['Ruby'].append('precious stone') 
    
    # creates a new list, possibly overwriting a previous one
    oxford['Ruby'] = [ 'precious stone' ]
    
    # overwrites possible previous one and doesn't allow multiple definitions of 'ruby'
    oxford['Ruby'] = 'precious stone'       

Now in an attempt to make it easier for python developers, the `defaultdict` was created. 
This allows for default objects to be placed. For example

In [17]:
from collections import defaultdict

# Every item that is requested, but not set will be an empty list
oxford = defaultdict(list)

# add as if it was already in the dictionairy
oxford['item that didnt yet exist'].append('a new item')

print(oxford['item that didnt yet exist'])
print(oxford['another item that didnt yet exist'])

['a new item']
[]


Essentially, it changes the behaviour when you want to read from something that isn't in the dictionairy yet.

Usually that would result in an exception as seen before, but with the defaultdict, it creates an object with the given function.

In [18]:
# This also works with other objects/functions etc.
have_i_visited = defaultdict(bool) # bool returns False by default
print(have_i_visited['grandma'])   # not even, you monster

have_i_visited['grandma'] = True
print(have_i_visited['grandma'])   # better

False
True


### `Counter`

It takes... well, a number of objects and counts them. 

The object than works as a dictionairy, with all the keys being the objects it got, and the values how many times it found them:

In [19]:
from collections import Counter

to_count = ['a', 'a', 'b', 'a', 'a', 'a', 'b']
counter = Counter(to_count)

print("Letter - Frequency")
for key, value in counter.items():
    print(key,"      ",value)

Letter - Frequency
b        2
a        5


Additionally it offers the following interesting functions:

In [20]:
# elements()
# return the input back, by returning the elements x the times they occurred
for word in counter.elements():
    print(word, end=' ')

b b a a a a a 

In [21]:
# most_common(n)
# returns the n most common elements, in order of most common to least.
for word, freq in counter.most_common(2):
    print(word, " - ", freq)

a  -  5
b  -  2


In [22]:
# update()
# allows to add another input, which it will add to the counted words

# unimportant function that returns all words in the alice.txt file
def words_alice():
    with open("alice.txt") as f:
        for line in f:
            for word in line.split():
                yield word

# updating our counter with all the words in alice in wonderland
counter.update(words_alice())


for word, freq in counter.most_common(10):
    # don't pay attention to the formatting below, its just to align
    print("%-4s : %5d" % (word, freq))

the  :  1505
and  :   714
to   :   703
a    :   611
of   :   490
she  :   484
said :   416
it   :   346
in   :   344
was  :   328
