# Introduction to Python - Day 04 (22 Sept 2017)

### Recap

+ Introduction to composite data types, a.k.a data structures (lists)
+ Iteration (repetitive execution) - another form of program control flow (for-loop patterns)
+ Running programs in debug mode (for debugging and exploring dynamic code execution environment)

---
# Dictionaries (a.k.a. HashMap/HashTable)

+ Consist of a **set of mappings** between _**<font color='blue'>unique</font>**_ **keys** and their **values**.

#### Basic syntax:
{key1: value1, key2: value2, ...}
   
```python
# Example:
genetic_code = {'uuu': 'phe', 'uua': 'leu', 'aug': 'met', 'uaa': 'stop'}
```

**Comparison with Lists**
+ Lists are ordered: the order in which elements are added is the order in which they are stored
    + Access by position/index
        + Ex. letters = ['a', 'b', 'c', 'd', 'e', 'f']
        + letters[0] is 'a' etc.
+ Dictionaries are unordered
    + Access by key
        + Ex. dict_ = {'key1': 'a', 'key2': 'b', 'key3': 'c'}
        + dict_['key1'] is 'a'

The association between a **key** and a **value** is often refered to as a **key**-**value** pair or sometimes an **item**.


#### Keys

+ must be immutable (string, integer, float, tuple)
+ must be unique


#### Values
+ Can be of any type, mutable or immutable, simple or composite (arbitrarily complex, heterogeneous)
    + primitives (character: 'a', integer: 0, float: 3.4)
    + sequential Types (string: 'asd', list: [0, 1, 2], another dictionary: {'key':'value'}, tuple: (0, 1, 2)
    + user Defined Types (discussed later) (functions, classes, objects etc.)


### Some Real World Examples

+ {**&lt;gene_id&gt;**: **&lt;**gene sequence**&gt;**, ...}
+ {**&lt;email&gt;**: **&lt;**user data**&gt;**, ...}
+ {**&lt;soc security&gt;**: **&lt;**individual**&gt;**, ...}
+ {**&lt;emp id&gt;**: **&lt;**emp data**&gt;**, ...}

## Operations

```python
help(dict)
```

+ Create
+ Access keys, values or (key, value) pairs / items
+ Modify items
+ Check membership of a key
+ Traverse through the dictionary and do something
+ Make it bigger / smaller (add and remove items)
+ …


## Creating a dictionary
Several ways to create a dictionary

```python
dict_x = {'a': 1, 'b': 2}      # initialize by assignment
dict_y = dict(a=1, b=2)        # use dict built-in function
print(dict_x, dict_y)
```
+ **keys** = 'a', 'b'
+ **values** = 1, 2
+ **items** = ('a', 1), ('b', 2)

+ Access by key:
```python
print("The value for key '{}' is {}".format('a', dict_x['a']))
```

#### Dictionaries can also be built incrementally - see example later

In [2]:
dict_x = {'a': 1, 'b': 2}      # initialize by assignment
dict_y = dict(a=1, b=2)        # use dict built-in function
print(dict_x, dict_y)
print("The value for key '{}' is {}".format('a', dict_x['a']))

{'a': 1, 'b': 2} {'a': 1, 'b': 2}
The value for key 'a' is 1


### Example - Lookup Table

```python
elements = {'H': 'hydrogen',   'He': 'helium', 
            'Li': 'lithium',  'C': 'carbon', 
            'O': 'oxygen',  'N': 'nitrogen'}
complement = {'A': 'T', 'T': 'A', 'C': 'G', 'G': 'C'}
print('H', '->', elements['H'])
print('A', '->', complement['A'])
```

### Example - Database Records
```python
person = {'name': 'John', 
          'surname': 'Grisham', 
          'contact': 
              {
              'phone': {'office': '123-456-7890',
                        'cell': '456-789-0123'
                       },
              'email': ['johnny@gmail.com', 'john.grisham@writers.com']
              }
          }
print(person['name'])
print(person['contact'])
print(person['contact']['phone'])
print(person['contact']['email'])
```

In [None]:
{"1": {fn: , ln: ..}, "2": {fn: , ln: }}

# Iterating over a dictionary

### Pattern 1: &lt;dict&gt;.keys()
<font color='blue'>**Note:**</font> &lt;dict&gt; is a placeholder for a dictionary object

```python
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_keys lazy obj: ', my_dict.keys())                       # lazy object
print('dict_keys unpacked: ', list(my_dict.keys()))                 # forceful typecast
print('Inside for loop:')
for key in my_dict.keys():               # for loop unpacks the lazy object internally
    print('key: ', key)
```

In [12]:
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print(my_dict)
print('dict_keys lazy obj: ', my_dict.keys())
print('dict_keys unpacked: ', list(my_dict.keys()))

print('Inside for loop:')
for key in my_dict.keys():               # for loop unpacks the lazy object internally
    print('key: ', key)

{0: 'a', 1: 'b', 2: 'c', 3: 'd'}
dict_keys lazy obj:  dict_keys([0, 1, 2, 3])
dict_keys unpacked:  [0, 1, 2, 3]
Inside for loop:
key:  0
key:  1
key:  2
key:  3


### Pattern 2: &lt;dict&gt;.values()


```python
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_values lazy obj: ', my_dict.values())                    # lazy object
print('dict_values unpacked: ', list(my_dict.values()))
print('Inside for loop')
for value in my_dict.values():
    print('value: ', value)
```

In [13]:
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_values lazy obj: ', my_dict.values())                    # lazy object
print('dict_values unpacked: ', list(my_dict.values()))
print('Inside for loop')
for value in my_dict.values():
    print('value: ', value)

dict_values lazy obj:  dict_values(['a', 'b', 'c', 'd'])
dict_values unpacked:  ['a', 'b', 'c', 'd']
Inside for loop
value:  a
value:  b
value:  c
value:  d


### Pattern 3: &lt;dict&gt;.items()

```python
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_items lazy obj: ', my_dict.items())                     # lazy object
print('dict_items unpacked: ', list(my_dict.items()))
print('Inside for loop: ')
for item in my_dict.items():
    print(item)
    # print('item: {}, key: {}, value: {}'.format(item, item[0], item[1]))
```

In [15]:
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_items lazy obj: ', my_dict.items())                     # lazy object
print('dict_items unpacked: ', list(my_dict.items()))
print('Inside for loop: ')
for item in my_dict.items():
    print('item: {}, key: {}, value: {}'.format(item, item[0], item[1]))

dict_items lazy obj:  dict_items([(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')])
dict_items unpacked:  [(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')]
Inside for loop: 
item: (0, 'a'), key: 0, value: a
item: (1, 'b'), key: 1, value: b
item: (2, 'c'), key: 2, value: c
item: (3, 'd'), key: 3, value: d


#### <font color='blue' size=3>Sidebar: List/Tuple unpacking</font> 
If there are the same number of variables as elements in a sequence, python will assign each element to a variable
```python
val1, val2 = [1, 2]
print(val1, val2)
```
If there are more or less elements, python will throw a ValueError
```python
val1, val2 = [1]
val1, val2 = [1, 2, 3]
```

In [18]:
val1, val2 = [1]
val1, val2 = [1, 2, 3]

ValueError: not enough values to unpack (expected 2, got 1)

### Pattern 4: Item split into key/value
```python
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_items lazy obj:', my_dict.items())
print('Inside for loop:')
for key, value in my_dict.items():
    print("key: {}, value: {}".format(key, value))
```

In [22]:
my_dict = {0: 'a', 1:'b', 2: 'c', 3: 'd'}
print('dict_items lazy obj:', my_dict.items())
print('Inside for loop:')
for key, value in my_dict.items():
    print("key: {}, value: {}".format(key, value))

dict_items lazy obj: dict_items([(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd')])
Inside for loop:
key: 0, value: a
key: 1, value: b
key: 2, value: c
key: 3, value: d


### Membership - check if a key exists in a dictionary

```python
some_dict = {'a': 0, 'b': 1, 'c': 2}
print("our dict: ", some_dict)
print("a in our dict: ", 'a' in some_dict)
```

In [23]:
some_dict = {'a': 0, 'b': 1, 'c': 2}
print("our dict: ", some_dict)
print("a in our dict: ", 'a' in some_dict)

our dict:  {'a': 0, 'b': 1, 'c': 2}
a in our dict:  True



### Modifications

+ Changing the value for a key

```python
some_dict['a'] = 10
print("our dict (now): ", some_dict)
```

+ Adding individual key-value pair to a dictionary

```python
some_dict['d'] = 3
print("our dict (now): ", some_dict)
```

+ Updating a dictionary with another dictionary (updates existing values; adds new key-value pairs)
```python
some_other_dict = {'a': 99, 'e': 999}
some_dict.update(some_other_dict)
print("our dict (now): ", some_dict)
```

In [26]:
some_dict['a'] = 10
print("our dict (now): ", some_dict)
some_dict['d'] = 3
print("our dict (now): ", some_dict)

some_other_dict = {'a': 99, 'e': 999}
some_dict.update(some_other_dict)
print("our dict (now): ", some_dict)

our dict (now):  {'a': 10, 'b': 1, 'c': 2, 'd': 3}
our dict (now):  {'a': 10, 'b': 1, 'c': 2, 'd': 3}
our dict (now):  {'a': 99, 'b': 1, 'c': 2, 'd': 3, 'e': 999}


### Extra - pretty Printing

+ Complicated dictionaries do not print nicely.
+ pprint is a library that prints dictionaries in a more structured manner
    + external library that needs to be imported
    + it comes standard with python installation
+ If you want to configure the output, create a pretty printer object first before using it (ow default config is used)

```python
import pprint
dict_ = {'name': 'Joe', 'Surname': 'van Niekerk', 'email': 'jvn@c.m', 
        'friends': [{'name': 'Sally'}, {'name': 'Dave'}, {'name': 'Rick'}, {'name': 'James'}]}
print('\n' +'-'*50)
print("No pretty printing")
print('-'*50)
print(dict_)
print('\n' + '-'*50)
print("Default pretty printing")
print('-'*50)
pprint.pprint(dict_)
print('\n' + '-'*50)
print("Custom pretty printing")
print('-'*50)
pp = pprint.PrettyPrinter(indent=4)   # create a pprint object with desired attributes (more on this later)
pp.pprint(dict_)
```

In [29]:
import pprint
dict_ = {'name': 'Joe', 'Surname': 'van Niekerk', 'email': 'jvn@c.m', 
        'friends': [{'name': 'Sally'}, {'name': 'Dave'}, {'name': 'Rick'}, {'name': 'James'}]}
print('\n' +'-'*50)
print("No pretty printing")
print('-'*50)
print(dict_)

print('\n' + '-'*50)
print("Default pretty printing")
print('-'*50)
pprint.pprint(dict_)

print('\n' + '-'*50)
print("Custom pretty printing")
print('-'*50)
pp = pprint.PrettyPrinter(indent=4)   # create a pprint object with desired attributes (more on this later)
pp.pprint(dict_)


--------------------------------------------------
No pretty printing
--------------------------------------------------
{'name': 'Joe', 'Surname': 'van Niekerk', 'email': 'jvn@c.m', 'friends': [{'name': 'Sally'}, {'name': 'Dave'}, {'name': 'Rick'}, {'name': 'James'}]}

--------------------------------------------------
Default pretty printing
--------------------------------------------------
{'Surname': 'van Niekerk',
 'email': 'jvn@c.m',
 'friends': [{'name': 'Sally'},
             {'name': 'Dave'},
             {'name': 'Rick'},
             {'name': 'James'}],
 'name': 'Joe'}

--------------------------------------------------
Custom pretty printing
--------------------------------------------------
{   'Surname': 'van Niekerk',
    'email': 'jvn@c.m',
    'friends': [   {'name': 'Sally'},
                   {'name': 'Dave'},
                   {'name': 'Rick'},
                   {'name': 'James'}],
    'name': 'Joe'}


## Example (PyCharm)
Common pattern: **Aggregate** and **summarize**
+ Say, we have a long list of numbers
+ [0,1,1,3,1,3,6,1,8,2,8,7,5,0,2,2,1,5,4,7,0,0,3,1,2,9,9,4,3,2,5,3,1,2,1,3,3,2,2,4,5,1,6,7,9,8,1,4,2,5,6,8,0,0,0,1,1,2,6,1,3,2,4,2,5,7,3,1,3,4,6]
+ Count the number of times each number appears (and may be rank-order them according to frequency)
+ Thought Process?

In [32]:
import pprint
numbers = [0,1,1,3,1,3,6,1,8,2,8,7,5,0,2,2,1,5,4,7,0,0,3,1,2,9,9,4,
           3,2,5,3,1,2,1,3,3,2,2,4,5,1,6,7,9,8,1,4,2,5,6,8,0,0,0,1,
           1,2,6,1,3,2,4,2,5,7,3,1,3,4,6]
counts = {}
for num in numbers:
    if num in counts:
        counts[num] = counts[num] + 1
    else:
        counts[num] = 1
pp = pprint.PrettyPrinter(width=10)
pp.pprint(counts)

{0: 7,
 1: 14,
 2: 12,
 3: 10,
 4: 6,
 5: 6,
 6: 5,
 7: 4,
 8: 4,
 9: 3}


In [34]:
import pprint
numbers = [0,1,1,3,1,3,6,1,8,2,8,7,5,0,2,2,1,5,4,7,0,0,3,1,2,9,9,4,
           3,2,5,3,1,2,1,3,3,2,2,4,5,1,6,7,9,8,1,4,2,5,6,8,0,0,0,1,
           1,2,6,1,3,2,4,2,5,7,3,1,3,4,6]
counts = {}
for num in numbers:
    counts[num] = counts.get(num, 0) + 1
#     if num in counts:
#         counts[num] = counts[num] + 1
#     else:
#         counts[num] = 1
pp = pprint.PrettyPrinter(width=10)
pp.pprint(counts)

# help(dict.get)

{0: 7,
 1: 14,
 2: 12,
 3: 10,
 4: 6,
 5: 6,
 6: 5,
 7: 4,
 8: 4,
 9: 3}


## Example (PyCharm)
Common pattern: **Aggregate**
+ Genetic code gives us mapping from nucleotide triplets (64 in number) to amino acids (20 in number). So some redundancy.
```python
import pprint
gen_code = {'uuu': 'Phe', 'uuc': 'Phe', 'uua': 'Leu', 'uug': 'Leu', 
            'ucu': 'Ser', 'ucc': 'Ser', 'uca': 'Ser', 'ucg': 'Ser', 
            'uau': 'Tyr', 'uac': 'Tyr', 'uaa': 'Stop', 'uag': 'Stop',
            'ugu': 'Cys', 'ugc': 'Cys', 'uga': 'Stop', 'ugg': 'Trp',
            'cuu': 'Leu', 'cuc': 'Leu', 'cua': 'Leu', 'cug': 'Leu', 
            'ccu': 'Pro', 'ccc': 'Pro', 'cca': 'Pro', 'ccg': 'Pro',
            'cau': 'His', 'cac': 'His', 'caa': 'Gln', 'cag': 'Gln', 
            'cgu': 'Arg', 'cgc': 'Arg', 'cga': 'Arg', 'cgg': 'Arg}
pprint.pprint(gen_code)
```

+ What if we want to know what triplets code for phe?
    - one solution: iterate through genetic code dict for each query and assemble results
    - better solution: pre-compute the desired results (mapping from amino acid to nucleotide triplets)
+ Thought Process?

In [39]:
import pprint
gen_code = {'uuu': 'Phe', 'uuc': 'Phe', 'uua': 'Leu', 'uug': 'Leu', 
          'ucu': 'Ser', 'ucc': 'Ser', 'uca': 'Ser', 'ucg': 'Ser', 
          'uau': 'Tyr', 'uac': 'Tyr', 'uaa': 'Stop', 'uag': 'Stop',
          'ugu': 'Cys', 'ugc': 'Cys', 'uga': 'Stop', 'ugg': 'Trp',
          'cuu': 'Leu', 'cuc': 'Leu', 'cua': 'Leu', 'cug': 'Leu', 
          'ccu': 'Pro', 'ccc': 'Pro', 'cca': 'Pro', 'ccg': 'Pro',
          'cau': 'His', 'cac': 'His', 'caa': 'Gln', 'cag': 'Gln', 
          'cgu': 'Arg', 'cgc': 'Arg', 'cga': 'Arg', 'cgg': 'Arg'}
# pprint.pprint(gen_code)

# aa_codon_map = {}
# for key, value in gen_code.items():
#     if value in aa_codon_map:
#         aa_codon_map[value].append(key)
#     else:
#         aa_codon_map[value] = [key]

aa_codon_map = {}
for key, value in gen_code.items():
    aa_codon_map[value] = aa_codon_map.get(value, []).append(key)

pprint.pprint(aa_codon_map)

AttributeError: 'NoneType' object has no attribute 'append'

## Control Flow (Demonstrate in PyCharm - debug mode)
### *<font color='blue'>break</font>* keyword
+ to **stop** the current loop from running any further (**break out of the loop**)

```python
print('\nbreak statement')
numbers = [1, 3, 5, 6, 9, 10]
for number in numbers:
    if number % 2 == 0:
        break
    print(number)
```

### *<font color='blue'>continue</font>* keyword
+ to **skip** over the current iteration in current loop (**continue without completing current iteration**)

```python
print('\ncontinue statement')
numbers = [1, 3, 5, 6, 9, 10]
for number in numbers:
    if number % 2 == 0:
        continue
    print(number)
```

### *<font color='blue'>break</font>* keyword and *<font color='blue'>continue</font>* keyword statements operate on the immediate enclosing for loop

```python
print('\nbreak statement')
letters = ['a', 'b', 'c']
numbers = [5, 7, 9, 10, 15]
for char in letters:                         # outer loop
   for number in numbers:  # inner loop; enumerate gives (index, value) tuples (see help)
       if number % 2 == 0:
           break
       print(char, number)

###########################

print('\ncontinue statement')
letters = ['a', 'b', 'c']
numbers = [5, 7, 9, 10, 15]
for char in letters:
   for number in numbers:
       if number % 2 == 0:
           continue
       print(char, number)
```