In [1]:
#Please execute this cell
import sys;
sys.path.append('../../'); 
import jupman;


# Dictionaries solutions

## [Download exercises zip](../../_static/dictionaries-exercises.zip)

[Browse files online](https://github.com/DavidLeoni/datasciprolab/tree/master/exercises/dictionaries)






## What to do

- unzip exercises in a folder, you should get something like this: 

```

-jupman.py
-my_lib.py
-other stuff ...
-exercises
     |- lists
         |- dictionaries-exercise.ipynb     
         |- dictionaries-solution.ipynb
         |- other stuff ..
```

<div class="alert alert-warning">

**WARNING**: to correctly visualize the notebook, it MUST be in an unzipped folder !
</div>


- open Jupyter Notebook from that folder. Two things should open, first a console and then browser. The browser should show a file list: navigate the list and open the notebook `exercises/dictionaries/dictionaries-exercise.ipynb`

<div class="alert alert-warning">

**WARNING 2**: DO NOT use the _Upload_ button in Jupyter, instead navigate in Jupyter browser to the unzipped folder !
</div>


- Go on reading that notebook, and follow instuctions inside.


Shortcut keys:

- to execute Python code inside a Jupyter cell, press `Control + Enter`
- to execute Python code inside a Jupyter cell AND select next cell, press `Shift + Enter`
- to execute Python code inside a Jupyter cell AND a create a new cell aftwerwards, press `Alt + Enter`
- If the notebooks look stuck, try to select `Kernel -> Restart`





## Introduction

We will review dictionaries, discuss ordering issues for keys, and finally deal with sets.

## Dict

### Dict introduction

First let's review Python dictionaries:

Dictionaries map keys to values. Keys must be immutable types such as numbers, strings, tuples (so i.e. no lists are allowed as keys), while values can be anything. In the following example, we create a dictionary `d` that initially maps from strings to numbers: 

In [2]:
# create empty dict:

d = dict()
d

{}

In [3]:
type( dict() )

dict

Alternatively, to create a dictionary you can type `{}` :

In [4]:
{}

{}

In [5]:
type( {} )

dict

In [6]:
# associate string "some key" to number 4
d['some key'] = 4
d

{'some key': 4}

To access a value corresponding to a key, write this: 

In [7]:
d['some key']

4

You can't associate mutable objects like lists:

```python
d[ ['a', 'mutable', 'list', 'as key']  ] = 3
```

```bash
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-204-fb9d60c4e88a> in <module>()
----> 1 d[ ['a', 'mutable', 'list', 'as key']  ] = 3

TypeError: unhashable type: 'list'
```

But you can associate tuples:

In [8]:
d[ ('an', 'immutable', 'tuple', 'as key')  ] = 3
d

{('an', 'immutable', 'tuple', 'as key'): 3, 'some key': 4}

In [9]:
# associate string "some other key" to number 7
d['some other key'] = 7
d

{('an', 'immutable', 'tuple', 'as key'): 3, 'some key': 4, 'some other key': 7}

In [10]:
# Dictionary is mutable, so you can reassign a key to a different value: 
d['some key'] = 5
d

{('an', 'immutable', 'tuple', 'as key'): 3, 'some key': 5, 'some other key': 7}

In [11]:
# Dictionares are eterogenous, so values can be of different types:

d['yet another key'] = 'now a string!'
d

{('an', 'immutable', 'tuple', 'as key'): 3,
 'some key': 5,
 'some other key': 7,
 'yet another key': 'now a string!'}

In [12]:
# Keys also can be of eterogeneous types, but they *must* be of immutable types:

In [13]:
d[123] = 'hello'
d

{('an', 'immutable', 'tuple', 'as key'): 3,
 123: 'hello',
 'some key': 5,
 'some other key': 7,
 'yet another key': 'now a string!'}

In [14]:

# To iterate through keys, use a 'for in' construct :

# ****  WARNING: iteration order most often is NOT the same as insertion order!!  ****

In [15]:
for k in d:
    print(k)

123
some key
some other key
('an', 'immutable', 'tuple', 'as key')
yet another key


In [16]:
# get all keys:

In [17]:
d.keys()

dict_keys([123, 'some key', 'some other key', ('an', 'immutable', 'tuple', 'as key'), 'yet another key'])

In [18]:
# get all values:

In [19]:
d.values()

dict_values(['hello', 5, 7, 3, 'now a string!'])

In [20]:
# delete a key:

del d['some key']
d

{('an', 'immutable', 'tuple', 'as key'): 3,
 123: 'hello',
 'some other key': 7,
 'yet another key': 'now a string!'}

### histogram

Difficulty: ✪✪

In [21]:

def histogram(string):
    """

    RETURN a dictionary that for each character in string contains the number of occurrences.
    The keys are the caracthers and the values are to occurrences
    """
    
    #jupman-raise
    ret = dict()
    for c in string:
        if c in ret:
            ret[c] += 1
        else:
            ret[c] = 1
    return ret
    #/jupman-raise
    
assert histogram("babbo") == {'b': 3, 'a':1, 'o':1}
assert histogram("") == {}
assert histogram("cc") == {'c': 2}
assert histogram("aacc") == {'a': 2, 'c':2}

### listify

Difficulty: ✪✪

In [22]:
def listify(d, order):
    """
    Takes a dictionary d as input and RETURN a list with only the values from the dict (so no keys )
    To have a predictable order, the function also takes as input a list 'order' where there are 
    the keys from first dictionary ordered as we would like in the resulting list
    """
    #jupman-raise
    ret = list()
    for element in order:
        ret.append (d[element])
    return ret
    #/jupman-raise


assert listify({}, []) == []
assert listify({'ciao':123}, ['ciao']) == [123]
assert listify({'a':'x','b':'y'}, ['a','b']) == ['x','y']
assert listify({'a':'x','b':'y'}, ['b','a']) == ['y','x']
assert listify({'a':'x','b':'y','c':'x'}, ['c','a','b']) == ['x','x','y']
assert listify({'a':'x','b':'y','c':'x'}, ['b','c','a']) == ['y','x','x']
assert listify({'a':5,'b':2,'c':9}, ['b','c','a']) == [2,9,5]
assert listify({6:'x',8:'y',3:'x'}, [6,3,8]) == ['x','x','y']

### tcounts

Difficulty: ✪✪

In [23]:
def tcounts(lst):
    """

    Takes a list of tuples. Each tuple has two values, the first is an immutable object and 
    the second one is an integer number (the counts of that object).
    RETURN a dictionary that for each immutable object found in the tuples,
    associate the total count found for it.
    
    See asserts for examples
    """
    ret = {}
    for c in lst:
        if c[0] in ret:
            ret[c[0]] += c[1]
        else:
            ret[c[0]] = c[1]
    return ret

assert tcounts([]) == {}
assert tcounts([('a',3)]) == {'a':3}
assert tcounts([('a',3),('a',4)]) == {'a':7}
assert tcounts([('a',3),('b',8), ('a',4)]) == {'a':7, 'b':8}
assert tcounts([('a',5), ('c',8), ('b',7), ('a',2), ('a',1), ('c',4)]) == {'a':5+2+1, 'b':7, 'c': 8 + 4}


## OrderedDict

As we said before, when you scan the keys of a dictionary, the order most often is **not** the same as the insertion order. To have it predictable, you need to use an `OrderedDict`

In [24]:
# first you need to import it from collections module
from collections import OrderedDict

od = OrderedDict()

# OrderedDict looks and feels exactly as regular dictionaries. Here we reproduce the previous example:

od['some key'] = 5

od['some other key'] = 7
od[('an', 'immutable', 'tuple','as key')] = 3
od['yet another key'] = 'now a string!'
od[123] = 'hello'
od

OrderedDict([('some key', 5),
             ('some other key', 7),
             (('an', 'immutable', 'tuple', 'as key'), 3),
             ('yet another key', 'now a string!'),
             (123, 'hello')])

Now  you will see that if you iterate with the `for in` construct, you get exactly the same insertion sequence:

In [25]:
for key in od:
    print("%s  :  %s" %(key, od[key]))


some key  :  5
some other key  :  7
('an', 'immutable', 'tuple', 'as key')  :  3
yet another key  :  now a string!
123  :  hello


To create it all at once, since you want to be sure of the order, you can pass a list of tuples representing key/value pairs. Here we reproduce the previous example: 


In [26]:


od = OrderedDict(
        [
            ('some key', 5),
            ('some other key', 7),
            (('an', 'immutable', 'tuple','as key'), 3),
            ('yet another key', 'now a string!'),
            (123, 'hello')
        ]
)

od

OrderedDict([('some key', 5),
             ('some other key', 7),
             (('an', 'immutable', 'tuple', 'as key'), 3),
             ('yet another key', 'now a string!'),
             (123, 'hello')])

Again you will see that if you iterate with the `for in` construct, you get exactly the same insertion sequence:

In [27]:
for key in od:
    print("%s  :  %s" % (key, od[key]))

some key  :  5
some other key  :  7
('an', 'immutable', 'tuple', 'as key')  :  3
yet another key  :  now a string!
123  :  hello


## Set


### Set introduction
A set is an _unordered_ collection of _distinct_ elements, so no duplicates are allowed.

In Python you can create a set with a call to `set()`

In [28]:
s = set()

In [29]:
s

set()

To add elements, use `.add()` method:

In [30]:
s.add('hello')
s.add('world')

Notice Python represents a set with curly brackets, but differently from a dictionary you won't see colons `:` nor key/value couples:

In [31]:
s

{'hello', 'world'}

### Empty sets

<div class="alert alert-warning">

**WARNING**: `{}` means empty dictionary, not empty set
</div>

Since a set print out representation starts and ends with curly brackets as dictionaries, when you see written `{}` you might wonder whether that is the empty set or the empty dictionary. 

The empty dictionary is represented as a curly bracket:


In [32]:
d = {}

In [33]:
d

{}

In [34]:
type(d)

dict

The empty set is represented instead with `set()`

In [35]:
s = set()

In [36]:
s

set()

In [37]:
type(s)

set

You can iterate in a set with the `for in` construct:

In [38]:
for el in s:
    print(el)

From the print out you notice sets, like dictionaries keys, are not necessarily iterated in same order as the insertion one. This also means they do not support access by index: 

```python
s[0]
```

```bash
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-61-f8bb2b116405> in <module>()
----> 1 s[0]

TypeError: 'set' object does not support indexing


```

Since sets must contain distinct elements, if we add the same element twice the same remains unmodified with no complaints from Python:

In [39]:
s.add('hello')

In [40]:
s

{'hello'}

In [41]:
s.add('world')

In [42]:
s

{'hello', 'world'}

In a set we add eterogenous elements, like a numer here:

In [43]:
s.add(7)

In [44]:
s

{7, 'hello', 'world'}

To remove an element, use `.remove()` method:

In [45]:
s.remove('world')

In [46]:
s

{7, 'hello'}

### inter

Difficulty: ✪✪

In [47]:


def inter(d1, d2):
    """
    RETURN a set of keys for which the couple <key, value> is the same in both dictionaries
    

    Example a = {key1: 1, key2: 2 , key3: 3} b = {key1:1 ,key2:3 , key3:3}

       return {key1,key3}
    """    
    #jupman-raise    
    res = set()
    for key in d1:
        if key in d2:
            if d1[key] == d2[key]:
                 res.add(key)
    return res
    #/jupman-raise

    
assert inter({'key1': 1, 'key2': 2 , 'key3': 3}, {'key1':1 ,'key2':3 , 'key3':3}) == {'key1', 'key3'}
assert inter(dict(), {'key1':1 ,'key2':3 , 'key3':3}) == set()
assert inter({'key1':1 ,'key2':3 , 'key3':3}, dict()) == set()
assert inter(dict(),dict()) == set()

### unique_vals

Difficulty: ✪✪

In [48]:
"""
    RETURN a list of unique values from the dictionary. The list MUST be ordered alphanumerically
    Question: We need it ordered for testing purposes. Why?
    
    Ex: {'a':'y','b':'x','c':'x'}
    
    must return  ['x','y']
    
    - to order the list, use method  .sort()
    
"""
def unique_vals(d):
    #jupman-raise
    s = set(d.values())
    ret = list(s)  # we can only sort lists (sets have no order)
    ret.sort()
    return ret
    #/jupman-raise
    
assert unique_vals({}) == []
assert unique_vals({'a':'y','b':'x','c':'x'}) == ['x','y']
assert unique_vals({'a':4,'b':6,'c':4,'d':8}) == [4,6,8]