In [1]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

# PYTHON 3

## Dictionaries, sets, collections


### Eugene Baulin


MIPT 2019

### Creating and deleting

 
- When creating two mutable objects separately, they will be guaranteed to be different. For immutable objects, this is not always true.
- You do not need to worry about deleting objects, the interpreter will do everything for you

## Dictionaries

Dictionary is a universal tool for expressing relationships between objects, counting, grouping.

They are also called **associative arrays or hash tables**.

In [None]:
a = {'Key1' : 'Value1', 'Key2' : 'Value2'}
a

In [None]:
b = dict([(1, 1), (2, 4), (3, 9)])
b

- Any hashable object can be a key. (mutable == not hashable)

The defnition of hashable from Python documentation:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.

Hashability makes an object usable as a dictionary key and a set member, because these data structures use the hash value internally.

Most of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not; immutable containers (such as tuples and frozensets) are only hashable if their elements are hashable. Objects which are instances of user-defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their id().

- In short, the object must have the `__hash __ ()` method defined correctly.

hash(integer1) == integer1

All of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not. Objects which are instances of user-defined classes are hashable by default. They all compare unequal (except with themselves), and their hash value is derived from their id().

In [None]:
hash(343)
hash(True)
hash('hello')

In [None]:
hash(6.5) # it's tricky to hash floats because of accuracy of float number representations
          # message: you better be careful with hashing floats or even avoid it at all
hash(round(6.50443,2)) # or at least you better use round() for hashing float

In [None]:
print(hash('aaa'))
print(hash('aab'))

#### Note: after restarting the interpreter, complex objects (for example, strings) will have a different hash value

lists in Python are not hashable

In [None]:
[1].__hash__ is None  # __hash__ method is not defined for list

Can I use dict as a key for another dict?

In [None]:
d1 = {1: 'b'}
d2 = {d1: 'abc'}

In [None]:
{1: 'b'}.__hash__ is None  # dict is also not hashable

You can iterate through the dictionary, both by key and by value.

In [None]:
# iterating
dictionary = {'a': 1, 'b': 2, 'c': 3}
   
for k in dictionary.keys():
    print(k)
    
print()
    
for k in dictionary:  # equivalent to iterating by keys but Python Zen says explicit is better than implicit
    print(k)          # that's why it's better to add ".keys()" in order to improve your code readability
                      # too readable code hasn’t hurt anyone yet
        
# always keep order in mind!

In [None]:
for v in dictionary.values(): # iterating by values
    print(v)

In [None]:
for pair in dictionary.items(): # iterating by key-value pairs
    print(pair)

In [None]:
# constructors:
a = dict(a=1, b=2, c=3)
a
keys = ["Petya", "Vasya", "Masha"]
values = [20, 21, 22]

dictionary = dict(zip(keys, values)) # probably the most convenient way to create a dict from two lists 
                                     # we will talk about zip() function later
dictionary

In [None]:
print(list(a.keys()))
print(list(a.values()))
print(list(a.items()))

In [None]:
del dictionary['Vasya']
dictionary

In [None]:
a.update(dictionary)  # union of two dicts
a

In [None]:
a[('Composite', 'Key')] = [1, 2, 3]   # only immutable objects could be keys in dicts
a

## 5 minutes task
### Use the new knowledge on dictionary iteration to reverse a dictionary, i.e. to create a dictionary with inverse pairs (value: key). Believe that in the original dictionary the values are also hashable.

### Remember list comprehensions? There are also dict comprehensions!

In [None]:
dct = {i : i ** 3 for i in range(5)}
dct

### Sets
Sets are also based on hash-tables

In [None]:
a = {1, 2, 3}
b = set([2, 3, 4])

a.add(5)
b.update({5, 6}) # update b with an argument set (union of b and an argument, the result is stored in b)
a, b

In [None]:
x in s
x not in s
s.issubset(t)   #equivalent to s <= t
s.issuperset(t) #equivalent to s >= t

In [None]:
print(a - b)
print(b - a)
print(a | b) # объединение
print(a & b) # пересечение
print(a ^ b) # ~ XOR

There are also set comprehensions

In [None]:
st = {i for i in range(10) if not i % 3}
st

In [None]:
d = {st: 1} # sets are also not hashable

In [None]:
d = {frozenset(st): 6}  # but there is a type frozenset that you actually can hash (because it's immutable!)
d

# When to use dict and set?

### Establishing a one to one correspondence among two sets of objects (e.g. it is convenient to implement a dictionary for translation from one language to another)

### Counting unique elements

### Fast checking an element for occurence (a key search in dict and set is performed in O (1) (on average): a hash is calculated from the object and it is checked whether there is such a hash in the container)


In [None]:
2 in a     # O(1)

### "Interview tasks"

Given two sorted lists with numbers (not necessarily the same length). Print all the numbers that are in the first list, but not in the second

In [23]:
lst1 = [1, 2, 8]
lst2 = [2, 6]

#### method 1: using set

In [None]:
# способ 1
set(lst1) - set(lst2) 

formally for O (n) in time (O (n) takes to create a set, but with a considerable constant), but requires additional memory, and sorted property is not used

#### method 2: let's think how to do it in O(n) time but without additional memory

In [18]:
lst1 = [1, 2, 8]
lst2 = [2, 6]

In [None]:
i, j = 0, 0

while i < len(lst1):

    if j >= len(lst2) or lst1[i] < lst2[j]: 
        
        print(lst1[i])
        i += 1
   
    elif lst1[i] == lst2[j]:
        
        i += 1
    
    else:
        
        j += 1

### collections

The objects in Collections are dictionaries modified for different needs and some other convenient data structures.

A good overview of the collections module can be read [here](https://pythonworld.ru/moduli/modul-collections.html) 

In [None]:
from collections import defaultdict
dct = defaultdict(float)

print(dct[2]) # if a key is not in dict, it creates a pair {a key: default value} instead of raising an error
print(dct)

In [None]:
from collections import deque
q = deque()

for i in range(10):
    q.append(i)

while len(q) > 5: 
    print(q.pop(), q) # O(1)

print()
    
while len(q):  # while deque is not empty
    print(q.popleft(), q) # O(1)

In [None]:
from collections import OrderedDict # remembers the order in which the keys were created

data = [(1, 'a'), (3, 'c'), (2, 'b')]

print(dict(data))
print(OrderedDict(data))