# Lecture 4 Sets & Dictionaries 

### Learning Objectives:
* Understand the characteristics and properties of sets and dictionaries.
* Effectively perform common operations on sets and dictionaries.
* Comprehend how hash tables are used to implement sets and dictionaries.
* Understand why lookup, insertion, and deletion in sets and dictionaries have O(1) time complexity.


### Commonalities of sets & dictionaries 

* Unordered (no indexing)
    * Sets: unordered collection of elements
    * Dicts: unordered collection of key-value pairs
* Unique elements (no duplicates)
    * Sets: unique elements
    * Dicts: unique keys
* Hash-based
    * Sets and dict keys use hashing to allow **O(1)** average
        * lookup
        * insertion
        * deletion
* Heterogeneous 
* Mutable
* Iterable

## Sets

In [None]:
# {element1, element2, ...} => sets
s = {1, 2, 3}
s

In [None]:
type(s)

In [None]:
type({})

In [None]:
# empty set
s = set()

In [None]:
type(s)

In [None]:
# set([iterable])
set([1, 2, 3])

In [None]:
set((1, 2, 3))

In [None]:
set('abc')

In [None]:
# Unordered/unindexed/not subscriptable
s = {1, 2, 3}
s[0]

In [None]:
# Unique elements (no duplicates)
{1, 1, 1, 2, 2, 3}

In [None]:
# heterogeneous
{1, 0.5, 'abc', (1, 2, 3)}

Sets can only contain **hashable** objects
* has a function to convert the object's value to an integer
    * implemented in the `hash()` method  
* this hash value remains **consistent** during its lifetime
    * **immutable** objects
* examples:
    * `int`
    * `float`
    * `str`
    * `tuple` if all elements in the tuple are hashable

In [None]:
hash(1)

In [None]:
hash(500)

In [None]:
hash(0.1)

In [None]:
hash(0.5)

In [None]:
hash('a')

In [None]:
hash('abc')

In [None]:
hash('hello, world')

In [None]:
hash((1, 0.5, 'abc'))

In [None]:
# lists are mutable, so unhashable!
hash([1, 2, 3])

In [None]:
# sets themselves are mutable, so unhashable
hash({1, 2, 3})

In [None]:
# all elements in the tuple need to be hashable to make the tuple hashable
hash((1, 2, [1, 2]))

hashing allows **O(1)** average complexity of
* lookup
* insertion
* deletion

we'll cover the reasons in slides later

In [None]:
s = set('abc')
s

In [None]:
# lookup
# O(1)
'a' in s

In [None]:
'd' in s

In [None]:
'd' not in s

In [None]:
# O(1)
len(s)

In [None]:
# insertion
# O(1)
s.add('d')
s

In [None]:
# unique elements 
s.add('d')
s

In [None]:
# deletion 
s.remove('b')
s

In [None]:
# remove() raise KeyError if no such element
# we can call set elements keys 
s.remove('b')

In [None]:
s.discard('a')
s

In [None]:
# discard() won't raise KeyError
s.discard('a')

In [None]:
# iterable
for i in s: 
    print(i)

Common set operations
* union
* intersection
* difference

In [None]:
s1 = set(range(5))
s2 = set(range(3, 8))
s1, s2

In [None]:
# union combines all unique elements from both sets
s1.union(s2)

In [None]:
# same as s1.union(s2)
s1 | s2

In [None]:
# intersection returns only the elements common to both sets
s1.intersection(s2)

In [None]:
# same as s1.intersection(s2)
s1 & s2

In [None]:
# s1.difference(s2) returns elements that are in s1 but not in s2
s1.difference(s2)

In [None]:
s2.difference(s1)

In [None]:
# same as s1.difference(s2)
s1 - s2

more [set operations](https://docs.python.org/3/library/stdtypes.html#set)

In [None]:
# s1.symmetric_difference(s2) returns elements in either the s1 or s2 but not both
s1.symmetric_difference(s2)

In [None]:
# same as s1.symmetric_difference(s2)
s1 ^ s2

In [None]:
s1 = set(range(5))
s2 = set(range(3))
s1, s2

In [None]:
# whether every element in s2 is in s1
s1.issuperset(s2)

In [None]:
# same as s1.issuperset(s2)
s1 >= s2

In [None]:
s2.issuperset(s1)

In [None]:
# whether every element in s1 is in s2
s2.issubset(s1)

In [None]:
# same as s2.issubset(s1)
s2 <= s1

In [None]:
s1.issubset(s2)

## Dictionaries 

In [None]:
# key value pairs 
# literal {key1: value1, key2: value2, ...}
phonebook = {'Alice': '555-1234', 'Bob': '555-5678'}
phonebook

In [None]:
type(phonebook)

In [None]:
# empty dict 
s = {}
s

In [None]:
type(s)

In [None]:
# create dict using a list of key-value tuples
dict([('Alice', '555-1234'), ('Bob', '555-5678')])

In [None]:
# create dict using zip([keys], [values])
dict(zip(['Alice', 'Bob'], ['555-1234', '555-5678']))

In [None]:
# Unique elements (no duplicates)
# duplicate keys take the last value
{'Alice': '555-1234', 'Alice': '555-5678'}

In [None]:
# you can have same values 
{'Alice': '555-1234', 'Bob': '555-1234'}

In [None]:
# O(1)
len(phonebook)

In [None]:
# heterogeneous
person = {
    'name': 'Alice',
    'phone': '555-1234',
    1: 'one',
    (2, 3): 1
}
person

All keys need to be **hashable**, like set elements

In [None]:
{[1, 2, 3]: 1}

Similarly, dicts have **O(1)** average complexity of
* lookup
* insertion
* deletion
  
and 
* update

In [None]:
# lookup
'Alice' in phonebook

In [None]:
'Charles' in phonebook

In [None]:
'Charles' not in phonebook

In [None]:
# accessing values 
phonebook['Alice']

In [None]:
# raise KeyError if key not in dict 
phonebook['Charles']

In [None]:
phonebook.get('Alice')

In [None]:
# use get(key, default) method to set a default
phonebook.get('Charles', 'Name not found')

In [None]:
# update
phonebook['Alice'] = '555-0000'
phonebook

In [None]:
# insertion
phonebook['David'] = '555-5555'
phonebook

In [None]:
# batch update with update() method
phonebook.update({
    'Alice': '777-1234', 
    'Bob': '222-5678',
    'Elaine': '999-9876'
})
phonebook

In [None]:
# deletion
del phonebook['David']
phonebook

In [None]:
del phonebook['David']

In [None]:
# is better to check before delete to avoid KeyError
???
    del phonebook['David']
phonebook

In [None]:
phonebook.keys()

In [None]:
phonebook.values()

In [None]:
phonebook.items()

In [None]:
for key in phonebook: # same as for key in phonebook.keys()
    print(key)

In [None]:
for value in phonebook.values():
    print(value)

In [None]:
# we're unpacking here
for key, value in phonebook.items():
    print(key, value)

In [None]:
# can be nested
users = {
    'Alice': {'age': 25, 'phone': '555-1234'}, 
    'Bob': {'age': 30, 'phone': '777-5678'}
}

In [None]:
users['Alice']

In [None]:
users['Alice']['age']

In [None]:
users['Alice']['age'] = 27
users

In [None]:
users['Alice']['address'] = '1000 x st, Madison, WI'
users

One use case: counting occurances

In [None]:
nums = [6, 2, 9, 6, 6, 9, 8, 3, 7, 7, 5, 8, 4, 5, 3, 5, 4, 5, 7, 4, 5, 6, 8, 8, 3, 9, 2, 1, 4, 4]
???