# Lecture 4 Sets & Dictionaries 

### Learning Objectives:
* Understand the characteristics and properties of sets and dictionaries.
* Effectively perform common operations on sets and dictionaries.
* Comprehend how hash tables are used to implement sets and dictionaries.
* Understand why lookup, insertion, and deletion in sets and dictionaries have O(1) time complexity.


### Commonalities of sets & dictionaries 

* Unordered (no indexing)
    * Sets: unordered collection of elements
    * Dicts: unordered collection of key-value pairs
* Unique elements (no duplicates)
    * Sets: unique elements
    * Dicts: unique keys
* Hash-based
    * Sets and dict keys use hashing to allow **O(1)** average
        * lookup
        * insertion
        * deletion
* Heterogeneous 
* Mutable
* Iterable

## Sets

In [1]:
# {element1, element2, ...} => sets
s = {1, 2, 3}
s

{1, 2, 3}

In [2]:
type(s)

set

In [3]:
type({})

dict

In [4]:
# empty set
s = set()

In [5]:
type(s)

set

In [6]:
# set([iterable])
set([1, 2, 3])

{1, 2, 3}

In [7]:
set((1, 2, 3))

{1, 2, 3}

In [8]:
set('abc')

{'a', 'b', 'c'}

In [9]:
# Unordered/unindexed/not subscriptable
s = {1, 2, 3}
s[0]

TypeError: 'set' object is not subscriptable

In [10]:
# Unique elements (no duplicates)
{1, 1, 1, 2, 2, 3}

{1, 2, 3}

In [11]:
# heterogeneous
{1, 0.5, 'abc', (1, 2, 3)}

{(1, 2, 3), 0.5, 1, 'abc'}

Sets can only contain **hashable** objects
* has a function to convert the object's value to an integer
    * implemented in the `hash()` method  
* this hash value remains **consistent** during its lifetime
    * **immutable** objects
* examples:
    * `int`
    * `float`
    * `str`
    * `tuple` if all elements in the tuple are hashable

In [12]:
hash(1)

1

In [13]:
hash(500)

500

In [14]:
hash(0.1)

230584300921369408

In [15]:
hash(0.5)

1152921504606846976

In [16]:
hash('a')

-2853043819228629119

In [17]:
hash('abc')

-2188954152372417382

In [18]:
hash('hello, world')

-5259906774421471684

In [19]:
hash((1, 0.5, 'abc'))

-1314201354667449936

In [20]:
# lists are mutable, so unhashable!
hash([1, 2, 3])

TypeError: unhashable type: 'list'

In [21]:
# sets themselves are mutable, so unhashable
hash({1, 2, 3})

TypeError: unhashable type: 'set'

In [22]:
# all elements in the tuple need to be hashable to make the tuple hashable
hash((1, 2, [1, 2]))

TypeError: unhashable type: 'list'

hashing allows **O(1)** average complexity of
* lookup
* insertion
* deletion

we'll cover the reasons in slides later

In [23]:
s = set('abc')
s

{'a', 'b', 'c'}

In [24]:
# lookup
# O(1)
'a' in s

True

In [25]:
'd' in s

False

In [26]:
'd' not in s

True

In [27]:
# O(1)
len(s)

3

In [28]:
# insertion
# O(1)
s.add('d')
s

{'a', 'b', 'c', 'd'}

In [29]:
# unique elements 
s.add('d')
s

{'a', 'b', 'c', 'd'}

In [30]:
# deletion 
s.remove('b')
s

{'a', 'c', 'd'}

In [31]:
# remove() raise KeyError if no such element
# we can call set elements keys 
s.remove('b')

KeyError: 'b'

In [32]:
s.discard('a')
s

{'c', 'd'}

In [33]:
# discard() won't raise KeyError
s.discard('a')

In [34]:
# iterable
for i in s: 
    print(i)

c
d


Common set operations
* union
* intersection
* difference

In [35]:
s1 = set(range(5))
s2 = set(range(3, 8))
s1, s2

({0, 1, 2, 3, 4}, {3, 4, 5, 6, 7})

In [36]:
# union combines all unique elements from both sets
s1.union(s2)

{0, 1, 2, 3, 4, 5, 6, 7}

In [37]:
# same as s1.union(s2)
s1 | s2

{0, 1, 2, 3, 4, 5, 6, 7}

In [38]:
# intersection returns only the elements common to both sets
s1.intersection(s2)

{3, 4}

In [39]:
# same as s1.intersection(s2)
s1 & s2

{3, 4}

In [40]:
# s1.difference(s2) returns elements that are in s1 but not in s2
s1.difference(s2)

{0, 1, 2}

In [41]:
s2.difference(s1)

{5, 6, 7}

In [42]:
# same as s1.difference(s2)
s1 - s2

{0, 1, 2}

more [set operations](https://docs.python.org/3/library/stdtypes.html#set)

In [43]:
# s1.symmetric_difference(s2) returns elements in either the s1 or s2 but not both
s1.symmetric_difference(s2)

{0, 1, 2, 5, 6, 7}

In [44]:
# same as s1.symmetric_difference(s2)
s1 ^ s2

{0, 1, 2, 5, 6, 7}

In [45]:
s1 = set(range(5))
s2 = set(range(3))
s1, s2

({0, 1, 2, 3, 4}, {0, 1, 2})

In [46]:
# whether every element in s2 is in s1
s1.issuperset(s2)

True

In [47]:
# same as s1.issuperset(s2)
s1 >= s2

True

In [48]:
s2.issuperset(s1)

False

In [49]:
# whether every element in s1 is in s2
s2.issubset(s1)

True

In [50]:
# same as s2.issubset(s1)
s2 <= s1

True

In [51]:
s1.issubset(s2)

False

## Dictionaries 

In [52]:
# key value pairs 
# literal {key1: value1, key2: value2, ...}
phonebook = {'Alice': '555-1234', 'Bob': '555-5678'}
phonebook

{'Alice': '555-1234', 'Bob': '555-5678'}

In [53]:
type(phonebook)

dict

In [54]:
# empty dict 
s = {}
s

{}

In [55]:
type(s)

dict

In [56]:
# create dict using a list of key-value tuples
dict([('Alice', '555-1234'), ('Bob', '555-5678')])

{'Alice': '555-1234', 'Bob': '555-5678'}

In [57]:
# create dict using zip([keys], [values])
dict(zip(['Alice', 'Bob'], ['555-1234', '555-5678']))

{'Alice': '555-1234', 'Bob': '555-5678'}

In [58]:
# Unique elements (no duplicates)
# duplicate keys take the last value
{'Alice': '555-1234', 'Alice': '555-5678'}

{'Alice': '555-5678'}

In [59]:
# you can have same values 
{'Alice': '555-1234', 'Bob': '555-1234'}

{'Alice': '555-1234', 'Bob': '555-1234'}

In [60]:
# O(1)
len(phonebook)

2

In [61]:
# heterogeneous
person = {
    'name': 'Alice',
    'phone': '555-1234',
    1: 'one',
    (2, 3): 1
}
person

{'name': 'Alice', 'phone': '555-1234', 1: 'one', (2, 3): 1}

All keys need to be **hashable**, like set elements

In [62]:
{[1, 2, 3]: 1}

TypeError: unhashable type: 'list'

Similarly, dicts have **O(1)** average complexity of
* lookup
* insertion
* deletion
  
and 
* update

In [63]:
# lookup
'Alice' in phonebook

True

In [64]:
'Charles' in phonebook

False

In [65]:
'Charles' not in phonebook

True

In [66]:
# accessing values 
phonebook['Alice']

'555-1234'

In [67]:
# raise KeyError if key not in dict 
phonebook['Charles']

KeyError: 'Charles'

In [68]:
phonebook.get('Alice')

'555-1234'

In [69]:
# use get(key, default) method to set a default
phonebook.get('Charles', 'Name not found')

'Name not found'

In [70]:
# update
phonebook['Alice'] = '555-0000'
phonebook

{'Alice': '555-0000', 'Bob': '555-5678'}

In [71]:
# insertion
phonebook['David'] = '555-5555'
phonebook

{'Alice': '555-0000', 'Bob': '555-5678', 'David': '555-5555'}

In [72]:
# batch update with update() method
phonebook.update({
    'Alice': '777-1234', 
    'Bob': '222-5678',
    'Elaine': '999-9876'
})
phonebook

{'Alice': '777-1234',
 'Bob': '222-5678',
 'David': '555-5555',
 'Elaine': '999-9876'}

In [73]:
# deletion
del phonebook['David']
phonebook

{'Alice': '777-1234', 'Bob': '222-5678', 'Elaine': '999-9876'}

In [74]:
del phonebook['David']

KeyError: 'David'

In [75]:
# is better to check before delete to avoid KeyError
if 'David' in phonebook:
    del phonebook['David']
phonebook

{'Alice': '777-1234', 'Bob': '222-5678', 'Elaine': '999-9876'}

In [76]:
phonebook.keys()

dict_keys(['Alice', 'Bob', 'Elaine'])

In [77]:
phonebook.values()

dict_values(['777-1234', '222-5678', '999-9876'])

In [78]:
phonebook.items()

dict_items([('Alice', '777-1234'), ('Bob', '222-5678'), ('Elaine', '999-9876')])

In [79]:
for key in phonebook: # same as for key in phonebook.keys()
    print(key)

Alice
Bob
Elaine


In [80]:
for value in phonebook.values():
    print(value)

777-1234
222-5678
999-9876


In [81]:
# we're unpacking here
for key, value in phonebook.items():
    print(key, value)

Alice 777-1234
Bob 222-5678
Elaine 999-9876


In [82]:
# can be nested
users = {
    'Alice': {'age': 25, 'phone': '555-1234'}, 
    'Bob': {'age': 30, 'phone': '777-5678'}
}

In [83]:
users['Alice']

{'age': 25, 'phone': '555-1234'}

In [84]:
users['Alice']['age']

25

In [85]:
users['Alice']['age'] = 27
users

{'Alice': {'age': 27, 'phone': '555-1234'},
 'Bob': {'age': 30, 'phone': '777-5678'}}

In [86]:
users['Alice']['address'] = '1000 x st, Madison, WI'
users

{'Alice': {'age': 27,
  'phone': '555-1234',
  'address': '1000 x st, Madison, WI'},
 'Bob': {'age': 30, 'phone': '777-5678'}}

One use case: counting occurances

In [87]:
nums = [6, 2, 9, 6, 6, 9, 8, 3, 7, 7, 5, 8, 4, 5, 3, 5, 4, 5, 7, 4, 5, 6, 8, 8, 3, 9, 2, 1, 4, 4]
counts = {}
for i in nums: 
    if i in counts:
        counts[i] += 1
    else: 
        counts[i] = 1
counts

{6: 4, 2: 2, 9: 3, 8: 4, 3: 3, 7: 3, 5: 5, 4: 5, 1: 1}