# Lesson 6: Sets and Dictionaries
**Teaching**: 15min<br>
**Exercises**: 5min

## Use a set to store unique values
* Create with `{...}`
* But must use `set()` to create an empty set

In [1]:
primes = {2, 3, 5, 7}
print('is 3 prime?', 3 in primes)
print('is 9 prime?', 9 in primes)

is 3 prime? True
is 9 prime? False


* Intersection, union, etc.

In [2]:
odds = {3, 5, 7, 9}
print('intersection', odds & primes)
print('union', odds | primes)

intersection {3, 5, 7}
union {2, 3, 5, 7, 9}


## Sets are mutable
* But only store *unique* values

In [3]:
primes.add(11)
print('primes becomes', primes)
primes.discard(7)
print('after removal', primes)
primes.add(11)
print('after adding 11 again', primes)

primes becomes {2, 3, 5, 7, 11}
after removal {2, 3, 5, 11}
after adding 11 again {2, 3, 5, 11}


## Sets are unordered
* Values are stored by *hashing*, which is intentionally as random as possible

In [4]:
names = {'Hopper', 'Cori', 'Kohn'}
for n in names:
    print(n)

Cori
Kohn
Hopper


## Use a dictionary to store key/value pairs
Equivalently, store extra information with elements of a set.

In [5]:
birthdays = {'Hopper': 1906, 'Cori': 1896}
print(birthdays['Hopper'])
birthdays['Kohn'] = 1823 # oops
birthdays['Kohn'] = 1923 # that's better
print(birthdays)

1906
{'Hopper': 1906, 'Cori': 1896, 'Kohn': 1923}


* Just an accident that keys are in order of when entered.
* Like sets, dictionaries store keys by hashing, which is as random as possible

## Set values and dictionary keys must be immutable
* Changing them after insertion would leave data in the wrong place
* Use a `tuple` for multi-valued keys

In [6]:
people = {('Grace', 'Hopper'): 1906, ('Gerty', 'Cory'): 1896, ('Walter', 'Kohn'): 1923}

You can *destructure* a tuple in the heading of a for loop:

In [7]:
for (first, last) in people:
    print(first,'was born in', people[(first, last)])

Grace was born in 1906
Gerty was born in 1896
Walter was born in 1923


## Example: create a histogram

In [8]:
numbers = [1, 0, 1, 2, 0, 0, 1, 2, 1, 3, 1, 0, 2]
count = {}
for n in numbers:
    if n not in count:
        count[n] = 1
    else:
        count[n] = count[n] + 1
print(count)

{1: 5, 0: 4, 2: 3, 3: 1}


Reminder: there are lots of useful Python libraries, especially the "standard library" that comes with Python:

In [9]:
from collections import Counter

print(Counter(numbers))
print(dict(Counter(numbers)))

Counter({1: 5, 0: 4, 2: 3, 3: 1})
{1: 5, 0: 4, 2: 3, 3: 1}


## Keys are often strings

In [10]:
atomic_numbers = {'H' : 1, 'He' : 2, 'Li' : 3, 'Be' : 4, 'B' : 5}
print('atomic number of lithium:', atomic_numbers['Li'])

atomic number of lithium: 3


In [11]:
from mp_workshop.data import atomic_numbers

for element in ('H', 'C', 'O'):
    print('atomic number of', element, 'is', atomic_numbers[element])

atomic number of H is 1
atomic number of C is 6
atomic number of O is 8


You can iterate over the keys of a dictionary:

In [12]:
# Use a counter so we don't print out so much.
n = 0
for element in atomic_numbers:
    if n < 5:
        print(element)
    n = n + 1

H
He
Li
Be
B


You can also iterate over (key, value) tuples of a dictionary using the `items` method:

In [13]:
n = 0
for (element, atomic_number) in atomic_numbers.items():
    if n < 5:
        print(element, atomic_number)
    n = n + 1

H 1
He 2
Li 3
Be 4
B 5


## Exercise: How heavy is this molecule?
You are given two things:

1. a dictionary mapping atomic symbols to atomic weights (`mp_workshop.data.atomic_weights`), and
2. a list of (atomic_symbol, count) pairs for a molecule.

```python
# Example molecules:
methane = [('C', 1), ('H', 4)]
aminothiazole = [('C', 3), ('H', 4), ('N', 2), ('S', 1)]
```

Print that molecule's molecular weight.


In [14]:
from mp_workshop.data import atomic_weights

# atomic weight is 16.0423
methane = [('C', 1), ('H', 4)]
# atomic weight is 100.1421
aminothiazole = [('C', 3), ('H', 4), ('N', 2), ('S', 1)]

# 2. Pick a molecule to test
molecule = methane

# 3. Do stuff to calculate `mol_weight`

# ...

#print(mol_weight)
