In [1]:
%load_ext autoreload
%autoreload 2

import sys, os
sys.path.insert(0, '../../src/')


<img src="https://raw.githubusercontent.com/abchapman93/delphi-python-2025-dev/refs/heads/main/media/DELPHI-long.png">
</br>

<h1 valign="center" align="center"><font size="+150">Introduction to Python</br>December 2025</font></h1>

In [2]:
from uu_delphi_python_dec25.helpers import *
from uu_delphi_python_dec25.quizzes.module1_quizzes import *

# Data Structures Part 1
In the previous notebook we learned about some of the essential data types in Python. We'll now dive deeper into Python's built-in **data structures**. A data structure is a type of object which helps us organize and access data. Each data structure is used for a slightly different purpose and has different methods associated with it.


The four data types we'll look at in the next two notebooks are:
- I. `list`
- II. `tuple`
- III. `dict` (dictionary)
- IV. `set`

## Lists
The following are all lists in Python: 

In [3]:
x = [1, 2, 3]
y = ["a", "b", "c"]
z = [1, "b", x]

#### TODO
What are the data types of the three elements of `z` in the previous cell?

In [4]:
# RUN CELL TO SEE QUIZ
quiz_data_types_z

VBox(children=(HTML(value='What are the data types of the three elements of `z`?'), RadioButtons(layout=Layout…



Here are some things we might want to do with lists:
1. Access specific elements
2. Add or remove elements
3. Combine two different lists
4. Sorting and finding the min/max values

### 1. Accessing specific elements
Lists and the other data structures in this notebook are also called **containers** because they **contain** other objects. As such, one of the main purposes of a list is to store an object in it and access it later.

The main way to do this with lists is **indexing**. Each element in a list has a numeric index - that is, its *ordered* position in a list. You do this by putting brackets after the name of the list and a numeric index. For example, let's say we want to access the first element of `x`. We would do this as:

```python
x[0]
```

This might look a little funny to you - the *zeroth* index of x? The reason that we use `x[0]` instead of `x[1]` for the first elementing is that Python uses **zero indexing**, meaning that the positions start at 0 and end at `len(x) - 1`. This is different from R or many other statistical software packages. You can think of this as saying: **Give me the element of x which is `0` positions away from the beginning**. 

#### TODO
What code would give you the second element of `x`?

In [6]:
# RUN CELL TO SEE QUIZ
quiz_second_element_x

VBox(children=(HTML(value='What code would give you the second element of `x`?'), Textarea(value='', placehold…



#### TODO
What value would get if you ran the following line of code?
```python
x[3]
```

In [5]:
# RUN CELL TO SEE QUIZ
quiz_x3

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('None', 'An error would be r…



To get the length of a list, we can use the built-in `len` function. 

In [8]:
len(x)

3

#### TODO
What is the largest value of `idx` we could use in the code `x[idx]`?

In [9]:
# RUN CODE TO SEE QUIZ
quiz_largest_idx_x

VBox(children=(HTML(value=''), Textarea(value='', placeholder='Type something'), Button(description='Submit', …



#### `k` steps back
When we pass in 0 or a positive index `k` for list `x`, we go `k` positions from the beginning. But if we pass in a negative number, we go backwards from the end of the list. So one way to get the last element of `x` is:

In [10]:
x[-1]

3

In [11]:
x

[1, 2, 3]

#### TODO
Which of the following lines of code would give you the second-to-last element of x?
- a) `x[-2]`
- b) `x[1]`
- c) `x[len(x)-2]`

In [12]:
# RUN CELL TO SEE QUIZ
quiz_second_to_last_x

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('a)', 'b)', 'c)', 'All of th…



Note that in our example lists, the last element of `z` is the list `x`. 

In [13]:
z

[1, 'b', [1, 2, 3]]

#### TODO
Which of the following lines of code would give the value `2`?
- a) `x[1]`
- b) `x[-2]`
- c) `z[3]`
- d) `z[-1][-2]`

In [14]:
# RUN CELL TO SEE QUIZ
quiz_values_of_2

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('a), b), and d)', 'a) and c)…



If we a subset of a list (that is, a smaller list containing some of the elements of the larger list), we use the following notation:

```python
x[start:end]
```

This will give us a list containing `[x[start], x[start+1], ..., x[end-1]]`

**Question**: Why will the sublist end at `x[end-1]`?

#### TODO
Create a smaller list containing only the second and third elements of `x`. Name it `x_sub1`.

In [15]:
[x[1], x[2]]

[2, 3]

In [16]:
x_sub1 = x[1:3]
x_sub1

[2, 3]

In [17]:
# RUN CELL TO TEST VALUE
test_x_sub1.test(x_sub1)

That is correct!


If we leave out the `start` index, then the sublist will contain all of the elements of `x` from the beginning until `end-1`:

In [18]:
x[:2]

[1, 2]

Similarly, if we leave out the `end` index, the sublist will contain all elements from `start` through the end of the list:

In [19]:
x[1:]

[2, 3]

#### TODO
What list would be created with the code `x[:]`. Why?

In [20]:
# RUN CELL TO SEE QUIZ
quiz_x_colon

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('An error will be raised.', …



### 2. Adding or removing elements
Often the containers we're using aren't static. We might update them by adding new objects to them or removing some. Next we'lllook at a few methods for doing this.

**Methods** are like functions but are associated with a particular object. Calling a method looks similar to calling a function but comes from the object:

```python
obj.method_name(args)
```

For example, lists have a method called `append` that adds an element to the end of the list. This method takes one argument, which is the object we want to add to the list. 

For example, let's say we had a list of names for patients in the emergency department. We'll call this list `waiting_list`. When a new patient shows up, we need to **append** their name to the list.

In [21]:
waiting_list = ["Jim", "Mary", "Rachel"]
waiting_list.append("Laura")

In [22]:
type(waiting_list)

list

In [23]:
waiting_list

['Jim', 'Mary', 'Rachel', 'Laura']

Now when we look at this object, we can see that "Laura" is also in our queue.

In [24]:
waiting_list

['Jim', 'Mary', 'Rachel', 'Laura']

But let's say that someone named **"Chloe"** comes in who is much sicker than the other patients and needs to be seen with higher priority. We can add an element to a specific position in a list by using the `insert` method. This takes two arguments: the index of the list where you want to put the new object, and the object itself.

So, for example, the line below will add `"Chloe"` to the beginning of the list.

In [25]:
waiting_list.insert(0, "Chloe")

In [26]:
waiting_list

['Chloe', 'Jim', 'Mary', 'Rachel', 'Laura']

Eventually, the understaffed doctors in the emergency room are ready to see a patient. To remove an object from a specific position we can use the `pop` method. This takes one argment, the index of the element to remove. Since our queue is already in the order we want to see the patients, we pass in `0` as the index.

This also returns the removed object, so we can save that as a variable to see who the next patient is.

In [27]:
next_patient = waiting_list.pop(0)
print(f"Next up: {next_patient}")

Next up: Chloe


In [28]:
waiting_list

['Jim', 'Mary', 'Rachel', 'Laura']

#### TODO
What would be the value of `next_patient` if you ran the previous cell again?

In [29]:
# RUN CELL TO SEE QUIZ
quiz_next_patient2

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('Jim', 'Mary', 'Rachel', 'Ch…



#### TODO
Let's say someone named **"Jacob"** comes into the ER. At first they don't seem too sick, so we put them at the end of the queue. Write the code below to add `"Jacob"` to the end of `waiting_list`.

In [30]:
waiting_list.append("Jacob")

In [31]:
waiting_list

['Jim', 'Mary', 'Rachel', 'Laura', 'Jacob']

In [32]:
# RUN CELL TO TEST VALUE
test_waiting_list_jacob.test(waiting_list)

That is correct!


#### TODO
After being added to the end of the queue, Jacob's condition suddenly worsens and he needs to be seen immediately. Remove Jacob's name from the list and save it as `next_patient`. 



In [33]:
next_patient = waiting_list.pop()

In [34]:
help(waiting_list.pop)

Help on built-in function pop:

pop(index=-1, /) method of builtins.list instance
    Remove and return item at index (default last).
    
    Raises IndexError if list is empty or index is out of range.



In [35]:
next_patient

'Jacob'

In [37]:
# RUN CELL TO TEST VALUE
test_next_patient3.test(next_patient)

That is correct!


After running the code above, what is the value of `len(waiting_list)`?

In [38]:
# RUN CELL TO SEE QUIZ
test_len_waiting_list

VBox(children=(HTML(value=''), Textarea(value='', placeholder='Type something'), Button(description='Submit', …



### 3. Combing two different lists
In addition to adding individual elements, we can also take two different lists and combine them. This is called **concatenation**. 

There are two ways to do this: First by using the addition operator `+`. For example, to combine the elements of `x` and `y` into one new list, we could do:

In [39]:
x + y

[1, 2, 3, 'a', 'b', 'c']

This creates a *new list* containing the elements of `x` followed by the elements of `y`. But the original lists are not altered: 

In [40]:
x

[1, 2, 3]

The second option is the method `extend` which modifies the list directly. So if you called:

```pythong
x.extend(y)
print(x)
```

You would see the new, longer list: `[1, 2, 3, 'a', 'b', 'c']`

I prefer not altering original objects whenever possible, so I personally recommend using the `+` operator. You can also update the value of x accordingly by saying:

```python
x += y
print(x)
# [1, 2, 3, 'a', 'b', 'c']
```

#### TODO
Below are three lists: `a`, `b`, and `c`. Write code to do the following:

1. Create a new list `z` which contains the elements of `b` followed by the elements of `a`
2. Modify `c` directly to include the elements in `a` and `b` (in that order)

In [41]:
a = [1, 2, 3]
b = ["a", "b", "c"]
c = [2, 4, "6"]

In [42]:
z = b + a
z

['a', 'b', 'c', 1, 2, 3]

In [43]:
# RUN CELL TO TEST VALUE
test_list_a_added_to_b.test(z)

That is correct!


In [44]:
c += a # Add elements of a to c
c += b # Add elements of b to c

In [45]:
c

[2, 4, '6', 1, 2, 3, 'a', 'b', 'c']

In [46]:
# RUN CELL TO TEST VALUE
test_list_a_b_added_to_c.test(c)

That is correct!


### 4. Sorting lists
We may initially get data in completely arbitrary order. But it's often useful to sort them in some way.

Again, there are two ways we can do this. First, the built-in function `sorted` takes a array and returns a new array containing the elements in ascending order:

In [47]:
my_list = [5, 2, 9]

In [48]:
sorted(my_list)

[2, 5, 9]

In [49]:
print(my_list)

[5, 2, 9]


We can also sort in descending order using the `reverse` argument:

In [50]:
sorted(my_list, reverse=True)

[9, 5, 2]

Once a list is sorted, we can use that to get identify the n smallest/largest numbers.


#### TODO
Use the `sorted` function to find the three smallest and three largest values of `random_list`. Save the values as lists with length 3 called `my_list_smallest` and `my_list_largest`, both in ascending order.

In [51]:
random_list = [87, 69, 56, 17, 80, 30, 29, 10, 88, 12, 88, 97, 89, 74, 97, 26, 13,
       88, 41, 22, 92, 26, 49, 46, 73]

In [None]:
random_list_smallest = ____
random_list_largest =  ____

In [53]:
# Solution
random_list_smallest = sorted(random_list)[:3]
random_list_largest =  sorted(random_list)[-3:]

In [54]:
random_list_smallest

[10, 12, 13]

In [55]:
# RUN CELL TO TEST VALUE
test_random_list_smallest.test(random_list_smallest)

That is correct!


In [56]:
random_list_largest

[92, 97, 97]

In [57]:
# RUN CELL TO TEST VALUE
test_random_list_largest.test(random_list_largest)

That is correct!


#### `min` and `max` function
If you only need absolute smallest/largest values, you can use the `min` and `max` functions:

In [58]:
min(random_list)

10

In [59]:
max(random_list)

97

#### TODO
Consider this list of strings:

```python
waiting_list2 = ['Jim', 'Chloe',  'Mary', 'Rachel', 'Laura', 'Jacob']
```

What values would we get if we called `min(waiting_list2)` and `max(waiting_list2)`?

In [61]:
min(waiting_list)

'Jim'

In [62]:
# RUN CELL TO SEE QUIZ
quiz_min_max_waiting_list

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=("'Chloe'; 'Rachel'", "'Jim';…



## II. Tuples
Now we'll move onto a second type of data structure. We declare a **tuple** by separating values by commas within parentheses (as opposed to the squares brackets used for lists).

In [64]:
x_tup = (1, 2, 3)
x_tup

(1, 2, 3)

In [65]:
type(x)

list

Tuples are very similar to lists, except that they are **immutable**. That means once a tuple is declared, we can't modify it or its contents. This is useful if we know a collection of elements are fairly permanent and don't plan on altering them.

So, for example, while with lists we could append an element to the end using `list.append(item)` and remove elements with `list.pop(i)`, with tuples we need to create a brand new object.

#### TODO
What will happen when we run the code below?
```python
x_tup.append(4)
```

In [66]:
x_tup.append(1)

AttributeError: 'tuple' object has no attribute 'append'

In [67]:
# RUN CELL TO SEE QUIZ
quiz_tup_append

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('An error will be raised.', …



#### TODO
What will happen when we run the code below?
```python
x_tup[1] = "a"
```

In [68]:
# RUN CODE TO SEE QUIZ
quiz_set_tup_index

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('A new tuple will be returne…



#### TODO
Which line of code would cause the value of the variable `x_tup` to include the elements `4` and `5`?

- a) `x_tup.extend((4,5))`
- b) `x_tup += (4, 5)`
- c) `x_tup.append(4); x_tup.append(5)`

In [69]:
(1,2,3) + ("a", )

(1, 2, 3, 'a')

In [70]:
quiz_x_tup_4_5 

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('a', 'b', 'c', 'a and b', 'A…



Other than changing their contents, we can do a lot of the same things with tuples that we did with lists.

We index them the same way:

In [71]:
x_tup[0]

1

In [72]:
x_tup[-1]

3

In [73]:
x_tup[1:]

(2, 3)

And we can gets the min and max sort them using the appropriate functions.

In [74]:
max(x_tup)

3

In [75]:
min(x_tup)

1

In [76]:
sorted(x_tup)

[1, 2, 3]

In [77]:
sorted(x_tup, reverse=True)

[3, 2, 1]

In [78]:
x_tuple2 = tuple(sorted(x_tup))
x_tuple2

(1, 2, 3)

#### TODO
What data type is returned by `sorted(x_tup)`?

In [79]:
# RUN CODE TO SEE QUIZ
quiz_type_sorted_tup

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('tuple', 'list', 'other'), v…



We can create tuples from lists and vice-versa:

In [80]:
list(x_tup)

[1, 2, 3]

#### TODO
Create a list called `x_list` from `x_tup`. What code would test whether `x_list` is equal to `x`? Are they equal? Does `x_list` equal `x_tup`?

In [None]:
____ = ____

In [None]:
# RUN CELL TO TEST VALUE
test_x_list.test(x_list)

That is correct!


In [None]:
# Solution
x_list = list(x_tup)

In [None]:
# RUN CELL TO SEE QUIZ
quiz_test_x_list_equals_x

VBox(children=(HTML(value='What code tests whether x_list equals x?'), Textarea(value='', placeholder='Type so…



In [88]:
# Solution
x_list == x

True

In [89]:
# Solution
x_list == x_tup

False

In [92]:
# RUN CELL TO SEE QUIZ
quiz_x_equals_x_list

VBox(children=(HTML(value='Are x and x_list equal?'), RadioButtons(layout=Layout(width='auto'), options=('Yes'…



In the next notebook, we'll learn about *sets* and *dictionaries*.

## Sets
Sets, like lists and tuples, are collections of Python objects. Two of the key characteristics of lists and tuples are:
1. **Ordering and indexing**: The order matters in lists and tuples. Two lists which have the same elements but in different orders are not considered the same (can you show this using Python code?). Because the elements are ordered, you can access individual elements using their positional index.
2. **There can be duplicate values**: Because lists and tuples are defined by both the elements in them and their order, it's perfectly valid for a list to have the same element more than once. For example, `["a", "c", "c"]` or `[1, 1, 2]`

**Sets** differ in both these qualities. Sets in Python are closer to the idea of [mathematical sets](https://en.wikipedia.org/wiki/Set_(mathematics)) than lists and arrays, in that they are a collection of objects and what matters most is what elements are **members** of the set.

We declare a set in Python similar to lists and tuples, but with curly brackets:

```
x = {x1, x2, ...}
```

We can also take another collection and create a set out of it:

```
set([1, 2, 3])
```

#### TODO
Declare three sets below: 
1. `evens`, which contains the elements `[2, 4, 6, 8, 10]`
2. `odds`, which contains the elements `[1, 3, 5, 7, 9]`
3. `primes`, which contains `[2, 3, 5, 7]`

In [None]:
evens = ____
____ = {1, 3, 5, 7, 9}
____ = ____

In [None]:
# Solution
evens = {2, 4, 6, 8, 10}
odds = {1, 3, 5, 7, 9}
primes = set([2, 3, 5, 7])

To check if an object is a **member** of a set, we can use the `in` keyword, which returns `True` if the element is in the set and `False` otherwise.

```python
# Example:
x_set = {1, 2, 3}
1 in x_set
>>> True
0 in x_set 
>>> False
```

#### TODO
What code would check whether `1` is in `primes`? What value would this code return?

In [94]:
# RUN CELL TO SEE QUIZ
quiz_code_1_in_primes

VBox(children=(HTML(value='What code would check whether 1 is in primes?'), Textarea(value='', placeholder='Ty…



In [95]:
# RUN CELL TO SEE QUIZ
quiz_1_in_primes

VBox(children=(HTML(value='What is the value of the code from the answer above?'), RadioButtons(layout=Layout(…



#### TODO
What value would we get if we executed `evens[0]`? Why?

In [96]:
# RUN CELL TO SEE QUIZ
quiz_evens0 

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=('An error would be raised.',…



#### Comparing sets
We often want to compare two sets with each other. For example, if we consider all [natural numbers](https://en.wikipedia.org/wiki/Natural_number) from 1-10, this is a set of 10 numbers:

```python
naturals_10 = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
```

We could create subsets such as:
- All **odd** numbers in the first 10 natural numbers
- All **even** numbers
- All **prime** numbers (i.e., numbers only divisible by one and themselves)

That's exactly what we did when we declared our three sets above.

Let's say we want to ask: Which numbers are both **odd** and **prime**?


This can be visually compared using [Venn diagrams](https://en.wikipedia.org/wiki/Venn_diagram). 

One way we could do this is check each number and see whether it's in both `odds` and `primes`:

In [98]:
(1 in odds) & (1 in primes)

False

In [99]:
(2 in odds) & (2 in primes)

False

In [100]:
(3 in odds) & (3 in primes)

True

But this gets old pretty quickly. A faster way to do this is using the method `set.intersection()`:

In [101]:
odds.intersection(primes)

{3, 5, 7}

#### TODO
Create a set called `even_primes` of all numbers between 1 and 10 which are both even and prime. Don't manually write out the set, but instead use existing sets and methods.

In [None]:
____

In [102]:
# Solution
even_primes = evens.intersection(primes)

In [103]:
# RUN CELL TO TEST VALUE
test_even_primes.test(even_primes)

That is correct!


#### TODO
What is the length of `odds.intersection(evens)`?

In [104]:
# RUN CELL TO SEE QUIZ
quiz_len_odds_evens

VBox(children=(HTML(value=''), Textarea(value='', placeholder='Type something'), Button(description='Submit', …



If we think of a Venn diagram, the intersection is the part in the middle of two circles which overlap (i.e., **"intersect"** ).

The other parts of a Venn diagram are:

**The difference**: These are the elements that are in one set but not the other.

In [105]:
evens.difference(primes)

{4, 6, 8, 10}

#### TODO
Create a set called `primes_not_odd` which is the set of all prime numbers that are not odd.

In [None]:
____

In [106]:
# Solution
primes_not_odd = primes.difference(odds)

In [107]:
primes_not_odd

{2}

In [108]:
# RUN CELL TO TEST VALUE
test_primes_not_odd.test(primes_not_odd)

That is correct!


#### TODO
What is the value of `len(primes_not_odd.difference(even_primes))`?

In [109]:
# RUN CELL TO SEE QUIZ
test_len_pno_ep

VBox(children=(HTML(value=''), Textarea(value='', placeholder='Type something'), Button(description='Submit', …



**The union:** These are all elements in both sets (i.e., all off the venn diagram)

In [233]:
odds.union(primes)

{1, 2, 3, 5, 7, 9}

#### TODO
Create a new set called `naturals_10` which contains all of the whole numbers 1-10. Don't manually write out the set as `{1, 2, ...}`, but instead use existing sets to create a **superset**.

In [None]:
naturals_10 = ____

In [None]:
# Solution
naturals_10 = odds.union(evens).union(odds)

In [236]:
# RUN CELL TO TEST VALUE
test_naturals_10.test(naturals_10)

That is correct!


#### Deduplicating with sets
Another common way of using sets is **deduplicating collections**. We said earlier that lists and tuples can have duplicate values, while sets contain only unique values. That means that if we want to know what all the unique values in a list with duplicates are, we can do this by turning a list into a set.

For example, let's say we had a class with a lot of people whose names start with **"A"** and put their names in a list. To get the unique list of first names, we could do the following:

In [238]:
first_names = ["Alex", "Alec", "Alex", "Aaron", "Alex", "Alek", "Alexis"]
first_names_unq = set(first_names)
first_names_unq

{'Aaron', 'Alec', 'Alek', 'Alex', 'Alexis'}

#### TODO
Let's say we had a list of the cities where a group of patients live. We want to know how many unique cities our patient populations reside in. Create an object called called `unq_cities` and count how many values are in it. Save that number as `num_unq_cities`.

In [111]:
pt_cities = [
    "Salt Lake City",
    "Ogden",
    "Evanston",
    "Salt Lake City",
    "Salt Lake Cit, UT",
    "Provo",
    "Provo"
]

In [None]:
unq_cities = ____
num_unq_cities = ____

In [112]:
# Solution
unq_cities = set(pt_cities)
num_unq_cities = len(unq_cities)
num_unq_cities

5

In [113]:
# RUN CELL TO TEST VALUE
test_num_unq_cities.test(num_unq_cities)

That is correct!


#### Discussion
What is an issue with our count above?

## Dictionaries
Like sets, **dictionaries** are unique and unordered. However, while sets are collections of individual elements, dictionaries are collections of **key/value pairs**. 

A key/value pair is a unique mapping from one item (a key) to another (a value). An example of this in real life is a mapping from states to their capitals:

- Utah --> Salt Lake City
- Pennsylvania --> Harrisburg
- New York --> Albany

The states are the keys and the capitals are values. Let's see what this would look like in Python.

Dictionaries are declared using curly brackets (just like sets). But we signify the key/value mapping using a colon **":"**.

In [117]:
state_capitals = {
    "Utah": "Salt Lake City",
    "Pennsylvania": "Harrisburg",
    "New York": "Alabany"
}

In [118]:
state_capitals

{'Utah': 'Salt Lake City', 'Pennsylvania': 'Harrisburg', 'New York': 'Alabany'}

Let's say we want to know the capital of a particular state. We can get this by using the key (i.e., state) as the index:

In [119]:
state_capitals["Utah"]

'Salt Lake City'

#### TODO
What code would we use to get the capital of Pennsylvania?

In [114]:
# RUN CODE TO SEE QUIZ
test_capital_pa

VBox(children=(HTML(value=''), Textarea(value='', placeholder='Type something'), Button(description='Submit', …



In [152]:
quiz_check_len_dict

VBox(children=(HTML(value='What value would be returned by the following code?\n\n<p style="font-family:courie…



If you try to index using a key that isn't in the dictionary, you get an error:

In [153]:
state_capitals["California"]

KeyError: 'California'

You can add a key/value pair to a dictionary like this:

In [154]:
state_capitals["California"] = "Sacramento"

In [155]:
state_capitals

{'Utah': 'Salt Lake City',
 'Pennsylvania': 'Harrisburg',
 'New York': 'Alabany',
 'California': 'Sacramento'}

And you can remove one using the `.pop(key)` method (similar to lists):

In [156]:
state_capitals.pop("California")

'Sacramento'

#### TODO
Add `"Idaho"` and its capital city to our dictionary.

In [None]:
____

In [None]:
# Solution
state_capitals["Idaho"] = "Boise"

In [258]:
# RUN CODE TO TEST VALUE
test_capital_idaho.test(state_capitals["Idaho"])

That is correct!


#### Uniqueness
Just like sets, dictionaries are unique, so you can only map a key to a single value. This makes sense in our case: a state can't have more than one capital!

#### TODO

What value would be printed out by the code below?

```python
state_capitals2 = {
    "California": "Sacramento",
    "Utah": "Salt Lake City",
    "Pennsylvania": "Harrisburg",
    "New York": "Alabany"
}


state_capitals2["New York"] = "New York City"
print(state_capitals2["New York"])
```

In [157]:
# RUN CELL TO SEE QUIZ
quiz_state_capitals2_ny 

VBox(children=(HTML(value=''), RadioButtons(layout=Layout(width='auto'), options=("'Albany'", "'New York City'…





#### Discussion
Let's say we were mapping states to **all** cities in that state, not just the capital. How could we do this with a Python dictionary? Implement it below using the following cities:

- **New York**: Albany, Buffalo, New York City 
- **California** Los Angeles, Sacramento, San Francisco, San Diego
- **Pennsylvania**: Harrisburg, Pittsburgh, Philadelphia
- **Utah**: Ogden, Provo, Salt Lake City

### Emergency room wait times

Let's think back to our example of emergency room patients. Earlier, all we had done is record patient names. But let's say we started to record some additional information as well:

| name | arrival_time | age | severity |
|------|--------------|-----|----------|
|   Jim   |      6:00        |  40   |     40     |
|   Mary   |       6:30       | 31   |     10     |
|   Rachel   |        7:00      |   27  |    20      |
|    Laura  |        7:30      |   38  |     15     |
|   Chloe   |         8:00     |   25  |     50     |

We can use dictionaries to map each name to the respective value. Let's start by just mapping names to arrival times:

In [160]:
pt_arrivals = {
    "Jim": "6:00",
    "Mary": "6:30",
    "Rachel": "7:00",
    "Laura": "7:30",
    "Chloe": "8:00"
}

In [161]:
pt_arrivals

{'Jim': '6:00',
 'Mary': '6:30',
 'Rachel': '7:00',
 'Laura': '7:30',
 'Chloe': '8:00'}

#### TODO

In [162]:
# RUN CELL TO SEE QUIZ
quiz_code_rachel_arrival 

VBox(children=(HTML(value='What code would give you the arrival time for Rachel?'), Textarea(value='', placeho…



We can separate out the keys and values using the `dict.keys()` and `dict.values()` methods:

In [163]:
pt_arrivals.keys()

dict_keys(['Jim', 'Mary', 'Rachel', 'Laura', 'Chloe'])

In [164]:
pt_arrivals.values()

dict_values(['6:00', '6:30', '7:00', '7:30', '8:00'])

This can be useful if we just want to check if there is a patient by a certain name in our records, or if anyone arrived at a certain time.

#### TODO

In [165]:
# RUN CODE TO SEE QUIZ
test_check_pt_name

VBox(children=(HTML(value="What code would check if there is someone named 'Jacob' in our arrivals dictionary?…



In [166]:
# RUN CELL TO SEE QUIZ
quiz_jacob_in_dict

VBox(children=(HTML(value='What would the answer to the code above be?'), RadioButtons(layout=Layout(width='au…



In [167]:
# RUN CELL TO SEE QUIZ
test_check_pt_time

VBox(children=(HTML(value='What code would check if someone arrived at 7:00?'), Textarea(value='', placeholder…



All of the examples we've seen so far have had strings as both the keys and values. But there are lots of other options for what data we put in a dictionary. Without getting too much into the details, the values can be any data type that is **hashable** - so that includes numerics, ints, tuples, and strings. The values can be any data type.

So in our earlier example where we wanted to map states to multiple names of cities, we could have done that as:

```python
{
    "Utah": {"Salt Lake City", "Provo", "Ogden", "Park City"},
    "Pennsylvania": {"Pittsburgh", "Philadelphia", "Erie", "Harrisburg"},
    "California": {"San Francisco", "San Diego", "Los Angeles", "Sacramento"}
}
```

In [168]:
# RUN CELL TO SEE CODE
quiz_dict_data_types

VBox(children=(HTML(value='What are the data types of the keys and values in the code above?'), RadioButtons(l…



One especially useful way of structuring is **nested dictionaries** where the keys are some basic data type, such as strings, and the values are other dictionaries. This is a common way of mapping keys to multiple attributes.

For example, if we wanted to map state names to several different facts about them, we could do that as:

In [169]:
states = {
    "Utah": {
        "cities": {"Salt Lake City", "Provo", "Ogden", "Park City"},
        "capital": "Salt Lake City",
        "population": 3.15 # 3.15 million
    },
    "Pennsylvania": {
        "cities": {"Pittsburgh", "Philadelphia", "Erie", "Harrisburg"},
        "capital": "Harrisburg",
        "population": 12.79
    },
    "California": {
        "cities": {"San Francisco", "San Diego", "Los Angeles", "Sacramento"},
        "capital": "Sacramento",
        "population": 39.35
    }
}

In [170]:
# RUN CELL TO SEE QUIZ
quiz_type_states_utah

VBox(children=(HTML(value='With the dictionary above, what is the data type of: <p style="font-family:courier"…



Once we access the inner dictionary in a nested dictionary, we can then use the keys in that dictionary to look up the properties of the city:

In [None]:
states["Utah"]["capital"]

39.35

#### TODO

In [172]:
# RUN CELL TO SEE QUIZ
quiz_ca_population

VBox(children=(HTML(value='What code would give you the population of California?'), Textarea(value='', placeh…



#### TODO
Let's say we wanted to get the set of all cities in either Utah or Pennsylvania. Using `states`, create a new variable called `cities_ut_pa` which contains all the corresponding elements.

In [None]:
____ = ____
cities_ut_pa

In [174]:
# Solution
cities_ut_pa = states["Utah"]["cities"].union(states["Pennsylvania"]["cities"])
cities_ut_pa

{'Erie',
 'Harrisburg',
 'Ogden',
 'Park City',
 'Philadelphia',
 'Pittsburgh',
 'Provo',
 'Salt Lake City'}

In [175]:
# RUN CELL TO TEST VALUE
test_cities_ut_pa.test(cities_ut_pa)

That is correct!
