
<img src="https://raw.githubusercontent.com/abchapman93/delphi-python-2025-dev/refs/heads/main/media/DELPHI-long.png">
</br>

<h1 valign="center" align="center"><font size="+150">Introduction to Python</br>December 2025</font></h1>

In [None]:
!pip install https://github.com/abchapman93/delphi-python-2025/releases/download/dev/uu_delphi_python_dec25-0.1.tar.gz

In [None]:
from uu_delphi_python_dec25 import *
from uu_delphi_python_dec25.quizzes.module1_quizzes import *
from uu_delphi_python_dec25.helpers import *

# Data Structures Part 1
In the previous notebook we learned about some of the essential data types in Python. We'll now dive deeper into Python's built-in **data structures**. A data structure is a type of object which helps us organize and access data. Each data structure is used for a slightly different purpose and has different methods associated with it.


The four data types we'll look at in the next two notebooks are:
- I. `list`
- II. `tuple`
- III. `dict` (dictionary)
- IV. `set`

## Lists
The following are all lists in Python: 

In [None]:
x = [1, 2, 3]
y = ["a", "b", "c"]
z = [1, "b", x]

#### TODO
What are the data types of the three elements of `z` in the previous cell?

In [None]:
# RUN CELL TO SEE QUIZ
quiz_data_types_z

Here are some things we might want to do with lists:
1. Access specific elements
2. Add or remove elements
3. Combine two different lists
4. Sorting and finding the min/max values

### 1. Accessing specific elements
Lists and the other data structures in this notebook are also called **containers** because they **contain** other objects. As such, one of the main purposes of a list is to store an object in it and access it later.

The main way to do this with lists is **indexing**. Each element in a list has a numeric index - that is, its *ordered* position in a list. You do this by putting brackets after the name of the list and a numeric index. For example, let's say we want to access the first element of `x`. We would do this as:

```python
x[0]
```

This might look a little funny to you - the *zeroth* index of x? The reason that we use `x[0]` instead of `x[1]` for the first elementing is that Python uses **zero indexing**, meaning that the positions start at 0 and end at `len(x) - 1`. This is different from R or many other statistical software packages. You can think of this as saying: **Give me the element of x which is `0` positions away from the beginning**. 

#### TODO
What code would give you the second element of `x`?

In [None]:
# RUN CELL TO SEE QUIZ
quiz_second_element_x

#### TODO
What value would get if you ran the following line of code?
```python
x[3]
```

In [None]:
# RUN CELL TO SEE QUIZ
quiz_x3

To get the length of a list, we can use the built-in `len` function. 

In [None]:
len(x)

#### TODO
What is the largest value of `idx` we could use in the code `x[idx]`?

In [None]:
# RUN CODE TO SEE QUIZ
quiz_largest_idx_x

#### `k` steps back
When we pass in 0 or a positive index `k` for list `x`, we go `k` positions from the beginning. But if we pass in a negative number, we go backwards from the end of the list. So one way to get the last element of `x` is:

In [None]:
x[-1]

In [None]:
x

#### TODO
Which of the following lines of code would give you the second-to-last element of x?
- a) `x[-2]`
- b) `x[1]`
- c) `x[len(x)-2]`

In [None]:
# RUN CELL TO SEE QUIZ
quiz_second_to_last_x

Note that in our example lists, the last element of `z` is the list `x`. 

In [None]:
z

#### TODO
Which of the following lines of code would give the value `2`?
- a) `x[1]`
- b) `x[-2]`
- c) `z[3]`
- d) `z[-1][-2]`

In [None]:
# RUN CELL TO SEE QUIZ
quiz_values_of_2

If we a subset of a list (that is, a smaller list containing some of the elements of the larger list), we use the following notation:

```python
x[start:end]
```

This will give us a list containing `[x[start], x[start+1], ..., x[end-1]]`

**Question**: Why will the sublist end at `x[end-1]`?

#### TODO
Create a smaller list containing only the second and third elements of `x`. Name it `x_sub1`.

In [None]:
[x[1], x[2]]

In [None]:
x_sub1 = x[1:3]
x_sub1

In [None]:
# RUN CELL TO TEST VALUE
test_x_sub1.test(x_sub1)

If we leave out the `start` index, then the sublist will contain all of the elements of `x` from the beginning until `end-1`:

In [None]:
x[:2]

Similarly, if we leave out the `end` index, the sublist will contain all elements from `start` through the end of the list:

In [None]:
x[1:]

#### TODO
What list would be created with the code `x[:]`. Why?

In [None]:
# RUN CELL TO SEE QUIZ
quiz_x_colon

### 2. Adding or removing elements
Often the containers we're using aren't static. We might update them by adding new objects to them or removing some. Next we'lllook at a few methods for doing this.

**Methods** are like functions but are associated with a particular object. Calling a method looks similar to calling a function but comes from the object:

```python
obj.method_name(args)
```

For example, lists have a method called `append` that adds an element to the end of the list. This method takes one argument, which is the object we want to add to the list. 

For example, let's say we had a list of names for patients in the emergency department. We'll call this list `waiting_list`. When a new patient shows up, we need to **append** their name to the list.

In [None]:
waiting_list = ["Jim", "Mary", "Rachel"]
waiting_list.append("Laura")

In [None]:
type(waiting_list)

In [None]:
waiting_list

Now when we look at this object, we can see that "Laura" is also in our queue.

In [None]:
waiting_list

But let's say that someone named **"Chloe"** comes in who is much sicker than the other patients and needs to be seen with higher priority. We can add an element to a specific position in a list by using the `insert` method. This takes two arguments: the index of the list where you want to put the new object, and the object itself.

So, for example, the line below will add `"Chloe"` to the beginning of the list.

In [None]:
waiting_list.insert(0, "Chloe")

In [None]:
waiting_list

Eventually, the understaffed doctors in the emergency room are ready to see a patient. To remove an object from a specific position we can use the `pop` method. This takes one argment, the index of the element to remove. Since our queue is already in the order we want to see the patients, we pass in `0` as the index.

This also returns the removed object, so we can save that as a variable to see who the next patient is.

In [None]:
next_patient = waiting_list.pop(0)
print(f"Next up: {next_patient}")

In [None]:
waiting_list

#### TODO
What would be the value of `next_patient` if you ran the previous cell again?

In [None]:
# RUN CELL TO SEE QUIZ
quiz_next_patient2

#### TODO
Let's say someone named **"Jacob"** comes into the ER. At first they don't seem too sick, so we put them at the end of the queue. Write the code below to add `"Jacob"` to the end of `waiting_list`.

In [None]:
____ 

In [None]:
# RUN CELL TO TEST VALUE
test_waiting_list_jacob.test(waiting_list)

#### TODO
After being added to the end of the queue, Jacob's condition suddenly worsens and he needs to be seen immediately. Remove Jacob's name from the list and save it as `next_patient`. 



In [None]:
____ = ____

In [None]:
next_patient

In [None]:
# RUN CELL TO TEST VALUE
test_next_patient3.test(next_patient)

After running the code above, what is the value of `len(waiting_list)`?

In [None]:
# RUN CELL TO SEE QUIZ
test_len_waiting_list

### 3. Combing two different lists
In addition to adding individual elements, we can also take two different lists and combine them. This is called **concatenation**. 

There are two ways to do this: First by using the addition operator `+`. For example, to combine the elements of `x` and `y` into one new list, we could do:

In [None]:
x + y

This creates a *new list* containing the elements of `x` followed by the elements of `y`. But the original lists are not altered: 

In [None]:
x

The second option is the method `extend` which modifies the list directly. So if you called:

```pythong
x.extend(y)
print(x)
```

You would see the new, longer list: `[1, 2, 3, 'a', 'b', 'c']`

I prefer not altering original objects whenever possible, so I personally recommend using the `+` operator. You can also update the value of x accordingly by saying:

```python
x += y
print(x)
# [1, 2, 3, 'a', 'b', 'c']
```

#### TODO
Below are three lists: `a`, `b`, and `c`. Write code to do the following:

1. Create a new list `z` which contains the elements of `b` followed by the elements of `a`
2. Modify `c` directly to include the elements in `a` and `b` (in that order)

In [None]:
a = [1, 2, 3]
b = ["a", "b", "c"]
c = [2, 4, "6"]

In [None]:
____ = ____

In [None]:
#### Solution
z = b + a
z

In [None]:
# RUN CELL TO TEST VALUE
test_list_a_added_to_b.test(z)

In [None]:
____ 

In [None]:
#### Solution
c += a # Add elements of a to c
c += b # Add elements of b to c

In [None]:
c

In [None]:
# RUN CELL TO TEST VALUE
test_list_a_b_added_to_c.test(c)

### 4. Sorting lists
We may initially get data in completely arbitrary order. But it's often useful to sort them in some way.

Again, there are two ways we can do this. First, the built-in function `sorted` takes a array and returns a new array containing the elements in ascending order:

In [None]:
my_list = [5, 2, 9]

In [None]:
sorted(my_list)

In [None]:
print(my_list)

We can also sort in descending order using the `reverse` argument:

In [None]:
sorted(my_list, reverse=True)

Once a list is sorted, we can use that to get identify the n smallest/largest numbers.


#### TODO
Use the `sorted` function to find the three smallest and three largest values of `random_list`. Save the values as lists with length 3 called `my_list_smallest` and `my_list_largest`, both in ascending order.

In [None]:
random_list = [87, 69, 56, 17, 80, 30, 29, 10, 88, 12, 88, 97, 89, 74, 97, 26, 13,
       88, 41, 22, 92, 26, 49, 46, 73]

In [None]:
random_list_smallest = ____
random_list_largest =  ____

In [None]:
random_list_smallest

In [None]:
# RUN CELL TO TEST VALUE
test_random_list_smallest.test(random_list_smallest)

In [None]:
random_list_largest

In [None]:
# RUN CELL TO TEST VALUE
test_random_list_largest.test(random_list_largest)

#### `min` and `max` function
If you only need absolute smallest/largest values, you can use the `min` and `max` functions:

In [None]:
min(random_list)

In [None]:
max(random_list)

#### TODO
Consider this list of strings:

```python
waiting_list2 = ['Jim', 'Chloe',  'Mary', 'Rachel', 'Laura', 'Jacob']
```

What values would we get if we called `min(waiting_list2)` and `max(waiting_list2)`?

In [None]:
# RUN CELL TO SEE QUIZ
quiz_min_max_waiting_list

## II. Tuples
Now we'll move onto a second type of data structure. We declare a **tuple** by separating values by commas within parentheses (as opposed to the squares brackets used for lists).

In [None]:
x_tup = (1, 2, 3)
x_tup

In [None]:
type(x)

Tuples are very similar to lists, except that they are **immutable**. That means once a tuple is declared, we can't modify it or its contents. This is useful if we know a collection of elements are fairly permanent and don't plan on altering them.

So, for example, while with lists we could append an element to the end using `list.append(item)` and remove elements with `list.pop(i)`, with tuples we need to create a brand new object.

#### TODO
What will happen when we run the code below?
```python
x_tup.append(4)
```

In [None]:
# RUN CELL TO SEE QUIZ
quiz_tup_append

#### TODO
What will happen when we run the code below?
```python
x_tup[1] = "a"
```

In [None]:
# RUN CODE TO SEE QUIZ
quiz_set_tup_index

#### TODO
Which line of code would cause the value of the variable `x_tup` to include the elements `4` and `5`?

- a) `x_tup.extend((4,5))`
- b) `x_tup += (4, 5)`
- c) `x_tup.append(4); x_tup.append(5)`

In [None]:
quiz_x_tup_4_5 

Other than changing their contents, we can do a lot of the same things with tuples that we did with lists.

We index them the same way:

In [None]:
x_tup[0]

In [None]:
x_tup[-1]

In [None]:
x_tup[1:]

And we can gets the min and max sort them using the appropriate functions.

In [None]:
max(x_tup)

In [None]:
min(x_tup)

In [None]:
sorted(x_tup)

In [None]:
sorted(x_tup, reverse=True)

In [None]:
x_tuple2 = tuple(sorted(x_tup))
x_tuple2

#### TODO
What data type is returned by `sorted(x_tup)`?

In [None]:
# RUN CODE TO SEE QUIZ
quiz_type_sorted_tup

We can create tuples from lists and vice-versa:

In [None]:
list(x_tup)

#### TODO
Create a list called `x_list` from `x_tup`. What code would test whether `x_list` is equal to `x`? Are they equal? Does `x_list` equal `x_tup`?

In [None]:
____ = ____

In [None]:
# RUN CELL TO TEST VALUE
test_x_list.test(x_list)

In [None]:
# RUN CELL TO SEE QUIZ
quiz_x_equals_x_list

In the next notebook, we'll learn about *sets* and *dictionaries*.