# Compound data structures

In addition to simple numeric and string data, Python provides several compound data structures.
These are essential because they are used constantly in real programs.

The most important built-in compound data structures are:

- **list** – ordered, mutable sequence
- **set** – unordered collection of unique elements
- **dict** – key–value mapping
- **tuple** – ordered, immutable sequence

All of these are *iterable*, meaning we can iterate over their elements one by one.
Because they store multiple values, we can often convert between them.

## Lists (`list`)

A list is an ordered, mutable data structure.
Order matters, elements can be changed, and duplicates are allowed.
Lists are indexed similarly to strings.

In [None]:
# List literal using square brackets
values = [2, 8.4, "text"]

# Order matters when comparing lists
[2, 4, 1] == [2, 1, 4]

In [None]:
# Lists are indexed starting from zero
numbers = [10, 20, 30, 50, 60]
numbers[0]

In [None]:
# Negative index counts from the end
numbers[-1]

In [None]:
# Slicing returns a list
numbers[0:2]
numbers[:]
numbers[-2:]

In [None]:
# Reversed slice bounds result in an empty list
numbers[2:1]

In [None]:
# Converting other iterables to a list
list("text")
list(("a", 5))
list({9, 8, 3, 1})
list(range(10, 20, 2))

In [None]:
# List concatenation
[1, "Peter", 8] + ["Lajos", 42]

In [None]:
# Lists are mutable
numbers = [10, 20, 30]
numbers[0] = 100
numbers

In [None]:
# Appending an element
numbers.append("something")
numbers

In [None]:
# Removing elements using del
del numbers[1:3]
numbers

Question:

What happens if you execute the previous cell multiple times? Why?

Try to guess what the result of the next line will be before you run it!

In [None]:
# if a list contains another list, we can index it twice
# (the first indexing gives the inner list, which we can then index further)
list_of_lists[1][1]

In [None]:
# a list has a length
len([1,2,3])

In [None]:
# and we can check whether something is contained in it or not:
"cockatoo" in ["quail", "pheasant", "cockatoo"]

In [None]:
# or we can sort the data in it (alphabetically or by magnitude)
sorted(["cockatoo", "quail", "pheasant"])

# Tuple (N-tuple data)

A tuple is very similar to a list, but unlike a list it cannot be modified. It can contain different data types and can be of any length. However, once it is created, it stays that way. At most, we can create a new one from it.

In return, it uses less memory and is slightly faster. Because a tuple is immutable (it never changes), it is also `hashable`, so it can be used as a key (see below). Use it when this matters, but most importantly remember that values listed with commas (often inside parentheses) represent a tuple, because this form is used very frequently in libraries.

When to use it:
- If you explicitly want it not to change.
- If you want to use it as a key in a dict or as an element in a set.
- If you are storing a logically related "package of values" (e.g. coordinates).

Advantages:
- Faster and more memory-efficient than a list.
- Safe, because it cannot be modified.

Disadvantages:
- It cannot be edited later; a new one must be created.

In [None]:
data_pair = ("Ákos", 22)
data_pair

('Ákos', 22)

In [None]:
# order also matters in tuples:
("Peter", "Kinga") == ("Kinga", "Peter") # these are not equal

In [None]:
# we can index them just like lists or strings:
data_pair[0] # first
data_pair[-1] # last

In [None]:
# we can convert other things into a fixed-length tuple:
tuple("text") # from string to tuple
tuple([3,4,5]) # from list to tuple

We often see parentheses around tuples. Since parentheses are also used to define the order of operations, this can be confusing. For example, what is (-4)? Is it the number -4 put in parentheses just to make it clear that it is negative, or is it a single-element tuple?

From Python’s point of view, the essence of a tuple is not the parentheses (they are often not even required), but the comma. Something is only considered a tuple if there is a comma in it. If there is no comma, it is not a tuple. Parentheses are usually added only to avoid confusion with other constructs, such as listing function parameters.

In [None]:
# this is a single-element tuple
(-5,) # we do not write anything after the comma; it is only there to make it a tuple

In [None]:
# you could also write it like this; it is still a single-element tuple, just a bit more confusing
-5,

In [None]:
# this, however, is the number -5 in parentheses. This is not a tuple.
(-5)

A tuple can store all kinds of data. Even data that itself is mutable. The tuple itself is still immutable!

In [None]:
# A tuple can contain many different kinds of data:
(("another","tuple"), ["list",9], 99, "peter")

In [None]:
# the fact that a tuple is immutable does not mean
# that the data stored inside it cannot be changed:

data = ([], [9]) # a pair consisting of an empty list and a single-element list
print(data)
data[0].append(100) # add an element to the first list
data[1][0] = 99 # overwrite the first (and only) element of the second list
print(data)

In [None]:
# the tuple itself is what cannot be modified!
# remove the comment and you will get a nice big error!

# data[0] = [2,3,4] # overwrite the first list with another one

Tuples are often used to assign values to multiple variables at once. If there are multiple labels listed with commas on the left-hand side of the assignment, Python tries to "unpack" the compound structure on the right-hand side and assign its elements one by one.

In [None]:
a, b, c = (1,2,3) # assign the three numbers at once!

b # so b gets the value 2

In [None]:
# the essence of a tuple is the comma, not the parentheses; on the right-hand side
# of an assignment it is recognized even without parentheses, so you can safely omit them:
a, b, c = 1, 2, 3 # you can also write it this way

In [None]:
# if Python cannot unpack it because the number of elements does not match, we get an error
a, b = 1, 2, 3 # cannot have more... <---- Error!
a, b, c = 1, 2 # and cannot have fewer either!  <---- Error!

## Set
A set consists of comma-separated values enclosed in curly braces. Similar to the previous structures, it can contain values of any type (as long as they are "hashable").

The key idea of a set is that it does not really care about order or multiplicity, only about whether something is a member or not. So if you add the same element multiple times, it will only be kept once (and the order is not preserved).

In [None]:
# the fact that a tuple is immutable does not mean
# that the data stored inside it cannot be changed:

data = ([], [9]) # a pair of an empty list and a single-element list
print(data)
data[0].append(100) # add an element to the first list
data[1][0] = 99 # overwrite the first (and only) element of the second list
print(data)

In [None]:
# the tuple itself is what cannot be modified!
# remove the comment and you will get a nice big error!

# data[0] = [2,3,4] # overwrite the first list with another one

Tuples are often used to assign values to multiple variables at once. If there are multiple labels listed with commas on the left-hand side of the assignment, Python tries to "unpack" the compound structure on the right-hand side and assign its elements one by one.

In [None]:
a, b, c = (1,2,3) # assign the three numbers at once!

b # so b gets the value 2

In [None]:
# the essence of a tuple is the comma, not the parentheses; on the right-hand side of assignment
# it is recognized even without parentheses, so you can safely omit them:
a, b, c = 1, 2, 3 # you can also write it this way

In [None]:
# if Python cannot unpack it because the number of elements does not match, we get an error
a, b = 1, 2, 3 # cannot have more... <---- Error!
a, b, c = 1, 2 # and it cannot have fewer either!  <---- Error!

## Set
Sets are comma-separated values enclosed in curly braces. Similar to the previous structures, they can contain values of any type (as long as they are "hashable").

The essence of a set is that it does not really care about order or multiplicity, only about whether something is a member or not. So if you add the same element multiple times, it will only be kept once (and the order is not preserved).

## Dictionaries (`dict`)

A dictionary stores values associated with keys.
Keys must be unique and immutable; values can be anything.
Dictionaries are unordered (in terms of logical meaning).

In [None]:
# Dictionary literal
person = {
    "name": "Alice",
    "age": 23,
    "city": "Budapest"
}

person["name"]

In [None]:
# Modifying dictionary values
person["age"] = 24
person

In [None]:
# Adding a new key–value pair
person["job"] = "engineer"
person

Dictionaries can be iterated over keys, values, or key–value pairs.

In [None]:
for key in person:
    print(key, person[key])

This concludes the introduction to lists and dictionaries.
These data structures are fundamental building blocks of Python programs.