Lecture: AI I - Basics 

Previous:
[**Chapter 2.1: Python Basics**](../02_python/01_basics.ipynb)

---

# Chapter 2.2: Data Structures

- [Sequence types](#Sequence-types)
- [Set types](#Set-types)
- [Mapping types](#Mapping-types)
- [Types for time and date](#Types-for-time-and-date)
- [Additional data structures](#Additional-data-structures)

## Sequence types 

In Python, sequences are collections of items that can be accessed by their index. The most common sequences are lists, tuples, and ranges. Together, these sequence types provide flexible and efficient ways to work with data in Python.

### Ranges

Ranges represent immutable sequences of numbers, typically used for generating number sequences in loops efficiently without storing all values in memory at once. The `range()` function can take one, two, or three parameters, giving you control over the generated number sequence: 
- the stop value alone
- a start and stop value
- a start, stop, and step value

In [1]:
type(range(10))

range

In [2]:
range(10)  # repr of a range object

range(0, 10)

In [3]:
list(range(10))  # stop value alone

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [4]:
list(range(1, 10))  # start and stop value

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [5]:
list(range(1, 10, 2))  # start, stop, step

[1, 3, 5, 7, 9]

### Lists

Lists are mutable, allowing you to add, remove, or change elements, making them ideal for general-purpose data storage.

In [44]:
l1 = list()
l2 = []

type(l1), type(l2)

(list, list)

#### Common Sequence Operations

Python supports [common sequence operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations) like indexing, slicing, concatenation, repetition, and checking membership, which work consistently across lists, tuples, and _strings_.

| Operation | Description |
|-----------|-------------|
| `x in s` | Check if `x` is in `s` |
| `x not in sseq` | Check if `x` is not in `s` |
| `s + t` | Concatenate sequences `s` and `t` |
| `s * n` or `n * s` | Repeat sequence `s` `n` times |
| `s[i]` | Access the `i`-th element of sequence `s` |
| `s[i:j]` | Slice sequence `s` from index `i` to `j` |
| `s[i:j:k]` | Slice sequence `s` from index `i` to `j`, stepping by `k` |
| `len(s)` | Get the length of sequence `s` |
| `min(s)` | Get the minimum value in sequence `s` |
| `max(s)` | Get the maximum value in sequence `s` |
| `s.index(x,[start[, end]])` | Get the index of the first occurrence of `x` in `s`, optionally within a specified range |
| `s.count(x)` | Count the occurrences of `x` in sequence `s` |

In [6]:
nums = list(range(10))
nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The `in` and `not in` operators are used to check whether an item exists within a sequence. Avoid using `not x in s`.

In [7]:
print(0 in nums)
print(10 not in nums)

True
True


In Python, the `+` operator concatenates sequences, joining them together, while the `*` operator repeats a sequence a specified number of times.

In [8]:
print(nums + nums)
print(nums * 3)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


The `min()` and `max()` functions return the smallest and largest items in a sequence, while `len()` returns the number of elements in the sequence.

In [9]:
print(min(nums))
print(max(nums))
print(len(nums))

0
9
10


The `key` keyword argument in `max()` and `min()` allows you to specify a function to determine the value used for comparison, enabling custom sorting criteria when finding the largest or smallest item.

In [10]:
names = ["Alice", "Bob", "Charlie"]

print(min(names))  # min by alphabetical order
print(max(names))  # max by alphabetical order

Alice
Charlie


In [11]:
print(min(names, key=len))  # min by length of name
print(max(names, key=len))  # max by length of name

Bob
Charlie


The `.index()` method returns the position of the first occurrence of a value in a sequence, while the `.count()` method returns the number of times a value appears in the sequence.

In [12]:
print(nums.index(5))
print(nums.count(5))

5
1


#### Indexing and Slicing

In Python, indexing allows you to access individual elements in a sequence using their position, with indices starting at 0 for the first item. Negative indices can be used to access elements from the end of the sequence. Slicing lets you extract a range of elements by specifying a start, stop, and optional step, creating a new subsequence without modifying the original.

In [13]:
print(nums[1]) # second element
print(nums[-1]) # last element

1
9


Follwing are a few slicing example. Slice the list from index 2 to the end:

In [14]:
nums[2:]

[2, 3, 4, 5, 6, 7, 8, 9]

Slice the list from the start to index 2 (exclusive):

In [15]:
nums[:2]

[0, 1]

Slice the list from the second last element to the end:

In [16]:
nums[-2:]

[8, 9]

Slice the list from index 2 to 5 (exclusive):

In [17]:
nums[2:5]

[2, 3, 4]

Slice the list every second element:

In [18]:
nums[::2]

[0, 2, 4, 6, 8]

Slice the list every second element starting from index 1:

In [19]:
nums[1::2]

[1, 3, 5, 7, 9]

Slice the list from index 1 to 7 (exclusive), every second element:

In [20]:
nums[1:7:2]

[1, 3, 5]

#### Mutable Sequence Operations

Lists support various [operations](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) for modifying their contents, such as adding, removing, and changing elements. These operations include:

| Operation | Description |
|-----------|-------------|
| `s.append(x)` | Add `x` to the end of sequence `s` |
| `s.extend(t)` or `s + t` | Extend sequence `s` by appending elements from sequence `t` |
| `s.insert(i, x)` | Insert `x` at index `i` in sequence `s` |
| `s.remove(x)` | Remove the first occurrence of `x` from sequence `s` |
| `s.pop([i])` | Remove and return the item at index `i` from sequence `s`, or the last item if `i` is not specified |
| `s.clear()` or `del s[:]` | Remove all items from sequence `s` |
| `s.sort(key=None, reverse=False)` | Sort the items of sequence `s` in ascending order, optionally using a custom key function and reversing the order |
| `s.reverse()` | Reverse the order of items in sequence `s` |
| `s.copy()` or `s[:]` | Create a shallow copy of sequence `s` |

You can add elements to a list using `.append()` to add a single item at the end, `.extend()` to add multiple items from another iterable, and `.insert()` to add a single item at a specific position.

In [21]:
nums.append(10)
nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [22]:
nums.extend([11, 12, 13])
nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

In [23]:
nums.insert(10, "test")
nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 'test', 10, 11, 12, 13]

You can use `.remove()` to delete the first occurrence of a specific value in a list, while `.pop()` removes and returns an item at a given index (or the last item by default).

In [24]:
nums.remove("test")
nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]

In [25]:
nums.pop(0)
nums.pop()
nums

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

You can use `.reverse()` to reverse the order of elements in a list in place, and `.sort()` to arrange the list items in ascending order by default or with a custom key for specific sorting needs.

**Hint**: For a shorter way to reverse a list ([code golf](https://codegolf.stackexchange.com/)), you can use slicing with `s[::-1]`.

In [26]:
nums.reverse()
nums

[12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

In [42]:
list(reversed(nums))

[12, 11, 10, 9, 8, 7, 300, 5, 200, 0]

In [29]:
nums.sort()  # sort in place
nums

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

In [30]:
sorted(nums)  # sorted returns a new sorted list

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

When sorting the `key` parameter allows you to specify a custom sorting function, and the `reverse` parameter can be set to `True` to sort in descending order.

In [31]:
print(names)
print(sorted(names))  # sorted returns a new sorted list
print(sorted(names, key=len))  # sorted by length of name
print(sorted(names, reverse=True))  # sorted in reverse order
print(sorted(names, key=len, reverse=True))  # sorted by length of name in reverse

['Alice', 'Bob', 'Charlie']
['Alice', 'Bob', 'Charlie']
['Bob', 'Alice', 'Charlie']
['Charlie', 'Bob', 'Alice']
['Charlie', 'Alice', 'Bob']


#### Indexing and Slicing Operations

You can change elements in a list by assigning a new value to an index or slice, and you can delete elements using `del` with an index or slice to remove specific items or ranges from the list.

In [32]:
nums[0] = -1  # change first element
nums

[-1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

In [33]:
nums[0:2] = [0, 1]  # change first two elements
nums

[0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

In [34]:
nums[1:6:2] = [100, 200, 300]  # change every second element from index 1 to 6 (exclusive)
nums

[0, 100, 3, 200, 5, 300, 7, 8, 9, 10, 11, 12]

In [35]:
del nums[1:3]  # delete elements from index 1 to 3 (exclusive)
nums

[0, 200, 5, 300, 7, 8, 9, 10, 11, 12]

### Tuples

Tuples are immutable sequences, meaning their contents cannot be changed after creation, which is useful for fixed collections of items or when you want to ensure data integrity.

In [36]:
t1 = ()
t2 = 1, # or t2 = (1,)
t3 = 1, 2  # or t3 = (1, 2)  
t4 = tuple([1, 2, 3])

type(t1), type(t2), type(t3), type(t4)

(tuple, tuple, tuple, tuple)

In Python, unpacking allows you to assign elements from a sequence to multiple variables in a single statement, making your code cleaner and more readable. An example is the divmod operator, which returns both the quotient and remainder of a division operation:

In [37]:
result = divmod(10, 3)
print(type(result), result)

<class 'tuple'> (3, 1)


In [38]:
quotient, remainder = divmod(10, 3)
print(quotient, remainder)

3 1


In Python, you can unpack arbitrarily nested sequences, meaning you can assign values from nested structures like in the following example:

In [39]:
a, (b, c) = 1, (2, 3)
print(a, b, c)

1 2 3


Or you can use the `*` operator to unpack a sequence into a variable number of elements, which is particularly useful when you want to capture the remaining elements in a sequence:

In [40]:
a, b, *c, d = 1, 2, 3, 4, 5
print(a, b, c, d)

1 2 [3, 4] 5


### Binary Sequence Types 

In addition to lists, tuples, and ranges, there are also the binary sequence types bytes and bytearray for handling binary data in Python.
* [Bytes](https://docs.python.org/3/library/stdtypes.html#bytes-objects): Immutable sequences of bytes for binary data.
* [Bytearray](https://docs.python.org/3/library/stdtypes.html#bytearray-objects): Mutable sequences of bytes, allowing modification of binary data.

## Set types

In Python, a [set](https://docs.python.org/3/library/stdtypes.html#set-types-set-frozenset) is an unordered collection of unique elements, useful for removing duplicates and performing operations like union, intersection, and difference. Sets are mutable, so you can add or remove items after creation. In contrast, a frozenset is an immutable version of a set, meaning its contents cannot be changed once created. This makes frozensets hashable and usable as keys in dictionaries or elements in other sets, providing flexibility when working with collections of unique, unchangeable data.

## Mapping types
dict

## Types for time and date 
datetime, date, time, timedelta, tzinfo, timezone, zoneinfo, calendar

## Additional data structures

https://docs.python.org/3/library/collections.abc.html
https://docs.python.org/3/library/heapq.html
https://docs.python.org/3/library/bisect.html
https://docs.python.org/3/library/array.html

---

Lecture: AI I - Basics 

Next: [**Chapter 2.3: Control Flow**](../02_python/03_control_flow.ipynb)