Lecture: AI I - Basics 

Previous:
[**Chapter 2.1: Python Basics**](../02_python/01_basics.ipynb)

---

# Chapter 2.2: Data Structures

- [Sequence types](#Sequence-types)
- [Set types](#Set-types)
- [Mapping types](#Mapping-types)
- [Types for time and date](#Types-for-time-and-date)
- [Additional data structures](#Additional-data-structures)

## Sequence types 

In Python, sequences are collections of items that can be accessed by their index. The most common sequences are lists, tuples, and ranges. Together, these sequence types provide flexible and efficient ways to work with data in Python.

### Ranges

Ranges represent immutable sequences of numbers, typically used for generating number sequences in loops efficiently without storing all values in memory at once. The `range()` function can take one, two, or three parameters, giving you control over the generated number sequence: 
- the stop value alone
- a start and stop value
- a start, stop, and step value

In [16]:
type(range(10))

range

In [None]:
range(10)  # repr of a range object

range(0, 10)

In [None]:
list(range(10))  # stop value alone

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
list(range(1, 10))  # start and stop value

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [10]:
list(range(1, 10, 2))  # start, stop, step

[1, 3, 5, 7, 9]

### Lists

Lists are mutable, allowing you to add, remove, or change elements, making them ideal for general-purpose data storage.

#### Common Sequence Operations

Python supports common sequence operations like indexing, slicing, concatenation, repetition, and checking membership, which work consistently across lists, tuples, and _strings_.

| Operation | Description |
|-----------|-------------|
| `x in s` | Check if `x` is in `s` |
| `x not in sseq` | Check if `x` is not in `s` |
| `s + t` | Concatenate sequences `s` and `t` |
| `s * n` or `n * s` | Repeat sequence `s` `n` times |
| `s[i]` | Access the `i`-th element of sequence `s` |
| `s[i:j]` | Slice sequence `s` from index `i` to `j` |
| `s[i:j:k]` | Slice sequence `s` from index `i` to `j`, stepping by `k` |
| `len(s)` | Get the length of sequence `s` |
| `min(s)` | Get the minimum value in sequence `s` |
| `max(s)` | Get the maximum value in sequence `s` |
| `s.index(x,[start[, end]])` | Get the index of the first occurrence of `x` in `s`, optionally within a specified range |
| `s.count(x)` | Count the occurrences of `x` in sequence `s` |

In [45]:
nums = list(range(10))
nums

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The `in` and `not in` operators are used to check whether an item exists within a sequence. Avoid using `not x in s`.

In [46]:
print(0 in nums)
print(10 not in nums)

True
True


In Python, the `+` operator concatenates sequences, joining them together, while the `*` operator repeats a sequence a specified number of times.

In [47]:
print(nums + s)
print(nums * 3)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


The `min()` and `max()` functions return the smallest and largest items in a sequence, while `len()` returns the number of elements in the sequence.

In [48]:
print(min(nums))
print(max(nums))
print(len(nums))

0
9
10


The `key` keyword argument in `max()` and `min()` allows you to specify a function to determine the value used for comparison, enabling custom sorting criteria when finding the largest or smallest item.

In [49]:
names = ["Alice", "Bob", "Charlie"]

print(min(names))  # min by alphabetical order
print(max(names))  # max by alphabetical order

Alice
Charlie


In [50]:
print(min(names, key=len))  # min by length of name
print(max(names, key=len))  # max by length of name

Bob
Charlie


The `.index()` method returns the position of the first occurrence of a value in a sequence, while the `.count()` method returns the number of times a value appears in the sequence.

In [51]:
print(nums.index(5))
print(nums.count(5))

5
1


#### Indexing and Slicing

In Python, indexing allows you to access individual elements in a sequence using their position, with indices starting at 0 for the first item. Negative indices can be used to access elements from the end of the sequence. Slicing lets you extract a range of elements by specifying a start, stop, and optional step, creating a new subsequence without modifying the original.

In [52]:
print(nums[1]) # second element
print(nums[-1]) # last element

1
9


follwing is are a few slicing example:

In [53]:
nums[2:]  # slice from index 2 to the end

[2, 3, 4, 5, 6, 7, 8, 9]

In [None]:
nums[:2]  # slice from the start to index 2 (exclusive)

[0, 1]

In [61]:
nums[-2:]  # slice from the second last element to the end

[8, 9]

In [None]:
nums[2:5] # slice from index 2 to 5 (exclusive)

[2, 3, 4]

In [56]:
nums[::2]  # slice every second element

[0, 2, 4, 6, 8]

In [None]:
nums[1::2]  # slice every second element starting from index 1

[1, 3, 5, 7, 9]

In [58]:
nums[1:7:2]  # slice from index 1 to 7 (exclusive), every second element

[1, 3, 5]

In [None]:
reversed = [::-1] code golf

### Tuples

Tuples are immutable sequences, meaning their contents cannot be changed after creation, which is useful for fixed collections of items or when you want to ensure data integrity.

In [18]:
t1 = ()
t2 = 1, # or t2 = (1,)
t3 = 1, 2  # or t3 = (1, 2)  
t4 = tuple([1, 2, 3])

type(t1), type(t2), type(t3), type(t4)

(tuple, tuple, tuple, tuple)

In Python, unpacking allows you to assign elements from a sequence to multiple variables in a single statement, making your code cleaner and more readable. An example is the divmod operator, which returns both the quotient and remainder of a division operation:

In [21]:
result = divmod(10, 3)
print(type(result), result)

<class 'tuple'> (3, 1)


In [20]:
quotient, remainder = divmod(10, 3)
print(quotient, remainder)

3 1


In Python, you can unpack arbitrarily nested sequences, meaning you can assign values from nested structures like in the following example:

In [23]:
a, (b, c) = 1, (2, 3)
print(a, b, c)

1 2 3


Or you can use the `*` operator to unpack a sequence into a variable number of elements, which is particularly useful when you want to capture the remaining elements in a sequence:

In [27]:
a, b, *c, d = 1, 2, 3, 4, 5
print(a, b, c, d)

1 2 [3, 4] 5


### Binary Sequence Types 

In addition to lists, tuples, and ranges, there are also the binary sequence types bytes and bytearray for handling binary data in Python.
* [Bytes](https://docs.python.org/3/library/stdtypes.html#bytes-objects): Immutable sequences of bytes for binary data.
* [Bytearray](https://docs.python.org/3/library/stdtypes.html#bytearray-objects): Mutable sequences of bytes, allowing modification of binary data.

## Set types
set frozentset

## Mapping types
dict

## Types for time and date 
datetime, date, time, timedelta, tzinfo, timezone, zoneinfo, calendar

## Additional data structures

https://docs.python.org/3/library/collections.abc.html
https://docs.python.org/3/library/heapq.html
https://docs.python.org/3/library/bisect.html
https://docs.python.org/3/library/array.html

---

Lecture: AI I - Basics 

Next: [**Chapter 2.3: Control Flow**](../02_python/03_control_flow.ipynb)