# Introduction to Python for Data Science
### Tomasz Rodak
## Lab I

2024/2025, winter semester

---

## Assessment

In-class written assessment, last class.

## Literature


* [The Python Tutorial](https://docs.python.org/3/tutorial/index.html)
* [Dive Into Python 3](https://diveintopython3.net/index.html)
* [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
* [Python 3 documentation](https://docs.python.org/3/index.html)



## Thonny IDE

Thonny is a beginner-friendly Integrated Development Environment (IDE) for Python. Its window is divided into three panes: the editor, the shell, and the debugger. The most important are the editor and the shell.

### Editor

The editor is where you write your code. It has a number of features that help you write code, such as syntax highlighting, code completion, and error highlighting. 

The editor is the upper pane in the Thonny window.

### Shell

The shell is where you can interact with the Python interpreter in the so-called Read-Eval-Print Loop (REPL) mode. REPL is a simple interactive programming environment that takes single user inputs, evaluates them, and returns the result to the user. The shell is useful for testing out small pieces of code or for experimenting with the Python language.

The shell is the lower pane in the Thonny window.

Here's a simple example of using Python's REPL:

```python
>>> 2 + 2
4
>>> print("Hello, World!")
Hello, World!
>>> x = 10
>>> x * 5
50
```

Each line beginning with `>>>` represents user input. The line(s) following show the output of that input.

## Basic arithmetic operations

Python supports the following basic arithmetic operations:

* addition: `+`
* subtraction: `-`
* multiplication: `*`
* division: `/`
* integer division: `//`
* remainder: `%`
* exponentiation: `**`

The order of operations is the same as in mathematics. Round brackets can be used to change the order of operations.

### Exercise 1.1

Calculate the following expressions. Use the shell in Thonny.

* $2 + 2$
* $3 \times 4$
* $5^2$
* $7 \div 3$
* $7 \div 3$ (integer division)
* $7 \mod 3$

---

### Exercise 1.2

More complex expressions:

* $2 + 3 \times 4$
* $(2 + 3) \times 4$
* $2^{10}$
* $2^{10} \mod 3$

---

## Variables, assignment, and data types

In Python, a variable is a name that refers to a value stored in memory. 

Variables are created directly through assignment, using the `=` operator, or indirectly through other operations that create objects.

Examples of variable assignment:
```python
>>> x = 10
>>> x = x + 1
>>> x
11
>>> y = 3.14
>>> 2 * y
6.28
>>> z = "Hello, World!"
>>> z + " " + z
'Hello, World! Hello, World!'
```

Variables can be reassigned to different values during the program execution or in the shell:
    
```python
>>> x = 10
>>> x = 20
>>> x
20
>>> x = "Hello, World!"
>>> x
'Hello, World!'
```

Every piece of data in Python is an object with a specific type. Variables do not have a type, but the values (objects) they refer to do. The type of an object determines what operations can be performed on it and how it behaves.

You can check the type of an object using the `type()` function:

```python
>>> x = 10
>>> type(x) # type of object referred to by x, not the type of x
<class 'int'>
>>> type(3.14)
<class 'float'>
>>> type("Hello, World!")
<class 'str'>
```

## Numeric types

Python supports three built-in numeric types:

* integers (`int`)
* floating-point numbers (`float`)
* complex numbers (`complex`)

### Exercise 1.3

Integers in Python have interesting property: they can be arbitrarily large. Calculate the following expressions:

* $2^{1000}$
* $2^{10000}$
* $2^{2^{10}}$

---

### Exercise 1.4

Type of an object returned by an operation may depend on the types of the operands. Experiment with the following operations in Python and determine the type of the result in each case:

* `5 + 3`
* `5 + 3.0`
* `5 / 2`
* `5 // 2`
* `5 * 2`
* `5 * 2.0`
* `100 ** 0.5`
* `4 ** (-1/2)`

For each operation:

1. Predict the result and its type before running the code.
2. Use the Python interpreter to perform the operation.
3. Use the `type()` function to check the actual type of the result.
4. Explain why the result has that particular type.

---

### Exercise 1.5

Previously, we have seen that Python is able to compute as large powers of $2$ as $2^{10000}$. This is possible because Python uses arbitrary-precision integer arithmetic. However, the same is not true for floating-point numbers. What will happen if you try to compute `2.0**10000`? You'll notice that this calculation raises an exception `OverflowError`. The reason is that floating-point numbers in Python are represented using a fixed number of bits (64 by default), which limits the range of numbers that can be represented.

* What is the biggest integer power of `2.0` that you can compute without getting an `OverflowError`? 
* What happens if you try to compute powers of `2.0` with large negative exponents, e.g., `2.0**(-10000)`?
* What happens if you try to divide very large numbers, e.g., `10**10000 / 10**9999` or `10**10000 / 10**1000`?

---

### Exercise 1.6

Let $T$ be a triangle with sides of lengths $a=3$, $b=4$, and $c=5$. The area of $T$ can be calculated using Heron's formula:

$$
\text{area}(T) = \sqrt{p(p-a)(p-b)(p-c)},
$$

where $p$ is the semiperimeter of $T$:

$$
p = \frac{a + b + c}{2}.
$$

Calculate the area of $T$ using Python. Define variables `a`, `b`, and `c` with the given values, then define `p` and finally calculate the area of $T$ (answer: $6$).

Do the same for a triangles with sides:
* $a=5$, $b=12$, $c=13$, (answer: $30$),
* $a=1$, $b=1$, $c=1$, (answer: $\sqrt{3}/4$).

---

##  Booleans

Python has a built-in type for Boolean values: `bool`. There are two Boolean values: `True` and `False`.

Boolean values are used to represent the truth values of logical expressions. Python supports the following logical operators:

* `and`
* `or`
* `not`

The `and` operator returns `True` if both operands are `True`, otherwise it returns `False`. The `or` operator returns `True` if at least one of the operands is `True`, otherwise it returns `False`. The `not` operator returns `True` if the operand is `False`, and vice versa.

Python supports the following comparison operators:

* `==` (equal)
* `!=` (not equal)
* `>` (greater than)
* `<` (less than)
* `>=` (greater than or equal)
* `<=` (less than or equal)

Comparison operators return Boolean values.

Boolean values can be combined with arithmetic and comparison operators to form complex logical expressions:

```python
>>> x = 10
>>> y = 20
>>> x > 5 and y < 30
True
>>> x < 5 or y > 30
False
>>> not (x > 5)
False
```

### Exercise 1.7

Determine the truth value of the following logical expressions:

* `(2 + 2 == 4) and (3 * 4 == 12)`
* `(2 + 2 > 4) or (3 * 4 < 12)`
* `not (2 + 2 == 4)`
* `(2 + 2 == 4) and not (3 * 4 == 12)`
* `(2 + 2 == 4) and (3 * 4 == 12) and (5 ** 2 == 25)`
* `(2 + 2 == 4) and (3 * 4 == 12) or (5 ** 2 == 24)`
* `(2 + 2 == 4) and (3 * 4 == 12) or (5 ** 2 == 25)`

Try to predict the result before running the code.

---

### Exercise 1.8

Given two Boolean variables `p` and `q`:
* write the expression for the XOR operation (exclusive OR) using only `p`, `q`, and the logical operators `and`, `or`, and `not`. The XOR operation returns `True` if exactly one of the operands is `True`, otherwise it returns `False`.
* write the expression for the implication operation using only `p`, `q`, and the logical operators `and`, `or`, and `not`. The implication operation returns `False` if the first operand is `True` and the second operand is `False`, otherwise it returns `True`.

---

Ternary operator `if` is a conditional expression that evaluates an expression and returns one of two values, depending on the value of the expression:

```python
>>> x = 10
>>> "positive" if x > 0 else "non-positive"
'positive'
>>> x = -10
>>> "positive" if x > 0 else "non-positive"
'non-positive'
```

This is similar to the ternary operator in C, Java, and other languages.

### Exercise 1.9

Write a Python expression that returns the string `"even"` if a given integer `x` is even, and `"odd"` otherwise.

---

## Strings

Strings are sequences of characters. In Python, strings are represented using single quotes (`'`) or double quotes (`"`). Strings can be concatenated using the `+` operator:

```python
>>> s1 = "Hello"
>>> s2 = "World"
>>> s1 + ", " + s2 + "!"
'Hello, World!'
```

Multiplication of a string by an integer results in multiple concatenations:

```python
>>> s = "Hello"
>>> s * 3
'HelloHelloHello'
```

If you want to write a string that spans multiple lines, you can use triple quotes (`'''` or `"""`):

```python   
>>> s = """This is a string
... that spans multiple lines."""
>>> s
'This is a string\nthat spans multiple lines.'
```

Strings, as other sequences in Python, can be indexed and sliced. Indexing starts at `0` and may be positive or negative. 

For example, the string `"Python"` has length `6`, so the indices are `0`, `1`, `2`, `3`, `4`, and `5` for positive indexing, and `-1`, `-2`, `-3`, `-4`, `-5`, and `-6` for negative indexing:

```
 +---+---+---+---+---+---+
 | P | y | t | h | o | n |
 +---+---+---+---+---+---+
 0   1   2   3   4   5   6
-6  -5  -4  -3  -2  -1
```

Square brackets are used to access individual characters or slices of a string:

```python
>>> s = "Python"
>>> s[0]
'P'
>>> s[-1]
'n'
>>> s[1:4]
'yth'
>>> s[:4]
'Pyth'
>>> s[4:]
'on'
```

Slices are specified by two indices separated by a colon. The first index is the start of the slice, and the second index is the end of the slice. The slice includes the start index but excludes the end index. If the start index is omitted, it defaults to `0`. If the end index is omitted, it defaults to the length of the sequence. You can also specify a step value, which is the third index in the slice specification:

```python
>>> s = "Python"
>>> s[::2]
'Pto'
>>> s[::-1]
'nohtyP'
```

### Exercise 1.10

Given a string `s`, write a Python expression that returns the string consisting of the first and last characters of `s`. For example, if `s = "Python"`, the expression should return `"Pn"`.

Assume that the string `s` has at least two characters. Then check your expression with the following strings:

* `"Python"`
* `"Hello, World!"`
* `"1234567890"`

What happens if you try to use your expression with a string that has only one character or is empty?

---

## Lists

Lists are ordered collections of objects. Lists can contain objects of different types, including other lists. Lists are created using square brackets `[]`:

```python
>>> lst = [1, 2, 3, 4, 5]
>>> lst
[1, 2, 3, 4, 5]
>>> lst = [1, 2, "Hello", 3.14, [1, 2, 3]]
>>> lst
[1, 2, 'Hello', 3.14, [1, 2, 3]]
```

Lists can be indexed and sliced in the same way as strings:

```python
>>> lst = [1, 2, 3, 4, 5]
>>> lst[0]
1
>>> lst[-1]
5
>>> lst[1:4]
[2, 3, 4]
>>> lst[:4]
[1, 2, 3, 4]
>>> lst[4:]
[5]
>>> lst[::2]
[1, 3, 5]
>>> lst[::-1]
[5, 4, 3, 2, 1]
```


### Exercise 1.11

Given a list `lst`, write a Python expression that returns the list consisting of the first and last elements of `lst`. For example, if `lst = [1, 2, 3, 4, 5]`, the expression should return `[1, 5]`.

Assume that the list `lst` has at least two elements. Then check your expression with the following lists:

* `[1, 2, 3, 4, 5]`
* `["Hello", "World"]`
* `[[1, 2], [3, 4], [5, 6]]`

What happens if you try to use your expression with a list that has only one element or is empty?

---

Lists can be modified by assigning new values to elements or slices:

```python
>>> lst = [1, 2, 3, 4, 5]
>>> lst[0] = 10
>>> lst
[10, 2, 3, 4, 5]
>>> lst[1:4] = [20, 30, 40]
>>> lst
[10, 20, 30, 40, 5]
```

This is called item assignment. Under this operation, the list is **modified in place**, i.e., the original list is changed, but the identity of the list remains the same (it is the same object). Data types that support item assignment are called **mutable**.

### Exercise 1.12

Strings are **immutable**, i.e., they cannot be modified in place. Try to assign a new value to an element of a string. What happens?

---

## `range()` function

The `range()` function generates a sequence of numbers. It can be called with one, two, or three arguments:

* `range(stop)`: generates a sequence of numbers from `0` to `stop - 1`
* `range(start, stop)`: generates a sequence of numbers from `start` to `stop - 1`
* `range(start, stop, step)`: generates a sequence of numbers from `start` to `stop - 1` with a step `step`

The `range()` function returns an object of the `range` type:

```python
>>> r = range(5)
>>> r
range(0, 5)
```

The numbers in the range are not stored in memory, but generated on the fly when needed. You can convert a `range` object to a list using the `list()` function:

```python
>>> list(range(5))
[0, 1, 2, 3, 4]
>>> list(range(1, 5))
[1, 2, 3, 4]
>>> list(range(1, 10, 2))
[1, 3, 5, 7, 9]
>>> list(range(10, 1, -2))
[10, 8, 6, 4, 2]
```

### Exercise 1.13

Use the `range()` function to generate the following sequences of numbers:

* $0, 1, 2, 3, 4, 5$
* $1, 2, 3, 4, 5$
* all even numbers from $0$ to $10$
* all odd numbers from $1$ to $9$
* all numbers from $10$ to $1$ in descending order
* all numbers divisible by $3$ from $0$ to $30$

---

## List comprehensions

List comprehensions are a concise way to create lists. They consist of an expression followed by a `for` clause, then zero or more `for` or `if` clauses. The result will be a new list resulting from evaluating the expression in the context of the `for` and `if` clauses that follow it.

For example, the following list comprehension creates a list of squares of numbers from `0` to `9`:

```python
>>> squares = [x ** 2 for x in range(10)]
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
```

List comprehensions can be used to filter elements of a list:

```python
>>> even_squares = [x ** 2 for x in range(10) if x % 2 == 0]
>>> even_squares
[0, 4, 16, 36, 64]
```

### Exercise 1.14

Use a list comprehension to generate the following lists:

* all numbers from $0$ to $100$ which are divisible by $3$ or $5$
* all squares not greater than $10000$

---

### Exercise 1.15

Use a list comprehension to generate the values from the multiplication table:

```
1    2    3    4    5    6    7    8    9   10
2    4    6    8   10   12   14   16   18   20
3    6    9   12   15   18   21   24   27   30
4    8   12   16   20   24   28   32   36   40
5   10   15   20   25   30   35   40   45   50
6   12   18   24   30   36   42   48   54   60
7   14   21   28   35   42   49   56   63   70
8   16   24   32   40   48   56   64   72   80
9   18   27   36   45   54   63   72   81   90
10  20   30   40   50   60   70   80   90  100
```

Do this in two ways: 
* using a list comprehension with a nested `for` loop. This should result in a list of the form 
    ```
    [1, 2, 3, ..., 10, 2, 4, 6, ..., 20, 3, 6, 9, ..., 30, ..., 100]
    ```
* using a list comprehension of list comprehensions. This should result in a list of the form 
    ```
    [[1, 2, 3, ..., 10], [2, 4, 6, ..., 20], [3, 6, 9, ..., 30], ..., [10, 20, 30, ..., 100]]
    ```

---