# Introduction to Python for Data Science
### Tomasz Rodak
## Lab X

2024/2025, winter semester

---

## Literature


* [The Python Tutorial](https://docs.python.org/3/tutorial/index.html)
* [Dive Into Python 3](https://diveintopython3.net/index.html)
* [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
* [Python 3 documentation](https://docs.python.org/3/index.html)



## OOP - special methods

### Special methods

Special methods, also known as **magic methods** or **dunder methods**, are a set of predefined methods in Python that allow you to customize the behavior of your classes in various ways. They are recognized by their double underscores (`__`) at the beginning and end of their names.

These methods are invoked by Python's interpreter under certain circumstances, such as when you use operators like `+`, `-`, `*`, or when you call built-in functions like `len()`, `str()`, or `repr()`.

From the Python [documentation](https://docs.python.org/3/reference/datamodel.html#special-method-names):
> A class can implement certain operations that are invoked by special syntax (such as arithmetic operations or subscripting and slicing) by defining methods with special names. This is Python’s approach to **operator overloading**, allowing classes to define their own behavior with respect to language operators. For instance, if a class defines a method named `__getitem__()`, and `x` is an instance of this class, then `x[i]` is roughly equivalent to `type(x).__getitem__(x, i)`. Except where mentioned, attempts to execute an operation raise an exception when no appropriate method is defined (typically `AttributeError` or `TypeError`).

We already met the `__init__()` method, which is called when an object is created. Here are some other special methods:

* `__str__(self)` - called by the `str()` built-in function and by the `print()` function to compute the "informal" or nicely printable string representation of an object.
* `__repr__(self)` - called by the `repr()` built-in function to compute the "official" string representation of an object.
* `__len__(self)` - called by the `len()` built-in function to compute the length of a sequence.
* `__add__(self, other)` - called by the `+` operator to compute the sum of two objects.
* `__call__(self, *args, **kwargs)` - called when an instance of the class is called as a function.
* and many more... 

See also the Appendix B in [Dive Into Python 3](https://diveintopython3.net/special-method-names.html).

### `datetime` example

Let's see how the `datetime` module uses `__str__()` and `__repr__()` methods.

Two main objects in the `datetime` module are `datetime` and `timedelta`. The `datetime` object represents a date and time, while the `timedelta` object represents a duration, the difference between two dates and times. The following code snippet shows how to create a `datetime` representing the current date and time:

```python
>>> from datetime import datetime
>>> today = datetime.today()
>>> today
datetime.datetime(2024, 12, 8, 21, 38, 45, 324204)
```

Notice that default view of a custom class is different and not very informative:

```python
>>> class MyClass:
...     pass
...
>>> obj = MyClass()
>>> obj
<__main__.MyClass object at 0x7f8b3c3b3b50>
```

The `datetime` object has two special methods responsible for the string representation of the object: `__str__()` and `__repr__()`. The `__str__()` method is called by the `str()` built-in function and by the `print()` function to compute the "informal" or nicely printable string representation of an object. The `__repr__()` method is called by the `repr()` built-in function to compute the "official" string representation of an object. The `datetime` object uses the `__str__()` method to return a nicely formatted string, and the `__repr__()` method to return a string that can be used to recreate the object.

```python
>>> today.__str__()
'2024-12-08 21:38:45.324204'
>>> today.__repr__()
'datetime.datetime(2024, 12, 8, 21, 38, 45, 324204)'
```


### Exercise 10.1

Finish the implementation of the `Point` class. Objects of this class represent points in a 2D space. The class should have:
* two attributes: `x` and `y`, representing the coordinates of the point,
* a method `__init__(self, x, y)` that initializes the point with the given coordinates,
* a method `__str__(self)` that returns a string representation of the point in the form `(x, y)`,
* a method `__repr__(self)` that returns a string representation of the point in the form `Point(x, y)`.

```python
class Point:
    """A class representing a point in a 2D space.
    
    Attributes:
    x (float): the x-coordinate of the point.
    y (float): the y-coordinate of the point.

    Examples:
    >>> p = Point(3, 4)
    >>> p
    Point(3, 4)
    >>> print(p)
    (3, 4)
    """
    pass
```

---

### Exercise 10.2

Finish the implementation of the `Array` class. Objects of this class represent n-dimensional arrays. The class should have the following methods:

* `__init__(self, iterable)` - initializes the array with the given iterable,
* `__len__(self)` - returns the number of elements in the array,
* `__getitem__(self, index)` - returns the element at the given index,
* `__setitem__(self, index, value)` - sets the element at the given index to the given value,
* `__eq__(self, other)` - checks if two arrays are equal,
* `__add__(self, other)` - adds two arrays element-wise,
* `__neg__(self)` - negates the array,
* `__sub__(self, other)` - subtracts two arrays element-wise,
* `__mul__(self, other)` - multiplies two arrays element-wise,
* `__str__(self)` - returns a string representation of the array in the form `[x1, x2, ..., xn]` if the array has less than 10 elements, otherwise in the form `[x1, x2, ..., xn]`,
* `__repr__(self)` - returns a string representation of the array in the form `Array([x1, x2, ..., xn])` if the array has less than 10 elements, otherwise in the form `Array([x1, x2, ..., xn])`.


```python
class Array:
    """A class representing an array.

    Examples:
    >>> a = Array(range(5))
    >>> a
    Array([0, 1, 2, 3, 4])
    >>> print(a)
    [0, 1, 2, 3, 4]
    >>> len(a)
    5
    >>> a[2]
    2
    >>> a[2] = 10
    >>> a
    Array([0, 1, 10, 3, 4])
    >>> b = Array(range(5))
    >>> b
    Array([0, 1, 2, 3, 4])
    >>> a == b
    False
    >>> c = Array(range(5))
    >>> b == c
    True
    >>> a + b
    Array([0, 2, 12, 6, 8])
    >>> -a
    Array([0, -1, -10, -3, -4])
    >>> a - b
    Array([0, 0, 8, 0, 0])
    >>> a * b
    Array([0, 1, 20, 9, 16])
    >>> d = Array(range(200))
    >>> d
    Array([0, 1, ..., 199])
    >>> print(d)
    [0, 1, ..., 199]
    """
    pass
```

---

### Exercise 10.3

Finish the implementation of the `Linspace` class. Objects of this class represent a sequence of equally spaced numbers over a specified interval. The class should have the following methods:

* `__init__(self, start, stop, num=50)` - initializes the sequence with the given start, stop, and number of elements,
* `__len__(self)` - returns the number of elements in the sequence,
* `__getitem__(self, index)` - returns the element at the given non-negative index,
* `__str__(self)` - returns a string representation of the sequence in the form `[x1, x2, ..., xn]`,
* `__repr__(self)` - returns a string representation of the sequence in the form `Linspace(start, stop, num)`.

Proper implementation of the `__getitem__()` method is crucial for this class.
* The method should raise an `IndexError` if the index is negative or greater than or equal to the number of elements in the sequence.
* The method should compute the sequence values on the fly when needed, there is no need to store them in memory.

As a result
* the object is **lazy** - it does not compute the sequence values until they are needed;
* the object is **iterable** - it can be used in a `for` loop or with the `list()` function.


```python
class Linspace:
    """A class representing a sequence of equally spaced numbers over a specified interval.

    Examples:
    >>> from math import isclose
    >>> l = Linspace(0, 1, 5)
    >>> len(l)
    5
    >>> l[0]
    0.0
    >>> l[4]
    1.0
    >>> l[5] # doctest: +ELLIPSIS
    Traceback (most recent call last):
    ...
    IndexError: index out of range
    >>> isclose(l[1], 0.25)
    True
    >>> print(l) 
    [0.0, 0.25, 0.5, 0.75, 1.0]
    >>> l
    Linspace(0, 1, 5)
    >>> for x in l:
    ...     print(x, end='_')
    0.0_0.25_0.5_0.75_1.0_
    >>> l = Linspace(0, 1, 10**12) 
    >>> len(l)
    1000000000000
    >>> l[0], l[10**12 - 1]
    (0.0, 1.0)
    """
    pass
```

---

### Non-public names

In Python, there is a convention that names prefixed with an underscore (`_`) are considered non-public. This means that they are not meant to be used outside of the class. This is just a convention, and Python does not enforce it. However, it is a good practice to follow this convention to avoid accidental misuse of non-public names.

Example:

```python
class LinearFunction:
    def __init__(self, a, b):
        self._a = a
        self._b = b

    def __call__(self, x):
        return self._a * x + self._b

    def __str__(self):
        sign = '+' if self._b >= 0 else '-'
        return f'{self._a}x {sign} {abs(self._b)}'
    
    def __repr__(self):
        return f'LinearFunction({self._a}, {self._b})'
```

In the `LinearFunction` class, the `_a` and `_b` attributes are non-public. They are meant to be used only within the class. The `__str__()`, `__repr__()`, and `__call__()` methods use these attributes to compute the string representation of the object and to evaluate the function at a given point. However, the user of the class should not access these attributes directly.

### Exercise 10.4

FIFO queue is a data structure that allows adding elements to the end of the queue and removing elements from the front of the queue. Finish the implementation of the `Queue` class. The class should have the following methods:

* `__init__(self, capacity)` - initializes the queue with the given capacity,
* `enqueue(self, item)` - adds an item to the end of the queue,
* `dequeue(self)` - removes and returns the item from the front of the queue,
* `is_empty(self)` - returns `True` if the queue is empty, `False` otherwise,
* `first(self)` - returns the item at the front of the queue without removing it,
* `__len__(self)` - returns the number of items in the queue,
* `__repr__(self)` - returns a string representation of the queue in the form `Queue(capacity)`.

Use the following non-public attributes:
* `_data` - a list of constant size that stores the items in the queue,
* `_front` - an integer representing the index of the front of the queue,
* `_size` - an integer representing the number of items in the queue.

At the beginning, the queue is empty, so `_front` should be set to 0, `_size` to 0, and `_data` should be a list of `None` values of the given capacity. When adding an item to the queue, the item should be placed at the index `_front + _size` (modulo the capacity). When removing an item from the queue, the item should be removed from the index `_front` (modulo the capacity). The `_size` attribute should be updated accordingly. The `enqueue()` method should raise an `IndexError` if the queue is full. The `dequeue()` method should raise an `IndexError` if the queue is empty. 

```python
class Queue:
    """A class representing a FIFO queue.

    Examples:
    >>> q = Queue(3)
    >>> q
    Queue(3)
    >>> q._data
    [None, None, None]
    >>> q.is_empty()
    True
    >>> q.first() # doctest: +ELLIPSIS
    Traceback (most recent call last):
    ...
    IndexError: queue is empty
    >>> q.enqueue(10)
    >>> q.enqueue(20)
    >>> q._data
    [10, 20, None]
    >>> q.is_empty()
    False
    >>> q.first()
    10
    >>> q.dequeue()
    10
    >>> q._data
    [None, 20, None]
    >>> q.enqueue(30)
    >>> q._data
    [None, 20, 30]
    >>> q.first()
    20
    >>> q.enqueue(40)
    >>> q._data
    [40, 20, 30]
    >>> q.first()
    20
    >>> q.enqueue(50) # doctest: +ELLIPSIS
    Traceback (most recent call last):
    ...
    IndexError: queue is full
    >>> q.dequeue()
    20
    >>> q._data
    [40, None, 30]
    >>> q._size
    2
    >>> len(q)
    2
    >>> q._front
    2
    """
    pass
```

---

### Exercise 10.5

Design a class to represent a singly linked list. Start by defining the public interface of the class, which includes the methods it should provide. Include detailed docstrings for each method to describe their purpose and usage. After defining the interface, implement the class and its methods.

---