# Lab 2-1: Classes and objects in Python.

---

## Quick introduction to the Python random module

The Python standard library includes a module called `random` that provides functions for generating pseudorandom numbers. To complete the exercises in this notebook, you may want to learn the following functions:

In [1]:
import random

random.random() # returns a random float between 0 and 1
print('The output of random.random() is:', random.random())

random.randint(0, 100) # returns a random integer between 0 and 100
print('The output of random.randint(0, 100) is:', random.randint(0, 100))

The output of random.random() is: 0.19940122720460107
The output of random.randint(0, 100) is: 74


In [2]:
movies = [
    'Taxi Driver',
    'Cars 3',
    'How High',
    'The Seventh Seal',
    'Mean Girls',
]

random_movie = random.choices(movies, k=1) # returns k random elements from the list (as a list)
print('Tonight we will watch', random_movie)

Tonight we will watch ['Mean Girls']


In [3]:
movie_preferences = {
    'Taxi Driver': 0.1,
    'Cars 3': 0.4,
    'How High': 0.2,
    'The Seventh Seal': 0,
    'Mean Girls': 0.3 
}

movies = list(movie_preferences.keys())
probabilities = list(movie_preferences.values())

random.choices(movies, probabilities, k=1) # returns k random elements from the list based on the probabilities given in a separate list
print('Eww, I would rather watch', random.choices(movies, probabilities))

Eww, I would rather watch ['Cars 3']


## Exercise 1: Fair dice (1 point)

1. Implement a class called `Die` that represents a fair six-sided die. The class should have a method called `roll_n` that simulates rolling $n$ identical dice and returns a list of the results.

In [4]:
class Die:
    def __init__(self):
        ...
    def roll_n(self, n):
        ...

## Exercise 2: Unfair dice (1 point)

1. Implement a class called `UnfairDie` that represents a **weighted** six-sided die. The class will be very similar to the `Die` class from the previous exercise, so you can use it as a starting point.

     The class should have a method called `roll_n` that simulates rolling the unfair die $n$ times and returns a list of results. The probabilities of rolling each face should be set by the user when creating a die object by passing a parameter `probs`, a list of six positive floats summing to one.
2. What is the probability of rolling a mean of more than 15 and less than 25 when rolling 5 identical unfair dice?  
The probabilities of rolling the faces 1-6 are given by a list [0.1, 0.1, 0.1, 0.25, 0.15, 0.3]. Conduct a simulation to estimate the probability.

In [5]:
class UnfairDie:
    ...

## Exercise 3: Entropy (1 point)

Entropy is a measure of **uncertainity** or **randomness** in a random variable. 
You may have already heard about entropy in the context of thermodynamics, and if you did, you may see how those are related!
The entropy of a discrete random variable $X$ with probability distribution $p(x)$ is defined as:
$$H(X) = -\sum_{x} p(x) \log_2 p(x)$$
Entropy is a **weighted average of the information content** of each possible value of $X$, where the weight is the **probability** of that value.
The formula can also be written in a way which conveys the intuition behind it more clearly:
$$H(X) = \sum_{x} p(x) * -\log_2 \frac{1}{p(x)}$$

### Let's break it down:

The term: $$\log_2 \frac{1}{p(x)}$$ is the **information content** or **surprisal** of the value $x$. For **high probability events** (such as the sun rising tomorrow) the surprisal is **low** (we get no new information about the system by observing the exact same sunrise for the 100th time), with $\log_2 \frac{1}{p(x)} = 0$ when $p(x) = 1$. For **low probability events**, such as your homework assignment being published in Nature, the surprisal is **high**, and $\log_2 \frac{1}{p(x)}$ goes to infinity as $p(x)\to 0$. The graph below illustrates how the information content of an event $x$ changes for different values of $p(x)$.

<center>
<img src="imgs/surprisal.png" width=400>
</center>
<br/><br/>

Now, entropy is the **average surprisal** of the random variable weighted by the probability of event $x$, thus in the formula it is multiplied by $p(x)$. 

If you do not immediately see how this relates to the concept of a weighted average, note that $p(x) \le 1$ and $\sum_{x}{}{p(x)} = 1$. 

Hopefully, you can see that the formula for entropy satisfies some of our expectations when it comes to measuring surprisal of a random event.
  
Let's take a look at an example. The entropy of a fair coin, where $p(\text{heads})=0.5$ and $p(\text{tails})=0.5$, is 1 bit:
$$H(\text{coin}) = p(\text{heads}) \log_2\frac{1}{p(\text{heads})} + p(\text{tails}) \log_2\frac{1}{p(\text{tails})}$$
$$= 0.5*\log_2\frac{1}{0.5} + 0.5*\log_2\frac{1}{0.5} = 1$$
while a coin that always lands heads has an entropy of 0 bits:
$$H(\text{coin}) = 1*\log_2\frac{1}{1} = 0$$
<br/>
***

1. Implement a method called `entropy` that calculates and returns the entropy of a die to the `Die` class from the previous exercise. How does the entropy of a fair die compare to the entropy of an unfair die with the probabilities $[0.1, 0.1, 0.1, 0.25, 0.15, 0.3]$? Be sure to return a valid entropy value even if some of the probabilities are zero.

In [None]:
from math import log2

...

## Exercise 3 (Plotting averages of dice rolls) (2 points)

1. Create a function called `get_average`. The function should take a Die (or UnfairDie) object and a number of dice to roll $n$ as arguments. The function should return the mean result of rolling $n$ dice.

2. Create a function called `get_n_averages`. The function should take a Die (or UnfairDie) object, a number of dice to roll $n$, and the number of times to repeat the experiment $k$. The function should roll $n$ dice $k$ times and return a list of $k$ mean results.

3. Plot a histogram of the results of rolling $n = 10$ fair dice $k = 50$, $1000$ and $10 000$ times. The function `plot_means` which plots the histogram from a list of numbers is already implemented by me, and can be used out of the box. **You will learn how to prepare plots like this with seaborn and pyplot during the next labs**. What shape does this distribution converge to as the number of trials increases? Conduct the same experiment for unfair dice with probabilities $[0.1, 0.1, 0.1, 0.25, 0.15, 0.3]$ and try to draw some conclusions.

In [3]:
def get_average(die, n):
    ...

In [4]:
def get_n_averages(die, n, k):
    ...

In [None]:
from src.helpers import plot_hist
import warnings # This library is used to ignore warnings, don't worry about it for now
warnings.filterwarnings('ignore')

results = get_n_averages(...)
plot_hist(results)

## Excercise 4 (Pachinko) (2 points)

Pachinko is a Japanese gambling game played on a vertical board. The board has pegs protruding from the surface and the player has to drop a ball from the top. The ball bounces off the pegs and can land in one of specially designated pockets. The pockets have different values and the prize is determined by the pocket in which the ball lands.

<center>
    <p float="left">
        <img src="imgs/pachinko1.jpg", width=350>
        <img src="imgs/pachinko.png", width=500>
    </p>
</center>

The figure above shows an actual pachinko machine (left) and a simplified version of a pachinko board (right). Assume the ball has an equal chance of bouncing either left or right off each peg. 

One can simulate the results of such a game in many ways. One example is by assigning a value of 0 to each left bounce and 1 to each right bounce. As the ball falls through $n$ rows, its final position (bin index) is determined by the sum of the values in each row.

1. Create a class called `Pachinko` that represents a simplified pachinko board of $k$ rows. The class should have a method called `drop_balls` that simulates dropping $n$ balls through the board and returns the list of $n$ integers, corresponding to the final position (bin index) of each ball.
2. Create a function called `plot_pachinko`. It should take a Pachinko object and the number of balls $n$ as arguments. The function should simulate dropping $n$ balls through the board and return a histogram of the distribution of balls in all bins (you can use the `plot_hist` function from the previous exercise to return the histogram).
3. Plot the histogram of the results of dropping 1000 balls through a pachinko board with $k=5, 10, 20$ rows. What shape does this distribution converge to as the number of rows increases? **Extra**: What exactly does this experiment have in common with means of multiple dice rolls?
4. You encounter a 10-row pachinko machine. One game (equivalent of dropping one ball) costs you 10 yen. If the ball lands in either the first or last bin, you win 2500 yen. You can play as many times as you wish. Will this machine make the casino go bankrupt? Conduct a simulation to verify your prediction.

In [60]:
class Pachinko:
    def __init__(self):
        ...
    
    def drop_balls(self, n):
        ...

In [61]:
from src.helpers import plot_hist

def plot_pachinko(pachinko, n):
    ...
    return plot_hist(...)

## *Python dunder methods

We have already encountered some special methods in Python, such as `__init__(self, args)`, which is called only once when an object is created. These methods are called dunder methods (short for *double underscore*). They are used to define how objects of a class behave when they are used in conjunction with built-in Python functions. 

For example, the `__str__(self)` method is called when an object is passed to the `print` function. If you want to define how your object should be represented as a string, you can implement this method in your class. Here is an example:

In [26]:
# Example of using __str__()

class Cow:
    def __init__(self, name):
        self.name = name # Setting the name of the cow

    def __str__(self):
        cow_art = (
        rf"""
        This is a cow named {self.name}:
        ^__^
        (oo)\_______
        (__)\       )\/\
            ||----w |
            ||     ||
        """
        )
        return cow_art.strip() # The strip() method is for aesthetic purposes
    
cow = Cow('Angelica')

# The print function calls the __str__() method!
print(cow)

This is a cow named Angelica:
        ^__^
        (oo)\_______
        (__)\       )\/\
            ||----w |
            ||     ||


Dunder methods are a powerful tool in Python that allows you to use your custom classes in any code that was prepared to work with built-in Python objects. You just have to implement the right dunder methods in your class.

Other useful dunder methods:
- `__call__(self)`: makes an object callable. It is executed when the object is called as a function.
- `__add__(self, other)`: called by the `+` operator. It should return the sum of two objects (*whatever that means for your particular class*).
- `__mul__(self, other)`: called by the `*` operator. It should return the product of two objects).
- `__len__(self)`: called by the `len` function. It should return the length of the object).
- `__eq__(self, other)`: called by the `==` operator. It should return `True` if two objects are equal.
- `__getitem__(self, key)`: called to get an item from the object using square brackets. It should return the item at the given key (*useful if your class is a data-related*).

There are many other dunder methods that you can implement in your classes. You can find a list of them [here](https://docs.python.org/3/reference/datamodel.html#special-method-names).