# Environment setup

- [The book github](https://github.com/TikhonJelvis/RL-book)

- First, move to the directory with the codebase:

   ```cd rl-book```

- Then, create and activate a Python vitrual environment

   ```python3 -m venv .venv```

   ```source .venv/bin/activate```
   
   ```conda create -n {env_name}```

   ```conda activate {env_name}```

- Once the environment is activated, you can install the right versions of each Python dependency.

   ```pip install -r requirements.txt```

- Once the environment is set up, you can confirm that it works by running the frameworks automated tests.

   ```python -m unittest discover```



## Classes and interfaces

- There are always two parts to answering this questions:

    - Understanding the domain concept that you are modeling.

    - Figuring out how to express that concept with features and patterns provided by your programming language.

- One approach would be to keep Probability implicit. Whenever we have a random variable, we could call a function and get a random result.

In [1]:
from random import randint

def six_sided():
    return randint(1, 6)

def roll_dice():
    return six_sided() + six_sided()

- This works, but it's pretty limited. We can't do anything except get one outcome at a time. This only captures a slice of how we think about Probability: there's randomness but we never even mentioned probability distributions.

### A distribution interface

- Let's define an abstaction for probability distributions. It depends on what kind of distribution we're working with. 

    - If we know something about the structure of a distribution - perhaps it's a Poisson distribution where $\lambda=5$, perhaps it's an empirical distribution with set probabilities for each outcome - we could do produce an exact Probability Distribution Function (PDF) or Cumumlative Distribution Function (CDF), calcaulate expectations and do various operations efficiently.

    - What if the distribution comes from a complicated simulation? At the extreme, we might not be able to do anything except draw samples from the distribution.

- Sampling is the least common denominator. Any abstraction we start with for a probability distribution needs to cover sampling, and any abstraction that requires and any abstraction that requires more than just sampling will not let us handle all the distributions we care about.

In [4]:
from abc import ABC, abstractmethod

class Distribution(ABC):
    @abstractmethod
    def sample(self):
        pass

- This class defines an interface : a definition of what we require for something to qualify as a distribution. Any kind of distribution we implement in the future will be able to generate samples; when we write functions that sample distributions, they cam require their inputs to inherit from `Distribtution`.

- We've made `Distribuition` an abstract base class (ABC), with `sample` as an abstact method. Abstract classes and abstract methods are features that Python provides to help us define interfaces for abstractions. We can define the `Distribiution` class to structure the rest of our probability distribution code before we define any specific distributions.

### A concrete distribution

- An interface can be approached from two sides:

    - Something that requires the interface. This will be code that uses operations specified in the interface and work with any value that satisfies those requirements.

    - Something that provides the interface. This will be some value that supports the operations specified in the interface.

- To use our `Distribution` class, we can start by providing a concrete class that implements the interface. Let's model dice.

In [5]:
import random

class Die(Distribution):
    def __init__(self, sides):
        self.sides = sides
    def sample(self):
        return random.randint(1, self.sides)
    
six_sided = Die(6)
def roll_dice():
    return six_sided.sample() + six_sided.sample()

In [6]:
print(six_sided)

<__main__.Die object at 0x10501b3d0>


- With a class we can fix this. To change the class is printed, we can override `__repr__`:

In [7]:
class Die(Distribution):
    def __init__(self, sides):
        self.sides = sides
    def sample(self):
        return random.randint(1, self.sides)
    def __repr__(self) -> str:
        return f"Die(sides={self.sides})"

In [8]:
print(Die(6))

Die(sides=6)


#### Dataclasses

- Two `Die` object with the same number of sides have the same behavior and represent the same probablility distribution, but with the default version of `__eq__`, two `Die` objects declared separately will never be equal:

In [9]:
six_sided = Die(6)
six_sided == six_sided

True

In [10]:
six_sided == Die(6)

False

In [11]:
Die(6) == Die(6)

False

In [12]:
class Die(Distribution):
    def __init__(self, sides):
        self.sides = sides
    def sample(self):
        return random.randint(1, self.sides)
    def __repr__(self) -> str:
        return f"Die(sides={self.sides})"
    def __eq__(self, other):
        if isinstance(other, Die):
            return self.sides == other.sides
        return False

In [13]:
Die(6) == Die(6)

True

In [14]:
Die(6) == None

False

- Python 3.7 introduces a feature that fixes all of these problems: `dataclases`. The `dataclasses` module provides a decorator that lets up write a class that behaves like `Die` without needing to manually implement `__init__`, `__repr__`, or `__eq__`.

In [15]:
from dataclasses import dataclass

@dataclass
class Die(Distribution):
    sides: int
    def sample(self):
        return random.randint(1, self.sides)

In [16]:
Die(6) == Die(6)

True

#### Immutability

- Changing state can create invisible conncections between seemingly separate parts of the codebase, which becomes hard to mentally track. 

- It is better to have the language prevent us from doing the wrong thing than relying on pure convention. Normal Python classes don't have a convenient  way to stop attributes from changing, but luckily dataclasses do:

    - With `frozen=True`, attempting to change sides will raise an exception.

In [17]:
from dataclasses import dataclass

@dataclass(frozen=True)
class Die(Distribution):
    
    sides: int
    def sample(self):
        return random.randint(1, self.sides)

In [19]:
d = Die(6)
# an exception is raised
# d.sides = 10

- An object that we cannot change is called immutable. Instead of changing the object inplace, we can return a fresh copy with the attribute changed; `dataclassses` provides a `replace` function that makes this easy: 

In [20]:
import dataclasses

d6 = Die(6)
d20 = dataclasses.replace(d6, sides=20)
d20

Die(sides=20)

`frozen=True` has an important bonus: we can use immutable objects as dictionary keys and set elements. Without `frozen=True`, we would get a `TypeError` because non-frozen dataclases do not implement `__hash__`:

In [21]:
from dataclasses import dataclass

@dataclass
class Die(Distribution):
    
    sides: int
    def sample(self):
        return random.randint(1, self.sides)

In [23]:
d = Die(6)
# an excepion occurs
# {d: 'abc'}

In [28]:
from dataclasses import dataclass

@dataclass(frozen=True)
class Die(Distribution):
    
    sides: int
    def sample(self):
        return random.randint(1, self.sides)

In [26]:
d = Die(6)
{d: 'abc'}

{Die(sides=6): 'abc'}

### Type variables

- The `distribution` class defines an interface for any distribution.

- To deal with different types from `sample`, we need type variables. Type variables are also known as 'generics' because they let us write classes that generically work for any type.

- To add annotations to the abstract `Distribution` class, we will need to define a type variable for the outcoimes for the distribution, then tell Python that `Distribution` is "generic" in that type:

In [40]:
from typing import Generic, TypeVar

# A type variable named "A"
A = TypeVar("A")

# Distribution is "generic in A"
class Distribution(ABC, Generic[A]):
    # sampling must produce a value of type A
    def sample(self) -> A:
        pass

- In this code, we defined a type variable A and specified that `Distribution` uses A by inheriting from `Generic[A]`. We can not write type annotations for distributions with specific types of outcomes:

In [37]:
from dataclasses import dataclass

@dataclass(frozen=True)
class Die(Distribution[int]):
    
    sides: int
    def sample(self):
        return random.randint(1, self.sides)

- This lets us write specialized functions that only work with certain kinds of distributions. Let's say we wanted to write a function that approximated the expected value fo a distribution by sampling repeatedly and calculating the mean. This function works for distributions that have numeric outcomes - `float` or `int`- but not other kinds of distributions. We can annotate this explicitly by using `Distribution[float]:`

In [41]:
import statistics

def expected_value(d: Distribution[float], n: int=100) -> float:
    return statistics.mean(d.sample() for _ in range(n))

### Functionality

