# Chapter 7 - Exercises

## Set Up

### Packages

In [3]:
import os

import arviz as az
import graphviz as gr
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc as pm
import seaborn as sns
from scipy import stats
from scipy.interpolate import BSpline
from sklearn.preprocessing import StandardScaler



### Defaults

In [4]:
# seaborn defaults
sns.set(
    style="whitegrid",
    font_scale=1.2,
    rc={
        "axes.edgecolor": "0",
        "axes.grid.which": "both",
        "axes.labelcolor": "0",
        "axes.spines.right": False,
        "axes.spines.top": False,
        "xtick.bottom": True,
        "ytick.left": True,
    },
)

colors = sns.color_palette()

### Constants

In [5]:
DATA_DIR = "../data"
HOWELL_FILE = "howell.csv"
CHERRY_BLOSSOMS_FILE = "cherry_blossoms.csv"
WAFFLE_DIVORCE_FILE = "waffle_divorce.csv"
MILK_FILE = "milk.csv"
LDS_FILE = "lds_by_state.csv"

RANDOM_SEED = 42

In [4]:
def load_data(file_name, data_dir=DATA_DIR, **kwargs):
    path = os.path.join(data_dir, file_name)
    return pd.read_csv(path, **kwargs)

## Easy

### 7E1

State the three motivating criteria that define information entropy.
Try to express each in your own words.

---

Information entropy is a function of probabilities $p=(p_i)$ such that the following hold.
1. It is continuous in the $p_i$.
2. Increasing the number of possible outcomes should increase the entropy. That is, if a possible outcome with probability $p_i$ is replaced by two outcomes whose probabilities sum to $p_i$ then the entropy should increase (strictly if the new outcomes have non-zero probability)
3. It is additive, in the sense that if we add a new *independent* random variable then then entropy of the joint distribution is the sum of the entropies of the marginal distributions.

### 7E2

Suppose a coin is weighted such that, when it is tosssed and lands on a table, it comes up heads 70% of the time.
What is the entropy of the coin.

---

In [18]:
def information_entropy(prob):
    prob = np.array(prob)
    return - np.sum(prob * np.log(prob))

In [19]:
information_entropy([0.3, 0.7])

0.6108643020548935

### 7E3

suppose a four-sided die is loaded such that, when tossed onto a table, it shows "1" 20%, "2" 25%, "3" 25%, and "4" 30% of the time.
What is the entropy of the die?

---

In [20]:
information_entropy([0.2, 0.25, 0.25, 0.3])

1.3762266043445461

### 7E4

Suppose another four-sided die is loaded such that it never shows "4".
The other three sides show equally often.
What is the entropy of this die.
---

We don't need to account for events with probability zero, so the entropy is

In [21]:
information_entropy([1/3, 1/3, 1/3])

1.0986122886681096

## Medium

## Hard