## Practice Calculating Minimal and Maximal Uncertainty

<p style="text-align: left; border-radius: 5px; background-color: #e3f3fb; color: #2a8bc6; padding: 10px 10px; border: 1px solid #00008B;" role="alert"><strong>Note:</strong> 
For optimal viewing, it is recommended that you click the fullscreen button in the lower-right corner of your coding window. General information on how to work with Jupyter Notebooks is contained in a collapsable panel to the right of every Jupyter Notebook. </p>


In the previous video, you explored how uncertainty relates to probability distributions using the example of fair and unfair dice. Intuitively, the more evenly probabilities are spread, the more uncertain we are about the outcome. This uncertainty can be measured using **entropy**, a key concept in information theory and language modeling.

In this activity, you’ll calculate the entropy of different discrete distributions, interpret what it means, and relate entropy to the concept of the “effective number” of likely outcomes.

## Your Objective

To practice working with entropy, you will need to complete the following tasks.

* [Task 1 of 7: Define a Function for Entropy](#Task-1-of-7:-Define-a-Function-for-Entropy)

* [Task 2 of 7: Calculate Entropy for a Fair Die](#Task-2-of-7:-Calculate-Entropy-for-a-Fair-Die)

* [Task 3 of 7: Calculate Entropy for an Unfair Die](#Task-3-of-7:-Calculate-Entropy-for-an-Unfair-Die)

* [Task 4 of 7: Calculate Entropy for a Somewhat Unfair Die](#Task-4-of-7:-Calculate-Entropy-for-a-Somewhat-Unfair-Die)

* [Task 5 of 7: Find the Equivalent Number of Outcomes](#Task-5-of-7:-Find-the-Equivalent-Number-of-Outcomes)

* [Task 6 of 7: Experiment With Uncertainty](#Task-6-of-7:-Experiment-With-Uncertainty)

* [Task 7 of 7: Save and Download This Notebook](#Task-7-of-7:-Save-and-Download-This-Notebook)

## Task 1 of 7: Define a Function for Entropy

**Entropy** quantifies uncertainty in a probability distribution. The entropy of a discrete probability distribution $p_1, p_2, \ldots, p_n$
 is defined as:
$$
H = -\sum_{i=1}^{n} p_i \log p_i
$$
The higher the entropy, the more unpredictable the outcome. For each possible event, you multiply its probability by the negative log of that probability and sum the terms. If an event is impossible (probability is zero), it doesn't contribute to the sum, becuase $0 \times \log0$ is treated as zero. 

Fun fact: The entropy function is usually denoted with $H$, due to a series of references going back to the 19th century, but nobody knows why it originally started.

The function in the code cell below calcuates the entropy of a probability distribution `probs`. 

In [None]:
import numpy as np

def entropy(probs):
    """
    Calculate the entropy of a discrete probability distribution.
    probs: array-like of probabilities (must sum to 1)
    Returns: entropy (using natural log)
    """
    probs = np.asarray(probs)
    
    # Remove zero probabilities to avoid log(0)
    nonzero_probs = probs[probs > 0]
    
    return -np.sum(nonzero_probs * np.log(nonzero_probs))

## Task 2 of 7: Calculate Entropy for a Fair Die

Here we simulate a fair 6-sided die. Each face is equally likely, so we set the probability for each to 1/6. This is the situation of **maximal uncertainty**: you have no idea which side will come up next.

You can use algebra to rearrange the entropy function and show that the entropy for any uniform distribution is the logarithm of the number of possible outcomes. You can also use calculus to show that the uniform distribution is the highest possible entropy.

Use the code below to show that the entropy of a fair 6-sided die is $\log 6$. This value represents the highest possible uncertainty for six outcomes.

In [None]:
# Fair six-sided die: each side has probability 1/6
fair_probs = np.ones(6) / 6
print("Probabilities for a fair die:", fair_probs)

H_fair = entropy(fair_probs)
print("Entropy for fair die:", H_fair)
print("log(6):", np.log(6))

## Task 3 of 7: Calculate Entropy for an Unfair Die

Now consider the opposite scenario: a die that always lands on side 6. The probability is 1 for side 6 and 0 for all others. There is zero uncertainty in this case. You know the outcome with complete confidence. What do you expect entropy to be?

In [None]:
# Unfair die: always lands on side 6
unfair_probs = np.array([0, 0, 0, 0, 0, 1])
print("Probabilities for a completely unfair die:", unfair_probs)

H_unfair = entropy(unfair_probs)
print("Entropy for unfair die:", H_unfair)

## Task 4 of 7: Calculate Entropy for a Somewhat Unfair Die

Here’s a situation in between: the die is “biased” so side 6 is twice as likely as any other (probability 2/7 for side 6, 1/7 for the others). You have some information about the likely result, but not complete certainty. The entropy falls between 0 and $\log6$, reflecting this partial uncertainty.

In [None]:
# Unfair die: side 6 is twice as likely as each other side 
# (probabilities: 1/7 for sides 1–5, 2/7 for side 6)
somewhat_unfair_probs = np.array([1/7, 1/7, 1/7, 1/7, 1/7, 2/7])
print("Probabilities for a somewhat unfair die:", somewhat_unfair_probs)

H_somewhat = entropy(somewhat_unfair_probs)
print("Entropy for somewhat unfair die:", H_somewhat)

## Task 5 of 7: Find the Equivalent Number of Outcomes

While entropy is a powerful measure of uncertainty, its value can sometimes feel abstract. To make entropy more intuitive, we can convert it back into a simple, countable quantity: the **equivalent number of outcomes**. <b>Note:</b> this is sometimes called the "effective number of outcomes."

The equivalent number of outcomes tells you, "If all outcomes were equally likely, how many would there be to obtain the current uncertainty?" In other words, it answers the question: "How many sides would a fair die need to have to have the same entropy as the distribution we are measuring?"

Mathematically, the effective number is defined as the exponential of the entropy:
$$Equivalent Number=\exp(H)$$
where 
$H$ is the entropy (using the natural logarithm). It reflects the *spread* or *uncertainty* of the whole distribution.

**Why Is This Useful?**
- The equivalent number provides an easy way to compare distributions of different shapes and sizes.
- It allows you to describe the degree of uncertainty in a way that is immediately interpretable and intuitive.
- In language modeling and information theory, it helps us understand and communicate how "broad" or "narrow" a model’s predictions are.

In [None]:
def equivalent_number(H):
    return np.exp(H)

print("Equivalent number for fair die with 6 sides:", equivalent_number(H_fair))
print("Equivalent number for unfair die that always lands on side 6:", equivalent_number(H_unfair))
print("Equivalent number for somewhat unfair die:", equivalent_number(H_somewhat))

**Why was the equivalent number of outcomes for a fair 6-sided die ~6?**  
For a fair 6-sided die, the entropy is $log(6)$. The equivalent number is: $exp(log(6))=6$  
So there are six equally likely outcomes.  

**For a certain outcome, such as an unfair die that always lands on one side**, the entropy is 0. Therefore the equivalent number is: $exp(0)=1$  
This means there is only one possible outcome, and no uncertainty. 

**The entropy for a biased die was ~1.75**. If some outcomes are more likely than others, the equivalent number will be somewhere between 1 and 6. In the example above, where one side is more likely than the rest, the **equivalent number is just a little less than 6 (5.74 in this case)**. With a more biased die where one or two outcomes are significantly more likely, you would see a lower equivalent outcome number. The effective number decreases when most probability is concentrated in a small number of outcomes.

## Task 6 of 7: Experiment With Uncertainty

Now, play with different probability distributions. Try making the die even more unfair or splitting the probability between two sides. Watch how entropy and the equivalent number change. Do these numbers agree with your gut sense of uncertainty?

## Task 7 of 7: Save and Download This Notebook

You are encouraged to save and download a copy of this notebook.

To save the notebook, in the top left corner of this Jupyter Lab environment, select *File > Save All*.

To download the notebook to your own computer, you can either:
- Select *File > Download* to download the notebook in its native .ipynb file format.
- Select *File > Save and Export Notebook As* to save the notebook in a file format of your choosing.