<h2> Simulation </h2>

One of the most powerful tools we have in understanding probability is to *simulate* an experiment many times; this gives us a lot of information about the long-term behavior of a model. Frequently, the best way to do this is by programming it -- we run thousands or millions of trials and gather a large amount of data regarding the typical outcomes, the average behavior or properties of a system, and more. 

In this notebook, we'll explore how we can simulate dice rolls using Python and generate and analyze random data for this purpose. The code blocks below will simulate throwing multiple dice, recording their outcomes, and doing computations with them.

<h3> Simulating dice rolls: the code </h3>

Python's basic way to generate random numbers is called a *uniform random variable*: it generates a number between $0$ and $1$ without any bias towards one part of the interval. No one number is more likely as an outcome than any other number.

We can use this to do other random experiments too! If we generate a random number between $0$ and $1$, we can interpret it as a die roll as follows: if the number is between $0$ and $1/6$, it's a $1$; if the number is between $1/6$ and $2/6$, it's a $2$; and so on. An equivalent way to do this is to take our random number (e.g. $0.42$), multiply by $6$ and add $1$ ($0.42 \cdot 6 + 1 = 3.52$), and round down (so we rolled a $3$). This will generate any number from $1$ to $6$ with equal probability -- just like rolling a die!

Here's code that does that. It's a bit more adaptible -- we set a variable `num_sides` that can be changed, rather than just $6$ sides. The logic is the same, though; let's roll a die ten times:

In [1]:
# Import the random number generator package 
from random import random

# This code roles the die once. The def keyword
# tells Python that we're defining a function, 
# and then we return the roll. Here, num_sides
# is the input to the function.

def roll_die(num_sides):
    # Get the random number
    r = random()

    # Do the computation. The int(...) function
    # is what rounds down.
    roll = int(num_sides*r + 1)

    # Return this number.
    return roll

# We're now going to roll the die ten times.
num_trials = 10

# We want to store the numbers in a list; if you
# want to learn more about this, go to
# https://www.w3schools.com/python/python_lists.asp
# This just sets up an empty list.
results = []

# Roll the 6-sided die however many times we want:
for n in range(num_trials):
    # One roll
    roll = roll_die(6)

    # Add (or: append) to the list of outcomes
    results.append(roll)
    
# Print out the results
print(results)

[6, 5, 1, 1, 1, 2, 3, 1, 5, 1]


So we have now gotten ten rolls! The first time that I ran this code, my outcome was $[6,5, 1, 1, 1, 2, 3, 1, 5, 1]$. It's a little unusual that I didn't get a $4$, but not that unlikely.

Let's make this a little bit more involved. Suppose we want to roll two dice, add their rolls, and count how many times their sum is eight. Let's also do $10,000$ experiments -- more than we can do by hand:

In [2]:
# Do 10000 trials of rolling two dice
num_trials = 10**5

# We'll call a trial a success if the sum
# is equal to 8. Start with no successes.
success = 0

# Now run the trial however many times we want
# to, and add to our count of successes.
for n in range(num_trials):
    # Roll each die:
    r1 = roll_die(6)
    r2 = roll_die(6)
    
    # Test if the sum was 8. We use == to see
    # if the left and right are equal; a single
    # = means we're assigning a variable.
    
    if r1 + r2 == 8:
        # This is short for success = success + 1.
        # We just count one more:
        success += 1
        
# Now print off how many successes there were:
print(f'The dice added to 8 in {success} trials out of {num_trials}.')

The dice added to 8 in 13651 trials out of 100000.


Running $100000$ trials, I had $13651$ successes -- meaning that the probability of the two dice adding to $8$ is about $0.137$. On the other hand, using the ideas from class, there are $5$ possible dice rolls out of the $36$ total which have a sum of $5$ (2-6, 3-5, 4-4, 5-3, and 6-2). This leads to a probability of $5/36 \approx 0.139$; so our simulation was only about $2\%$ off from the truth. This is pretty great!

Let's make one last modification: we also want to store the outcomes of each experiment. We'll use a [Python set](https://www.w3schools.com/python/python_sets.asp) to do this; it's something that pairs a key (the outcome) and a value (the count for that value). The syntax is `key:value`.

In [3]:
# Do 1000 trials this time
num_trials = 1000

# Make the set which counts successes. For each possible
# sum between 2 and 12, we set the count to zero; we'll then
# add one every time that's the observed sum. Remember that
# for the range, we have to add 1.
counts = {r:0 for r in range(2, 13)} 

# Now we do the experiment:
for _ in range(num_trials):

    # Roll two dice, compute the sum
    r1 = roll_die(6)
    r2 = roll_die(6)
    s = r1 + r2

    # Update the count of successes. Remember that
    # += 1 updates a variable to add 1 to it.
    counts[s] += 1

# Now print the results in a bit of a table. We're
# counting good outcomes out of the total number of trials.
for r in range(2, 13):
    print(f'Probability of summing to {r} = {counts[r] / num_trials}')

Probability of summing to 2 = 0.024
Probability of summing to 3 = 0.061
Probability of summing to 4 = 0.103
Probability of summing to 5 = 0.118
Probability of summing to 6 = 0.132
Probability of summing to 7 = 0.155
Probability of summing to 8 = 0.13
Probability of summing to 9 = 0.099
Probability of summing to 10 = 0.109
Probability of summing to 11 = 0.045
Probability of summing to 12 = 0.024


Based on this, the most likely outcomes were $6, 7, $ and $8$. We would need to do more trials to get more certainty -- this is a pretty small simulation!

<h3> Questions </h3>

For the following questions, add or modify code blocks as necessary. Use the block below this one that contains your conclusions; you can copy-paste any results into that cell. 

1) If you roll three dice instead, what is/are the most common sum(s)? Make sure to update the range -- the sums won't run from $2$ to $12$ anymore!

2) Let's change the experiment: roll two dice and multiply their values instead of adding. What is the most likely outcome, experimentally?

<h2><i>Put your answers to the questions here!</i></h2>


<h3> Submitting this to Gradescope </h3>

Once you've finished modifying your notebook and answering the questions, you'll need to submit it to Gradescope along with your other homework. To do this, generate a pdf file by clicking `File -> Save and Export Notebook as... -> PDF`. Then upload that PDF to Gradescope and submit it to the assignment `Jupyter 2 - Simulation`. As always -- if you have any questions or run into any issues you can
* ask during discussion,
* email your TA or instructor,
* or bring them to student hours!