# Getting Started with Symbulate

## Section 1. Probability Spaces

<[Contents](index.ipynb) | [Random variables](gs_rv.ipynb)>

**Every time you start Symbulate**, you must first run (SHIFT-ENTER) the following commands.

In [None]:
from symbulate import *
%matplotlib inline

This section provides an introduction to the basic Symbulate commands for simulating outcomes of a random process (like flipping a coin or rolling a die) and summarizing simulation output.

### Example 1.1: Flipping a coin

The following commands simulate a single flip of a fair coin.

In [None]:
flip = BoxModel(["Heads", "Tails"])
flip.draw()

A [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) is a simple example of a [probability space](https://dlsun.github.io/symbulate/probspace.html).  The first line above creates a box with two tickets, "Heads" and "Tails". The second line draws a ticket from this box at random and returns the result of the draw.  (By default each ticket is equally likely to be selected.  This can be changed with the [`probs`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options) argument.)

Many simple probability models can be thought of as drawing tickets from a box, like in the following exercise.

### Exercise 1.2: Rolling a die

Use [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) and [`.draw()`](https://dlsun.github.io/symbulate/probspace.html#Draw) to simulate a single roll of a fair six-sided die.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.2'></a>
[Solution](#sol1.2)

### Example 1.3: A sequence of five coin flips

The following commands simulate five flips of a fair coin.  Note that it is often more convenient to represent Heads as 1 and Tails as 0.

In [None]:
flips = BoxModel([1, 0], size=5)
flips.draw()

The first argument of [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) creates a box with two tickets, 0 and 1, and the second argument, [`size`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options), determines the number of tickets to draw from the box.  (By default, each ticket is replaced before the next draw; this can be changed with the [`replace`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options) argument.)

<a id='a1.3'></a>

### Exercise 1.4: Rolling a pair of six-sided dice

Use [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) with the [`size`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options) argument and [`.draw()`](https://dlsun.github.io/symbulate/probspace.html#Draw)  to simulate rolling two fair six-sided die. 

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.4'></a>
[Solution](#sol1.4)

### Example 1.5: Simulating multiple draws with `.sim()`

In [Example 1.3](#e1.2) one draw consisted of a sequence of five coin flips. The following commands simulate 10000 such draws using [`.sim()`](https://dlsun.github.io/symbulate/sim.html#Simulate)

In [None]:
flips = BoxModel([1, 0], size=5)
flips.sim(10000)

Note that every time [`.sim()`](https://dlsun.github.io/symbulate/sim.html#Simulate) is called new values are simulated. Store simulated values as a variable in order to perform multiple operations in different lines of code on the same set of values.

In [None]:
flips = BoxModel([1, 0], size=5)
sims = flips.sim(10000)
sims

Be careful not to confuse the "sample size" of a single outcome (e.g. `size=5` for 5 flips) with the number of outcomes to simulate (`.sim(10000)` for 10000 simulated outcomes).

### Exercise 1.6: Simulating sets of coin flips

1) Simulate 3 sequences of 5 coin flips each.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

2) Simulate 5 sequences of 3 coin flips each.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.6'></a>
[Solution](#sol1.6)

### Example 1.7: Counting the number of Heads in a sequence of coin flips

With 1 representing Heads and 0 Tails, the number of Heads in a sequence of coin flips can be counted by summing the 0/1 values.  Calling [`.draw()`](https://dlsun.github.io/symbulate/probspace.html#Draw) below simulates a single draw from `P`, a sequence of 0s and 1s representing five coin flips, and then these values are summed to return the number of Heads.  

In [None]:
P = BoxModel([1, 0], size=5)
sum(P.draw())

The above code simulates the value of the number of Heads in a single sequence of five coin flips.  Many such values can be simulated using [`.sim()`](https://dlsun.github.io/symbulate/sim.html#sim) followed by the [`.apply()`](https://dlsun.github.io/symbulate/sim.html#apply) method to apply the `sum` function to each simulated outcome.

In [None]:
P = BoxModel([1, 0], size=5)
P.sim(10000).apply(sum)

The number of Heads in a sequence of five coin flips is an example of a *random variable* ([`RV`](https://dlsun.github.io/symbulate/rv.html)).  [Section 2](gs_rv.ipynb) of the tutorial covers random variables in more detail. 

<a id='e1.7'></a>

### Exercise 1.8: Sum of two dice rolls

Use [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel), [`.sim()`](https://dlsun.github.io/symbulate/sim.html#sim), and [`.apply()`](https://dlsun.github.io/symbulate/sim.html#apply) to simulate 10000 values of the sum of two six-sided dice rolls.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.8'></a>
[Solution](#sol1.8)

### Example 1.9: Summarizing simulation results and estimating probabilities

In [Example 1.7](#e1.6) we simulated 10000 values of the number of Heads in five flips of a coin.  The simulated values can be summarized using [`.tabulate()`](https://dlsun.github.io/symbulate/sim.html#tabulate).  

In [None]:
P = BoxModel([1, 0], size=5)
sims = P.sim(10000).apply(sum)
sims.tabulate()

We can see, for example, that it is much more likely to obtain 3 Heads than 0 Heads in 5 flips of a coin.  By default, [`.tabulate()`](https://dlsun.github.io/symbulate/sim.html#tabulate) displays frequencies (counts).  Use [`.tabulate(normalize=True)`](https://dlsun.github.io/symbulate/sim.html#tabulate) to display relative frequencies (proportions).  The probability of a random event can be estimated by its simulated relative frequency.

In [None]:
sims.tabulate(normalize=True)

There are several other [tools](https://dlsun.github.io/symbulate/sim.html) for summarizing simulations, like the [`count`](https://dlsun.github.io/symbulate/sim.html#count) functions below.  (Displaying output in a [plot](https://dlsun.github.io/symbulate/rv.html#plot) will be covered in [Section 2](gs_rv.ipynb) of the tutorial.)

In [None]:
sims.count_eq(2) / 10000

In [None]:
sims.count_leq(2) / 10000

### Exercise 1.10: Summarizing simulation results for dice rolls

In [Exercise 1.8](#e1.7) you simulated 10000 values of the sum of two six-sided dice rolls.

1) Create a table of relative frequencies of the simulated values.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

2) Use a [count](https://dlsun.github.io/symbulate/sim.html#count) function to estimate the probability that the sum of two dice rolls is greater than nine.  (Bonus: try doing this several ways.)

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.10'></a>
[Solution](#sol1.10)

### Example 1.11: Other common probability models  - Binomial

The second table in [Example 1.9](#e1.8) displays the approximate [distribution](https://dlsun.github.io/symbulate/rv.html#distribution) of the number of Heads in five flips of a fair coin. This distribution is called the [`Binomial`](https://dlsun.github.io/symbulate/common_discrete.html#binomial) distribution with `n=5` flips and a probability that each flip lands on Heads equal to `p=0.5`. The number of Heads can be simulated from this Binomial distribution directly, without first simulating the individual coin flips.

In [None]:
Binomial(n=5, p=0.5).sim(10000).tabulate()

In addition to [`Binomial`](https://dlsun.github.io/symbulate/common_discrete.html#binomial), many other [commonly used probability spaces](https://dlsun.github.io/symbulate/common_discrete.html) are built in to Symbulate.

### Exercise 1.12: Simulating from a Binomial model

1) Use [`Binomial`](https://dlsun.github.io/symbulate/common_discrete.html#binomial) and [`.sim()`](https://dlsun.github.io/symbulate/sim.html#Simulate) to simulate 10000 values of the number of Heads in a sequence of 10 coin flips and tabulate the results.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

2) Use a [count](https://dlsun.github.io/symbulate/sim.html#count) function to estimate the probability of getting 7 or more heads out of 10 coin flips.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.12'></a>
[Solution](#sol1.12)

### Example 1.13: Rolling different dice

In [Exercise 1.4](#a1.3) we simulated a pair of rolls of a six-sided die using [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) with [`size=2`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options).  Now suppose we want to roll a six-sided die and a four-sided die.  This can be accomplished by setting up a [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) for each die and combining them with an asterisks [`*`](https://dlsun.github.io/symbulate/probspace.html#indep). 

In [None]:
rolls = BoxModel([1, 2, 3, 4, 5, 6]) * BoxModel([1, 2, 3, 4])
rolls.draw()

Note that the outcome of the draw is a pair of values.  The product [`*`](https://dlsun.github.io/symbulate/probspace.html#indep) notation indicates that that the two rolls are  [independent](https://dlsun.github.io/symbulate/probspace.html#Independent-probability-spaces).

### Exercise 1.14: Independent random numbers

Draw two random whole numbers, first an odd number and then an even number, between 1 and 10 (inclusive).  Hint: Use two [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel)s, one for odd and one for even, and combine them with [`*`](https://dlsun.github.io/symbulate/probspace.html#indep).

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.14'></a>
[Solution](#sol1.14)

### Example 1.15: BoxModel with non-equally likely tickets

By default [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) assumes that each ticket is equally likely, but there are several ways that non-equally likely situations can be handled.  As an example, suppose a special six-sided die has one face with a 1, two faces with 2, two faces with a 3, and one face with a 4. We can create a [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) with one ticket labeled 1, two tickets labeled 2, two labeled 3, and one labeled 4.

In [None]:
die = BoxModel([1, 2, 2, 3, 3, 4])
die.sim(10000).tabulate(normalize=True)

Rather than repeating values, we can specify a [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) by each of the possible values along with the number of tickets with that value, as in the following.

In [None]:
die = BoxModel({1: 1, 2: 2, 3: 2, 4: 1})
die.sim(10000).tabulate(normalize=True)

A non-equally likely [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) can also be defined using the [`probs`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options) argument, by specifying a probability value for each ticket. 

In [None]:
die = BoxModel([1, 2, 3, 4], probs=[1/6, 2/6, 2/6, 1/6])
die.sim(10000).tabulate(normalize=True)

Notice that the relative frequencies are close to the specified probabilities.

### Exercise 1.16: Flipping a weighted coin
<a id='ex1.16'></a>

A certain weighted coin lands on heads 75% of the time and tails 25% of the time. Use [`BoxModel`](https://dlsun.github.io/symbulate/probspace.html#boxmodel) with the [`probs`](https://dlsun.github.io/symbulate/probspace.html#BoxModel-options) argument to simulate and summarize many flips of the weighted coin.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

<a id='e1.16'></a>
[Solution](#sol1.16)

## Additional Exercises

<a id='ex1.17'></a>
### Exercise 1.17: Sum of two different dice

Use simulation to approximate the distribution of the sum of a six-sided die and a four-sided die.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

[Hint](#h1.17)

<a id='e1.17'></a>
[Solution](#sol1.17)

### Exercise 1.18: Estimating probabilities of the sum of different dice

Suppose we have a special six-sided die that has one face with a 1, one face with a 2, two faces with a 3, and two faces with 4. We also have a fair four-sided die. Use simulation to estimate the probability that the sum of these two dice is at least to 6. 

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

[Hint](#h1.18)

<a id='e1.18'></a>
[Solution](#sol1.18)

### Exercise 1.19: Summarizing multiple draws of weighted coin flips

Consider a certain coin that lands on heads 60% of the time and tails 40% of the time.  

1) Approximate the distribution of the number of heads in four flips of this coin by first simulating the individual flips.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

2) Approximate the distribution of the number of heads in four flips of this coin by simulating directly from a [`Binomial`](https://dlsun.github.io/symbulate/common_discrete.html#binomial) distribution.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

[Hint](#h1.19)

<a id='e1.19'></a>
[Solution](#sol1.19)  

### Exercise 1.20: Max of two dice rolls

Approximate the distribution of the larger of two rolls of a fair six-sided die,  and estimate the probability that the larger of the two rolls is at least 5.

In [None]:
### Type your commands in this cell and then run using SHIFT-ENTER.

[Hint](#h1.20)

<a id='e1.20'></a>
[Solution](#sol1.20)

[Back to Contents](#contents)

## Hints for Additional Exercises

<a id='h1.17'></a>
### Exercise 1.17: Hint

In [Exercise 1.8](#e1.7) we simulated the value of the number of Heads in a single sequence of 1/0 coin flips by using [`.apply()`](https://dlsun.github.io/symbulate/sim.html#apply)  with the `sum` function. In [Example 1.13](#e1.12) we simulated rolling a six-sided die and a four-sided die.

<a id='h1.18'></a>
[Back](#e1.17)  

### Exercise 1.18: Hint

In [Example 1.9](#e1.8) we estimated the probability of a random event using a `count` function. In [Exercise 1.17](#ex1.17) we found the sum of a fair six-sided dice and a fair four-sided dice. In [Example 1.15](#e1.14) we created a BoxModel for unequally likely tickets.

<a id='h1.19'></a>
[Back](#e1.18)  

### Exercise 1.19: Hint

1) In [Example 1.9](#e1.8) we approximated the distribution of the number of Heads in five flips of a fair coin. In [Exercise 1.16](#ex1.16) we simulated flips of a coin that landed on heads 75% of the time and tails 25% of the time.

2) In [Example 1.11](#e1.10) we used [`Binomial`](https://dlsun.github.io/symbulate/common_discrete.html#binomial) with `n=5` and `p=0.5` to simulate the number of Heads in five flips of a coin that lands on Heads with probability 0.5.

<a id='h1.20'></a>
[Back](#e1.18)  

### Exercise 1.20: Hint

In [Exercise 1.8](#e1.7) we simulated the `sum` of two rolls.  To find the larger of the two rolls use [`.apply()`](https://dlsun.github.io/symbulate/sim.html#apply) with the `max` function. Use [`.tabulate()`](https://dlsun.github.io/symbulate/sim.html#tabulate) to summarize the approximate distribution and a [`.count`](https://dlsun.github.io/symbulate/sim.html#count) function to approximate the probability.

<a id='h1.21'></a>
[Back](#e1.19)  

[Back to Contents](#contents)
<a id='solutions'></a>
<a id='sol1.2'></a>

## Solutions to Exercises

### Exercise 1.2: Solution

In [None]:
die = BoxModel([1, 2, 3, 4, 5, 6])
die.draw()

[Back](#e1.2)
<a id='sol1.4'></a>

### Exercise 1.4: Solution

In [None]:
dice = BoxModel([1, 2, 3, 4, 5, 6], size=2)
dice.draw()

[Back](#e1.4)
<a id='sol1.6'></a>

### Exercise 1.6: Solution

1) Simulate 3 sequences of 5 coin flips each.

In [None]:
BoxModel([1, 0], size=5).sim(3) 

2) Simulate 5 sequences of 3 coin flips each.

In [None]:
BoxModel([1, 0], size=3).sim(5) 

[Back](#e1.6)
<a id='sol1.8'></a>

### Exercise 1.8: Solution

In [None]:
dice = BoxModel([1, 2, 3, 4, 5, 6], size=2)
dice.sim(10000).apply(sum)

[Back](#e1.8)
<a id='sol1.10'></a>

### Exercise 1.10: Solution

1) Create a table of relative frequencies of the simulated values.

In [None]:
dice = BoxModel([1, 2, 3, 4, 5, 6], size=2)
sims = dice.sim(10000).apply(sum)
sims.tabulate(normalize=True)

2) Use a [count](https://dlsun.github.io/symbulate/sim.html#count) function to estimate the probability that the sum of two dice rolls is greater than nine. (Bonus: try doing this several ways.)

In [None]:
sims.count_gt(9) / 10000

In [None]:
sims.count_geq(10) / 10000

In [None]:
1 - sims.count_leq(9) / 10000

In [None]:
1 - sims.count_lt(10) / 10000

[Back](#e1.10)
<a id='sol1.12'></a>

### Exercise 1.12: Solution

1) Use [`Binomial`](https://dlsun.github.io/symbulate/common_discrete.html#binomial) and [`.sim()`](https://dlsun.github.io/symbulate/sim.html#Simulate) to simulate 10000 values of the number of Heads in a sequence of 10 coin flips and tabulate the results.

In [None]:
sims = Binomial(n=10, p=0.5).sim(10000)
sims.tabulate()

2) Use a [count](https://dlsun.github.io/symbulate/sim.html#count) function to estimate the probability of getting 7 or more heads out of 10 coin flips.

In [None]:
sims.count_geq(7) / 10000

[Back](#e1.12)
<a id='sol1.14'></a>

### Exercise 1.14: Solution

In [None]:
P = BoxModel([1, 3, 5, 7, 9]) * BoxModel([2, 4, 6, 8, 10])
P.draw()

[Back](#e1.14)
<a id='sol1.16'></a>

### Exercise 1.16: Solution

In [None]:
coin = BoxModel(["Heads", "Tails"], probs=[0.75, 0.25])
coin.sim(10000).tabulate()

[Back](#e1.16)  
<a id='sol1.17'></a>

### Exercise 1.17: Solution

In [None]:
rolls = BoxModel([1, 2, 3, 4, 5, 6]) * BoxModel([1, 2, 3, 4])
sums = rolls.sim(10000).apply(sum)
sums.tabulate(normalize=True)

[Back](#e1.17)  
<a id='sol1.18'></a>

### Exercise 1.18: Solution

In [None]:
rolls = BoxModel({1: 1, 2: 1, 3: 2, 4: 2}) * BoxModel([1, 2, 3, 4])
sims = rolls.sim(10000).apply(sum)
sims.count_geq(6)/10000

[Back](#e1.18)  
<a id='sol1.19'></a>

### Exercise 1.19: Solution

1) Use simulation to display the relative frequencies of the number of heads in 4 flips of the special coin using an apply() function.

In [None]:
coins = BoxModel([1, 0], probs = [0.6, 0.4], size=4)
coins.sim(10000).apply(sum).tabulate(normalize=True)

2) Use simulation to display the relative frequencies of the number of heads in 4 flips of the special coin using the Binomial distribution.

In [None]:
Binomial(n=4, p=0.6).sim(10000).tabulate(normalize=True)

[Back](#e1.19)  
<a id='sol1.20'></a>

### Exercise 1.20: Solution

In [None]:
dice = BoxModel([1, 2, 3, 4, 5, 6], size=2)
sims = dice.sim(10000).apply(max)
sims.tabulate(normalize=True)

In [None]:
sims.count_geq(5) / 10000

[Back](#e1.20)  
<a id='sol1.21'></a>

<[Contents](index.ipynb) | [Random variables](gs_rv.ipynb)>