# Problem 2: Writing functions to simulate coin tosses

In the second half of this course, we'll spend a lot of time doing statistics by simulation. In this problem you'll apply Python syntax for functions and loops to run two simple simulations of coing tosses. In these simulations we consider a fair coin (probability of heads = 0.5). In the language of probability, each toss is a 'trial' and we consider 'heads' a success.

You'll simulate the distributions that answer two questions:

1. How many heads do you expect with n coin tosses?

2. How many coin tosses does it take before seeing heads for the first time?

If you've taken a probability course, you may know how to answer these questions mathematically. In this problem, we're just going to simulate the results and look at how the two distributions differ.

## 2.1: A function to simulate n coin tosses

In the cell below, write a function called `total_successes()` to simulate n coin tosses. 

1. The function should take one argument, the number of tosses (which you can call n). 
2. Give n a default value of 10.
3. The function should return the total number of heads obtained.

Hint: Use a for loop to execute the coin-toss code n number of times and keep track of each iteration that results in heads.

Once you define your function, test it out a few times to make sure it's giving you reasonable answer.

### Simulating the coin toss:
Obviously there is no actual coin involved here. Use the random number generator function `random.random()` to generate a number between 0 and 1. (We discussed the `random` module in the lecture.) If the random number is greater than 0.5, consider the outcome heads.

In [None]:
import random

# Write the function below
# YOUR ANSWER HERE

In [None]:
assert total_successes(10) < 11

In [None]:
# Try out your function here


## 2.2 A function to simulate number of tosses until first success

Now write a function called `first_success()` to simulate the number of coin tosses it takes before seeing heads for the first time. Use the same random number approach to simulate coin tosses.

This function needs to keep tossing the coin until heads shows up. One way to do this is to use a different type of loop control, called a `while` loop. A `while` loop is a mix of `if` and `for`: `while` evaluates whether a statement is true or false, and runs the next loop iteration as long as the condition is true.

Here's a comparison of `if` and `while`:

```python
if x < 10:
    # execute the code in this block once
    
while x < 10:
    # execute the code, repeat the loop if the condition is still true
```    

Another way to think about `while` is this: `for` runs a loop for a fixed number of times, and `while` runs a loop for an unspecified number of times - it just keeps going until a truth condition fails.

In the cell below is an exmple of the syntax:

In [None]:
# Count to 10
x = 0
while x <10: # loop will keep running as long as this boolean condition is True
    x += 1
    print(x)

In the cell below, write the function `first_success()` to simulate tossing a coin until the first success.

1. The function takes no arguments (parentheses are empty). (Make sure you understand why no arguments are needed.


2. The function should return the number of tosses it takes to first see heads. (This total should include the toss that came up heads.)


3. Use `random.random()` to simulate the coin toss, as before.


4. Use a `while` loop to keep tossing until a toss comes up heads.

Test out your function and make sure it gives you reasonable answers.

In [None]:
# Write your function below

# YOUR ANSWER HERE

In [None]:
assert first_success() > 0

In [None]:
# Test out your function here

## 2.3 Use functions to simulate coin toss distributions

The two functions each simulate one set of trials. Your first function tells you how many heads were in one set of n coin tosses. But what is the distribution of outcomes if you repeated the n tosses many times?

To simulate probability distributions using our functions,you will run for loops to call the functions 100,000 times. As the loop runs, you keep track of the outcome of that iteration using a list. Then you can plost the distribution of 100,000 outcomes stored in the list.

In the cell below, do the following:

1. Write a `for` loop that calls `total_successes()` 100,000 times, using 10 for the arguement n. Each time through the loop, append the result to the list `totals`.

3. Write a `for` loop that calls `first_success()` 100,000 times. Each time through the loop, append the result to the list `firsts`.

In [None]:
# Create lists to hold the outcomes
totals = []
firsts = []

# Call total_successes 100,000 times
# YOUR ANSWER HERE

# Call first_success 100,000 times
# YOUR ANSWER HERE

print(max(totals), min(totals)) # make sure lists are the right size

In [None]:
assert max(totals) > 6
assert min(totals) < 2
assert len(totals) == 100000
assert len(firsts) == 100000
assert max(firsts) < 35
assert max(firsts) > 8
assert min(firsts) == 1

## Visualizing the distributions
Run the cell below to plot the two distributions. (We'll cover plot syntax in a later lecture.)  Note their different shapes. If you've taken a probability course, you might recognize that we've simulated the binomial (number of successes in n trials) and the geometric (number of trials to first success) distributions.

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = [12, 4]

bins = [0,1,2,3,4,5,6,7,8,9,10,12,14,16]
hist_data = plt.hist(firsts, bins=bins, density=True, label='Tosses to first success')
#plt.axvline(gg_mean, color='orange', label='GG mean')
bin_edges = hist_data[1]
hist_data = plt.hist(totals, bins=bin_edges, density=True, alpha=0.5, label='Total successes')
#plt.axvline(aa_mean, color='blue', label='AA mean')
legend = plt.legend()