Chance is a constant presence in our lives that we often take for granted. We look at weather forecasts and assume that what we see is what we'll get, only to be disappointed that a sunny day turned out to be overcast. What we don't appreciate is that these forecasts are more likely educated guesses rather than certainties. The most common place we encounter and interact with chance is the casino, where we hope that Lady Luck may be on our side at the blackjack table. There are some intrepid gamblers that have found a way to tame luck and maximize their potential profits. These few are those who have a solid grasp of dealing with uncertainty.

Mathematicians of the past have figured out a mathematical framework in the study of chance and uncertainty, called **probability**. Understanding probability is crucial to understanding its more advanced applications, such as spam filters and weather prediction.

A coin flip is the simplest example of a **random experiment**. A random experiment is not like a scientific experiment, but it can be thought of as a basic action. It's called "random" because this action can produce different results that we cannot predict before the experiment is "done". For a coin flip, a coin has two sides, which we will refer to as "heads" or "tails". We cannot know if a coin flip will land on the heads side or tails side before it hits the ground, so it qualifies as a random experiment.

The result of a random experiment is called an **outcome**, so both heads and tails are possible outcomes of the experiment. With random experiments, it is useful to understand all the outcomes than come out of it. For example, we know that a coin can only produce a heads or a tails. There is a special term for the collection of all the outcomes of a random experiment, and it is referred to as the **sample space**. We use a coin flip as an example because its sample space only has two outcomes.

It's easy to get lost in all the jargon, so we've summarized them in a diagram below:

![image.png](attachment:image.png)

### Theoretical probabilities. 

We say they are theoretical because we need to make an important assumption that each outcome in the sample space is equally likely. In practice, this assumption is often too strong, or unrealistic, to make. Consider picking a random fruit at the grocery store. Most people will take some time to examine the fruit to make sure it isn't rotten in anyway. This means that some fruit will be more likely to be picked than others, so the assumption of equal likelihood doesn't hold here.

**Task**

1. Find the theoretical probability of getting a 5 when rolling a six-sided die. 
2. Consider a lottery where over 350,000 tickets have been sold. What would be the theoretical probability of winning? 

**Answer**

`p_5 <- 1 / 6
 p_lottery <- 1 / 350000`

So far, we have been careful in our use of the word "outcome". In probability terms, an outcome refers to a singular result in the sample space. We saw that calculating the probability of an outcome was simple: all we need to know is the size of the sample space. To be able to calculate the probability of multiple outcomes simultaneously, we need to discuss the concept of an **event**.

An event can refer to either a singular outcome or a collection of outcomes. Events are more general than outcomes and allow us to calculate more complex probabilities.

### Examples of events

* In a six-sided dice roll, the event that we'll roll an odd number includes the outcomes 1, 3, and 5.
* In a deck of playing cards, the event that we'll draw an Ace is composed of 4 outcomes: the Ace of Spades, Hearts, Clubs and Diamonds.

We can define events to include outcomes that are not a part of a particular sample space. For example, if we define our event to be "rolling a 7" when our sample space is the outcomes of a six-sided dice roll. No matter how many times we roll the dice, we'll never satisfy the event, so intuitively the probability of this event is 0.

**Task**

* A new deck of playing cards contains 52 cards. The deck contains 4 suits, or distinct symbols, and each suit has 13 playing cards. These 13 cards include: the ace, the numbers 2 through 10 and the three face cards: Jack, Queen and King.

* Using this knowledge, calculate the probabilities of the event that:

    - we draw a card with the Hearts suit.
    - we draw a card that has any of the numbers 2 or 3.
    
**Answer**

`p_heart <- 13 / 52
 p_2_or_3 <- 8 / 52`

Consider a curious question: if we can calculate theoretical probability, how would we be able to confirm if our calculation is correct? That is to say, is there a way that we can estimate a theoretical probability of a given event?

Our solution is the random experiment! A single experiment only results in a single outcome, but we can repeat the experiment multiple times. To estimate the probability of an event, we can repeat an experiment a certain number of times and count how many times we observe the event of interest.

As a simple example, let's say we're interested in estimating the probability of a coin landing on heads. To estimate the probability, we can take the following steps:

* Toss the coin many times, thus repeating the random experiment
* Count the number of times the coin landed on heads.
* Divide the number of heads by the total number of times we tossed the coin.

We call this probability an **empirical probability**. Empirical refers to the idea that we concretely observe this probability, as opposed to merely calculating it theoretically. We can interpret an empirical probability as a relative frequency, which is similar to our understanding of the theoretical probability.

It's important to note that the number 56 was arbitrary. We could have easily gotten 40 heads, 47 heads or even 60 heads. Each experiment can yield a different empirical probability, but only one outcome `50 heads` corresponds with our theoretical probability of 50%. 

**Task**

We rolled a six-sided dice 100 times and got the number three 18 times. Calculate the probability of getting that number.

**Answer**

`p_three <- 18 / 100`

Instead of having to actually flip an actual coin multiple times, we can simulate multiple coin flips using R. Writing simulations is an essential tool to a data scientist's toolkit.

We'll start by writing a function named `coin_toss()` to simulate a single coin toss:

`set.seed(1)`

`coin_toss <- function() {
    toss <- runif(1)
    if (toss <= 0.5) {
        return("HEADS")
    } else {
        return("TAILS")
    }
}`

We use the `runif()` function to emulate a coin flip and create a random number between 0 and 1. We've assumed that our theoretical probability is 50%, so if the random number is less than 0.5, the function will return "HEADS". Otherwise, it will return "TAILS."

**Task**

* Write a for-loop to repeat the coin toss 10 times
    - If the coin returns HEADS, we should increment
* Use this loop of "coin flips" to calculate the empirical probability of getting a heads. 
* Using a separate variable, perform another set of 100 coin tosses and calculate another empirical probability of getting heads.
    - The only variable that changes here is the number of coin tosses. 
    
**Answer**

`heads <- 0`

`n_experiments <- 10`

`for (i in 1:n_experiments) {
    toss <- coin_toss()
    if (toss == "HEADS") {
        heads <- heads + 1
    }
}`

`experiment_one <- heads / n_experiments`


`n_experiments <- 100`

`heads_2 <- 0`

`for (i in 1:n_experiments) {
    toss <- coin_toss()
    if (toss == "HEADS") {
        heads_2 <- heads_2 + 1
    }
}`

`experiment_two <- heads_2 / n_experiments`

We may have noticed that the probability of `experiment_two` is much closer to the theoretical probability than `experiment_one`.

The more experiments we use, the empirical probability of getting a heads starts to approach the theoretical probability. This result comes from one of the most important laws in probability: the **Law of Large Numbers**. The Law of Large Numbers is what links our empirical probabilities to our theoretical probabilities. We won't go into its technical details, but just know that this law is what makes simulations and programming so powerful.

Notice that for small numbers, the empirical probability can differ a lot from the theoretical probability. As the number of coin tosses increases, this deviation away from the theoretical probability will shrink.

Often times, there will be cases where we can't calculate the theoretical probability to check against an empirical probability. In these cases, we may take some solace that performing a properly set up simulation many, many times will create a good estimate of the theoretical probability.

**Task**

The Law of Large Numbers can take effect really quickly. This means that we won't need to repeat our experiment an extreme amount of times before we start approximating the theoretical probability well. We can also demonstrate this using simulation. 

* Write a for-loop to calculate the empirical probability of getting a heads using 10 coin flips. Subtract this probability from the theoretical probability.
* Repeat the task above using 100 coin flips. Subtract this probability from the theoretical probability.
* Finally, repeat the task above using 1000 coin flips. Subtract this probability from the theoretical probability.
    - Do we notice any trends between the differences and the number of coin flips used?
    
**Answer**

`heads <- 0`

`n_experiments <- 10`

`for (i in 1:n_experiments) {
    toss <- coin_toss()
    if (toss == "HEADS") {
        heads <- heads + 1
    }
}`

`experiment_diff_one <- 0.5 - (heads / n_experiments)`


`n_experiments <- 100`
`heads_2 <- 0`

`for (i in 1:n_experiments) {
    toss <- coin_toss()
    if (toss == "HEADS") {
        heads_2 <- heads_2 + 1
    }
}`

`experiment_diff_two <- 0.5 - (heads_2 / n_experiments)`

`n_experiments <- 1000`

`heads_3 <- 0`

`for (i in 1:n_experiments) {
    toss <- coin_toss()
    if (toss == "HEADS") {
        heads_3 <- heads_3 + 1
    }
}`

`experiment_diff_three <- 0.5 - (heads_3 / n_experiments)`