# Introduction to Probability

This notebook accompanies the Introduction to Probability lecture and includes the simulations and visualizations for different examples of the use of probability theory in practice. The aim of these examples is to build intuition on how probability works.

In [None]:
import random
import numpy as np

# Auto-setup for Colab
import os

if 'google.colab' in str(get_ipython()):
    if os.path.exists('stats_for_cs'):
        !rm -rf stats_for_cs
    !git clone https://github.com/uio-bmi/stats_for_cs.git
    %cd stats_for_cs

from util import plot_event_probabilities

## Example 1: a coin toss

In [None]:
def coin():
    return random.sample(['head', 'tail'], 1)[0]

coin()

In [None]:
# random.random?

## Example 2: rolling a die

What if instead of a coin, we had a die? What would the simulation look like then?

In [2]:
# TODO: write the same simulation here, but instead of tossing a coin, the simulation should roll a 6-sided die


0.7585236528408602

## Example 3: weighted coin

Going back to the coin example: what if we somehow temper with the coin, and head and tail are not equally likely anymore? We make a weighted coin and look into its behavior.

In [None]:
# TODO: write a function to simulate a weighted coin with probability of getting head 0.3

# hint: random() function

def weighted_coin(...):
    pass

In [None]:
# what if p_head = 0.5? - repeat the simulation above to see

## Example 4: multiple coin tosses 

If we toss a coin 3 times, what is the probability that we get exactly 2 heads?

What are the possible outcomes here, what are the events?

In [None]:
def toss_coin_n_times(n):
    outputs = None
    
    # TODO: write code here
    
    return outputs

number_of_trials = None 

toss_coin_n_times(number_of_trials)

How can we now estimate the probabilities of such events? What would simulation look like?

In [None]:
def simulate_event_probabilities(p_head, n, num_experiments):
    
    # TODO: write code here to estimate how often we get each of the outcomes
    
    return event_probabilities

This way of obtaining probability distributions through simulations is called Monte Carlo simulation.

We can also show the results of the simulation graphically:

In [None]:
p_head = None
n = None
num_experiments = None

event_probabilities = simulate_event_probababilities(p_head, n, num_experiments)

plot_event_probabilities()

## Example 5: going back to the betting example

If we throw a fair coin 10 times, what is the probability that we get 5 heads and 5 tails? Write the code to simulate this experiment and empirically estimate its probability.

Some thinking points:
- what is the event in this case?
- what are possible outcomes?

In [None]:
# TODO: Write the code here and estimate the probability of obtaining exactly 5 heads and 5 tails in 10 coin tosses



# Introduction to Probability: Part 2

This part of the notebook accompanies the second part of the Introduction to probability lecture and includes examples, different probability distributions, and introduction to the concept of conditional probability.

In [None]:
import random
import numpy as np
import re
import plotly.express as px

## Example 1

You are rolling a die three times. What is the probability that the sum of the sides is less than 12?

In [None]:
def estimate_probability_of_sum(target_sum, num_experiments):
    
    # TODO: estimate the probability

estimate_probability_of_sum(12, 100)

## Example 2

Three card players play a series of matches. The probability that player 1 will win any game is 30%, the probability that player 2 will win is 50% and the probability that the third player wins is 20%. If they play 6 games, what is the probability that player 1 wins at least 2 games?


In [None]:
def estimate_winner_probability(player1_p, player2_p, player3_p, num_games, num_experiments):
    
    # TODO: estimate the probability

estimate_winner_probability(0.3, 0.5, 0.2, 6, 10000)
    

# Different distributions

In Session 1, we simulated tossing a coin 3 times to get the probability of getting different number of heads. This was plotted as a histogram - a way to connect the number of heads (X) with the frequency of obtaining them (Y).


In [None]:
def toss_coin_n_times(p_head, n):    
    return ['H' if random.random() <= p_head else 'T' for _ in range(n)]

def simulate_event_probabilities(p_head, n, num_experiments):
    event_counts = {i: 0 for i in range(n+1)}
    
    for experiment in range(num_experiments):
        outcome = toss_coin_n_times(p_head, n)
        event_counts[outcome.count('H')] += 1
        
    event_probabilities = {f'H{event}': count / num_experiments for event, count in event_counts.items()}
    
    return event_probabilities

def plot_event_probabilities(event_probabilities):
    fig = px.bar(x=list(event_probabilities.keys()), y=list(event_probabilities.values()), labels={'x': 'event', 'y': 'probability'})
    
    fig.show()

event_probabilities = simulate_event_probabilities(p_head=0.5, n=3, num_experiments=1000)
print(event_probabilities)

plot_event_probabilities(event_probabilities)

This can also be computed using a formula of binomial distribution (probability of getting k successes in n independent trials when the probability of success is p):

In [None]:
import math

def head_count_prob(p_head, n, head_count):
    return math.factorial(n) / (math.factorial(head_count) * math.factorial(n - head_count)) * (p_head**head_count) * ((1-p_head)**(n-head_count))

event_probabilities_from_formula = {
    f'H{head_count}': head_count_prob(p_head=0.5, n=3, head_count=head_count) for head_count in range(4)
}

print(event_probabilities_from_formula)
plot_event_probabilities(event_probabilities_from_formula)

Or using a library function that implements the formula:


In [None]:
from scipy.stats import binom

def compute_event_probabilities_from_formula(p_head, n):
    return {
        f'H{head_count}': round(binom.pmf(k=head_count, n=n, p=p_head), 4) for head_count in range(n+1)
    }

event_probabilities_from_formula = compute_event_probabilities_from_formula(p_head=0.5, n=3)

print(event_probabilities_from_formula)
plot_event_probabilities(event_probabilities_from_formula)

And if we combine the plots:


In [None]:
import plotly.graph_objects as go

def plot_probability_comparison(event_probabilities_simulation, event_probabilities_formula):
    fig = go.Figure(data=[
        go.Bar(name='formula', x=list(event_probabilities_formula.keys()), y=list(event_probabilities_formula.values())),
        go.Bar(name='simulation', x=list(event_probabilities_simulation.keys()), y=list(event_probabilities_simulation.values()))])
    fig.update_layout(barmode='group')
        
    fig.show()
    
p_head, n, num_experiments = 0.5, 3, 1000
event_probabilities_sim = simulate_event_probabilities(p_head, n, num_experiments)
event_probabilities_formula = compute_event_probabilities_from_formula(p_head, n)

plot_probability_comparison(event_probabilities_sim, event_probabilities_formula)

## What happens if we toss a coin many times: approaching continuous distributions


In [None]:
p_head, n, num_experiments = 0.5, 100, 1000
event_probabilities_sim = simulate_event_probabilities(p_head, n, num_experiments)
plot_event_probabilities(event_probabilities_sim)

## Example problem: Monopoly


In the game of Monopoly, one moves one's marker around a board containing 40 fields, by throwing a pair of dice every turn. What is the probability of finishing your first round on your fifth turn?

<img src="https://images.unsplash.com/photo-1640461470346-c8b56497850a?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=1674&q=80" style="height:280px; float: left; margin-top: 10px; margin-right: 10px" />

Alternatively:

What is the probability that the sum of values passes 40 after drawing a sample from a pair of (2) randint(1,6) calls exactly 5 times?


Write a code to simulate this and estimate the probability.


In [None]:
from random import randint

# TODO: write the code here

## Example problem: computing the probability of a letter in a text


In [None]:
original_text = """To be, or not to be, that is the question, Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune,Or to take arms against a sea of troubles,And by opposing end them? To die: to sleep;No more; and by a sleep to say we endThe heart-ache and the thousand natural shocksThat flesh is heir to, 'tis a consummationDevoutly to be wish'd. To die, to sleep;To sleep: perchance to dream: ay, there's the rub;For in that sleep of death what dreams may comeWhen we have shuffled off this mortal coil,Must give us pause: there's the respectThat makes calamity of so long life;For who would bear the whips and scorns of time,The oppressor's wrong, the proud man's contumely,The pangs of despised love, the law's delay,The insolence of office and the spurnsThat patient merit of the unworthy takes,When he himself might his quietus makeWith a bare bodkin? who would fardels bear,To grunt and sweat under a weary life,But that the dread of something after death,The undiscover'd country from whose bournNo traveller returns, puzzles the willAnd makes us rather bear those ills we haveThan fly to others that we know not of?Thus conscience does make cowards of us all;And thus the native hue of resolutionIs sicklied o'er with the pale cast of thought,And enterprises of great pith and momentWith this regard their currents turn awry,And lose the name of action.--Soft you now!The fair Ophelia! Nymph, in thy orisonsBe all my sins remember'd."""
text = re.sub(r"[\.\,\:\' \;\n\-\?\!]*", "", original_text).lower()

# print(original_text)
# print(text)

def compute_marginal_probability(letter, text):
    probability = None
    
    # TODO: write code here to estimate P(letter) in the given text
        
    return probability

def compute_conditional_probability(letter, previous_letter, text):
    cond_probability = None
    
    # TODO: write code here to estimate P(letter | previous_letter) in the text
        
    return cond_probability

print(compute_marginal_probability("a", text))

print(compute_conditional_probability("a", "h", text))

## Example 3

We toss a fair coin 3 times. What is the probability that more heads than tails come up if the first toss is head?

In [None]:
def estimate_more_heads_probability(num_experiments):

    # TODO: simulate and estimate the probability of getting more heads given that first toss is a head

estimate_more_heads_probability(100)
    