# Probability

## Three definitions

**1. The classical interpretation of probability**: Arose from games of chance (50% chance a coin flips ‘head’). 
- Each possible distinct result is called an **outcome**. An **event** is identified as a collection of outcomes.  
- The probability of an **event** $E$ under the classical interpretation of probability is computed by taking the ratio of the number of outcomes $N_e$ favorable to event $E$ to the total number of possible outcomes $N$
 

$P(E) = \frac{N_e}{N}$

**2. Relative frequency concept of probability**: This is an empirical approach to probability. 
- If an experiment is repeated a large number of times $n$ and event $E$ occurs on $n_e$ (..%) of these trials, then the probability of event $E$ is approximately:

$P(E) \cong \frac{n_e}{n}$

**3. Personal or subjective probability**: used for a one-shot statement. (there are no repetitions of an experiment possible). 
- Problem: they vary from person to person and can not be checked / verified. 

In [1]:
#right let's try to get a sample here. 
#create a table with 500 random number-pairs where even digits will be designated as H and odd as T
#because there are 5 single digit even and five single digit odd numbers, the probability of obtaining an even number is 50%. 
#This set of 500 pairs therefore also represents 500 tosses of two fair coins. 

import random
import pandas as pd
import numpy as np

y = [1,2,3,4,5,6,7,8,9]
x1 = np.random.choice(y, size=500, replace=True)
x2 = np.random.choice(y, size=500, replace=True)
x = np.c_[x1,x2]

In [2]:
# right, now let's make a dataframe from this array:

df= pd.DataFrame.from_records(x)
df.columns = ['first coin', 'second coin']
df.head()

Unnamed: 0,first coin,second coin
0,9,6
1,4,3
2,6,3
3,5,5
4,8,6


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 500 entries, 0 to 499
Data columns (total 2 columns):
first coin     500 non-null int64
second coin    500 non-null int64
dtypes: int64(2)
memory usage: 7.9 KB


In [4]:
#Okay, now that the dataframe is set, let's make a function that determines how many numbers are 'odd' = 'T' or 'even' = 'H'
#also immediately let's make a summary table in which you can see how many events are TT, TH, HT or HH.

df['first toss'] = df['first coin']

def funct(row):
    if (row['first coin'] %2) == 0:
            val = 'H'
    else:
        val = 'T'
    return val

df['first toss'] = df.apply(funct, axis =1 )
df.head()

Unnamed: 0,first coin,second coin,first toss
0,9,6,T
1,4,3,H
2,6,3,H
3,5,5,T
4,8,6,H


In [5]:
def funct(row):
    if (row['second coin'] %2) == 0:
            val = 'H'
    else:
        val = 'T'
    return val

df['second toss'] = df.apply(funct, axis = 1)

In [6]:
df.head()

Unnamed: 0,first coin,second coin,first toss,second toss
0,9,6,T,H
1,4,3,H,T
2,6,3,H,T
3,5,5,T,T
4,8,6,H,H


In [7]:
df['out'] = df['first toss'] + df['second toss']
df

Unnamed: 0,first coin,second coin,first toss,second toss,out
0,9,6,T,H,TH
1,4,3,H,T,HT
2,6,3,H,T,HT
3,5,5,T,T,TT
4,8,6,H,H,HH
5,1,5,T,T,TT
6,8,2,H,H,HH
7,4,7,H,T,HT
8,1,1,T,T,TT
9,4,2,H,H,HH


In [8]:
#Now let's make a dataframe with the frequency per toss-type:
dfnew = pd.DataFrame(df['out'].value_counts()).reset_index()
dfnew

Unnamed: 0,index,out
0,TT,160
1,TH,128
2,HT,125
3,HH,87


In [9]:
dfnew = dfnew.rename(columns={'index':'type', 'out':'frequency' })
dfnew

Unnamed: 0,type,frequency
0,TT,160
1,TH,128
2,HT,125
3,HH,87


In [10]:
#now let's add a relative frequency column (divided by 500):
dfnew['relative frequency'] = dfnew['frequency']/500
dfnew

Unnamed: 0,type,frequency,relative frequency
0,TT,160,0.32
1,TH,128,0.256
2,HT,125,0.25
3,HH,87,0.174


## Assignment: 
compute the probability of tossing two coins 500 times and observing exactly one head using the 2nd definition of probability:
$P(E) \cong \frac{n_e}{n}$


Where $n_e$ is the total of the two frequencies of 'HT' and 'TH'. 
Let's compute this.

In [11]:
v1 = dfnew.iloc[1]['frequency']
v2 = dfnew.iloc[2]['frequency']
P = (v1 + v2) / 500 
print('The probability of tossing two coins 500 times and observing exactly one head is: '+str(P))

The probability of tossing two coins 500 times and observing exactly one head is: 0.506


## Conclusion:
The probability is close to 50%