# Probability Review

http://onlinestatbook.com/2/probability/probability_intro.html

In many events, there are no definite outcomes so the outcome can't be predicted with total certainty. What we can say is how likely the outcomes are to happen, using the idea of probability? Inferential statistics is built on the foundation of probability theory and has been remarkably successful in guiding opinion about the conclusions to be drawn from data. 

One conception of probability is drawn from the idea of symmetrical outcomes. For example, the 2 possible outcomes of tossing a fair coin seem not to be distinguishable in any way that affects which side will land up or down. Therefore, the probability of heads is taken to be 1/2 (same as tails). In general, if there are N symmetrical outcomes, the probability of any given one of them occurring is taken to be 1/N. Thus, if a six-sided die is rolled, the probability of any one of the six sides coming up is 1/6. 

If the oil prices is traded at $100 a barrel 70% of the time in the last year then the probability of it being around $100 would be 70% higher for the next year. This is a common conclusion but could be unreasonable if there is more data available to decide whether it will be trading at $100 a barrel tomorrow. For example, if the supply increased in the last day then the price would fall. We should consider only the days in last year where oil production matches. Even this information is not enough since decrease in oil prices depends on consumption too (The prices will fall if consumption is low). So, we should consider only the prior occurrences of matching oil production and similar consumption. As we keep considering more factors affecting the outcomes, you can see that the sample of prior cases will soon be reduced to the empty set. 

In some cases, probability should be though of as subjective. Questions such as "What is the probability the Starbucks coffee is better than McDonalds coffee" can't be answer using summetry or frequency approaches to probability. Assigning a probability of 0.9 for example to this event reflects the speaker's personal opinion. Such an approach to probability, however, seems to lose the objective content of the idea of chance; probability becomes mere opinion. 

### Sample Spaces

For a random experiment E, the set of all possible outcomes of E is called the sample space and is denoted by the letter S. 
For a coin-toss experiment, S would be the results “Head” and “Tail”, which we may represent by $S = {H, T}$. Formally, the performance of a random experiment is the unpredictable selection of an outcome in S.

The R package `prob` has all the functions to find probability of different basic events. 
A sample space is (usually) represented by a data frame.
Each row of the data frame corresponds to an outcome of the experiment.

Consider the random experiment of tossing a coin.
The outcomes are H and T. 
We can set up the sample space quickly with the tosscoin function:

In [9]:
library(prob)

Loading required package: combinat

Attaching package: 'combinat'

The following object is masked from 'package:utils':

    combn


Attaching package: 'prob'

The following objects are masked from 'package:base':

    intersect, setdiff, union



In [12]:
tosscoin(1)

toss1
H
T


The number 1 tells tosscoin that we only want to toss the coin once. We could toss it more times, like tosscoin(3), to get the output below...

In [11]:
tosscoin(3)

toss1,toss2,toss3
H,H,H
T,H,H
H,T,H
T,T,H
H,H,T
T,H,T
H,T,T
T,T,T
