# What is probability?


Probability is the chance that something will happen
- Probability of an event happening P(E) = Number of ways it can happen n(E)/ Total number of outcomes n(T).
- Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1, where 0 indicates impossibility and 1 indicates certainty.

# Why probability is important?

- Uncertainty and randomness occur in many aspects of our daily life and having a good knowledge of probability helps us make sense of these uncertainties. 
- Learning about probability helps us make informed judgments on what is likely to happen, based on a pattern of data collected previously or an estimate.

# How Probability is used in Data Science?
- Data science often uses statistical inferences to predict or analyze trends from data, while statistical inferences uses probability distributions of data. Hence knowing probability and its applications are important to work effectively on data science problems.

# Conditional probability

## 1. What is conditional probability? 

- Conditional probability is a measure of the probability of an event (some particular situation occurring) given that (by assumption, presumption, assertion or evidence) another event has occurred.

![image.png](attachment:image.png)

$P(A and B) = P(A)*P(B)$ Probability of the occurence of both A and B

Ref:
- https://www.mathsisfun.com/data/probability-events-conditional.html
- https://www.analyticsvidhya.com/blog/2017/03/conditional-probability-bayes-theorem/

### 2. How conditional probability is used in data science?
Many data science techniques (i.e. Naive Bayes) rely on Bayes’ theorem. Bayes’ theorem is a formula that describes how to update the probabilities of hypotheses when given evidence.
![image.png](attachment:image.png)

- The Bayes theorem describes the probability of an event based on the prior knowledge of the conditions that might be related to the event. 

We can write the conditional probability as $P(\frac{A}{B})$, the probability of the occurrence of event A given that B has already happened.

 If we know the conditional probability $P(\frac{A}{B})$ , we can use the bayes rule to find out the reverse probabilities $P(\frac{B}{A})$
 ![image.png](attachment:image.png)
 P (A ꓵ B) = P (A) * P (B)

In [1]:
# calculate P(A|B) given P(A), P(B|A), P(B|not A)
def bayes_theorem(p_a, p_b_given_a, p_b_given_not_a):
    # calculate P(not A)
    not_a = 1 - p_a
    # calculate P(B)
    p_b = p_b_given_a * p_a + p_b_given_not_a * not_a
    # calculate P(A|B)
    p_a_given_b = (p_b_given_a * p_a) / p_b
    return p_a_given_b
 
# P(A)
p_a = 0.0002
# P(B|A)
p_b_given_a = 0.85
# P(B|not A)
p_b_given_not_a = 0.05
# calculate P(A|B)
result = bayes_theorem(p_a, p_b_given_a, p_b_given_not_a)
# summarize
print('P(A|B) = %.3f%%' % (result * 100))

P(A|B) = 0.339%


# Random variables



A random variable is a set of possible values from a random experiment.
![image.png](attachment:image.png)

- A random variable (random quantity, aleatory variable, or stochastic variable) is a variable whose possible values are outcomes of a random phenomenon.
- Random variables can be discrete or continuous. Discrete random variables can only take certain values while continuous random variables can take any value (within a range).

## Probability distributions
The probability distribution for a random variable describes how the probabilities are distributed over the values of the random variable.

- For a discrete random variable, x, the probability distribution is defined by a probability mass function (PMF), denoted by f(x). This function provides the probability for each value of the random variable.
- For a continuous random variable, since there is an infinite number of values in any interval, the probability that a continuous random variable will lie within a given interval is considered. So here, the probability distribution is defined by probability density function(PDF), also denoted by f(x).

Probability functions must satisfy two requirements:
- f(x) >= 0 $\forall{x}$
- $\sum{f(x)}=1$, x is a random variables 

## Types of probability distributions

### 1. Binomial distribution 
A binomial distribution is a statistical experiment that has the following properties:
- The experiment consists of n repeated trials. 
- Each trial can result in just two possible outcomes.

The probability of success, denoted by P, is the same on every trial.
![image.png](attachment:image.png)

![image.png](attachment:image.png)

### 2. Normal distribution or Gaussian distribution.

The normal distribution is a probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean. It has following properties:
- The normal curve is symmetrical about the mean μ;
- The mean is at the middle and divides the area into halves;
- The total area under the curve is equal to 1;
- It is completely determined by its mean and standard deviation σ (or variance)
![image.png](attachment:image.png)

## Hypothesis testing

- Hypothesis testing is a way for you to test the results of a survey or experiment to see if you have meaningful results based on probability.
- We are testing the results are valid to use or it just have happened by chance (stochastic).
- The probability and hypothesis testing give rise to two important concepts, namely:
    - Null Hypothesis: The result is no different from assumption.
        - The null hypothesis is always the accepted fact or our assumption.
    - Alternate Hypothesis: Result disproves the assumption

The step by step working with hypothesis testing:
    1. Figure out your null hypothesis,
    2. State your null hypothesis,
    3. Choose what kind of test you need to perform,
    4. Either support or reject the null hypothesis.

### Bayesian Hypothesis Testing

#### 1. P-values
The P value, or calculated probability, is the probability of finding the observed, or more extreme, results when the null hypothesis (H0) of a study question is true – the definition of ‘extreme’ depends on how the hypothesis is being tested. P is also described in terms of rejecting H0 when it is actually true, however, it is not a direct probability of this state.

- A p-value is a number that we get by running a hypothesis test on your data. 
- A P-value of 0.05 (5%) or less is usually enough to claim that your results are repeatable.
- p-value is the usual way to test our results.

Permission: Answer the questions.
- What is p-value used for?
- p-value high => what happend? p-values < 0.05 
- when is Alternative hypothesis (H1) accepted?
- Types of error and what does it mean?  