## Uncertainty in Machine Learning


- Uncertainty is the biggest source of difficulty for beginners in machine learning, especially developers.
- Noise in data, incomplete coverage of the domain, and imperfect models provide the three main sources of uncertainty in machine learning.
- Probability provides the foundation and tools for quantifying, handling, and harnessing uncertainty in applied machine learning.

#### In many cases, it is more practical to use a simple but uncertain rule rather than a complex but certain one, even if the true rule is deterministic and our modeling system has the fidelity to accommodate a complex rule.

- Joint probability is the probability of two or more events occurring simultaneously.
- Marginal probability is the probability of an event irrespective of the outcome of other
variables
- Conditional probability is the probability of one event occurring in the presence of one or more other events.

- Joint Probability: Probability of events A and B.
- Marginal Probability: Probability of event A given variable Y .
- Conditional Probability: Probability of event A given event B.

- The probability of a row of data is the joint probability across each input variable.
- The probability of a specific value of one input variable is the marginal probability across
the values of the other input variables.
- The predictive model itself is an estimate of the conditional probability of an output given an input example.

## Joint Probability for Two Variables

#### We may be interested in the probability of two simultaneous events, e.g. the outcomes of two different random variables. The probability of two (or more) events is called the joint probability.

#### The joint probability for events A and B is calculated as the probability of event A given event B multiplied by the probability of event B. This can be stated formally as follows:
- P(A ∩ B) = P(A given B) × P(B)

## Marginal Probability

#### We may be interested in the probability of an event for one random variable, irrespective of the outcome of another random variable. For example, the probability of X = A for all outcomes of Y . The probability of one event in the presence of all (or a subset of) outcomes of the other random variable is called the marginal probability or the marginal distribution. The marginal probability of one random variable in the presence of additional random variables is referred to as the marginal probability distribution.

## Independence

#### If one variable is not dependent on a second variable, this is called independence or statistical independence. This has an impact on calculating the probabilities of the two variables. For example, we may be interested in the joint probability of independent events A and B, which is the same as the probability of A and the probability of B. Probabilities are combined using multiplication, therefore the joint probability of independent events is calculated as the probability of event A multiplied by the probability of event B. This can be stated formally as follows: Joint Probability : 
- P (A ∩ B) = P (A) × P (B)

#### We refer to the marginal probability of an independent probability as simply the probability. Similarly, the conditional probability of A given B when the variables are independent is simply the probability of A as the probability of B has no effect. For example:
- Conditional Probability : P (A|B) = P (A)

## Exclusivity
#### If the occurrence of one event excludes the occurrence of other events, then the events are said to be mutually exclusive. The probability of the events are said to be disjoint, meaning that they cannot interact, are strictly independent. If the probability of event A is mutually exclusive with event B, then the joint probability of event A and event B is zero.
- P(A ∩ B) = 0.0 

#### Instead, the probability of an outcome can be described as event A or event B, stated formally as follows:
- P(A or B) = P(A) + P(B)

#### The or is also called a union and is denoted as a capital U letter (∪); for example:
- P(A or B) = P(A ∪ B)

#### If the events are not mutually exclusive, we may be interested in the outcome of either event. The probability of non-mutually exclusive events is calculated as the probability of event A and the probability of event B minus the probability of both events occurring simultaneously. This can be stated formally as follows:
- P(A ∪ B) = P(A) + P(B) − P(A ∩ B)