# Tutorial
This is a short tutorial to get started with Bayesian Networks and the BayesNet library.

The real world is uncertain ... weather, stock prices, sensor readings etc. A mathematically sound way to deal with uncertainty is to use probabilities, e.g. the probability that it will rain on a random day:
$$ P(Rain=true)=0.16 $$
"Rain" is a discrete random variable that can have values $\{false, true\}$

Probabilities add to 1 so:
$$ P(Rain=false) = 1 - P(Rain=true) = 0.84 $$
There is a shortcut to write both probabilities, for "Rain=true" and "Rain=false" at once
$$ p(Rain) = \begin{pmatrix}p(Rain=false)\\p(Rain=true)\end{pmatrix} = \begin{pmatrix}0.84\\0.16\end{pmatrix} $$

This is a probability distribution over all possible values of Rain. It can also be represented as a table:

![](pd_rain.svg)

Notice that the values in the table sum to one, since the values represent a probability distribution.

In graphical models, like Bayesian networks, random variables are represented as cirles:

![](one_variable.svg)

Things get interesting, when you have more than one variable. Then, it can happen, that one variable influences another variable.
The graphical way to represent this relation between the two variables is an arrow, e.g. the season influences the probability for rain:

![](season_rain.svg)

This could be read as:
* season "influences" or "causes" rain or
* rain "depends on" or "is influenced by" season

Assume season has two values $ \{winter, summer\} $, then the probability that it rains is higher, when we know that it is winter, than in summer.  
This is now a **conditional probability**: P(Rain|Season).  
The bar "|" reads as "given": the probabilty of "Rain" given the season". Knowing the season, would could change our belief, weather it would rain today.

We can express this as a table:

![](cpd_rain_given_season.svg)

This is a conditional probability table. The first line is $P(Rain|Season=winter)$ and the second line is $P(Rain|Season=winter)$

Note again, that the values in the rows of the table sum to one. Each row represents a (conditional) probability distribution.

Now, the **key to understand Bayesian Networks** is, that the diagram above with the two random variables "Season" and "Rain" represents a joint probability. It is the joint probability of season and rain: $ P(Season,Rain)$.

And the rules for Bayesian Networks say that if you want to calculate the joint probability corresponding to a diagram with random variables, then you have one factor per variable, where a factor for a variable without incoming edges is a probability, e.g. $P(season)$ and a factor with incoming edges is a conditional probability, e.g. $P(Rain|Season)$, and you multiply all these factors.

So in the case above, $ P(Season, Rain)$ would be:

$$ P(Season,Rain) = P(Season) * P(Rain|Season) $$

Variables that we condition on, like "Season" are called "parents". "Season" is a parent of "Rain". The general formula for calculation the joint probability of a Baysian network is:

$$ P(X_1, X_2, ... , X_n) = \prod_i P(X_i|pa(X_i)) $$

This is called the **Chain Rule** for Bayesian Networks.

<div class="alert alert-block alert-warning"><b>Note:</b> A Bayesian network is just a graphical representation of a joint probability distribution. The arrows in the diagram indicate, how this joint distribution factorizes into (conditional) probabilities</div>






You might say: **"What is the benefit of knowing the joint distribution and how it factorizes?"**

The benefit of knowing the joint distribution and its factorization is, that we can derive (infer) values from it, that are of interest and are not given.

For example, from the above joint and factorization of "Season" and "Rain" $P(Season,Rain) = P(Season) * P(Rain|Season)$ we could compute the following distributions:
* $P(Season|Rain)$: the probability of the season, given it is raining
* $P(Rain)$: the (unconditional/marginal) probability of rain

Calculating such values is called "inference".

You can think of a joint distribution like a database, that you can query for information. The information in this "database" is linked like in a brain. And when you query the database, these links are considered when calculating the result.



When doing calculations based on Baysian Networks, we often do not calculate with probabilities, but with "factors". Factors are similar to probability distributions, but more general:

<div class="alert alert-block alert-warning"><b>Factor:</b> A factor is simply a function that maps certain configurations (aka. assignments) of its arguments to a real values</div>

$$ f(arg_1,arg_2, ... , arg_n) \mapsto \mathbb{R} $$

If the arguments are discrete values, then we can represent a factor, similar to a (conditional) probability as a table.
But in contrast to conditional probability tables, the values are contained in a vector. The values do not have to sum to 1 and they can even be negative.

We can convert probability distributions in factors. In the diagram you see the probability distributions of "Season" and "Rain" on the left and their corresponding factors on the right.

![](cpds_and_factors.svg)

Note: In the BayesNet library, we follow the convention that the value of the variable in the first column changes most frequently.

Note: When working with factors, there is no notion of direction. From f(Season,Rain) we do not know that "Season" has an influence on "Rain". So without knowing the orgiginal graph with the edges, we cannot convert a factor back to a conditional probability distribution.

