In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

## Introduction

In this chapter, we will look at the core ideas surrounding statistical inference on graphs.

## Statistics refresher

Before we can proceed with statistical inference on graphs,
we must first refresh ourselves with some ideas from the world of statistics.
Otherwise, the methods that we will end up using
may seem a tad _weird_, and hence difficult to follow along.

To review statistical ideas,
let's set up a few statements and explore what they mean.

## We are concerned with models of randomness

As with all things statistics, we are concerned with models of randomness.
Here, probability distributions give us a way to think about random events
and how to assign credibility points to them.

### In an abstract fashion...

The supremely abstract way of thinking about a probability distribution
is that it is the space of all possibilities of "stuff"
with different credibility points _distributed_ amongst each possible "thing".

### More concretely: the coin flip

A more concrete example is to consider the coin flip.
Here, the space of all possibilities of "stuff" is the set of "heads" and "tails".
If we have a fair coin, then we have 0.5 credibility points _distributed_
to each of "heads" and "tails".

### Another example: dice rolls

Another concrete example is to consider the six-sided dice.
Here, the space of all possibilities of "stuff" is the set of numbers in the range $[1, 6]$.
If we have a fair dice, then we have 1/6 credibility points assigned
to each of the numbers.
(Unfair dice will have an unequal _distribution_ of credibility points across each face.)

### The key idea: "distribution of credibility points".

With dice and coins, we _distribute_ credibility points
across _discrete_ states.
The fraction of credibility points assigned to a given state 
(i.e. heads/tails, or integers in a fixed range)
tell us the _probability_ of that state showing up.
Putting these two ideas together
gives rise to our notion of a "discrete probability distribution".
(_Notice the italicized words:
they are there to prime you for the vocabulary
used to describe statistics!_)

To visualize this in your mind,
think of placing dollar coins on each state,
with the number of coins being proportional 
to the credibility assigned to each state.
The **fraction of dollar coins assigned to each state, then,
represents the probability of that state**.
The _normalized_ number of coins,
normalized in such a way that the total number of coins equals 1,
gives the _likelihood_ of that state.

### Infinite-faced dice

Now, if you've played Dungeons and Dragons, 
you'll know that there's one dice that has many more faces.
There's one with as many as 20 faces.
So we know that dice don't necessarily _have_ to be limited to six faces.

Let's take the multi-faced dice idea to an extreme in a thought experiment.
Imagine with me for a moment that you have an infinite-faced dice,
with numbers starting at $0$ and going to $+\infty$ (positive infinity).
