# Forking Experiments

This notebook contains some experimental results from running several network simulations that can help us to understand how some settings and parameters affect the network's throughput (more specifically on the chain forks evolution) under a Proof of Stake v3 proposing mechanism.

But before showing the experimental results, we'll introduce some basic calculations to see what the theory tells us and have a better idea of which kind of questions we're trying to answer.

## Simplest case: 0-delay

Given any arbitrary target spacing (average time between blocks), which in our case is 16s, and 0 delay, two variables control the probability of having forks:

  - The numer of staked coins.
  - The time granularity used to compute block hashes.

**Some assumptions:**

  - There's no delay, once a block is created, it arrives immediately to all the nodes in the network.
    Although it's intuitive to expect that higher delays could lead to higher number of forks, we'll
    assume for now that there's no delay to have a better intuition on how other factors could affect
    the outcome.
  - All coins have the same denomination and each proposer holds one single coin, this is a strong (and not realistic) assumption but not very problematic.
  - The difficulty adjustment mechanism is good enough to keep empirical probabilities close enough to the desired ones.
  - Given that the delay is 0, and that we have a fork choice rule, the "weakest" forks are rapidly discarded and the nodes don't try to build chains on top of them.

**Some definitions:**

  - $C$ : number of staked coins
  - $T$ : target spacing, or expected average time between blocks
  - $m$ : in $T$ seconds a node will be able to try $m$ hashes.
  - $P\left(C,m,k\right)$ : Probability that a coin gives us the ability to propose during a period of $k\frac{T}{m}$ seconds when there are $C$ staked coins, the target spacing is $T$ and a node can try to propose every $\frac{T}{m}$ seconds.
  - $\mathsf{P}$ : $P\left(C,m,1\right)$.
  - $F\left(C,m,k\right)$ : Probability of having at least one fork during a period of $k\frac{T}{m}$ seconds when there are $C$ staked coins, the target spacing is $T$ and a node can try to propose every $\frac{T}{m}$ seconds.
  - $\mathsf{F}$ : $F\left(C,m,1\right)$.
  - $\mathbb{F}$ : $F\left(C,m,m\right)$.
  

**Some calculations:**

The probability of having at least one fork for a given instant is $1$ minus the probability of not having any block proposal, minus the probability of having exactly $1$ block proposal:

$$\mathsf{F} = 1 - \mathsf{P}\left(1 - \mathsf{P}\right)^{C-1} - \left(1 - \mathsf{P}\right)^C = 1 - \left(1 - \mathsf{P}\right)^{C-1}$$

The probability of having at least one fork during $T$ seconds is:

$$\mathbb{F} = 1 - \left(1-\mathsf{F}\right)^m$$

We want to find expressions that depend on $C$ and $m$, so let's start doing some substitions:

$$\frac{1}{m} = 1 - \left(1-\mathsf{P}\right)^C \implies \mathsf{P} = 1-\left(1-\frac{1}{m}\right)^{\frac{1}{C}}$$

and hence

$$\mathsf{F} = 1 - \left(1-\mathsf{P}\right)^{C-1} = 1 - \left( 1 - \left( 1 - \left( 1 - \frac{1}{m}\right)^{\frac{1}{C}}\right)\right)^{C-1} = 1 - \left(1-\frac{1}{m}\right)^{\frac{C-1}{C}}$$

therefore

$$\mathbb{F} = 1 - \left(1-\mathsf{F}\right)^m = 1 - \left( 1 - \left( 1 - \left( 1 - \frac{1}{m}\right)^{\frac{C-1}{C}}\right)\right)^m = 1 - \left(1-\frac{1}{m}\right)^{m\frac{C-1}{C}}$$

As we can see, when the number of coins ($C$) is big enough, the probability of having at least one fork every $T$ seconds is $1-\left(1-\frac{1}{m}\right)^m$, which is a decreasing function of $m$; and $\lim_{m\to\infty}1-\left(1-\frac{1}{m}\right)^m = 1-e^{-1}$.

We can take a look on $\mathbb{F}$ seeing what happens when $m$ is very large. The probability of having forks every $T$ seconds is $1-e^{-1+\frac{1}{C}}$, which makes sense: the more proposers, the higher the probability of having forks (and having just one proposer gives us a probability equal to $0$).

From this we know that, having negligible delay, it's worth to have a large $m$ parameter, but we still face a fundamental limit that we can't cross, some amount of forking is inevitable. This tells us that, having a target spacing of 16s we'll see a fork approximately every 25.31s in average.

## A more convoluted case: deterministic $\delta$-delay

Now let's see what happens when the block propagation delay is positive, although this is more realistic, we'll still keep things simple by assuming a deterministic fixed delay.