# Problem

## Round, Round, Get a Round, I Get a Round

_Suppose you have two real numbers, like 3.14 and 2.718. If you round these two numbers and add these rounded values together, you get 3 + 3, or 6. Alternatively, if you add the original two numbers and then round the sum, you still get 6._

_But rounding then adding doesn’t always give you the same result as adding then rounding. For example, if the two numbers are 2.4 and 3.4, rounding then adding gives you 5 (i.e., 2 + 3), whereas adding then rounding gives you 6 (i.e., 2.4 + 3.4 = 5.8, which rounds to 6)._

_How likely is it that rounding then adding gives you the same result as adding then rounding?_

_To be more precise, suppose you randomly, uniformly, and independently pick two real numbers between 0 and 1. What is the probability that rounding the two numbers and then adding gives you the same result as adding the two numbers and then rounding?_

# Solution

There are a couple of intuitive ways of going about this problem. Here's one that I think will set us up well for the extra credit!

Let $X_1$ and $X_2$ be our randomly sampled numbers. Let's first consider the distributions of $Y = round(X_1) + round(X_2)$ and $Z = round(X_1 + X_2)$

We want 

$ \mathbb{P}(Y=i \land Z=i) \text{ for all } i \in \{0,1,2\}$

But we can't do something naive like

$ \mathbb{P}(Y=i \land Z=i) \neq \mathbb{P}(Y=i) \mathbb{P}(Z=i)$

Instead let's use some conditional probabilities

$ \mathbb{P}(Y=i \land Z=i) = \mathbb{P}(Z=i | Y=i ) \mathbb{P}(Y=i) $

$\mathbb{P}(Y=i)$ is actually pretty easy to deal with and is simply a binomial distribution of $N$ samples.

$\mathbb{P}(Z=i | Y=i )$ is a bit more complex so let's take it step by step.

First, let's consider where both distributions are 0.

$\mathbb{P}(Z=0 | Y=0 )$

$\mathbb{P}(X_1 + X_2<\frac{1}{2} | Y=0 ) = \mathbb{P}(X_1 + X_2<\frac{1}{2} | X_1<\frac{1}{2} , X_2<\frac{1}{2} )$

Looking at the condition closely, we notice that $X_1$ and $X_2$ are independent and uniformly distributed. In this way, it resembles a convolution of two uniform distributions, this time just over the interval $[0,0.5]$. This is a [well known distribution]((https://math.stackexchange.com/questions/357672/density-of-sum-of-two-independent-uniform-random-variables-on-0-1)) and is simply a triangle distribution.

*Triangle Distribution*

$
X_1 + X_2 = Z' =
\begin{cases}
z' & \text{if } 0<=z'<=1 \\
2-z' & \text{if } 1<z'<=2>
\end{cases}
$

We just need to modify this distribution to account for the fact that we are only considering the case where $X_1$ and $X_2$ are less than 0.5. This is a simple matter of scaling the distribution by 0.5.

$
X_1 + X_2 = Z' = f(z') =
\begin{cases}
4z' & \text{if } 0<=z'<=0.5 \\
4-4z' & \text{if } 0.5<z'<=1>
\end{cases}
$

We integrate this from 0 to 0.5 to get the probability that $Z'<0.5$

$
\int_0^{0.5} 4z' dz' = 2z'^2 |_0^{0.5} = 0.5
$

For $i=1$, we can use the same logic to get (the factor of 2 comes from the symmetric case where $X_1<0.5$ and $X_2>0.5$)

$ 2\mathbb{P}(\frac{1}{2}<X_1 + X_2<\frac{3}{2} | X_1>\frac{1}{2} , X_2<\frac{1}{2} )$

Again, this is just our triagle distribution, except we need to modify it to account for the fact that our variables are uniform on $X_1=[0.5,1]$ and $X_2=[0,0.5]$. This requires the same scaling of the distribution but also now requires a shift to the right by 0.5.

$\int_{0.5}^{1.5} f(z'-0.5) dz'$

Though if we consider the bound and the shift, this is the same as

$\int_{0}^{1} f(z') dz'=1$

Finally, for the case where $i=2$, we can observe that its symmetric with the $i=0 case so

$\mathbb{P}(Z=2 | Y=2 ) = 0.5$

Now we can put it all together

$\mathbb{P}(Y=i \land Z=i) = \mathbb{P}(Z=i | Y=i ) \mathbb{P}(Y=i) $

$\mathbb{P}(Y=0 \land Z=0) = \frac{1}{2} \cdot \mathbb{P}(Y=0) = \frac{1}{2} \cdot \mathbb{P}(X_1<\frac{1}{2} , X_2<\frac{1}{2}) = \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{8}$

$\mathbb{P}(Y=1 \land Z=1) = 1 \cdot \mathbb{P}(Y=1) = \binom{2}{1} \mathbb{P}(X_1>\frac{1}{2} , X_2<\frac{1}{2}) = 2 \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{2}$

$\mathbb{P}(Y=2 \land Z=2) = 0.5 \cdot \mathbb{P}(Y=2) = \frac{1}{2} \cdot \frac{1}{2} \cdot \frac{1}{2} = \frac{1}{8}$

Summed together, we get

$\frac{1}{8} + \frac{1}{2} + \frac{1}{8} = \frac{3}{4}$

# Extra Credit

Analyzing the problem in this way sets us up well for the extra credit. We can now consider the case where we have $N$ samples and we want to find the probability that $Y=Z$. This is a bit more complex but we can use the same logic as above.

Now we have

$X_1 + X_2 + \ldots + X_N = Z' = f(z')$

This distribution is a convolution of $N$ uniform distributions and if I was a badass, I would derive the distribution here. But that would be that's a bit beyond what I'd like to do for this problem and it is a [well known distribution](https://en.wikipedia.org/wiki/Irwin%E2%80%93Hall_distribution) known as the Irwin-Hall distribution. It's CDF is given by

$F(x,n) = \frac{1}{n!} \sum_{k=0}^n (-1)^k \binom{n}{k} (x-k)^n$

Of course, this formulation is for the case where $X_i$ are uniform on $[0,1]$. We can modify this to account for the fact that we are only considering the case where the range of $X_i$ is $0.5$ and shifted on intervals of $0.5$. Instead, I opted to do a close examination of the ranges of the integral of our compressed distribution and converted them into the appropriate ranges for the Irwin-Hall distribution. The pattern came out to be

$
\begin{align}
Y=Z=0 &\rightarrow [0,1] \\
Y=Z=1 &\rightarrow [0,2] \\
Y=Z=2 &\rightarrow [1,3] \\
Y=Z=3 &\rightarrow [2,4] \\
&\cdots \\
Y=Z=N-1 &\rightarrow [N-2,N] \\
Y=Z=N &\rightarrow [N-1,N]
\end{align}
$

So we need to calculate the probability

$S = \left( \dfrac{1}{2} \right)^N \left[ 2(F(1) - F(0)) + \sum_{k=0}^{N-2} \binom{N}{k} \left( F(k+2) - F(k) \right) \right]$

$S = \left( \dfrac{1}{2} \right)^N \left[ \frac{2}{N!}+ \sum_{k=0}^{N-2} \binom{N}{k} \left( F(k+2) - F(k) \right) \right]$

Frankly, after much effort, I couldn't seem to find a way to easily simplify this so this is where I'll leave it! I think this is generated the correct answer!

In [21]:
import math

def choose(n, k):
    return math.factorial(n) / (math.factorial(k) * math.factorial(n-k))

def irwinhall_cdf(x, n):
    return 1/math.factorial(n) * sum([(-1)**k * math.comb(n,k) * (x-k)**n for k in range(x+1)])

def compute_sum(n):
    total = irwinhall_cdf(1, n) - irwinhall_cdf(0, n)
    total += irwinhall_cdf(n, n) - irwinhall_cdf(n-1, n)
    for k in range(0, n-1):
        total += choose(n, k+1) * (irwinhall_cdf(k+2, n) - irwinhall_cdf(k, n))
    # return (1/2)**n * (2/math.factorial(n) + total)
    return (1/2)**n * total

for i in range(1, 15):
    print("{}={}".format(i, compute_sum(i)))

1=1.0
2=0.75
3=0.6666666666666667
4=0.5989583333333334
5=0.55
6=0.5110243055555556
7=0.4793650793650793
8=0.4529209681919643
9=0.4304177689594357
10=0.4109626428244185
11=0.39392556517556515
12=0.37884408454473006
13=0.3653708694854528
14=0.35323915669918926
