# Alternative Solution Using Dependent Variables

## Problem Statement

Let:
- $C_1$, $C_2 \sim U(0,1)$ be the candle positions
- $K \sim U(0,1)$ be the knife position

Denote the probability of the knife landing between the two candles as

$$P(C_{1,2} < K < C_{1,2}) = P(C_1 < K < C_2) + P(C_2 < K < C_1)$$.

It is equally likely that the knife lands between the candles in both orientations, since $C_1$, $C_2$ and $K$ are $\sim U(0,1)$,  i.e.

$$P(C_1 < K < C_2) = P(C_2 < K < C_1).$$

Therefore

$$P(C_{1,2} < K < C_{1,2}) = 2 \cdot P(C_1 < K < C_2).$$

## Joint CDF

We can further simplify
$$
\begin{align*}
P(C_1 < K < C_2)
&= P(C_1 - K < 0\ \mathrm{and}\ C_2 - K > 0)\\
&= P(X < 0\ \mathrm{and}\ Y < 0)
\end{align*}
$$
where $X = C_1 - K$ and $Y = K - C_2$.

Therefore the solution to the problem is given by the joint cumulative distribution function

$$P(X < 0\ \mathrm{and}\ Y < 0) = \int_{-1}^0 \int_{\alpha(y)}^{\beta(y)} f_{X,Y}(x,y) \, dx \, dy.$$

where $f_{X,Y}(x,y)$ is the joint density of $X$ and $Y$. The integration limits $\alpha$, $\beta$ of $X$ are functions of $Y$ due to the dependency of both variables on $K$.

### Joint Density

The joint density can be decomposed into

$$
\begin{align*}
f_{X,Y}(x,y) &= f_{Y\mid X}(y\mid x)f_X(x) \\
&= f_{X\mid Y}(x\mid y)f_Y(y)
\end{align*}
$$

#### Marginals

$X$ and $Y$ are both differences of standard uniform random variables. Therefore they both follow a Triangular distribution e.g.

$$K - C_2 = Y \sim \text{Triangular}(-1, 0, 1).$$

Therefore the marginals $f_X(x)$ and $f_Y(y)$ are PDFs of Triangular distributions.

#### Conditional Probability

The conditional probability distribution of $X_{Y = y}$ ($X$ given $Y=y$) can be established through the shared variable $K$. If we observe a specific value $Y=y$, and since $Y = K - C_2$  we can narrow the support of $K$ to

$$K \in [\max(y, 0), \min(1, 1+y)],$$

therefore

$$K \sim U(\max(y, 0), \min(1, 1+y)).$$

Information about $Y$ does not influence $C_1$ so $C_1$ still follows a standard uniform distribution. Therefore the conditional distribution $X_{Y = y}$ is the difference of two uniform variables with different supports and consequently is trapezoidal distributed.

#### Trapezoidal Distribution Refresher

Trapezoidal distributions have four parameters: $a$, $b$, $c$, $d$; that correspond to the lower bound, first bend, second bend and upper bound respectively.

Let $F \sim U(l,m)$ and $G \sim U(n, o)$ and $H = F-G$ then

$$H\sim \text{Trapezoidal}(l-o,l-n,m-o,m-n)$$

Therefore for a known value of $Y$ the distribution of $X$ is

$$X_{Y = y} \sim \text{Trapezoidal}(-\min(1, 1+y), -\max(y, 0), 1-\min(1, 1+y), 1-\max(y, 0))$$

and the conditional probability density $f_{X\mid Y}$ is the PDF of this distribution.

## Solution

Finally given the support of $X_{Y = y}$ the joint CDF is 

$$ P(X < 0\ \mathrm{and}\ Y < 0) = \int_{-1}^0 \int_{a}^{\min(d,0)} \left(\begin{cases} y + 1 & \text{for}\: y < 0 \\1 - y & \text{otherwise} \end{cases}\right) \left(\begin{cases} \frac{x + \min\left(1, y + 1\right)}{- \max\left(0, y\right) + \min\left(1, y + 1\right)} & \text{for}\: x < - \max\left(0, y\right) \\\frac{- x - \max\left(0, y\right) + 1}{- \max\left(0, y\right) + \min\left(1, y + 1\right)} & \text{for}\: x > 1 - \min\left(1, y + 1\right) \\1 & \text{otherwise} \end{cases}\right) \, dx \, dy$$

Evaluating the integral yields

$$P(C_1 < K < C_2)=\frac{1}{6}$$

therefore

$$P(C_{1,2 } < K < C_{1,2}) = 2 \cdot \frac{1}{6} = \frac{1}{3}$$

# Sympy Verification

In [1]:
from sympy import symbols, integrate, Piecewise, Max, Min, simplify, Integral

y = symbols("y")
triangular = Piecewise(
    (y + 1, y < 0),
    (-y + 1, True)
)

x = symbols("x")
k0 = Max(y, 0)
k1 = Min(1, 1+y)

a = -k1
b = -k0
c = 1 - k1
d = 1 - k0

trapezoid = Piecewise(
    ((x-a)/(b-a), x < b),
    ((d-x)/(d-c), x > c),
    (1, True)
)

In [2]:
triangular*trapezoid

Piecewise((y + 1, y < 0), (1 - y, True))*Piecewise(((x + Min(1, y + 1))/(-Max(0, y) + Min(1, y + 1)), x < -Max(0, y)), ((-x - Max(0, y) + 1)/(-Max(0, y) + Min(1, y + 1)), x > 1 - Min(1, y + 1)), (1, True))

In [3]:
2 * integrate(triangular*trapezoid, (x, a, Min(d, 0)), (y, -1, 0) )

1/3

# Numeric Integration

In [4]:
import numpy as np
from scipy.integrate import dblquad
from scipy.stats import trapezoid, triang
import time

# Used for tighter inner integral bounds
def x_lower(y):
    # Support of K
    k_range = [np.maximum(y, 0), np.minimum(1, 1+y)]

    # Support of C1
    c1_range = [0, 1]

    # Conventional trapezoid parameters
    a = c1_range[0] - k_range[1]
    
    return a

def x_upper(y):
    # Support of K
    k_range = [np.maximum(y, 0), np.minimum(1, 1+y)]

    # Support of C1
    c1_range = [0, 1]

    # Conventional trapezoid parameters
    d = c1_range[1] - k_range[0]
    
    return np.minimum(d, 0)

def f(y, x):
    # Y marginal
    y_density = triang.pdf(y, loc=-1, scale=2, c=0.5)

    # Support of K
    k_range = [np.maximum(y, 0), np.minimum(1, 1+y)]

    # Support of C1
    c1_range = [0, 1]

    # Conventional trapezoid parameters
    a = c1_range[0] - k_range[1]
    b = c1_range[0] - k_range[0]
    c = c1_range[1] - k_range[1]
    d = c1_range[1] - k_range[0]

    # Scipy trapezoid parameters
    loc = a
    scale = d-a
    c1 = (b-a) / scale
    d1 = (c-a) / scale

    # X conditional probability density
    x_density = trapezoid.pdf(x,c=c1, d=d1,loc=loc, scale=scale)

    return y_density*x_density

p = dblquad(f, a=-1, b=0, gfun=x_lower, hfun=x_upper)[0]
print(f"Probability: {p*2}")

Probability: 0.3333333333333333


## Optimisation - Manual PDFs

Using scipy's stats classes are quite slow, this version constructs PDF for the distributions manually.

In [5]:
from scipy.integrate import dblquad
import time
from math import fabs
s = time.time()

def x_lower(y):
    return 0 - (1+y if y<0 else 1)

def x_upper(y):
    d = 1 - (y if y>0 else 0)
    return 0 if d>0 else d

def f(y, x):
    # Y marginal
    y_density = -fabs(y) + 1

    # Support of K
    k_range0 = y if y>0 else 0
    k_range1 = 1+y if y<0 else 1

    # Conventional trapezoid parameters
    a = -k_range1
    b = -k_range0
    c = 1 - k_range1
    d = 1 - k_range0

    # X conditional probability density
    x_density = (2/(d+c-a-b))
    
    if x < b:
        x_density = x_density*((x-a)/(b-a))
        x_density = x_density if x_density>0 else 0
    elif x > c:
        x_density = x_density*((d-x)/(d-c))
        x_density = x_density if x_density>0 else 0
    
    return y_density*x_density

tol = 1e-06 # Increase for even faster results
st = time.time()
p = dblquad(f, a=-1, b=0, gfun=x_lower, hfun=x_upper)[0]
en = time.time()
print(f"Probability: {p*2}")
print(f"Calculated in {en-st:.4f} seconds")

Probability: 0.3333333333333333
Calculated in 0.0003 seconds


## Cython Implementation

For fun I tried to squeeze even more performance by writing a compiled implementation.

In [7]:
%load_ext cython

In [8]:
%%cython
from scipy.integrate import dblquad
from libc.math cimport abs
import cython
import time
import numpy as np

def x_lower(y):
    return 0 - (1+y if y<0 else 1)

def x_upper(y):
    d = 1 - (y if y>0 else 0)
    return 0 if d>0 else d

@cython.cdivision(True)
def f(y: cython.double, x: cython.double):
    # Y marginal
    cdef double y_density = -abs(y) + 1

    # Support of K
    cdef double k_range0 = y if y>0 else 0.0
    cdef double k_range1 = 1+y if y<0 else 1.0

    # Conventional trapezoid parameters
    cdef double a = 0.0 - k_range1
    cdef double b = 0.0 - k_range0
    cdef double c = 1.0 - k_range1
    cdef double d = 1.0 - k_range0

    # X conditional probability density
    cdef double x_density = (2.0/(d+c-a-b))
    
    if x < b:
        x_density = x_density*((x-a)/(b-a))
        x_density = x_density if x_density>0 else 0.0
    elif x > c:
        x_density = x_density*((d-x)/(d-c))
        x_density = x_density if x_density>0 else 0.0
    
    return y_density*x_density

tol = 1e-06 # Increase for even faster results
st = time.time()
p = dblquad(f, a=-1, b=0, gfun=x_lower, hfun=x_upper)[0]
en = time.time()
print(f"Probability: {p*2}")
print(f"Calculated in {en-st:.4f} seconds")

Content of stderr:
Calculated in 0.0012 seconds
