**TODO-Analytic:**
  * Conf. Bounds l(p), r(p) for pairwise order coefficient
  * Conf. Bounds l(p), r(p) for multivariate order coefficient

**TODO-Numeric:**
  * Test effect of trial number
  * Test effect of event number
  * Test effect of bootstrap count

In [None]:
import numpy as np
import matplotlib.pyplot as plt

def metric(t1, t2):
    ord12 = (t1 < t2).astype(float)
    return np.abs(2*np.mean(ord12) - 1)

## Ordering for data

1. Subtract minimum and normalize
2. Compute CDF
3. Subtract CDF and integrate

In [None]:
##################
# Generate Data
##################

def addConv(data, r):
    for i in range(len(data)-1):
        data[i+1] += r * data[i]
    return data

# Convert data to probability distribution
def rescale(x):
    #xnorm = x - np.min(x)            # Subtract minimum
    xnorm = x - np.percentile(x, 40)  # Subtract minimum
    xnorm[xnorm < 0] = 0
    return xnorm / np.sum(xnorm)      # Normalize

def CDF(p):
    rez = np.zeros(len(p) + 1)
    for i in range(len(p)):
        rez[i+1] = rez[i] + p[i]
    return rez

# Generate slow data
nPoints = 100
x = addConv(np.random.normal(0, 1, nPoints), 0.7)
y = addConv(np.random.normal(0, 1, nPoints), 0.7)
x = rescale(x)
y = rescale(y)

##################
# Generate CDF
##################
cdfX = CDF(x)
cdfY = CDF(y)
deltaCDF = cdfX - cdfY
orderFunc = np.sum(deltaCDF)

print("Order function is", orderFunc, "so the earlier trace is", int(orderFunc<0))

fig, ax = plt.subplots(ncols = 3, figsize=(15,5))
ax[0].set_title("Two slow noisy datasets")
ax[0].plot(x, label='data1')
ax[0].plot(y, label='data2')
ax[0].legend()

ax[1].set_title("CDF's for both data sets")
ax[1].plot(cdfX, label='data1')
ax[1].plot(cdfY, label='data2')
ax[1].legend()

ax[2].set_title("CDF difference")
ax[2].plot(deltaCDF)

plt.show()

## Pairwise orderability

How to determine if one event consistently happens before/after another one?

Approach:
  1. Compute $b_i = x < y$. For noisy data can compare means or cumulatives
  2. Assume $N_{<} = \sum_i b_i \sim Bin(p, N)$ - all events have iid binomial distribution
  3. Compute expected value $\hat{p} = \frac{N_{<}}{N}$
  4. Define orderability coefficient $c = |2\hat{p} - 1|$, s.t. $c \in [0,1]$
  5. Compute empirical distribution of $c$ under $H_0 : p=0.5$, compute confidence threshold $c_{\max}(\alpha)$ s.t. under $H_0$ we have $c > c_{max}$ with probability at most $\alpha$.

In [None]:
pValLst = [0.01, 0.05]
pValCol = ['r', 'y']
nTrial = 59
nBootstrap = 10000

metricLst = []
for i in range(nBootstrap):

    times1 = np.random.uniform(0, 1, nTrial)
    times2 = np.random.uniform(0, 1, nTrial)
    metricLst += [metric(times1, times2)]

plt.figure()
plt.hist(metricLst, bins='auto')
plt.xlabel("Orderability coefficient")
plt.title("Orderability of random data")

for pVal, col in zip(pValLst, pValCol):
    empL = np.percentile(metricLst, 100*pVal)
    empR = np.percentile(metricLst, 100*(1-pVal))
    print("For p =", pVal, "empirical confidence interval for bivariate order metric is", [empL, empR])
    plt.axvline(x=empL, linestyle='--', color=col, label=str(pVal))
    plt.axvline(x=empR, linestyle='--', color=col)

plt.legend()
plt.show()

## Multivariate Orderability

How to determine if multiple events are orderable

Approach:
  1. Compute pairwise "earlier" count $N^{<}_{ij}$
  2. Compute pairwise orderability $c_{ij}$ matrix (excl. diagonal)
  3. Define multivariate orderability $C = \bar{c}$ as mean over pairwise coefficients
  4. Compute empirical confidence intervals for $C$ under $H_0 : t_{ij} = U(0,1) \; \forall i,j$ that all events $i$ at all repetitions $j$ are random and uniform.
  5. Test if $C$ for real data is inside the interval

In [None]:
pValLst = [0.01, 0.05]
pValCol = ['r', 'y']
nEvent = 20
nTrial = 59
nBootstrap = 10000

metricLst = []

for i in range(nBootstrap):
    times = np.random.uniform(0, 1, (nEvent, nTrial))

    ordMatL = []
    for i in range(nEvent):
        for j in range(i+1, nEvent):
            ordMatL += [metric(times[i], times[j])]

    metricLst += [np.mean(ordMatL)]

plt.figure()
plt.hist(metricLst, bins='auto')
plt.xlabel("Mean Orderability coefficient")
plt.title("Orderability of random data of " +str(nEvent) + " events")

for pVal, col in zip(pValLst, pValCol):
    empL = np.percentile(metricLst, 100*pVal/2)
    empR = np.percentile(metricLst, 100*(1-pVal/2))
    print("For p =", pVal, "empirical confidence interval for multivariate order metric is", [empL, empR])
    plt.axvline(x=empL, linestyle='--', color=col, label=str(pVal))
    plt.axvline(x=empR, linestyle='--', color=col)
plt.legend()
plt.show()