# STAT 345: Nonparametric Statistics

## Lesson 10.2: Mood's Median Test

**Reading: Conover Section 4.3**

*Prof. John T. Whelan*

Tuesday 15 April 2025

These lecture slides are in a computational notebook.  You have access to them through http://vmware.rit.edu/

Flat HTML and slideshow versions are also in MyCourses.

The notebook can run Python commands (other notebooks can use R or Julia; "Ju-Pyt-R").  Think: computational data analysis, not "coding".

Standard commands to activate inline interface and import libraries:

In [1]:
%matplotlib inline

In [2]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (8.0,5.0)
plt.rcParams['font.size'] = 14

## Mood’s Median Test

- Today: an application of the $\chi^2$ test for a $2\times c$ contingency table.

- Consider $c$ samples from different populations; do the populations have the same median?

- In notation of this chapter, samples have sizes $\{c_j|j=1,\ldots,c\}$, total # of data values is $\sum_{j=1}^c c_j=N$ & data are $\{x_{jk}|j=1,\ldots c;k=1,\ldots,c_j\}$ (use $k$ to avoid confusion w/row label $i$.)

- Kruskal-Wallis test can check if they all come from same distribution, but want $H_0$ to allow for different distributions w/same median.

- Estimate median w/"grand median" of all the $\{x_{jk}\}\equiv \{X_I|I=1\ldots,N\}$ & count points in each sample above or below grand median.

- Estimate median w/"grand median" $\tilde{X}$ of all the $\{x_{jk}\}\equiv \{X_I|I=1\ldots,N\}$ & count points in each sample above or below grand median.

- Since some values can exactly equal the grand median (if $N$ is odd and/or some data values are equal), define categories as $x_{jk}>\tilde{X}$ and $x_{jk}\le\tilde{X}$.

- Arrange number counts into a $2\times c$ contingency table. (Note order of rows, following Conover)

| |  $$j=1$$   |   $$j=2$$   |  $$\cdots$$   | $$j=c$$   | Total |
| ------- | ---------- | ----------| ----------|---------- |-------|
|     $$>$$ | $$O_{11}$$ |  $$O_{12}$$ |  $$\cdots$$ |  $$O_{1c}$$ | $$r_1$$ |
|   $$\le$$ | $$O_{21}$$ |  $$O_{22}$$ |  $$\cdots$$ |  $$O_{2c}$$ | $$r_2$$ |
| **Total** |          $$c_1$$   |   $$c_2$$ |    $$\cdots$$   | $$c_c$$ |   $$N$$ |

- If samples all drawn from dists w/same median, should have same probability of lying above or below the median. (Nominally $1/2$, but could differ for discrete dists.)  So construct standard $\chi^2$ statistic, now for independence of columns:
$T =
  \sum\limits_{j=1}^c \sum\limits_{i=1}^2 \frac{N}{r_ic_j}\left(O_{ij}-\frac{r_ic_j}{N}\right)^2$

- Can simplify $T =
  \sum\limits_{j=1}^c \sum\limits_{i=1}^2 \frac{N}{r_ic_j}\left(O_{ij}-\frac{r_ic_j}{N}\right)^2$ somewhat, by
noting that
$$O_{2j}-\frac{r_2c_j}{N} = c_j - O_{1j} - \frac{(N-r_1)c_j}{N}
  = -\left(O_{1j} - \frac{r_1c_j}{N}\right)$$

- and so
$$T = N \sum_{j=1}^c \frac{\left(O_{1j} - r_1c_j/N\right)^2}{c_j}
  \left(\frac{1}{r_1}+\frac{1}{r_2}\right)$$

- but
$\frac{1}{r_1}+\frac{1}{r_2} = \frac{r_2+r_1}{r_1r_2} = \frac{N}{r_1r_2}$
so
$$T = \frac{N^2}{r_1r_2} \sum_{j=1}^c \frac{\left(O_{1j} - r_1c_j/N\right)^2}{c_j}$$

$$T = \frac{N^2}{r_1r_2} \sum_{j=1}^c \frac{\left(O_{1j} - r_1c_j/N\right)^2}{c_j}$$
- If none of the original sample values are equal to the grand median, so
that $r_1=r_2=N/2$ by definition, we have the further simplification
$$T = 4 \sum_{j=1}^c \frac{\left(O_{1j} - c_j/2\right)^2}{c_j}
  = \sum_{j=1}^c \frac{\left(O_{1j} - O_{2j}\right)^2}{c_j}
  \quad \hbox{if $r_1=r_2$}$$

Note that if the sample size is not large enough to use the chi-squared
approximation, the setup of the test means that the column totals (the
sizes of the $c$ samples) are fixed, and the row totals are
approximately fixed since we know $r_1\approx\frac{N}{2}\approx r_2$.
The null distribution that’s appropriate is thus the one associated with
fixed marginal totals.

For a demonstration, consider the data from the Kruskal-Wallis lesson:

In [3]:
x_j_k = [np.array([ 14.97,   5.80,  25.03,   5.50 ]),
       np.array([  5.83,  13.96,  21.96]),
       np.array([ 17.89,  23.03,  61.09,   18.62,  55.51])]
grandmedian=np.median(np.concatenate(x_j_k)); grandmedian

18.255000000000003

Note that since $N=12$, the grand median $\tilde{X}$ is between the 6th and 7th largest values, $17.89$ and $18.62$.

We convert the data into a contingency table:

In [4]:
O1_j = np.array([np.sum(xj_k>grandmedian) for xj_k in x_j_k])
O2_j = np.array([np.sum(xj_k<=grandmedian) for xj_k in x_j_k])
r1 = np.sum(O1_j); r2 = np.sum(O2_j); c_j = O1_j + O2_j; N=r1+r2; c=len(c_j)

In [5]:
print('|                |'+('|'.join([(r'$$j=%d$$'%j) for j in range(c)]))+'|Total|');
print('|   ------:      |'+' :---: |'*c+' :-- |');
print(r'| $$>\tilde{X}$$ |'+('|'.join([('  %3s  '%O1j) for O1j in O1_j]))+('|%3s  |'%r1));
print(r'|$$\le\tilde{X}$$|'+('|'.join([('  %3s  '%O2j) for O2j in O2_j]))+('|%3s  |'%r2));
print(r'|    **Total**   |'+('|'.join([('  %3s  '%cj) for cj in c_j]))+('|%3s  |'%N));

|                |$$j=0$$|$$j=1$$|$$j=2$$|Total|
|   ------:      | :---: | :---: | :---: | :-- |
| $$>\tilde{X}$$ |    1  |    1  |    4  |  6  |
|$$\le\tilde{X}$$|    3  |    2  |    1  |  6  |
|    **Total**   |    4  |    3  |    5  | 12  |


|                |$$j=0$$|$$j=1$$|$$j=2$$|Total|
|   ------:      | :---: | :---: | :---: | :-- |
| $$>\tilde{X}$$ |    1  |    1  |    4  |  6  |
|$$\le\tilde{X}$$|    3  |    2  |    1  |  6  |
|    **Total**   |    4  |    3  |    5  | 12  |

We construct $T = \frac{N^2}{r_1r_2} \sum_{j=1}^c \frac{\left(O_{1j} - r_1c_j/N\right)^2}{c_j}$

In [6]:
T = N**2/(r1*r2) * np.sum((O1_j-r1*c_j/N)**2/c_j); T

3.1333333333333333

Because $r_1=r_2=N/2$, we could also use the form $T = \sum_{j=1}^c \frac{\left(O_{1j} - O_{2j}\right)^2}{c_j}$

In [7]:
np.sum((O1_j-O2_j)**2/c_j)

3.1333333333333333

|                |$$j=0$$|$$j=1$$|$$j=2$$|Total|
|   ------:      | :---: | :---: | :---: | :-- |
| $$>\tilde{X}$$ |    1  |    1  |    4  |  6  |
|$$\le\tilde{X}$$|    3  |    2  |    1  |  6  |
|    **Total**   |    4  |    3  |    5  | 12  |

In [8]:
T

3.1333333333333333

We compute the $p$-value assuming it's a $\chi^2$ w/$(2-1)(c-1)=c-1=2$ dof:

In [9]:
stats.chi2(df=c-1).sf(T)

0.20873982339007963

But the numbers are small; is the $\chi^2$ approximation really valid?

There are $\frac{12!}{4!3!5!}=27720$ different ways to arrange the $N=12$ values.

In [10]:
from scipy.special import factorial
factorial(12)/(factorial(4)*factorial(3)*factorial(5)) # Not the best way to calculate this, but works for small numbers

27720.0

In [11]:
# We can loop through the arrangements of the 12 values using this function from PS 05.1:
import itertools
def multinomial_combinations(items, ns):
    if len(ns) == 1:
        for c in itertools.combinations(items, ns[0]):
            yield (c,)
    else:
        for c_first in itertools.combinations(items, ns[0]):
            items_remaining= set(items) - set(c_first)
            for c_other in multinomial_combinations(items_remaining, ns[1:]):
                 yield (c_first,) + c_other

We could use the actual numbers, but as long as there are no ties, the ranks work as well:

In [12]:
R_r = np.arange(1,N+1); Rbar = (N+1)/2; Rbar,np.mean(Rbar)

(6.5, 6.5)

In [13]:
R_I_j_k = [R_j_k for R_j_k in multinomial_combinations(R_r,c_j)]; R_I_j_k

[((1, 2, 3, 4), (5, 6, 7), (8, 9, 10, 11, 12)),
 ((1, 2, 3, 4), (5, 6, 8), (7, 9, 10, 11, 12)),
 ((1, 2, 3, 4), (5, 6, 9), (7, 8, 10, 11, 12)),
 ((1, 2, 3, 4), (5, 6, 10), (7, 8, 9, 11, 12)),
 ((1, 2, 3, 4), (5, 6, 11), (7, 8, 9, 10, 12)),
 ((1, 2, 3, 4), (5, 6, 12), (7, 8, 9, 10, 11)),
 ((1, 2, 3, 4), (5, 7, 8), (6, 9, 10, 11, 12)),
 ((1, 2, 3, 4), (5, 7, 9), (6, 8, 10, 11, 12)),
 ((1, 2, 3, 4), (5, 7, 10), (6, 8, 9, 11, 12)),
 ((1, 2, 3, 4), (5, 7, 11), (6, 8, 9, 10, 12)),
 ((1, 2, 3, 4), (5, 7, 12), (6, 8, 9, 10, 11)),
 ((1, 2, 3, 4), (5, 8, 9), (6, 7, 10, 11, 12)),
 ((1, 2, 3, 4), (5, 8, 10), (6, 7, 9, 11, 12)),
 ((1, 2, 3, 4), (5, 8, 11), (6, 7, 9, 10, 12)),
 ((1, 2, 3, 4), (5, 8, 12), (6, 7, 9, 10, 11)),
 ((1, 2, 3, 4), (5, 9, 10), (6, 7, 8, 11, 12)),
 ((1, 2, 3, 4), (5, 9, 11), (6, 7, 8, 10, 12)),
 ((1, 2, 3, 4), (5, 9, 12), (6, 7, 8, 10, 11)),
 ((1, 2, 3, 4), (5, 10, 11), (6, 7, 8, 9, 12)),
 ((1, 2, 3, 4), (5, 10, 12), (6, 7, 8, 9, 11)),
 ((1, 2, 3, 4), (5, 11, 12), (6, 7, 8, 9

In [14]:
O1_Ij = np.array([[np.sum(R_j_k[j]>Rbar) for j in range(c)] for R_j_k in R_I_j_k]); O1_Ij

array([[0, 1, 5],
       [0, 1, 5],
       [0, 1, 5],
       ...,
       [4, 1, 1],
       [4, 2, 0],
       [4, 2, 0]])

We can check that the $\{O^{(I)}_{1j}\}$ we've constructed satisfy $\sum_{j=1}^c O^{(I)}_{1j}=r_1$:

In [15]:
np.unique(np.sum(O1_Ij,axis=-1)), r1

(array([6]), 6)

And then we can compute $O^{(I)}_{2j}=c_j-O^{(I)}_{1j}$:

In [16]:
O2_Ij = c_j[None,:] - O1_Ij
np.unique(np.sum(O2_Ij,axis=-1)), r2

(array([6]), 6)

Now we compute $T^{(I)} = \sum_{j=1}^c \frac{\left(O^{(I)}_{1j} - O^{(I)}_{2j}\right)^2}{c_j}$.  Avoid roundoff by multiplying by common denominator $c_1c_2c_3=60$.

In [17]:
T_I = np.round(60*np.sum((O1_Ij-O2_Ij)**2/c_j[None,:],axis=-1))/60; np.unique(T_I)

array([0.53333333, 1.53333333, 3.13333333, 4.2       , 4.8       ,
       6.13333333, 7.2       , 9.        , 9.33333333])

Recompute $T$ to avoid roundoff there:

In [18]:
T = np.round(60*np.sum((O1_j-O2_j)**2/c_j))/60; T

3.1333333333333333

We see that the exact $p$-value is indeed different from the $\chi^2$ approximation

In [19]:
np.mean(T_I>=T), stats.chi2(df=c-1).sf(T)

(0.35064935064935066, 0.20873982339007963)

but note $P(\color{royalblue}{T}\mathbin{\ge} t)\ne P(\color{royalblue}{T}\mathbin{>} t)$

In [20]:
np.mean(T_I>T)

0.22077922077922077

What if some values are equal to the grand median?

In [21]:
x_j_k = [np.array([ 14.97,   5.80,  25.03,   5.50 ]),
       np.array([  5.83,  13.96,  21.96]),
       np.array([ 18,  23.03,  61.09,   18,  55.51])]
grandmedian=np.median(np.concatenate(x_j_k)); grandmedian

18.0

Note that since $N=12$, the grand median $\tilde{X}$ is between the 6th and 7th largest values, $18$ and $18$.

We convert the data into a contingency table:

In [22]:
O1_j = np.array([np.sum(xj_k>grandmedian) for xj_k in x_j_k])
O2_j = np.array([np.sum(xj_k<=grandmedian) for xj_k in x_j_k])
r1 = np.sum(O1_j); r2 = np.sum(O2_j); c_j = O1_j + O2_j; N=r1+r2; c=len(c_j)

In [23]:
print('|                |'+('|'.join([(r'$$j=%d$$'%j) for j in range(c)]))+'|Total|');
print('|   ------:      |'+' :---: |'*c+' :-- |');
print(r'| $$>\tilde{X}$$ |'+('|'.join([('  %3s  '%O1j) for O1j in O1_j]))+('|%3s  |'%r1));
print(r'|$$\le\tilde{X}$$|'+('|'.join([('  %3s  '%O2j) for O2j in O2_j]))+('|%3s  |'%r2));
print(r'|    **Total**   |'+('|'.join([('  %3s  '%cj) for cj in c_j]))+('|%3s  |'%N));

|                |$$j=0$$|$$j=1$$|$$j=2$$|Total|
|   ------:      | :---: | :---: | :---: | :-- |
| $$>\tilde{X}$$ |    1  |    1  |    3  |  5  |
|$$\le\tilde{X}$$|    3  |    2  |    2  |  7  |
|    **Total**   |    4  |    3  |    5  | 12  |


|                |$$j=0$$|$$j=1$$|$$j=2$$|Total|
|   ------:      | :---: | :---: | :---: | :-- |
| $$>\tilde{X}$$ |    1  |    1  |    3  |  5  |
|$$\le\tilde{X}$$|    3  |    2  |    2  |  7  |
|    **Total**   |    4  |    3  |    5  | 12  |

Note that now $r_1=5$ and $r_2=7$ are not equal to $N/2=6$.

We construct $T = \frac{N^2}{r_1r_2} \sum_{j=1}^c \frac{\left(O_{1j} - r_1c_j/N\right)^2}{c_j}$

In [24]:
T = N**2/(r1*r2) * np.sum((O1_j-r1*c_j/N)**2/c_j); T

1.234285714285714

Because $r_1\ne r_2\ne N/2$, we can't use the form $T = \sum_{j=1}^c \frac{\left(O_{1j} - O_{2j}\right)^2}{c_j}$

In [25]:
np.sum((O1_j-O2_j)**2/c_j)

1.5333333333333332

|                |$$j=0$$|$$j=1$$|$$j=2$$|Total|
|   ------:      | :---: | :---: | :---: | :-- |
| $$>\tilde{X}$$ |    1  |    1  |    3  |  5  |
|$$\le\tilde{X}$$|    3  |    2  |    2  |  7  |
|    **Total**   |    4  |    3  |    5  | 12  |

In [26]:
T

1.234285714285714

We compute the $p$-value assuming it's a $\chi^2$ w/$(2-1)(c-1)=c-1=2$ dof:

In [27]:
stats.chi2(df=c-1).sf(T)

0.5394836194862993

But the numbers are small; is the $\chi^2$ approximation really valid?

There are still $\frac{12!}{4!3!5!}=27720$ different ways to arrange the $N=12$ values.

We can still use the ranks, but now there are ties:

In [28]:
R_r = np.sort(stats.rankdata(np.concatenate(x_j_k))); R_r

array([ 1. ,  2. ,  3. ,  4. ,  5. ,  6.5,  6.5,  8. ,  9. , 10. , 11. ,
       12. ])

In [29]:
Rbar = (N+1)/2; Rbar,np.mean(R_r)

(6.5, 6.5)

It's a little tricky to loop through the different groupings, since the `multinomial_combinations` function doesn't work if the elements are not unique:

In [30]:
[R_j_k for R_j_k in multinomial_combinations(R_r,c_j)]

[((1.0, 2.0, 6.5, 6.5), (3.0, 4.0, 5.0), (8.0, 9.0, 10.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 4.0, 8.0), (5.0, 9.0, 10.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 4.0, 9.0), (5.0, 8.0, 10.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 4.0, 10.0), (5.0, 8.0, 9.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 4.0, 11.0), (5.0, 8.0, 9.0, 10.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 4.0, 12.0), (5.0, 8.0, 9.0, 10.0, 11.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 5.0, 8.0), (4.0, 9.0, 10.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 5.0, 9.0), (4.0, 8.0, 10.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 5.0, 10.0), (4.0, 8.0, 9.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 5.0, 11.0), (4.0, 8.0, 9.0, 10.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 5.0, 12.0), (4.0, 8.0, 9.0, 10.0, 11.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 8.0, 9.0), (4.0, 5.0, 10.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 8.0, 10.0), (4.0, 5.0, 9.0, 11.0, 12.0)),
 ((1.0, 2.0, 6.5, 6.5), (3.0, 8.0, 11.0), (4.0, 5.0, 9.0, 10.0, 

Note that the tied ranks always get placed in the first group.

Instead we have to use the groupings of the indices 0 to 11 and then use those indices to access elements from the list of ranks:

In [31]:
r_r = np.arange(N); r_r, R_r[r_r]

(array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]),
 array([ 1. ,  2. ,  3. ,  4. ,  5. ,  6.5,  6.5,  8. ,  9. , 10. , 11. ,
        12. ]))

In [32]:
r_I_j_k = [r_j_k for r_j_k in multinomial_combinations(r_r,c_j)]; r_I_j_k

[((0, 1, 2, 3), (4, 5, 6), (7, 8, 9, 10, 11)),
 ((0, 1, 2, 3), (4, 5, 7), (6, 8, 9, 10, 11)),
 ((0, 1, 2, 3), (4, 5, 8), (6, 7, 9, 10, 11)),
 ((0, 1, 2, 3), (4, 5, 9), (6, 7, 8, 10, 11)),
 ((0, 1, 2, 3), (4, 5, 10), (6, 7, 8, 9, 11)),
 ((0, 1, 2, 3), (4, 5, 11), (6, 7, 8, 9, 10)),
 ((0, 1, 2, 3), (4, 6, 7), (5, 8, 9, 10, 11)),
 ((0, 1, 2, 3), (4, 6, 8), (5, 7, 9, 10, 11)),
 ((0, 1, 2, 3), (4, 6, 9), (5, 7, 8, 10, 11)),
 ((0, 1, 2, 3), (4, 6, 10), (5, 7, 8, 9, 11)),
 ((0, 1, 2, 3), (4, 6, 11), (5, 7, 8, 9, 10)),
 ((0, 1, 2, 3), (4, 7, 8), (5, 6, 9, 10, 11)),
 ((0, 1, 2, 3), (4, 7, 9), (5, 6, 8, 10, 11)),
 ((0, 1, 2, 3), (4, 7, 10), (5, 6, 8, 9, 11)),
 ((0, 1, 2, 3), (4, 7, 11), (5, 6, 8, 9, 10)),
 ((0, 1, 2, 3), (4, 8, 9), (5, 6, 7, 10, 11)),
 ((0, 1, 2, 3), (4, 8, 10), (5, 6, 7, 9, 11)),
 ((0, 1, 2, 3), (4, 8, 11), (5, 6, 7, 9, 10)),
 ((0, 1, 2, 3), (4, 9, 10), (5, 6, 7, 8, 11)),
 ((0, 1, 2, 3), (4, 9, 11), (5, 6, 7, 8, 10)),
 ((0, 1, 2, 3), (4, 10, 11), (5, 6, 7, 8, 9)),
 ((0, 1, 2, 3

In [33]:
O1_Ij = np.array([[np.sum(R_r[np.array(r_j_k[j])]>Rbar) for j in range(c)] for r_j_k in r_I_j_k]); O1_Ij

array([[0, 0, 5],
       [0, 1, 4],
       [0, 1, 4],
       ...,
       [4, 1, 0],
       [4, 1, 0],
       [4, 1, 0]])

We can check that the $\{O^{(I)}_{1j}\}$ we've constructed satisfy $\sum_{j=1}^c O^{(I)}_{1j}=r_1=5$ (not $6$):

In [34]:
np.unique(np.sum(O1_Ij,axis=-1)), r1

(array([5]), 5)

And then we can compute $O^{(I)}_{2j}=c_j-O^{(I)}_{1j}$:

In [35]:
O2_Ij = c_j[None,:] - O1_Ij
np.unique(np.sum(O2_Ij,axis=-1)), r2

(array([7]), 7)

Now we compute $T^{(I)} = \frac{N^2}{r_1r_2}\sum_{j=1}^c \frac{\left(O^{(I)}_{1j} - r_1c_j/N\right)^2}{c_j}$.  Avoid roundoff by rounding to 8 decimal places:

In [36]:
T_I = np.round(N**2/(r1*r2)*np.sum((O1_Ij-r1*c_j[None,:]/N)**2/c_j[None,:],axis=-1),8)
T = np.round(T,8); T, np.unique(T_I)

(1.23428571,
 array([ 0.20571429,  1.23428571,  1.85142857,  2.88      ,  2.94857143,
         3.97714286,  4.32      ,  5.62285714,  5.96571429,  6.17142857,
         7.06285714,  7.88571429,  8.70857143,  9.25714286, 12.        ]))

We see that the exact $p$-value is indeed different from the $\chi^2$ approximation

In [37]:
np.mean(T_I>=T), stats.chi2(df=c-1).sf(T)

(0.7727272727272727, 0.5394836206423356)

but note $P(\color{royalblue}{T}\mathbin{\ge} t)\ne P(\color{royalblue}{T}\mathbin{>} t)$

In [38]:
np.mean(T_I>T)

0.4696969696969697