# STAT 345: Nonparametric Statistics

## Lesson 07.1: The Friedman and Quade Tests

**Reading: Conover Section 5.8**

*Prof. John T. Whelan*

Thursday 6 March 2025

These lecture slides are in a computational notebook.  You have access to them through http://vmware.rit.edu/

Flat HTML and slideshow versions are also in MyCourses.

The notebook can run Python commands (other notebooks can use R or Julia; "Ju-Pyt-R").  Think: computational data analysis, not "coding".

Standard commands to activate inline interface and import libraries:

In [1]:
%matplotlib inline

In [2]:
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = (8.0,5.0)
plt.rcParams['font.size'] = 14

## The Complete Block Design
So far: rank-based tests on paired and
independent samples. We can categorize them as:

Independent samples:
  -   Two independent samples: Wilcoxon rank sum (Mann-Whitney), Conover squared ranks

  -   $k$ independent samples: Kruskal-Wallis (generalization of rank      sum), Conover squared ranks

Paired samples
-   Two paired samples: Sign test, Wilcoxon signed rank (also correlation <span><span>coëfficient</span></span>s)

Now extend paired samples to "blocks" of $k$ related observations.

- Instead of $n$ paired samples, consider $b$ "blocks" of $k$ "treatments"

- Name comes from experimental design with $b$ groups of $k$ subjects each.<br>Subjects in a block are considered identical, but different blocks may not be.

- "Randomized complete block design" ("Randomized" means we randomly selected which of the $k$ subjects in each block receives which treatment.)

- The result of the experiment is a $b\times k$ matrix of observations: $$\{X_{ij}\} =
  \begin{pmatrix}
    X_{11} & X_{12} & \cdots & X_{1k} \\
    X_{21} & X_{22} & \cdots & X_{2k} \\
    \vdots & \vdots & \ddots & \vdots \\
    X_{b1} & X_{b2} & \cdots & X_{bk} \\
  \end{pmatrix}$$

- $H_0$: within a block, each treatment as likely to give larger or small result than another,
$P({\color{royalblue}{X_{ij}}}{\mathbin{>}}{\color{royalblue}{X_{i\ell}}})=P({\color{royalblue}{X_{ij}}}{\mathbin{<}}{\color{royalblue}{X_{i\ell}}})$ for all $i=1,\ldots,b$, $j,\ell=1,\ldots,k$

In [3]:
X_ij = np.array([[  2.  ,  19.86,   9.17],
                 [  1.05,   3.1 ,   3.34],
                 [  0.14,  25.4 ,  26.59],
                 [ 14.6 ,   3.93,  10.95]]); b,k = np.shape(X_ij); print('%d blocks of %d treatments' % (b,k))

4 blocks of 3 treatments


We define $\{R_{ij}|j=1,\ldots,k\}$ to be the ranks of the responses to
the $k$ treatments within block $i$:

In [4]:
R_ij = stats.rankdata(X_ij,axis=-1); R_ij

array([[1., 3., 2.],
       [1., 2., 3.],
       [1., 2., 3.],
       [3., 1., 2.]])

$$\hbox{and let}\quad R_j = \sum_{i=1}^b R_{ij}\quad\hbox{be the sum of ranks for treatment $j$.}$$

In [5]:
R_j = np.sum(R_ij,axis=0); R_j

array([ 6.,  8., 10.])

Note that the minimum possible value for $R_j$ is $b$ and the maximum is
$kb$.

In our specific example,

In [6]:
R_ij, R_j

(array([[1., 3., 2.],
        [1., 2., 3.],
        [1., 2., 3.],
        [3., 1., 2.]]),
 array([ 6.,  8., 10.]))

the sums of the ranks for each treatment are
$$\begin{gathered}
    R_1 = 1+1+1+3=6 \\
    R_2 = 3+2+2+1=8 \\
    R_3 = 2+3+3+2=10
  \end{gathered}$$

**WARNING**: be sure to keep straight which are the blocks and which are the treatments!<br>(You are comparing the treatments, not the blocks.)

## The Friedman Test

- Friedman (as in Milton) test makes a $\chi^2$ statistic from the $\{R_j=\sum_{i=1}^b R_{ij}\}$.

- Under $H_0$, $E({\color{royalblue}{R_{ij}}})=\frac{k+1}{2}$ and, if there are no
ties,
$\operatorname{Var}({\color{royalblue}{R_{ij}}})=\frac{k(k+1)}{12}$

$\color{royalblue}{R_{ij}}$ \& $\color{royalblue}{R_{\ell j}}$ independent rvs for $i\ne\ell$, so $E({\color{royalblue}{R_{j}}})=\frac{b(k+1)}{2}$ and, if there are no ties,
$\operatorname{Var}({\color{royalblue}{R_{j}}})=\frac{bk(k+1)}{12}$.

$$\hbox{So}\qquad {\color{royalblue}{T_1}}
  =\frac{12}{bk(k+1)}\sum_{j=1}^k \left({\color{royalblue}{R_j}}-\frac{b(k+1)}{2}\right)^2
  =\frac{(k-1)\sum_{j=1}^k\left({\color{royalblue}{R_j}}-b\overline{R}\right)^2}
  {\sum_{i=1}^b\sum_{j=1}^k\left(\color{royalblue}{R_{ij}}-\overline{R}\right)^2}$$
is approximately chi-squared w/$k-1$ dof, under $H_0$.

- $k-1$ degrees of freedom because of constraint $\sum_{j=1}^k R_j =
\frac{bk(k+1)}{2}$.

$${\color{royalblue}{T_1}}
  =\frac{12}{bk(k+1)}\sum_{j=1}^k \left({\color{royalblue}{R_j}}-\frac{b(k+1)}{2}\right)^2
  =\frac{(k-1)\sum_{j=1}^k\left({\color{royalblue}{R_j}}-b\overline{R}\right)^2}
  {\sum_{i=1}^b\sum_{j=1}^k\left(\color{royalblue}{R_{ij}}-\overline{R}\right)^2}$$

In [7]:
T1 = (12/(b*k*(k+1)))*np.sum((R_j-0.5*b*(k+1))**2); T1

2.0

The second form also works if there are ties, but if there are no ties, they're equal

In [8]:
Rbar = 0.5*(k+1); (k-1)*np.sum((R_j-b*Rbar)**2)/np.sum((R_ij-Rbar)**2)

2.0

In [9]:
stats.chi2(df=k-1).sf(T1)

0.36787944117144245

In [10]:
stats.friedmanchisquare(X_ij[:,0],X_ij[:,1],X_ij[:,2]) # There's also a built-in, but it doesn't take an array 

FriedmanchisquareResult(statistic=2.0, pvalue=0.36787944117144245)

Note that the normal approximation that leads to the approximate
chi-squared distribution is not very good for this small sample size.

Conover asserts that we get a better approximation with the transformed
statistic (from applying two-way ANOVA to the ranks)
$$T_2 = \frac{(b-1)T_1}{b(k-1)-T_1}$$ which should have
an $F$ distribution with degree-of-freedom parameters $k-1$ and
$(b-1)(k-1)$.

In [11]:
T2 = ((b-1)*T1)/(b*(k-1)-T1); T2

1.0

In [12]:
stats.f(k-1,(b-1)*(k-1)).sf(T2), stats.chi2(df=k-1).sf(T1)

(0.421875, 0.36787944117144245)

An interesting exercise is to work out the exact distribution for these
statistics for this case. In general there are $(k!)^b$ different
arrangements of ranks possible with no ties; in this case that is
$6^4=1296$.  (You'll explore this on the homework.)

Note that, although we use ranks, this is actually the $k$-sample
analogy of the sign test. The rankings within a block correspond to the
sign of $y_i-x_i$, which we could rename as $X_{i2}-X_{i1}$, which is
equivalent to the ordering of $X_{i1}$ and $X_{i2}$.

## The Quade Test

Quade, [*Journal of the American Statistical Association*, **74**, 680 (1979)](https://www.jstor.org/stable/2286991).

To get an analog of the Wilcoxon signed rank statistic, we need the
equivalent of the magnitude of the difference.
The obvious choice is the spread of the values $X_{ij}$ within block $i$, which we write as
$$M_i = \max_{j} X_{ij} - \min_{j} X_{ij}$$

In [13]:
X_ij

array([[ 2.  , 19.86,  9.17],
       [ 1.05,  3.1 ,  3.34],
       [ 0.14, 25.4 , 26.59],
       [14.6 ,  3.93, 10.95]])

In [14]:
M_i = np.max(X_ij,axis=-1)-np.min(X_ij,axis=-1); M_i

array([17.86,  2.29, 26.45, 10.67])

The ranks of these are
called $Q_i$

In [15]:
Q_i = stats.rankdata(M_i); Q_i

array([3., 1., 4., 2.])

and the equivalent of the signed ranks are then
$$S_{ij} = Q_i \left(R_{ij} - \frac{k+1}{2}\right)$$

In [16]:
S_ij = Q_i[:,None]*(R_ij-0.5*(k+1)); S_ij

array([[-3.,  3.,  0.],
       [-1.,  0.,  1.],
       [-4.,  0.,  4.],
       [ 2., -2.,  0.]])

$Q_i$ are the ranks
and the quantity in parentheses is the generalization of the sign of the
difference. The statistic is then constructed out of the sums of these,
$S_j = \sum_{i=1}^b S_{ij}$

In [17]:
S_j = np.sum(S_ij,axis=0); S_j

array([-6.,  1.,  5.])

and the test statistic is
$$T_3 = \frac{(b-1)\frac{1}{b}\sum_{j=1}^k S_j^2}
  {\sum_{i=1}^b\sum_{j=1}^k S_{ij}^2-\frac{1}{b}\sum_{j=1}^k S_j^2}$$
which is again supposed to be $F(k-1,(b-1)(k-1))$-distributed.

In [18]:
B = np.sum(S_j**2)/b; A2 = np.sum(S_ij**2); T3 = (b-1)*B/(A2-B); B, A2, T3

(15.5, 60.0, 1.0449438202247192)

In [19]:
stats.f(k-1,(b-1)*(k-1)).sf(T3)

0.40796817129629637

It seems
as though the statistic depends on more than just the $\{S_j\}$ due to
the first term in the denominator, but that is only true if there are
ties. If there are no ties, $\sum_{i=1}^b\sum_{j=1}^k
S_{ij}^2$ has a fixed, if somewhat complicated, value in terms of $b$
and $k$.

In [20]:
b*(b+1)*(2*b+1)*k*(k**2-1)/72

60.0