This repository contains R programs for the article “Nonparametric Comparisons of Multiple Distributions under Uniform Stochastic Ordering.”
Prior to using R programs on this repository, please download the main R program EGJ_USO_Library.R.
Since both distinguishing distribution methods and GOF tests depend on the ODC between consecutive distributions, it suffices to generate random samples from the ODCs with the first distribution assigned to be uniformly distributed.
In the manuscripts, we consider G_q with q between 0 and 1 for star-shaped ODC and G_q for non-star-shaped ODC. See the top-left figure in Figure 1.
The sequence of the ODCs K_delta
from Wang and Tang (2020) on the right of Figure 1 is for power curve comparison.
We provide R
codes for generating the random samples from G_q rUSO.samples.R
The computation times in the following are based on a computer with a 3.0GHz processor and 64GB of memory.
R
codes for ODC plots in Figure 1: Figure_1_ODCs.R.
The R
codes to reproduce Table 1 are attached: Testing_Equality_k3.R. The calculation took approximately 10 minutes.
Testing_GOF_k3.R provides the size and power studies with k=3 sample with sample size n=200 which requires 6 hours approximately.
The power curves comparison in Figure 2, (R
codes Testing_GOF_k3_PC.R)
with k=3 samples and equal sample sizes n=200, requires 6 hours totally on a computer with a 3.0GHz processor and 64GB of memory.
The R
codes to reproduce Table 3 is attached: Testing_Jump_k3.R. The calculation took approximately 10 minutes.
In addition to the simulation results in the manuscript, more simulations results are provided in the supplementary materials with R
codes attached in the followings.
Other than k=3 samples, we applied the distinghishing distribution methods to samples k=4, and k=5 with sample sizes n=60,100,and 200. All the calculations took less than 10 minuntes.
We provide the size and power study for k=4, and k=5 samples with sample sizes n=60, 100, and 200. All the calculations took less than 10 minuntes.
We also consider more settings for power curves comparison for GOF tests with k=4,5 samples and sample sizes n= 200 with R
codes attached.
- For k=4 with sample sizes n=200 which took approximately 8 hours on a computer with a 3.0GHz processor and 64GB of memory, respectively.
- For k=5 with sample sizes n=200 which took approximately 10 hours, respectively.
Other than k=3 samples, we applied the distinghishing distribution methods to samples k=4, and k=5 with sample sizes n=60,100,and 200. All the calculations took less than 10 minuntes.
We applied both distinguishing distribution methods and GOF tests to microfibrillar-associated protein 4 (MFAP4) data with clinical cohort characteristics and MFAP4 serum levels from Bracht et. al. (2016) in MFAP4.xlsx. We grouped the MFAP4 levels in fibrosis stages and saved in R
data form data_MFAP4
in MFAP4.Rdata.
Here we provide the empirical estimators and estimators under USO for ODCs between consecutive fibrosis stages with
R
codes attached: Figure_3.
The first part (first 3 rows) of Table 3 provides the differences of distributions from equality in L_p norm with p=1,2, and supremum norms, respectively. The thresholds for each L_p differences are provided to determine if the consecutive distributions are distinct.
The second part (last 3 rows) of Table 3 provides the departures of consecutive distributions from USO in L_p norm with p=1,2, and supremum norms, respectively.
The critical values boot.cv.Skps
and boot.cv.Wkps
for the cumulated test statistics Skp
and Wkp
, respectively, are provided. The Bonferroni-corrected critical values are saved in function Bon.cvs
with L_p norms Bon.cv.p1
, $Bon.cv.p2
, and Bon.cv.ps
.
The jump detection methods, including J_p^0
and J_p^*
with p=1,2, and infinity, for MFAP4 data are coded in MFAP4_Jump_Detection.R
- Thilo Bracht, Christian Mölleken, Maike Ahrens, Gereon Poschmann, Anders Schlosser, Martin Eisenacher, Kai Stühler, Helmut E. Meyer, Wolff H. Schmiegel, Uffe Holmskov, Grith L. Sorensen and Barbara Sitek (2016). Evaluation of the biomarker candidate MFAP4 for non-invasive assessment of hepatic fibrosis in hepatitis C patients. Journal of Translational Medicine. 14:201.
- Dewei Wang, Chuan-Fa Tang, and Joshua M. Tebbs (2020). More powerful goodness-of-fit tests for uniform stochastic ordering. Computational Statistics & Data Analysis. 144:106898.
- Chuan-Fa Tang, Dewei Wang, and Joshua M. Tebbs (2017). Nonparametric goodness-of-fit tests for uniform stochastic ordering The Annals of Statistics. 45:2565-2589.