BatchClassifierPaper

Repository for the analysis of Coleman et al., 2022. Note that all scripts assume that the BatchClassifierPaper directory is the working directory.

Many of the scripts in this repo require some of the lead author's R packages. To install these please run the following lines of code:

devtools::install_github("stcolema/BatchMixtureModel", ref = "Paper")
devtools::install_github("stcolema/mdiHelpR")

Simulation study

To recreate the simulation study analysis, the scripts that are in the Scripts/Simulations/ directory are called in the following order:

dataGeneration.R: generate ten datasets in each of the six scenarios and save the scenario description file.
modelling.R: run the models and save various outputs.
acceptance.R: check that acceptance rates are close to the [0.1, 0.5] range (good target for efficient exploration).
gewekeConvergence.R: use the Geweke statistic for the complete log-likelihood to check within-chain convergence.
completeLikelihood.R: use the complete log-likelihood to check across-chain convergence. This is a manual step and we identify chains that have achieved within-chain convergence but have settled in a different mode to the other chains in the same simulation and any chains that incorrectly passed the previous step.
modelComparison.R: visualise the performance of the various models having dropped the poorly-behaved chains.
lookAtGeneratedData.R: this produces a plot of the observed and batch-corrected datasets from one example for each scenario. These are used in the Supplementary Material.

Dopico et al. analysis

To recreate the analysis of the ELISA data from Dopico et al. (2021), please download the dataset https://github.com/chr1swallace/seroprevalence-paper/blob/master/adjusted-data.RData to ./Data/ELISA/Sweden/. Then run

swedenModelling.R: run MCMC for the MVT mixture model.
swedenModelCheck.R: assess chains on log-likelihood trace plots.
swedenHyperParameterAndSeroprevalencePlot.R: visualise the seroprevalence estimate and the prior distributions.
swedenDataPlots.R: compare the observed data to the inferred.

ELISA like simulations

To recreate the simulation study of data generated from the MVT model applied to the ELISA data from Dopico et al. (2021), the scripts that are in the Scripts/pseudo-ELISA/ directory are called in the following order

elisaSimGen.R: generate the data.
elisaLikeModelling.R: apply each of the models and save various outputs.
elisaLikeConvergence.R: check within-chain convergence using the Geweke-statistic.
elisaLikeLikelihood.R: plot the sampled likelihoods and manually remove chains that found a local minimim.
elisaLikeModelComparison.R: compare the model performance under the F1 score and the distance.
elisaFinalResults.R: make the plots actually used in the paper.

Dingens et al. analysis

To recreate the analysis of the ELISA data from Dingens et al. (2020), please run

seattleModelling.R: run multiple MCMC chains on the data.
seattleModelCheck.R: assess convergence using the log-likelihood.
seattleResultsPlot.R: create the figure used in the paper.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Analysis		Analysis
Data		Data
Scripts		Scripts
Simulations		Simulations
BatchClassifierPaper.Rproj		BatchClassifierPaper.Rproj
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BatchClassifierPaper

Simulation study

Dopico et al. analysis

ELISA like simulations

Dingens et al. analysis

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BatchClassifierPaper

Simulation study

Dopico et al. analysis

ELISA like simulations

Dingens et al. analysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages