Skip to content

dherrera1911/seroreversion_metaanalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Dynamics of SARS-CoV-2 seroassay sensitivity: a systematic review and modeling study

This is the code associated with the paper Dynamics of SARS-CoV-2 seroassay sensitivity: a systematic review and modeling study, (https://www.medrxiv.org/content/10.1101/2022.09.08.22279731v3), currently in press at Eurosurveillance.

Data files

Raw data files:

Data files to reproduce the analysis from scratch are found in data/raw_data. These have to be preprocessed before fitting the model.

PCR_to_serotest_known.csv: Serology testing on PCR diagnosed individuals, with known diagnosis-to-serology time.

PCR_to_serotest_unknown.csv: Same as above, but with unknown diagnosis-to-serology time.

death_dynamics.csv: Time series of reported cases and deaths for different locations used in the analysis.

assay_characteristics.csv: Technical characteristics of each test

Secondary files:

Pre-processed data files for jumping straight into fitting the analysis are in data/processed_data:

PCR_to_serotest_estimated_times.csv: Same as PCR_to_serotest_unknown.csv, but with estimated times between COVID diagnosis and serology testing.

PCR_to_serotest_all.csv: Data table putting together all the data to be fitted in the analysis. This is the file needed to jump straight into model fitting.

assay_specificity.csv: Data table with reported specificities for each assay

Results data files:

Results from model fitting are in data/analysis_results/, and these are used for making plots and statistics in the rest of the code. Results are saved here automatically at the end of analysis scripts.

We include some results files that have summaries of the model fits, but we exclude the files with the posterior samples, due to their size. Some figures of the manuscript can be done with the available files.

Files with 'XXX_sensitivity_curve.csv' name structure have the time-dependent sensitivity curves estimated from the fitted model specified in XXX. These files contain estimated sensitivity profiles for each assay, as well as for the average estimated for a given kind of assay.

Files with 'XXX_parameter_summary.csv' have the summary statistics of each parameter fitted by the Bayesian models.

Files with 'XXX_posterior_samples.csv' contain the posterior samples of the model that were used to build the previous two files. These files are not included due to their size.

Files with 'XXX_predicted_sensitivities_grouped_CV.csv', have the cross-validation results.

Files 'XXX_statistical_significance.csv' contain statistical tests as proportions of posterior samples that show a result of interest.

Files 'sensitivity_profile_table*.csv' contain the resulting time-varying sensitivities for different assays and types of assays.

Important variables:

The data file fitted in the statistical analyses is PCR_to_serotest_all.csv. The most important variables of this dataset are:

Variable Description
testName Indicates the identity of the assay used
testTime Time elapsed from diagnosis to antibody testing
citationID Identifier of the study reporting the data
nSamples Number of people with COVID diagnosis that were tested for antibodies
nSeropositives Number of positive antibody tests among the nSamples tested
sensitivityMean Mean sensitivity for this time point
sensitivityL(H) Binomial 95% CI lower (higher) bound of sensitivity
timeKnown Indicates whether testTime is reported in the source, or estimated here
antigenTarget Indicates what antigens are targeted by the assay
antibodyTarget Indicates what antibody isotypes are targeted by the assay
technique Indicates analytic technique (i.e. LFIA or which type of Quantitative detection)
design Indicates the test design (indirect/direct/competitive)
sampleType Indicates what kind of population was sampled
midpointDate Median date of sample collection

Analysis code

Each script includes a description at the beggining. The scripts are numbered so as to be run in order. Paths are built so as to be ran from the directory where a script file is located.

Description of the scripts in the code directory:

01_death_dynamics_table.R: (Preprocessing) Builds the data/raw_data/death_dynamics.csv file by putting together different case and death data sources.

02_estimate_seroreversion_delays.R: (Preprocessing) Uses the data in data/raw_data/death_dynamics.csv to estimate the delays between diagnosis and serology testing for the data in PCR_to_serotest_unknown.csv

03_organize_seroreversion_data.R: (Preprocessing) Tidies the data and does som extra pre-processing. The output file of this script is PCR_to_serotest_all.csv, the input to the statistical analysis.

04_average_sensitivity_analysis.R: (Analysis) Fits a hierarchical Bayesian regression to the data in PCR_to_serotest_all.csv, without taking into account assay characteristics. Outputs files to data/analysis_results with descriptions of the fitted model

04_bis_average_sensitivity_analysis_CV.R: (Analysis) Fits the same model as the previous file, but for doing cross-validation. Outputs the cross-validation results, but no model summary.

05_characteristics_sensitivity_analysis.R: (Analysis) Fits a hierarchical Bayesian regression to the data in PCR_to_serotest_all.csv. Unlike script 04, it includes an effect for the different test characteristics.

05_bis_characteristics_sensitivity_analysis_CV.R: (Analysis) Fits a hierarchical Bayesian model like the script above, but for doing cross-validation. Only outputs the CV results, and not a model summary.

06_positive_slope_analysis.R: (Analysis) Fits a model with two slopes, an early slope and a later slope. It does so on a small set of tests that show positive slopes in the main analysis.

07_manufacturer_comparison.R: (Analysis) Compares the results from previous model fittings to manufacturer reported sensitivities.

08_characteristics_analysis_known_times.R: (Analysis) Does the same as script 05, but for excluding data points where we estimated the time from diagnosis to testing.

09_organize_specificity_data.R: (Preprocessing) Prepare the specificity data to analyze how it changes across assays.

10_analyze_specificity_data.R: (Analysis) Fit a Bayesian model to the specificity data, to find effects of assay characteristics on specificity.

11_serotracker_analysis.R: (Analysis) Computes how many data points in SeroTracker, that are Unity-aligned, use assays at high-risk of seroreversion.

functions_auxiliary.R: Contains miscellaneous functions for small tasks.

functions_seroreversion_fit_analysis.R: Contains functions related to the Bayesian analysis fit. For example, preparing the initialization values, extracting the posterior samples in a tidy format, etc.

Directory plotting_tabulating_scripts has scripts that generate the figures for the paper. Each script includes a description of what Figures it generates. Like the analysis scripts, paths are made so as to have these scripts ran from the directory where they are located.

Directory stan_models has the .stan files that implement the Bayesian models that are fit in the main analysis scripts described above.

Meta-analysis summary

In directory data/systematic_review_summary/, several .csv files with notes on the systematic review procedure, data on the analyzed cohorts, a list of the PRISMA systematic review procedure, and tables with assay characteristics are found.

Contact

For questions, contact dherrera1911[at]gmail.com

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published