This study compares effectiveness of the Oxford-AstraZenenca vaccine versus the Pfizer-BNT vaccine in vaccinated health and social care workers.
-
If you are interested in how we defined our codelists, look in the
codelists/
directory. -
Analysis scripts are in the
analysis/
directory.- The instructions used to extract data from the OpensAFELY-TPP database is specified in the study definition; this is written in Python, but non-programmers should be able to understand what is going on there
- The
lib/
directory contains useful functions and look-up tables. - The remaining folders mostly contain the R scripts that process, describe, and analyse the extracted database data.
-
Non-disclosive model outputs, including tables, figures, etc, are in the
released_outputs/
directory. -
The
project.yaml
defines run-order and dependencies for all the analysis scripts. This file should not be edited directly. To make changes to the yaml, edit and run thecreate-project.R
script instead.
Scripts are organised into five directories:
-
data_makedummy.R
contains the script used to generate dummy data. This is used instead of the usual dummy data specified in the study definition, because it is then possible to impose some more useful structure in the data, such as ensuring nobody has a first dose of both the Pfizer and Astra-Zeneca vaccines. If the study definition is updated, this script must also be updated to ensure variable names and types match.
-
design.R
defines some common design elements used throughout the study, such as follow-up dates, model outcomes, and covariates.data_process.R
imports the extracted database data (or dummy data), standardises some variables and derives some new ones.data_selection.R
filters out participants who should not be included in the main analysis, and creates a small table used for the inclusion/exclusion flowchartdata_properties.R
tabulates and summarises the variables in the processed data for inspection / sense-checking.
-
table1.R
creates a "table 1" table describing cohort characteristics at baseline, stratified by vaccine type.table1_allvax.R
same astable1.r
, but on the pre-exclusion cohort.table_irr.R
calculates unadjusted incidence rates for various outcomes, stratified by vaccine type and times since vaccination.km.R
unadjusted Kaplan-Meier plots for outcomes, by vaccine type.seconddose.R
cumulative incidence of second dose coverage, by vaccine type.vaxdate.R
cumulative coverage of first vaccine dose over calendar time.
-
-
model_plr.R
fits the pooled logistic regression models. This script takes four arguments:outcome
, for examplepostest
for positive SARS-CoV-2 test orcovidadmitted
COVID-19 hospitalisationtimescale
, eithercalendar
for calendar-time ortimesincevax
for vaccination-timecensor_seconddose
, whether (1
) or not (0
) to censor follow-up at the second dosesamplesize_nonoutcomes_n
, to reduce computations time, the size of the sample for those who did not experience the outcome of interest. All those who experienced the outcome were included.
-
report_plr.R
outputs summary information, effect estimates, and marginalised cumulative incidence estimates for the pooled logistic regression models frommodel_plr.R
. This script has theoutcome
,timescale
, andcensor_seconddose
arguments of themodel_plr.R
script to pick up the correct models. -
model_cox.R
fits Cox models with time-varying effects, which were used to check consistency with the pooled logistic regression models. -
report_cox.R
outputs summary information for the Cox models frommodel_cox.R
.
-
-
report_objects.R
collates some of the baseline data and model outputs and saves to file, to make it easier to incorporate outputs across different actions in the main R markdown script that generates the study manuscript.effectiveness_report.rmd
is a R markdown file that puts a lot of the outputs together in one file for easy checking distribution. A pre-cursor to the manuscript.effectiveness_report_comparemodels.rmd
makes it easy to compare Cox versus PLR models, and calendar-time or vaccination-time timescales.
Materials for the manuscript are in the manuscript/
directory. This includes a bibliography, author list, citation style, the Rmarkdown document where the manuscript is authored, and rendered copies of the latest version of the manuscript itself.
Figures, tables, and inline numbers in the Rmarkdown manuscript are taken from non-disclosive, released materials from the OpenSAFELY platform.
The OpenSAFELY framework is a secure analytics platform for electronic health records research in the NHS.
Instead of requesting access for slices of patient data and transporting them elsewhere for analysis, the framework supports developing analytics against dummy data, and then running against the real data within the same infrastructure that the data is stored. Read more at OpenSAFELY.org.
Developers and epidemiologists interested in the framework should review the OpenSAFELY documentation