Details of the purpose and any published outputs from this project can be found at the link above.
The contents of this repository MUST NOT be considered an accurate or valid representation of the study or its purpose. This repository may reflect an incomplete or incorrect analysis with no further ongoing work. The content has ONLY been made public to support the OpenSAFELY open science and transparency principles and to support the sharing of re-usable code for other subsequent users. No clinical, policy or safety conclusions must be drawn from the contents of this repository.
-
Detailed protocols are in the
protocol
folder.post-covid-events-neurodegenerative
contains the outcome-specific elements necessary to implementpost-covid-events-ehrql
.
-
If you are interested in how we defined our code lists, look in the
codelists
folder. -
Analyses scripts are in the
analysis
directory:-
Dataset definition scripts are in the
dataset_definition
directory:- If you are interested in how we defined our variables, we use the variable script
variable_helper_fuctions
to define functions that generate variables. We then apply these functions invariables_cohorts
to create a dictionary of variables for cohort definitions, and invariables_dates
to create a dictionary of variables for calculating study start dates and end dates. - If you are interested in how we defined study dates (e.g., index and end dates), these vary by cohort and are described in the protocol. We use the script
dataset_definition_dates
to generate a dataset with all required dates for each cohort. This script imported all variables generated fromvariables_dates
. - If you are interested in how we defined our cohorts, we use the dataset definition script
dataset_definition_cohorts
to define a function that generates cohorts. This script imports all variables generated fromvariables_cohorts
using the patient's index date, the cohort start date and the cohort end date. This approach is used to generate three cohorts: pre-vaccination, vaccinated, and unvaccinated—found indataset_definition_prevax
,dataset_definition_vax
, anddataset_definition_unvax
, respectively.
- If you are interested in how we defined our variables, we use the variable script
-
Dataset cleaning scripts are in the
dataset_clean
directory:- This directory also contains all the R scripts that process, describe, and analyse the extracted data.
dataset_clean
is the core script which executes all the other scripts in this folderfn-preprocess
is the function carrying out initial preprocessing, formatting columns correctlyfn-modify_dummy
is called from within fn-preprocess.R, and alters the proportions of dummy variables to better suit analysesfn-inex
is the inclusion/exclusion functionfn-qa
is the quality assurance functionfn-ref
is the function that sets the reference levels for factors
-
Table 1 scripts are in the
table1
directory:- This directory contains a single script:
table1.R
. This script works with the output ofdataset_clean
to describe the patient characteristics.
- This directory contains a single script:
-
Modelling scripts are in the
model
directory:make_model_input.R
works with the output ofdataset_clean
to prepare suitable data subsets for Cox analysis. Combines each outcome and subgroup in one formatted .rds file.fn-prepare_model_input.R
is a companion function tomake_model_input.R
which handles the interaction withactive_analyses.rds
.cox-ipw
is a reusable action which uses the output ofmake_model_input.R
to fit a Cox model to the data. (NB: It is not a file in the server)
-
Output scripts are in the
make_output
directory:make_model_output.R
combines all the Cox results in one formatted .csv file per subgroup.make_other_output.R
combines cohort-specific outputs (e.g. the table1 outputs) into 1 .csv file.make_aer_input.R
generates summary statistics by age and sex required for AER (Absolute Excess Risk) estimation for each outcome (using the model input files for the main analysis generated frommake_model_input
).
-
-
The
active_analyses
contains a list of active analyses. -
The
project.yaml
defines run-order and dependencies for all the analysis scripts. This file should not be edited directly. To make changes to the yaml, edit and run thecreate_project_actions.R
script which generates all the actions. -
Descriptive and Model outputs, including figures and tables are in the
released_outputs
directory.
Outputs follow OpenSAFELY naming conventions related to suppression rules by adding the suffix "_midpoint6". The suffix "_midpoint6_derived" means that the value(s) are derived from the midpoint6 values. Detailed information regarding naming conventions can be found here.
Variable | Description |
---|---|
Characteristic | patient characteristic under consideration |
Subcharacteristic | patient sub characteristic under consideration |
N [midpoint6 derived] | number of people with characteristic |
(%) [midpoint6 derived] | % of total people with characteristic |
COVID-19 [diagnoses midpoint6] | number of people with characteristic and COVID-19 |
Variable | Description |
---|---|
name | unique identifier for analysis |
cohort | cohort used for the analysis |
exposure | exposure used for the analysis |
outcome | outcome used for the analysis |
analysis | string to identify whether this is the ‘main’ analysis or a subgroup |
unexposed_person_days | number of person days before or without exposure in the analysis |
unexposed_events_midpoint6 | number of unexposed people with the outcome in the analysis |
exposed_person_days | number of person days after exposure in the analysis |
exposed_events_midpoint6 | number of exposed people with the outcome in the analysis |
total_person_days | number of person days in the analysis |
total_events_midpoint6_derived | number of people with the outcome in the analysis |
day0_events_midpoint6 | number of people with the exposure and outcome on the same day |
total_exposed_midpoint6 | number of people with the exposure in the analysis |
sample_size_midpoint6 | number of people in the analysis |
Variable | Description |
---|---|
name | unique identifier for analysis |
cohort | cohort used for the analysis |
outcome | outcome used for the analysis |
analysis | string to identify whether this is the ‘main’ analysis or a subgroup |
error | captured error message if analysis did not run |
model | string to identify whether the model adjustment |
term | string to identify the term in the analysis |
lnhr | log hazard ratio for the analysis |
se_lnhr | standard error for the log hazard ratio for the analysis |
hr | hazard ratio for the analysis |
conf_low | lower confidence limit for the analysis |
conf_high | higher confidence limit for the analysis |
N_total_midpoint6 | total number of people in the analysis |
N_exposed_midpoint6 | total number of people with the exposure in the analysis |
N_events_midpoint6 | total number of people with the outcome following exposure in the analysis |
person_time_total | total person time included in the analysis |
outcome_time_median | median time to outcome following exposure |
strata_warning | string to identify strata variables that may cause model faults |
surv_formula | survival formula for the analysis |
source | language used for cox calculation |
These outputs will have similar outputs to the table1|table2|venn outputs, but combined across cohorts. They may contain additional columns indicating the cohort and subgroup of the analysis.
Variable | Description |
---|---|
aer_sex | sex subgroup under consideration |
aer_age | age subgroup under consideration |
analysis | string to identify whether this is the ‘main’ analysis or a subgroup |
cohort | cohort used for the analysis |
outcome | outcome used for the analysis |
unexposed_person_days | unexposed person days in the age/sex grouping |
unexposed_events | number of events in unexposed people in the age/sex grouping |
total_exposed | total number of people with the exposure in the age/sex grouping |
sample_size | total number of people in the age/sex grouping |
The OpenSAFELY framework is a Trusted Research Environment (TRE) for electronic health records research in the NHS, with a focus on public accountability and research quality.
Read more at OpenSAFELY.org.
As standard, research projects have a MIT license.