Skip to content

opensafely/post-covid-pre-vaccinated-cardiovascular

Repository files navigation

Impact of vaccination on the association of COVID-19 with arterial and venous thrombotic diseases: an OpenSAFELY cohort study using linked electronic health records

Abstract: The incidence of cardiovascular diseases increases after COVID-19 diagnosis. How COVID-19 vaccination and different SARS-CoV-2 variants impact on this increase is unclear. The objective was to quantify associations between COVID-19 diagnosis and cardiovascular diseases in different vaccination and variant eras in England.

This project compares three cohorts: pre-vaccination, unvaccinated and vaccinated. This repository creates the results for the pre-vaccination cohorts. The unvaccinated and vaccinated repository can be found here

You can run this project via Gitpod in a web browser by clicking on this badge: Gitpod ready-to-code

  • The preprint can be found
  • The project protocol which contains in depth deatils of the anlaysis plan can be found here
  • Analysis scripts are in the analysis directory
  • If you are interested in how we defined our code lists, look in the codelists directory
  • Developers and epidemiologists interested in the framework should review the OpenSAFELY documentation

project.yaml

The project.yaml defines project actions, run-order and dependencies for all analysis scripts. This file should not be edited directly. To make changes to the yaml, edit and run the create_project.R script instead. Project actions are then run securely using OpenSAFELY Jobs. Details of the purpose and any published outputs from this project can be found at this link as well.

Below is a description of each action in the project.yaml. Arguments are denoted by {arg} in the action name.

  • vax_eligibility_inputs
    • Runs vax_eligibility_inputs.R which creates metadata for aspects of the study design which are required for the generate_study_population actions.
    • Creates dataframes that contain dates for each phase of vaccination and conditions for defining JCVI vacciantion groups
  • generate_study_population
    • Runs study_definition.py
    • These scripts are used to define JCVI groups, vaccine variables, variables used to apply eligibility criteria, outcome variables and covariate variables. Common variables used in all three scripts can be found in common_variables.py.
  • preprocess_data
    • Runs preprocess_data.R to apply dataframe tidying to input.feather (generated by generate_study_population)
    • Tidies vaccine variables, determines patient study index date, creates additional variables (e.g. covid phenotype), tidies dataset and ensures all variables are in the correct format e.g. numeric, character etc.
  • stage1_data_cleaning
    • Runs Stage1_data_cleaning.R
    • Applies quality assurance rule and inclusion/exclusion criteria
    • Outputted dataset is analysis ready
  • stage1_end_date_table
  • stage2_missing_table1
    • Runs Stage2_missing_table1.R
    • Check for missing data within variables
    • Creates the summary statistics for Table 1 of the manuscript which can then be outputted
  • stage4_table_2_{follow_up}_{outcome_postion}
    • Runs table_2.R which calculates pre- and post-exposure event counts and person days of follow-up for all outcomes and subgroups.
    • Used for Table 2 in the manuscript
  • venn_diagram
    • Runs venn_diagram.R
    • Creates venn diagram data for all outcomes reporting where outcomes are sourced from i.e primary care, secondary care or deaths data
  • Analysis_cox_{outcome}
    • Runs 01_cox_pipeline.R
    • Each action runs all subgroups for the outcome and cohort of interest
    • Detailed descriptions of each script used to fit the cox models can be in the model README file
  • stata_cox_model_{outcome}_{subgroup}_{time_periods}
  • format_stata_output
  • format_R_output

Creating the study population

In OpenSAFELY a study definition is a formal specification of the data that you want to extract from the OpenSAFELY database. This includes:

  • the patient population (dataset rows)
  • the variables (dataset columns)
  • the expected distributions of these variables for use in dummy data

Further details on creating the study population can be found in the OpenSAFELY documentation.

The contents of this repository MUST NOT be considered an accurate or valid representation of the study or its purpose. This repository may reflect an incomplete or incorrect analysis with no further ongoing work. The content has ONLY been made public to support the OpenSAFELY open science and transparency principles and to support the sharing of re-usable code for other subsequent users. No clinical, policy or safety conclusions must be drawn from the contents of this repository.

About the OpenSAFELY framework

The OpenSAFELY framework is a Trusted Research Environment (TRE) for electronic health records research in the NHS, with a focus on public accountability and research quality.

Read more at OpenSAFELY.org.

Licences

As standard, research projects have a MIT license.