Skip to content

CADDE-CENTRE/covid19_brazil_hfr

Repository files navigation

covid19_brazil_hfr

Overview

This repository contains the code and the data necessary to run the model and associated Bayesian analysis via Stan in:

  • A Brizzi, C Whittaker, LMS Servo et al. "Factors driving extensive spatial and temporal fluctuations in COVID-19 fatality rates in Brazilian hospitals". Imperial College London (06-10-2021), doi https://doi.org/10.25561/91875.

Data

The directory inst/data contains:

License

The code in this repository is licensed under CC BY-4.0.

CC BY 4.0

Warranty

Imperial makes no representation or warranty about the accuracy or completeness of the data nor that the results will not constitute in infringement of third-party rights. Imperial accepts no liability or responsibility for any use which may be made of any results, for the results, nor for any reliance which may be placed on any such work or results.

System Requirements

  • macOS or UNIX, the code was developed on macOS Mojave 10.14.
  • R version >= 4.0.3

Package requirements are reported in the covid19_brazil_hfr.yml file, and can be installed as follows:

$ cd covid19_brazil_hfr
$ conda env create -f covid19_brazil_hfr.yml
$ source activate covid19_brazil_hfr
$ export TBB_CXX_TYPE=gcc
$ export CXXFLAGS+=-fPIE

or alternatively:

$ cd covid19_brazil_hfr
$ conda create --name covid19_brazil_hfr
$ conda install -c conda-forge r-base r-rcpp r-ggplot2 r-data.table r-viridis r-gtools r-bayesplot r-mglm r-fitdistrplus r-actuar r-abind r-knitr r-rmarkdown r-yaml r-stringi r-codetools r-ggsci  r-bh r-matrix r-inline r-gridextra r-rcppparallel r-loo r-pkgbuild r-withr r-v8
$ source activate covid19_brazil_hfr
$ export TBB_CXX_TYPE=gcc
$ export CXXFLAGS+=-fPIE

For the installation cmdstanr package, fire up R and proceed as described here

install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
check_cmdstan_toolchain()
install_cmdstan(cores = 4)

Usage

To reproduce the main analyses from an UNIX environment, it is only necessary to move to the inst directory and run:

$ Rscript start_local.R -outdir foo

specifying an output directory which will contain all the plots and Hamiltonian Monte Carlo posterior draws. The R script produces a bash script containing instructions to preprocess the data, fit the model and then analyse the results. We typically run 4 HMC chains in parallel, though the number of chains and the number of iterations can be modified in inst/script/HFR.fit.cmdstan.R by modifying the arguments in:

m_fit <- m$sample( 
                data=stan_data, seed=42,
                refresh=1e2, iter_warmup=5e2, iter_sampling=2e3, chains=4,
                parallel_chains=4, threads_per_chain = 1, save_warmup=TRUE,
                init= list(stan_init,stan_init,stan_init,stan_init)
        ) 

To run sensitivity analyses with other genomic data sources, it is first necessary to run the chains with the specified data by adding one of the flags -random or -fiocruz to the above command, or including the flag in the directory name.

$ # The following are equivalent:
$ Rscript start_local.R -outdir foo -random
$ Rscript start_local.R -outdir foo-random

In the first case, the '-random' flag will be appended to the output directory name, so that in both cases, the results will be stored in foo-random .

After the above analyses are completed, it is possible to reproduce the plots in the supplementary materials by moving to the inst/scripts directory and running:

$ # if -random sensitivity:
$ Rscript sensitivity_analysis_ControlledSequenceDataCollection.R -dirs foo bar

$ # if -fiocruz sensitivity:
$ Rscript sensitivity_analysis_Fiocruz.R -dirs foo bar

where foo and bar are the directories containing the results from the standard and from the sensitivity analysis respectively.

We also include the start.HPC.R scripts that allows to run the models for the different cities in parallel. This was designed to run on the Imperial College London Research Computing Service, but the script can be adapted for different set ups. The script writes and queues bash scripts to run the preprocessing analyses, and then run the state capital models in parallel.

Acknowledgements

We thank all contributors to GISAID for making SARS‐CoV‐2 sequence data information publicly available as listed inacknowledgments_GISAID; all contributors to Rede Genomica Fiocruz for making SARS‐CoV‐2 variant frequency data publicly available; all members of the CADDE network for their comments throughout the project and earlier versions of the manuscript; Oliver G. Pybus, Andrew Rambaut and JT. McCrone for their insightful comments on SARS‐CoV‐2 phylogenetic analyses; and the Imperial College Research Computing Service, DOI: 10.14469/hpc/2232, for providing the computational resources to perform this study

Funding

This study was supported by the Medical Research Council-São Paulo Research Foundation (FAPESP) CADDE partnership award (MR/S0195/1 and FAPESP18/14389-0) (https://caddecentre.org), by the EPSRC through the EPSRC Centre for Doctoral Training in Modern Statistics and Statistical Machine Learning at Imperial and Oxford. RSA from the Rede Coronaômica BR MCTI/FINEP affiliated to RedeVírus/MCTI (FINEP 01.20.0029.000462/20, CNPq 404096/2020-4), from CNPq (312688/2017-2 and 439119/2018-9), MEC/CAPES (14/2020 - 23072.211119/2020-10), FINEP (0494/2001.20.0026.00); LSB acknowledges support from Inova Fiocruz (48401485034116); DSC is funded by the Clarendon Fund, University of Oxford Department of Zoology and Merton College; NF from Wellcome Trust and Royal Society (N.R.F.: Sir Henry Dale Fellowship. 204311/Z/16/Z) and from the Bill & Melinda Gates Foundation (INV-034540) ; CAP was supported by FAPESP (2019/21858-0), Fundação Faculdade de Medicina and Coordenação de Aperfeiçoamento de Pessoal de Nível Superior Brasil (CAPES) Finance Code 001 OTR from the Instituto de Salud Carlos III (Sara Borrell fellowship, CD19/00110); OR from the Bill & Melinda Gates Foundation (OPP1175094); ES from Bill & Melinda Gates Foundation (INV-034652); RPS from the Rede Coronaômica BR MCTI/FINEP affiliated to RedeVírus/MCTI (FINEP 01.20.0029.000462/20, CNPq 404096/2020-4), from CNPq (310627/2018-4), MEC/CAPES (14/2020 - 23072.211119/2020-10), FINEP (0494/20 01.20.0026.00), FAPEMIG (APQ-00475-20); WMS from FAPESP (2017/13981-0, 2019/24251-9) and the NIH (AI12094); CW acknowledges an MRC Doctoral Training partnership studentship