source files for paper 'Antibiotic-resistant Neisseria gonorrhoeae spread faster with more treatment, not more sexual partners'
Clone or download
Pull request Compare This branch is even with fingerhuth:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data
scripts
sensitivity
license.txt
readme.txt

readme.txt

# README
# descriptions of contents of data/, scripts/

# data/
The following data files are included:

  - behav_het.data:
    generated by: scripts/s_mle-het.R
    description: partner change rate and proportion low sexual activity group for heterosexual men and women, estimated via maximum likelihood estimation from two poisson distributions
    columns: a: proportion of population in low activity class, m1: average number of partners per year for low activity class, m2: average number of partners per year for high activity class

  - behav_msm.data:
    generated by: scripts/s_mle-msm.R
    description: partner change rate and proportion low sexual activity group for men who have sex with men, estimated via maximum likelihood estimation from two poisson distributions
    columns: a: proportion of population in low activity class, m1: average number of partners per year for low activity class, m2: average number of partners per year for high activity class

  - output_het_993734.data:
    note: please request or generate yourself
    generated by: workflow on cluster; script: scripts/cluster/s_para_merge.R
    description: equilibrium prevalences for 10^7 parameter sets drawn with seed 993734, for heterosexual men and women
    columns: 1: prevalence in low activity class, 2: prevalence in high activity class, 3: prevalence in overall population, 4: incidence in low activity class, 5: incidence in high activity class, 6: incidence in overall population, 7: sexual mixing coefficient, 8: transmission probability within low activity class, 9: transmission probability between high activity class, 10: average duration of infection (in years), 11: frequency of diagnosis and treatment
    note: the incidence of diagnosed infections is calculated as incidence times frequency of diagnosis and treatment (e.g. column 6 * column 11 for overall population)

  - output_msm_312774.data:
    note: please request or generate yourself
    generated by: workflow on cluster; script: scripts/cluster/s_para_merge.R
    description: equilibrium prevalences for 10^7 parameter sets drawn with seed 312774, for men who have sex with men
    columns: 1: prevalence in low activity class, 2: prevalence in high activity class, 3: prevalence in overall population, 4: incidence in low activity class, 5: incidence in high activity class, 6: incidence in overall population, 7: sexual mixing coefficient, 8: transmission probability within low activity class, 9: transmission probability between high activity class, 10: average duration of infection (in years), 11: frequency of diagnosis and treatment
    note: the incidence of diagnosed infections is calculated as incidence times frequency of diagnosis and treatment (e.g. column 6 * column 11 for overall population)

  - outros_het_993734.data:
    note: please request or generate yourself
    generated by: workflow on cluster; script: scripts/cluster/s_para_condition.R
    description: parameter sets that give desired resistant-free equilibrium prevalences and incidences, for heterosexual men and women, seed 993734
    columns: 1: prevalence in low activity class, 2: prevalence in high activity class, 3: prevalence in overall population, 4: incidence in low activity class, 5: incidence in high activity class, 6: incidence in overall population, 7: sexual mixing coefficient, 8: transmission probability within low activity class, 9: transmission probability between high activity class, 10: average duration of infection (in years), 11: frequency of diagnosis and treatment
    note: the incidence of diagnosed infections is calculated as incidence times frequency of diagnosis and treatment (e.g. column 6 * column 11 for overall population)

  - outros_msm_312774.data:
    note: please request or generate yourself
    generated by: workflow on cluster; script: scripts/cluster/s_para_condition.R
    description: parameter sets that give desired resistant-free equilibrium prevalences and incidences, for men having sex with men, seed 312774
    columns: 1: prevalence in low activity class, 2: prevalence in high activity class, 3: prevalence in overall population, 4: incidence in low activity class, 5: incidence in high activity class, 6: incidence in overall population, 7: sexual mixing coefficient, 8: transmission probability within low activity class, 9: transmission probability between high activity class, 10: average duration of infection (in years), 11: frequency of diagnosis and treatment
    note: the incidence of diagnosed infections is calculated as incidence times frequency of diagnosis and treatment (e.g. column 6 * column 11 for overall population)

  - propresT_het_993734.data:
    note: please request or generate yourself
    generated by: scripts/cluster/s_res_run-res-eq_local.R
    description: proportion of the total infected population that carries resistant strain, results from all calibrated simulations for hetersexual men and women, seed 993734
    rows: proportion resistant in total population at time point indicated by same row number in file times_het_993734.data
    columns: parameter sets, same order as rows in outros_het_993734.data

  - propresT_msm_312774.data:
    note: please request or generate yourself
    generated by: scripts/cluster/s_res_run-res-eq_local.R
    description: proportion of the total infected population that carries resistant strain, results from all calibrated simulations for men who have sex with men, seed 312774
    rows: proportion resistant in total population at time point indicated by same row number in file times_msm_312774.data
    columns: parameter sets, same order as rows in outros_msm_9312774.data

  - natsal.RData
    note: not provided because not our data. data is available upon registration, download at: http://discover.ukdataservice.ac.uk/catalogue/?sn=5223&type=Data%20catalogue
    description: Natsal-2 (National Survey of Attitudes and Lifestyles 2) data set, see Natsal-2 documentation

  - rates.data:
    generated by: scripts/s_exp-fit_p_fig2.R
    description: exponential growth rates from fitting logistic functions to surveillance data
    columns: as indicated by column names

  - surveillance.data:
    generated by: manual, digitized with plotdigitizer
    description: digitized surveillance data from GISP and GRASP programs
    columns: as indicated by column names

  - theta312774.data:
    note: please request or generate yourself
    generated by: workflow on cluster; script: scripts/cluster/s_para_draw.R
    description: parameter sets drawn with seed 312774 from priors for men who have sex with men
    columns: as indicated by column names; epsilon: sexual mixing coefficient, betaL: transmission probability within low activity class, betaH: transmission probability within high activity class, D: average duration of infection in years, f: frequency of diagnosed and treated infections

  - theta993734.data:
    note: please request or generate yourself
    generated by: workflow on cluster; script: scripts/cluster/s_para_draw.R
    description: parameter sets drawn with seed 993734 from priors for heterosexual men and women
    columns: as indicated by column names; epsilon: sexual mixing coefficient, betaL: transmission probability within low activity class, betaH: transmission probability within high activity class, D: average duration of infection in years, f: frequency of diagnosed and treated infections

# scripts/
All scripts include relative pathways and should be run from within the scripts/ folder, i.e. set the directory to use R command “setwd(“scripts/”)”. The seed used for MSM is 312774 and for HMW 993734.
The following scripts are included:

  - f_mixing.R:
    aim: R function to compute mixing matrix rho_{ij} for model runs

  - f_model:
    aim: R function to provide model equations to be solved in ODE solver

  - f_multipanel.R:
    aim: provide a plotting function that allows for multiple panels in one plot with alphabetical panel labelling

  - p_fig3.R:
    aim: plot prior and posterior distributions of parameters sexual mixing coefficient, fraction of diagnosed and treated infections, average duration of infection, transmission probability within low activity group, transmission probability within high activity group
    output: figures/Fig3.tiff

  - p_fig4.R:
    aim: plot timeline of resistant increase for median of simulations, 50% of simulations, 95% of simulations
    output: figures/Fig4.tiff

  - p_fig5.R:
    aim: plot distributions of treatment rates from simulations
    output: figures/Fig5.tiff

  - p_Sfigs.R:
    aim: plot supplementary figures: prevalences and incidences for heterosexual men and women/men who have sex with men
    output: figures/S1fig.tiff, figures/S2fig.tiff

  - s_exp-fit_p_fig2.R:
    aim: fit antibiotic resistance surveillance data to logistic growth models, estimate exponential growth rates from data, estimate doubling times from data (no output, only in script), plot data and logistic growth models
    output: figures/Fig2.tiff, data/rates.data

  - s_mle-het.R:
    aim: estimate partner change rate and proportion low activity group for heterosexual men and women
    output: data/behav_het.data
    note: needs data set ‘data/natsal.Rdata’ which is not provided by us, but can be downloaded at http://discover.ukdataservice.ac.uk/catalogue/?sn=5223&type=Data%20catalogue

  - s_mle-msm.R:
    aim: estimate partner change rate and proportion low activity group for men who have sex with men
    output: data/behav_msm.data
    note: needs data set ‘data/natsal.Rdata’ which is not provided by us, but can be downloaded at http://discover.ukdataservice.ac.uk/catalogue/?sn=5223&type=Data%20catalogue

  - s_natsal-condom-usage.R:
    aim: see whether men who have sex with men use more condoms than hetersexual men and women
    output: men who have sex with men use more condoms than hetersexual men and women, supports claim in manuscript that they do.
    note: needs data set ‘data/natsal.Rdata’ which is not provided by us, but can be downloaded at http://discover.ukdataservice.ac.uk/catalogue/?sn=5223&type=Data%20catalogue

  - s_res_run-res-eq.R:
    aim: simulate model with resistant strain for all calibrated parameter sets
    output: data/times*.data, data/propresL*.data, data/propresH*.data, data/propresT*.data
  - note: can be run locally, but takes rather long (hours). takes longer for men who have sex with men than heterosexual men (because more parameter sets fit men who have sex with men prevalences and incidences, that means there are more simulations to run)

## cluster/:
    description: This folder contains all scripts that are designed to be run on a cluster for parallel computing of several simulations at the same time. The scripts are designed to be called from the terminal. The files with the suffix ‘_local’ were not used to run simulations shown in manuscript. Instead, they are intended to provide the same code we ran on a cluster for use on a local computer. The main file ’s_para_main.R’ that calls all other scripts is not provided in a local version because it is specifically designed to provide the entire workflow on the cluster. A description of all scripts included in cluster/ follows.

    - s_para_condition_local.R:
      description: same as s_para_condition.R (see below), however once data/output*.data is generated it can be run locally to reproduce data/outros_*.data via terminal command ‘Rscript s_para_condition.R POPULATION SEED CONDITION’ where POPULATION: het/msm, SEED: seed used to draw from prior distributions, CONDITION: het/msm
      output: data/outros*.data

    - s_para_condition.R:
      called by: s_para_main.R (on cluster)
      aim: select parameter sets that give desired prevalences and incidences (defined in here)
      output: equivalent to locally generated data/outros_*.data

    - s_para_draw_local.R:
      description: same as s_para_draw.R, however can be run locally to generate data/theta*.data via terminal command ‘Rscript s_para_draw_local.R POPULATION SIZE SEED’ where POPULATION: het/msm, SIZE: number of samples (i.e. parameter sets) that should be drawn from distribution), SEED: seed used to drawn from prior distributions
      output: data/theta*.data

    - s_para_draw.R:
      called by: s_para_main.R (on cluster)
      aim: draw parameter sets from prior distributions (defined in s_para_sampling.R)
      output: equivalent to locally generated data/theta*.data

    - s_para_main.R:
      aim: combine entire simulation workflow into this file
      description: The simulation workflow consists of the following steps:
        - draw seed with which samples from prior distributions drawn
        - call s_para_draw.R: draw from prior distributions
        - call s_para_run-free-eq.R (parallel calling to simulate bundles of 1000 parameter sets): run simulations for given parameter sets until resistant-free equilibrium reached
        - call s_para_merge.R: merge output from parallel simulations into one file
        - call s_para_condition.R: select parameter sets that yield relevant prevalences and incidences (posteriors)
        - call s_plot_para.R: plot control plot to check parameter priors and posteriors
      This file needs to be adapted to be run locally.

    - s_para_merge_local.R
      description: same as s_para_merge.R, however can be run locally to merge data/free-eq*.data (once generated by ’s_para_run-free-eq_local.R’) into data/output*.data.
      output: data/output*.data
      note: to reproduce data/output*.data as used in mansucript, data/free-eq*.data needed to be generated

    - s_para_merge.R:
      called by: s_para_main.R (on cluster)
      aim: merge output from parallel simulation runs to one single file
      output: generated on cluster, saved here as data/output*.data

    - s_para_run-free-eq_local.R:
      description: same as s_para_run-free-eq.R, however can be run locally to generate data/free-eq_*.data (once data/theta*.data is generated) via terminal command ‘Rscript s_para-run-free-eq BUNDLE POPULATION SEED’ where BUNDLE: number of 1000 bundle of parameter sets, i.e. 0=theta[1:1000,], 1=theta[1001:2000], …, POPULATION: het/msm, SEED: seed used to draw from prior distributions. The files data/free-eq_0*.data corresponds to the first 1000 rows of output*.data for the corresponding populations and seeds. etc. for further rows in output*.data. It is not recommended to reproduce all 2*10^7/1000 (2 population*10^7 parameter sets drawn from prior distributions/1000 bundles) free-eq*.data files locally because it would take very long if done in non-parallel fashion. But of course you are very welcome to if you are curious.

    - s_para_run-free-eq.R:
      called by: s_para_main.R (on cluster)
      aim: simulate model for 1000 parameter sets
      output: generated on cluster, not saved here (because it would have been 10^4 files); instead merged files containing outputs from all simulation are saved here as data/output*.data

    - s_para_sampling.R:
      called by: s_para_draw.R (on cluster)
      aim: function to define prior distributions

    - s_plot_para.R:
      called by: s_para_main.R (on cluster)
      aim: plot prevalences, incidences and parameters before and after conditioning (i.e. priors and posteriors) to check simulations and conditioning visually
      output: generates figures/fig*.pdf, not yet generated in this data set