-
Notifications
You must be signed in to change notification settings - Fork 1
Home
This GitHub repository accompanies the publication: Pilosof S, He Q, Tiedje KE, Ruybal-Pesántez S, Day KP, Pascual M. Competition for hosts modulates vast antigenic diversity to generate persistent strain structure in Plasmodium falciparum. PLOS Biology, doi:
If you are here we assume you have read the paper and we therefore use the terminology we use in the paper.
The analysis is perfectly reproducible, yet it is rather complex, with multiple steps that depend on each other. It is performed mainly using a High Performance Computing (HPC) grid because it is computationally intensive. We ran ours on University of Chicago's RCC Midway. There is no way to do all of it on a local machine. Some adjustments for the specific local machine and HPC system will be necessary.
In any file that you use, please make sure to change the working folders, as well as the folders that hold the result files. Note that it is also necessary to change folders in the function get_data
in file functions.R
All the necessary files are in this repository (see Files in this repository).
Start by reading the General workflow pipeline. Then you can read about each of the phases of the analysis, which are performed in the following order:
Low, medium and high diversity, 3 scenarios each.
- Run ABM (50 runs per scenario per diversity regime)
- Select a threshold edge weights
- Obtain results for benchmark scenarios (using the selected cutoff values for each regime).
- Sensitivity analysis
High diversity, selection scenario only
- Run ABM (50 runs)
- Select a threshold edge weights
- Obtain results for benchmark scenarios
- Sensitivity analysis
In this analysis we use a bio-informatic pipeline to cluster sequences into alleles. Details in the paper.
- All the parameter files used in the ABM are in file
ABM_parameter_files
in FIGSHARE. - We have put all the data that underlies the figures that are published in the paper, including the SI, in a dedicated repository FIGSHARE. These data were produced using the workflow described above. Each zip file in the repository is named after the corresponding figure. The figures can be reproduced using these data with the code in file
Results_PLOS_biol.R
.
Empirical data was also placed in the repository under name empirical_data.zip
. Follow the code in file empirical.R
to analyze the data and produce Figure 3 in the main text.