Repository for Extensively parameterized mutation–selection models reliably capture site-specific selective constraint, by SJS* and COW.
Contents of repository:
data/contains all simulated alignments, and the 512-taxon balanced trees (with branch lengths of either 0.5 or 0.01) used during simulation. Alignments are named with this format:
blindicates the branch lengths of the tree used for simulation.
simulation/contains all code used for simulating sequences, as well as simulating parameters for use in sequence simulation.
ramsey2011_alignmentscontains all sequence alignments from Ramsey et al. (2011).
derive_natural_simulation_parameters.pyderives parameters for simulating natural sequences
derive_dms_simulation_parameters.pyderives parameters for simulating DMS sequences. Note that experimental preferences are in the directory
true_simulation_parameterscontains all true parameters for simulation, including true dN/dS and entropy, amino-acid fitness, codon frequencies, and selection coefficients
simulate_alignments.pysimulates a sequence alignment, specifically on UT's (now defunct..) PhyloCluster.
inference/contains all code used for mutation-selection model inference. All scripts named
*.qsubare used for submitting jobs to UT's Phylocluster, and all
*.pyscripts conduct and process inferences.
results/contains all inference results.
swmutsel/contains all inference results with swMutSel for a variety of penalizations, indicated in file name. The script ./results/extract_sw_fitness.py extracts fitness values from the MLE inferences from swMutSel into separate text files for later use
phylobayes/contains all inference results with pbMutSel
postprocessing/contains all code used to process, analyze, and plot data (mostly
Rscripts). All generated plots are also in this directory. All scripts should be executed from this directory! Note: R code requires the packages (and their dependencies) cowplot, ggrepel, dplyr, tidyr, readr, grid, lme4, multcomp, and lmerTest.
calculate_inferred_quantities.pycalculates dN/dS, entropy, selection coefficient distributions, and JSD for inferences in
results/. Resulting quantities are in the subdirectory dataframes.
process_results.Rprocesses inference results in dataframes to create the final csv file inference_results.csv
plot_figures.Rmakes all the figures in the manuscript. Figures are saved in either in the subdirectory
universal_functions.pyis a python module containing various functions used throughout the repository.