Code and data for Spielman and Wilke, The relationship between dN/dS and scaled selection coefficients, Mol. Biol. Evol. 2015.
TeX Python HyPhy Shell
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
Manuscript
datasets
hyphy_files
scripts
.gitignore
README.md

README.md

Omega_MutSel

Repository for "The relationship between dN/dS and scaled selection coefficients", Stephanie J. Spielman and Claus O. Wilke. All code written by SJS (contact at stephanie.spielman@gmail.com).

Description of Contents


datasets/

Contains tab-delimited summary files for simulated datasets. All simulated alignments available from TBD (for now, email stephanie.spielman@gmail.com).

scripts/

Contains scripts used in analysis. [NOTE: all simulated alignments were created using a custom sequence simulation library, pyvolve. See within for details.]

  • experimental_data/

    • nucleoprotein_amino_preferences.txt
      • This file corresponds exactly to supplementary_file_1.xls from of Bloom 2014. Gives amino acid preference/fitness data for each of the 498 positions in NP. Each row is a position, and each column is the amino acid preference (alphabetical)
    • np_codon_eqfreqs.txt
      • Contains codon equilibrium frequencies computed from NP preference data and NP mutation rates. Each row is a position, and values are alphabetical (first column is AAA, second column is AAC, etc.). Generated by prefs_to_freqs.py .
    • yeast_codon_eqfreqs.txt
      • Contains codon equilibrium frequencies computed from yeast preference data and yeast mutation rates. Each row is a position, and values are alphabetical (first column is AAA, second column is AAC, etc.). Generated by prefs_to_freqs.py .
    • polio_codon_eqfreqs.txt
      • Contains codon equilibrium frequencies computed from polio preference data and polio mutation rates. Each row is a position, and values are alphabetical (first column is AAA, second column is AAC, etc.). Generated by prefs_to_freqs.py .
  • np_scripts/

  • simulation_scripts/ Scripts in this directory were created to run specifically on The University of Texas at Austin's Center for Computational Biology and Bioinformatics cluster, Phylocluster. All files w/ extension ".qsub" are job submission scripts corresponding to a particular python script, such that xyz.qsub goes with run_ xyz.py.

    • run_sim_nyp.py
      • simulate alignments which use NP amino acid fitness data and either NP, yeast, or polio mutation rates
    • run_nyp.py
      • infer dN/dS, omega for NP, yeast, or polio datasets
    • run_siminf.py
      • simulate alignments and subsequently infer dN/dS and omega for the "synonymous selection" and "no synonymous selection" sets
    • run_convergence.py
      • simulate alignmets, infer dN/dS and omega to demonstrate omega convergence with data sets of increasing size
    • functions_omega_mutsel.py
      • Contains functions used by scripts in this directory.

hyphy_files/

Contains files used in HYPHY inference.

  • globalDNDS_fequal.bf
    • hyphy batchfile to infer omega according to GY94 M0 model with Fequal (1/61 for all codons) frequency parameterization. Used to determined omega for nosynsel.txt, synsel.txt, conv.txt .
  • globalDNDS_np.bf
    • hyphy batchfile to infer omega for simulations with experimental NP amino acid fitness data and NP mutation rates, according to a variety of frequency parameterizations. Generated by prefs_to_freqs.py .
  • globalDNDS_yeast.bf
    • hyphy batchfile to infer omega for simulations with experimental NP amino acid fitness data and yeast mutation rates, according to a variety of frequency parameterizations. Generated by prefs_to_freqs.py .
  • globalDNDS_yeast.bf
    • hyphy batchfile to infer omega for simulations with experimental NP amino acid fitness data and polio mutation rates, according to a variety of frequency parameterizations. Generated by prefs_to_freqs.py .
  • CF3x4.bf
  • GY94.mdl
    • contains standard GY94 rate matrix
  • MG_np.mdl
    • contains MG1 and MG3 matrices for NP mutation rates
  • MG_yeast.mdl
    • contains MG1 and MG3 matrices for yeast mutation rates
  • MG_polio.mdl
    • contains MG1 and MG3 matrices for polio mutation rates