Learning Natural Selection from the Site Frequency Spectrum
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
LICENSE
README.md
SFselect.py
SFselect_train.py
learn.py
metaSVM.py
params.py
reader.py
regimes.py

README.md

SFselect

Learning Natural Selection from the Site Frequency Spectrum

This repo includes two main programs:

  1. SFselect.py -- a standalone program for applying pre-trained SVMs of the site frequency spectrum (SFS) to allele frequency data. The output is for each sliding genomic window a probability under the model that the window is evolving under a sweep.

For more details on using SFselect.py, see http://bioinf.ucsd.edu/~rronen/sfselect.html

  1. SFselect_train.py -- a program for training SVMs of the site frequency spectrum (SFS) to classify regions evolving neutrally from those evolvign under a hard selective sweep. This program requires as input simulated population data (can be generated by simulators like ms, msms, etc). See 'params.py' for setting the simulation parameters (from which the data file names are constructed, among other things).

  2. To run it, you have to install sklearn v0.13, by running

sudo pip install -Iv https://pypi.python.org/packages/source/s/scikit-learn/scikit-learn-0.13.tar.gz#md5=8d6029f668a330aded7afe5df18df4dc

###Dependencies numpy, matplotlib, scikits-learn (tested with v0.13)