Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
reference-free ddRADseq analysis tools
Fetching latest commit…
Cannot retrieve the latest commit at this time.
pipeline script generates reference-sorted, indexed BAM from uniqued reads from radtag sequencing lanes. To generate uniqued reads, see preprocess_radtag_lane.py four accessory programs and three python libraries are used, listed below. for parallel execution, GNU parallel is also HIGHLY recommended. Experimental LSF support is also available. REQUIREMENTS: - PATH must contain: blat mcl mcxload muscle samtools [parallel] - PYTHONPATH must contain: numpy gdata editdist see (at the time of this writing, March 09 2011) blat http://hgdownload.cse.ucsc.edu/downloads.html mcl/mcxload http://www.micans.org/mcl/ muscle http://www.drive5.com/muscle/ samtools http://samtools.sourceforge.net/ GNU parallel http://savannah.gnu.org/projects/parallel/ numpy * http://sourceforge.net/projects/numpy/files/ gdata install gdata v2.0.10 included in this repository (recent versions are known to be incompatible with rtd code, but are available at: http://code.google.com/p/gdata-python-client/downloads/list) editdist http://www.mindrot.org/projects/py-editdist/ * N.B. numpy is also available as part of the excellent Enthought Python Distribution, available free for academic/non-profit use at http://www.enthought.com/products/epd.php NOTE ON GOOGLE DOCUMENTS SPREADSHEETS: It appears as of this writing (June 2012) the google spreadsheets API only correctly queries all fields of a user-edited spreadsheet if the first column is blank. column A is therefore left blank in the tables generated by initialize_sample_DB.py (I recommend hiding column A of all programmatically accessed GDoc spreadsheets)