Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
reference-free ddRADseq analysis tools
Python
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
.gitignore
101013_lane7_sample_data.csv
DB_index_by_well.csv
LICENSE
LSF.py
README README updated
RE_site_dropout.py
USAGE_NOTES
__init__.py
bam2fastq_by_index.py
calc_offby.py add iterative
config.template.py
convert_fq.py
estimate_error_by_clustering.py
evaluate_rtd_clustering.py
extract_perfect_RE_reads.py
find_perfect_match_reads.py
gdata-2.0.10.tar.gz
get_uniqued_lines_by_cluster.py
initialize_sample_DB.py
iterative_rtd.py
mcl_id_triples_by_blat.py
musclemap.py
overlap_preprocess.py
overlap_rtd.py
plot_error.py
pool_lane_counts.py
preprocess_radtag_lane.py
preprocess_radtag_lane_vlbc.py
read_quality_statistics.py
rtd_run.py
run_safe.py
s_7_sequence-1M.txt.gz
sam_from_clust_uniqued.py
simulate_loci.py
strip_rqtl_header_add_phenocols.py
summarize_sequencing_stats.py
vcf_to_rqtl.py
vcf_to_rqtl_DB.py
vcf_to_rqtl_from_template_map.py

README

pipeline script generates reference-sorted, indexed BAM from uniqued reads from radtag sequencing lanes.

To generate uniqued reads, see preprocess_radtag_lane.py

four accessory programs and three python libraries are used, listed below.
for parallel execution, GNU parallel is also HIGHLY recommended. 
Experimental LSF support is also available.

REQUIREMENTS:
-        PATH must contain: blat mcl mcxload muscle samtools [parallel]
-  PYTHONPATH must contain: numpy gdata editdist

see (at the time of this writing, March 09 2011)
  blat         http://hgdownload.cse.ucsc.edu/downloads.html
  mcl/mcxload  http://www.micans.org/mcl/
  muscle       http://www.drive5.com/muscle/
  samtools     http://samtools.sourceforge.net/
  GNU parallel http://savannah.gnu.org/projects/parallel/

  numpy *      http://sourceforge.net/projects/numpy/files/
  gdata        install gdata v2.0.10 included in this repository
			(recent versions are known to be incompatible
			with rtd code, but are available at:
			http://code.google.com/p/gdata-python-client/downloads/list)
  editdist     http://www.mindrot.org/projects/py-editdist/

* N.B. numpy is also available as part of the excellent Enthought Python Distribution,
available free for academic/non-profit use at http://www.enthought.com/products/epd.php

NOTE ON GOOGLE DOCUMENTS SPREADSHEETS:
It appears as of this writing (June 2012) the google spreadsheets API only correctly queries 
all fields of a user-edited spreadsheet if the first column is blank.
column A is therefore left blank in the tables generated by initialize_sample_DB.py
(I recommend hiding column A of all programmatically accessed GDoc spreadsheets)
Something went wrong with that request. Please try again.