Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
reference-free ddRADseq analysis tools
Python
branch: master
Failed to load latest commit information.
.gitignore updated .gitignore to include gdata build directory
101013_lane7_sample_data.csv added test sample data csv
DB_index_by_well.csv DB_index_by_well.csv added
LICENSE lgpl3 added
LSF.py prepro
README README updated
RE_site_dropout.py add iterative
USAGE_NOTES Update USAGE_NOTES
__init__.py 20120208 radtag_denovo code added
bam2fastq_by_index.py 20120208 radtag_denovo code added
calc_offby.py add iterative
config.template.py config.template.py corrected
convert_fq.py add iterative
estimate_error_by_clustering.py tagged version
evaluate_rtd_clustering.py added run_safe.py
extract_perfect_RE_reads.py add iterative
find_perfect_match_reads.py 20120208 radtag_denovo code added
gdata-2.0.10.tar.gz added gdata 2.0.10
get_uniqued_lines_by_cluster.py multiple subject support added, bugfixes
initialize_sample_DB.py 20120208 radtag_denovo code added
iterative_rtd.py iterative_rtd updates
mcl_id_triples_by_blat.py rtd fixes
musclemap.py 20120208 radtag_denovo code added
overlap_preprocess.py fixed sq lookup
overlap_rtd.py add iterative
plot_error.py add plot_error.py
pool_lane_counts.py add pool
preprocess_radtag_lane.py passthough for db records in legacy lookup
preprocess_radtag_lane_vlbc.py refactored vcf_to_rqtl
read_quality_statistics.py read_quality_statistics added
rtd_run.py commit
run_safe.py exception on 0 length return
s_7_sequence-1M.txt.gz added sample sequence data
sam_from_clust_uniqued.py DB_index_by_well.csv added
simulate_loci.py simulation scripts updated to include efficiency predictions
strip_rqtl_header_add_phenocols.py 20120208 radtag_denovo code added
summarize_sequencing_stats.py switched .uniqued handling to compressed by default
vcf_to_rqtl.py Merge branch 'master' of github.com:brantp/rtd
vcf_to_rqtl_DB.py added htseq style vcf_to_rqtl (vcf_to_rqtl_DB.py)
vcf_to_rqtl_from_template_map.py prepro

README

pipeline script generates reference-sorted, indexed BAM from uniqued reads from radtag sequencing lanes.

To generate uniqued reads, see preprocess_radtag_lane.py

four accessory programs and three python libraries are used, listed below.
for parallel execution, GNU parallel is also HIGHLY recommended. 
Experimental LSF support is also available.

REQUIREMENTS:
-        PATH must contain: blat mcl mcxload muscle samtools [parallel]
-  PYTHONPATH must contain: numpy gdata editdist

see (at the time of this writing, March 09 2011)
  blat         http://hgdownload.cse.ucsc.edu/downloads.html
  mcl/mcxload  http://www.micans.org/mcl/
  muscle       http://www.drive5.com/muscle/
  samtools     http://samtools.sourceforge.net/
  GNU parallel http://savannah.gnu.org/projects/parallel/

  numpy *      http://sourceforge.net/projects/numpy/files/
  gdata        install gdata v2.0.10 included in this repository
			(recent versions are known to be incompatible
			with rtd code, but are available at:
			http://code.google.com/p/gdata-python-client/downloads/list)
  editdist     http://www.mindrot.org/projects/py-editdist/

* N.B. numpy is also available as part of the excellent Enthought Python Distribution,
available free for academic/non-profit use at http://www.enthought.com/products/epd.php

NOTE ON GOOGLE DOCUMENTS SPREADSHEETS:
It appears as of this writing (June 2012) the google spreadsheets API only correctly queries 
all fields of a user-edited spreadsheet if the first column is blank.
column A is therefore left blank in the tables generated by initialize_sample_DB.py
(I recommend hiding column A of all programmatically accessed GDoc spreadsheets)
Something went wrong with that request. Please try again.