Partitioning and analysis methods for large, complex sequence datasets
Python Shell
Switch branches/tags
Nothing to show
Clone or download
Pull request Compare This branch is 2 commits ahead of sunitj:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
LSA
LSFScripts
docs
misc
HashCounting.sh
KmerSVDClustering.sh
LICENSE
README.md
ReadPartitioning.sh
testData.tar.gz

README.md

Welcome to the Latent Strain Analysis (LSA) code repository!

LSA was developed as a pre-assembly tool for partitioning metagenomic reads. It uses a hyperplane hashing function and streaming SVD in order to find covariance relations between k-mers. The code, and the process outline in LSFScripts in particular, have been optimized to scale to massive data sets in fixed memory with a highly distributed computing environment.

Documentation

Documentation for LSA, including a "getting started" tutorial with accompanying test data, and step-by-step instructions for analyzing large collections, can be found at: http://latentstrainanalysis.readthedocs.org/