graphsample

Subsample FASTQ by sampling connected components of a de-Bruijn graph

Taking a subsample from a FASTQ can lead to poor coverage of some regions in the subsample, causing the subsample to have informational properties that are not representative of the full read set.

graphsample addresses this problem by building a de-Bruijn graph from the reads, identifying all the connected components, and randomly sampling those components. It outputs the reads that belong to the sampled components.

What's the point? Glad you asked! graphsample allows you to take a small subsample from a large set of reads, and use the subsample to optimise the parameters of any tools and algorithms you want to run on the full set.

Compiling

$ git clone --recursive https://github.com/Blahah/graphsample.git
$ cd graphsample
$ make

Running

$ bin/graphsample --help

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
src		src
third-party		third-party
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

graphsample

Compiling

Running

About

Releases 2

Packages

Languages

blahah/graphsample

Folders and files

Latest commit

History

Repository files navigation

graphsample

Compiling

Running

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages