-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Generate a mask from variant-spanning k-mers (#289)
The two stages of the kevlar workflow that are currently the most memory-intensive, and therefore the most expensive, are the novel k-mer finding and the likelihood calculations. Both stages currently require loading k-mer counts for all 3-4 samples into memory. We can drastically reduce memory during the novel k-mer finding by 1) performing error correction on the reads; and 2) reducing the CountMin sketch sizes. While 2) will reduce the accuracy of the k-mer abundance estimates, this can be overcome during the subsequent filtering step. However, we still need accurate k-mer abundances for all samples while calculating variant likelihoods. This update adds a feature to kevlar alac which creates a mask of all variant-spanning k-mers. This allows the user to exchange additional runtime (recount the k-mers) for a drastic reduction in memory required for the final step of the pipeline.
- Loading branch information
Showing
4 changed files
with
46 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters