Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip empty FASTA sequences, and allow to force taxids #27

Open
wants to merge 117 commits into
base: master
Choose a base branch
from
Open

Skip empty FASTA sequences, and allow to force taxids #27

wants to merge 117 commits into from

Conversation

fbreitwieser
Copy link
Collaborator

This pull request includes two changes:

  • When observing a empty sequence, the program would exit in previous version. This is fixed. Additionally the number of empty sequences, and the number of sequences without valid taxonomy mapping, are counted and reported at the end.
  • Added the option -T to force set_lcas to set the taxid of the sequence as the taxid of its kmers - instead of the LCA. This proves very useful for re-setting the taxids of contaminant sequences after the database build.

fbreitwieser and others added 30 commits July 10, 2015 10:41
…output

When -T is set, for each observed k-mer the taxid of the sequence is set - instead of the
lowest common ancestor of of the sequence taxid and the currently set taxid. This is useful
for setting the taxid of contaminant sequences, which may also be observed in database
genomes, to the contaminant taxid.

-v gives more verbose output
… to run set_lcas on sequences that were not in the DB build originally
* 'master' of https://github.com/fbreitwieser/kraken-hll:
  Dump taxdb without ending separators (for kraken-report)
* 'master' of https://github.com/fbreitwieser/kraken-hll:
  Change name to KrakenHLL
  Add environment CPPFLAGS and LDFLAGS
  Fox parent map generation in taxDB
  Make sure TaxDB is not copied
  Many small changes to make it working for GCC 4.4 / C++0x

Conflicts:
	tests/build-dbs.sh
* 'master' of https://github.com/fbreitwieser/kraken-hll:
  Update
  Change name to KrakenHLL

Conflicts:
	install_kraken.sh
	tests/build-dbs.sh
	tests/test-on-simulated-reads.sh
* 'master' of https://github.com/fbreitwieser/kraken-hll:
  Compute k-mer counts if not present
  Create taxDB if not present
  Update
  Change name to KrakenHLL
* 'master' of https://github.com/fbreitwieser/kraken-hll:
  Fix gzstream building and licensing
  Fix gzstream compilation, update on HLL

Conflicts:
	src/hyperloglogplus.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant