Given a set
A minimal (
>original_header 20 6.13
TGGATAAAAAGGCTGACGAAAGGTCTAGCTAAAATTGTCAGGTGCTCTCAGATAAAGCAGTAAGCGAGTTGGTGTTCGCTGAGCGTCGACTAGGCAACGTTAAAGCTATTTTAGGC...
In this case 20 kmers are shared with the indexed kmers. This represents 6.13% of the kmers in the sequence.
back_to_sequences --in-kmers kmers.fasta --in-sequences reads.fasta --out-sequences filtered_reads.fasta --out-kmers counted_kmers.txt
The filtered_reads.fasta
file contains the original sequences (here reads) from reads.fasta
that contain at least one of the kmers from kmers.fasta
. The headers of each read is the same as in reads.fasta
, plus the estimated ratio of shared kmers and number of shared kmers.
As the --out-kmers option is used, the file counted_kmers.txt
contains for each kmer in kmers.fasta
the number of times it was found in filtered_reads.fasta
.
Full documentation is available at https://b2s-doc.readthedocs.io/en/latest/