Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use -F option? #454

Open
bede opened this issue Dec 10, 2023 · 3 comments
Open

How to use -F option? #454

bede opened this issue Dec 10, 2023 · 3 comments

Comments

@bede
Copy link

bede commented Dec 10, 2023

I am struggling to understand the useful looking -F option, which allows one to pass a fasta file from which k-mers are extracted and used as reads. I suspect I have misunderstood the manual:

-F k:<int>,i:<int>

	Reads are substrings (k-mers) extracted from a FASTA file <s>. Specifically, for every reference sequence in FASTA file <s>, Bowtie 2 aligns the k-mers at offsets 1, 1+i, 1+2i, ... until reaching the end of the reference. Each k-mer is aligned as a separate read. Quality values are set to all Is (40 on Phred scale). Each k-mer (read) is given a name like <sequence>_<offset>, where <sequence> is the name of the FASTA sequence it was drawn from and <offset> is its 0-based offset of origin with respect to the sequence. Only single k-mers, i.e. unpaired reads, can be aligned in this way. 

I have unsuccessfully tried, for example, the following:

$ bowtie2 -x NC_029549.1 -f NC_029549.1.fa -F k:150,i:1
FASTA and FASTA sampling formats are mutually exclusive.
(ERR): bowtie2-align exited with value 1

Might someone be able to provide an example of how this feature should be used?

Thank you!

@BenLangmead
Copy link
Owner

Thank you -- we are looking into this. I suspect the mention of <s> is a spurious hold-over from the Bowtie 1 manual, and that we should have said <r> -- which the variable we use in the Bowtie 2 manual to refer to the unpaired reads file specified with -U. We'll get a more definitive answer soon.

@ch4rr0
Copy link
Collaborator

ch4rr0 commented Dec 10, 2023

Hello,

Your command line was fine, with the exception that the k and i should be left out. I have pushed a fix to the bug_fixes branch that should resolve the mutually exclusive error thrown when -f was specified with -F.

@BenLangmead -- we updated the -f option to behave like -q in that it is simply a flag that specifies the format of the input files to follow. That way a user can do something like this:

bowtie2 -x index -f -1 mate1.fa -2 mate2.fa or bowtie2 -x index -q -1 mate1.fq -2 mate2.fq or bowtie2 -x index -f --interleaved input.fa or bowtie2 -x index -q --intereaved input.fa

In the case of FASTA-continuous this allows any one the following to be parsed the same way:

N.B. unpaired reads, -U, are default in bowtie2

bowtie2 -x index -f -F 10,2 input.fa # fasta explicit, unpaired inferred
bowtie2 -x index -F 10,2 input.fa # fasta and unpaired are inferred
bowtie2 -x index -F 10,2 -U input.fa # fasta is inferred, unpaired explicit
bowtie2 -x index -f -F 10,2 -U input.fa # all explicitly specified

I hope this makes sense.

@bede
Copy link
Author

bede commented Dec 10, 2023

Speedy! Thank you both 🙏

Thanks for fixing the mutual exclusivity issue as well as with how I was using -F. The bug_fixes branch is now working as expected with bowtie2 -x NC_029549.1 -f NC_029549.1.fa -F 150,1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants