You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Warning message when running the following command
$ seqkit seq -m 10000 <path_to_fastq.gz> | gzip > <output.fastq.gz>
�[33m[WARN]�[0m you may switch on flag -g/--remove-gaps to remove spaces
The usage page documents -g as removing gaps as opposed to removing spaces (which looks more likely to be right).
Assuming that it is indeed "removing gaps", does it mean that bases that are literally N will be dropped?
Second, what could be triggering this warning message? I'd hope my FASTQ doesn't have spaces in its sequence lines....
provide a reproducible example
Unfortunately, the data I work with is protected. Sorry about this.
Thank you!
Steve
The text was updated successfully, but these errors were encountered:
Hi Steve, it's triggered by options -m or -M for filtering squence by lenght, just in case there are some spaces in the sequences which might result in incorrect outputs.
By default, -g removes "- \t." which can be set by :
-G, --gap-letters string gap letters (default "- \t.")
You're right, it's more like removing spaces not gaps in most scenarios. I used the “gap” in case of multiple sequence alignment files, where the gap is marked as “-”.
Thanks for the explanation!
It looks like you've labeled this a todo item, so I'll leave this open. (But please feel free to close it when appropriate).
-g, --remove-gaps remove gaps letters set by -G/--gap-letters, e.g., spaces, tabs, and
dashes (gaps "-" in aligned sequences)
-G, --gap-letters string gap letters to be removed with -g/--remove-gaps (default "- \t.")
Prerequisites
seqkit version
Describe your issue
describe the problem
Warning message when running the following command
The usage page documents
-g
asremoving gaps
as opposed toremoving spaces
(which looks more likely to be right).Assuming that it is indeed "removing gaps", does it mean that bases that are literally
N
will be dropped?Second, what could be triggering this warning message? I'd hope my FASTQ doesn't have spaces in its sequence lines....
provide a reproducible example
Unfortunately, the data I work with is protected. Sorry about this.
Thank you!
Steve
The text was updated successfully, but these errors were encountered: