Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
Already on GitHub? Sign in to your account
pcr.seqs removes start position, keeps end position #348
Comments
mothur-westcott
added Bugs Command:Pcr.seqs
labels
Jun 28, 2017
mothur-westcott
added this to the
Version 1.40.0
milestone
Jun 28, 2017
mothur-westcott
added a commit
that referenced
this issue
Jul 21, 2017
|
|
mothur-westcott |
232c6f5
|
mothur-westcott
closed this
Jul 21, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
rrohwer commentedJun 27, 2017
In the pcr.seqs command, if you specify the start and end position the resulting sequences do not include the start position but do include the end position. This inconsistent behavior causes a bug when following the directions for trimming a reference alignment (Pat's blog post: http://blog.mothur.org/2016/07/07/Customization-for-your-region/ ) where the reference alignment does not include the first bp of your region and therefore you will lose the first bp of your sequences because they get trimmed off after aligning. This is confusing, so here's an example:
trim the reference alignment to the primer region as outlined in blog post
manually align the beginning of the resulting sequences after removing dashes (I truncated seqs for readability and chose an example silva sequence from the trimmed reference alignment)
the beginning base pair is missing after calling
pcr.seqs(start= ,end=)however the end of the trimmed sequence is not similarly offset
So as you can see
pcr.seqswith theoligos=option is behaving as expected, but with thestart=andend=option is behaving inconsistently where start is not included and end is. You can narrow it down to this use of the command because the e-coli sequence was trimmed correctly but the silva sequence is missing the first bp with or without the primer included.A temporary work-around is to use keepprimer=T when creating the reference alignment so that the full sequence region is included in it.
Thanks as always!
Robin