-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DADA2 Not recovering known community members in mock community samples #1005
Comments
To clarify, the sequences linked here are the mock community sequences you are trying to recover? And, is it just these sequences being denoised, or are they part of a long sequenced region? Also, "first" and "second" in your text, corresponds to the 1st and 2nd sequence in |
Yes, the sequences listed are those that I am trying to recover, and they
should be the only sequences present in the samples. These sequences are
the complete amplicon (after removing primers), they are not part of a
longer sequenced region. I promise there are good reasons for why I am
metabarcoding such a tiny region that I realize are not obvious. Yes, first
and second correspond to the order in the Pmb.F.priors vector.
…On Sat, May 9, 2020 at 4:17 PM Benjamin Callahan ***@***.***> wrote:
To clarify, the sequences linked here are the mock community sequences you
are trying to recover? And, is it just these sequences being denoised, or
are they part of a long sequenced region?
Also, "first" and "second" in your text, corresponds to the 1st and 2nd
sequence in Pmb.F.priors?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1005 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTPX4GTKFPKXS4PSIJJG4DRQW24TANCNFSM4M4OKNWQ>
.
--
James Skelton
Community Ecologist
webpage:
poetsworm.com
email:
skelto3@g <skelto3@vt.edu>mail.com
|
That is... strange. When I use the dada2 alignment from within the R package, these sequences are all clearly distinguished from one another so what is going on?
What version of the dada2 R package are you using? Could you share an example fastq file with me? |
Using v‘1.14.0’ Would be willing to share a fastq privately. How may I do
so?
…On Mon, May 11, 2020 at 10:32 AM Benjamin Callahan ***@***.***> wrote:
That is... strange. When I use the dada2 alignment from within the R
package, these sequences are all clearly distinguished from one another so
what is going on?
unname(outer(Pmb.F.priors, Pmb.F.priors, nwhamming, vec=TRUE))
What version of the dada2 R package are you using? Could you share an
example fastq file with me?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1005 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTPX4FTOTBYWE6DBKRRT63RRAD6RANCNFSM4M4OKNWQ>
.
--
James Skelton
Community Ecologist
webpage:
poetsworm.com
email:
skelto3@g <skelto3@vt.edu>mail.com
|
You can email me: benjamin DOT j DOT callahan AT gmail DOT com |
Did we get this figured out over email? |
Yes. Changing gap_penalty to 20 resolved the issue. Thank you for checking
back.
…On Thu, Jul 16, 2020, 4:27 PM Benjamin Callahan ***@***.***> wrote:
Did we get this figured out over email?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1005 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABTPX4BR7NLN45U7DAS6O7LR35PBXANCNFSM4M4OKNWQ>
.
|
Hello,
I constructed simple mock communities comprised of short synthetic genes that vary only at a 6 bp region in the middle, combined in various known concentrations (example sequences of one such mock community pasted below). In every try so far, at least one of the known mock community members is not recovered after denoising, despite an abundance of perfect matches being present in the raw reads. I have included the known sequences as priors (forward and reverse compliments prior to merging), used pooling and no pooling, and tried selfconsist = T and F, each to no avail. In the below example mock communtiy, based on the known concentrations of the differnt variants going into the mock, it appears that the first and second sequences are being assigned to the same ASV, which is given the same sequence as the second mock member, and thus I recover zero perfect matches for the first mock member. This is particularly puzzling because the first mock member comprisies ~a third of the raw reads in some samples.
Is there anything else I can try to get DADA2 to descriminate among these similar sequences?
thank you.
Pmb.F.priors <- c("AGCTATTCTATTCCTAAATAATACATCCAACACTCCAACACTATTATTCCTAGCAACC",
"AGCTATTCTATTCCTAAATAATACTCTCAACACTCCAACACTATTATTCCTAGCAACC",
"AGCTATTCTATTCCTAAATAATAAGAGCAACACTCCAACACTATTATTCCTAGCAACC",
"AGCTATTCTATTCCTAAATAATAATGACAACACTCCAACACTATTATTCCTAGCAACC",
"AGCTATTCTATTCCTAAATAATATACACAACACTCCAACACTATTATTCCTAGCAACC")
The text was updated successfully, but these errors were encountered: