July 8 2019 release - does it REQUIRE Blast 2.9.0? #20

wolfgangrumpf · 2019-07-16T15:09:37Z

I'm considering upgrading BLCA but our cluster doesn't have BLAST 2.9 on it yet. Is 2.9 required, or will the July 8 2019 release work with BLAST 2.8?

yingeddi2008 · 2019-07-16T15:45:42Z

You are right, The July 8 2019 release should work with previous versions of blast. I made some minor changes so it could work with the latest version of blastn 2.9. Let me know if you find any problems.

wolfgangrumpf · 2019-07-16T17:55:57Z

Okay, I’m working on the install now. Another question for you - I’m using an older release of BLCA - with an older database. Some of the results it comes up with don’t agree with a “simple” BLAST on the current NCBI database - e.g. one sequence in particular that I am working with is an ATCC E. Coli strain, which NCBI recognizes if I use their online search, but BLCA says it’s E. fergusonii. Is this a database mismatch issue, e.g. if I update to the newer BLCA and database, do you think the sequence will be recognized as E. coli? Cheers, Wolfgang Rumpf, Ph.D. ———————————— Bioinformatics Analyst The Institute for Genomic Medicine at The Abigail Wexner Research Institute Nationwide Children’s Hospital —————————————- Professor University of Maryland Global Campus

…

On Jul 16, 2019, at 11:45 AM, yingeddi2008 ***@***.***> wrote: You are right, THe July 8 2019 release work with previous versions of blast. I made some minor changes so it could work with the latest version of blastn 2.9. Let me know if you find any problems. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

yingeddi2008 · 2019-07-18T16:42:34Z

Hi Wolfgang,

Please note that the default BLCA database is 16s rRNA, not the NT database which you are referring to when you perform BLASTN online. We have noticed some issue with the 16s rRNA database -- such as that some of the 16s rRNA fragments are not the type strains. I believe that's the reason why the annotation is off. Since we have no control over NCBI's 16s rRNA database, I can't say that updating the BLCA software will fix your misclassification issue. I do recommend that you use a manually curated database, such as greengene or SILVA instead.

I hope this helps,

Eddi

dswan · 2019-07-19T12:28:06Z

There's also a plethora of sequences in the NCBI 16S database with ambiguous nucleotides, I'd thought of applying a filter for removing some of the more egregiously poor sequences actually. It's a shame because the ITS targetted loci project at the NCBI is far better curated for quality and really focuses on type strains.

One of the things I've been meaning to dig into a little further is the provenance of these files:

ftp://ftp.ncbi.nlm.nih.gov/refseq/TargetedLoci/Bacteria/bacteria.16SrRNA.fna.gz

and

ftp://ftp.ncbi.nlm.nih.gov/refseq/TargetedLoci/Archaea/archaea.16SrRNA.fna.gz

As opposed to the pre-formatted BLAST database. Technically should be all the same project I imagine, but I've noticed a few formatting issues with the BLAST database, probably down to sequence redundancy.

(updated) Having checked these files they're similar enough to satisfy me that they're the same source!

qunfengdong · 2019-07-19T14:42:20Z

If you can remove those poor sequences in NCBI 16S database, I do believe that it'd be better. Any other ITS loci sequences should also work as long as you can compile the corresponding taxonomic annotation.

…

On Fri, Jul 19, 2019 at 7:28 AM Dr. Daniel Swan ***@***.***> wrote: There's also a plethora of sequences in the NCBI 16S database with ambiguous nucleotides, I'd thought of applying a filter for removing some of the more egregiously poor sequences actually. It's a shame because the ITS targetted loci project at the NCBI is far better curated for quality and really focuses on type strains. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AEOBXE3RAYCPRTH3RWM333LQAGXNNA5CNFSM4IECAP4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2LPPII#issuecomment-513210273>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEOBXE76FVU7K3PG33BDKLLQAGXNNANCNFSM4IECAP4A> .

dswan · 2019-07-19T14:57:32Z

If you can remove those poor sequences in NCBI 16S database, I do believe that it'd be better. Any other ITS loci sequences should also work as long as you can compile the corresponding taxonomic annotation.

I did wonder how BLAST handled these ambiguities, but I assume they would be penalised.

wolfgangrumpf · 2019-07-19T14:58:36Z

I saw that there are instructions for generating the SILVA LSU database for BLCA, but not for the SSU - I don’t suppose anyone has done this already? Or will greengenes provide sufficient resolution? Cheers, Wolfgang Rumpf, Ph.D. ———————————— Bioinformatics Analyst The Institute for Genomic Medicine at The Abigail Wexner Research Institute Nationwide Children’s Hospital —————————————- Professor University of Maryland Global Campus

…

On Jul 19, 2019, at 10:42 AM, qunfengdong ***@***.***> wrote: If you can remove those poor sequences in NCBI 16S database, I do believe that it'd be better. Any other ITS loci sequences should also work as long as you can compile the corresponding taxonomic annotation. On Fri, Jul 19, 2019 at 7:28 AM Dr. Daniel Swan ***@***.***> wrote: > There's also a plethora of sequences in the NCBI 16S database with > ambiguous nucleotides, I'd thought of applying a filter for removing some > of the more egregiously poor sequences actually. It's a shame because the > ITS targetted loci project at the NCBI is far better curated for quality > and really focuses on type strains. > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <#20?email_source=notifications&email_token=AEOBXE3RAYCPRTH3RWM333LQAGXNNA5CNFSM4IECAP4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2LPPII#issuecomment-513210273>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AEOBXE76FVU7K3PG33BDKLLQAGXNNANCNFSM4IECAP4A> > . > — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

qunfengdong · 2019-07-19T15:17:42Z

Yes, BLAST should penalize those.

…

On Fri, Jul 19, 2019 at 9:57 AM Dr. Daniel Swan ***@***.***> wrote: If you can remove those poor sequences in NCBI 16S database, I do believe that it'd be better. Any other ITS loci sequences should also work as long as you can compile the corresponding taxonomic annotation. I did wonder how BLAST handled these ambiguities, but I assume they would be penalised. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AEOBXE5NLH6IDXDWATX3JMTQAHI53A5CNFSM4IECAP4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2L4BBA#issuecomment-513261700>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEOBXE5RXZ3NBMU7JPSWTYTQAHI53ANCNFSM4IECAP4A> .

qunfengdong · 2019-07-19T15:32:48Z

No, we have not tried neither SILVA LSU nor SSU (the LSU instruction was provided kindly by Dr. Daniel Swan), and we have not done any systematic comparison to greengenes either. We are just providing those options available for the community to use. Sometimes, we do apply multiple databases to our own projects. On Fri, Jul 19, 2019 at 9:58 AM Wolfgang Rumpf <notifications@github.com> wrote:

…

I saw that there are instructions for generating the SILVA LSU database for BLCA, but not for the SSU - I don’t suppose anyone has done this already? Or will greengenes provide sufficient resolution? Cheers, Wolfgang Rumpf, Ph.D. ———————————— Bioinformatics Analyst The Institute for Genomic Medicine at The Abigail Wexner Research Institute Nationwide Children’s Hospital —————————————- Professor University of Maryland Global Campus > On Jul 19, 2019, at 10:42 AM, qunfengdong ***@***.***> wrote: > > If you can remove those poor sequences in NCBI 16S database, I do believe > that it'd be better. Any other ITS loci sequences should also work as > long as you can compile the corresponding taxonomic annotation. > > On Fri, Jul 19, 2019 at 7:28 AM Dr. Daniel Swan < ***@***.***> > wrote: > > > There's also a plethora of sequences in the NCBI 16S database with > > ambiguous nucleotides, I'd thought of applying a filter for removing some > > of the more egregiously poor sequences actually. It's a shame because the > > ITS targetted loci project at the NCBI is far better curated for quality > > and really focuses on type strains. > > > > — > > You are receiving this because you are subscribed to this thread. > > Reply to this email directly, view it on GitHub > > < #20?email_source=notifications&email_token=AEOBXE3RAYCPRTH3RWM333LQAGXNNA5CNFSM4IECAP4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2LPPII#issuecomment-513210273 >, > > or mute the thread > > < https://github.com/notifications/unsubscribe-auth/AEOBXE76FVU7K3PG33BDKLLQAGXNNANCNFSM4IECAP4A > > > . > > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub, or mute the thread. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#20?email_source=notifications&email_token=AEOBXE6TZRSWERF5PXQ6RYDQAHJB3A5CNFSM4IECAP4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2L4EBY#issuecomment-513262087>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEOBXE4NJJ3WGIBGFDZFC7TQAHJB3ANCNFSM4IECAP4A> .

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

July 8 2019 release - does it REQUIRE Blast 2.9.0? #20

July 8 2019 release - does it REQUIRE Blast 2.9.0? #20

wolfgangrumpf commented Jul 16, 2019

yingeddi2008 commented Jul 16, 2019 •

edited

wolfgangrumpf commented Jul 16, 2019 via email

yingeddi2008 commented Jul 18, 2019

dswan commented Jul 19, 2019 •

edited

qunfengdong commented Jul 19, 2019 via email

dswan commented Jul 19, 2019

wolfgangrumpf commented Jul 19, 2019 via email

qunfengdong commented Jul 19, 2019 via email

qunfengdong commented Jul 19, 2019 via email

July 8 2019 release - does it REQUIRE Blast 2.9.0? #20

July 8 2019 release - does it REQUIRE Blast 2.9.0? #20

Comments

wolfgangrumpf commented Jul 16, 2019

yingeddi2008 commented Jul 16, 2019 • edited

wolfgangrumpf commented Jul 16, 2019 via email

yingeddi2008 commented Jul 18, 2019

dswan commented Jul 19, 2019 • edited

qunfengdong commented Jul 19, 2019 via email

dswan commented Jul 19, 2019

wolfgangrumpf commented Jul 19, 2019 via email

qunfengdong commented Jul 19, 2019 via email

qunfengdong commented Jul 19, 2019 via email

yingeddi2008 commented Jul 16, 2019 •

edited

dswan commented Jul 19, 2019 •

edited