Good's coverage estimate #255

jansuategui · 2012-10-15T22:45:52Z

Good's Coverage estimate using Qiime

gregcaporaso · 2012-11-28T02:02:00Z

Would this generally be useful functionality? Respond with +1 if you would use this (and ideally describe a specific use case).

antgonza · 2012-11-28T12:54:04Z

+1 This is a metric that a few users, examples below, have asked for.
The calculation is pretty simple, algorithm described in the first
link.

https://groups.google.com/forum/?fromgroups#!topic/qiime-forum/0S_WyC5q79s
https://groups.google.com/forum/#!msg/qiime-forum/husZ_TVFGOM/nmOspuCZ4DEJ
https://groups.google.com/forum/?fromgroups=#!topic/qiime-forum/4E_jr-34G7k

gregcaporaso · 2012-11-28T14:34:38Z

OK, any takers on this one? @wdwvt1 or @justin212k seem like likely candidates.

rob-knight · 2012-11-28T21:25:46Z

+1 should integrate with Manuel's diversity estimation stuff (not sure where that ended up). Is very common use case to justify sampling effort to reviewers.

Rob

On Nov 28, 2012, at 7:34 AM, Greg Caporaso <notifications@github.com mailto:notifications@github.com> wrote:

OK, any takers on this one? @wdwvt1 https://github.com/wdwvt1 or @justin212khttps://github.com/justin212k seem like likely candidates.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-10804480.

justin212k · 2012-12-06T01:01:49Z

Hmm, we could add another file or two to qiime/pycogent_backports. But that adds a decent amount of complexity to the qiime codebase. What do you all think of adding good's coverage to alpha_diversity.py directly?

antgonza · 2012-12-06T12:41:33Z

I think that should work ...

rob-knight · 2012-12-06T17:17:26Z

Agree. Probably want to merge in whatever module has Jens's implementation of Manuel's coverage estimators.

On Dec 6, 2012, at 5:41 AM, Antonio Gonzalez <notifications@github.com mailto:notifications@github.com> wrote:

I think that should work ...

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11083913.

justin212k · 2012-12-06T17:27:33Z

Sounds good. @jens_the_kraut, you know which module that is?

On Thu, Dec 6, 2012 at 9:17 AM, Rob Knight notifications@github.com wrote:

Agree. Probably want to merge in whatever module has Jens's implementation
of Manuel's coverage estimators.

On Dec 6, 2012, at 5:41 AM, Antonio Gonzalez <notifications@github.com
mailto:notifications@github.com> wrote:

I think that should work ...

—
Reply to this email directly or view it on GitHub<
https://github.com/qiime/qiime/issues/255#issuecomment-11083913>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11094643.

jensreeder · 2012-12-06T17:52:57Z

That would be conditional_uncovered_probability.py.
It might be useful to rename this script into coverage.py, as this seems to be the term that people look for.
This requires to (1-x) the probabilities from Manuel's estimators.

In the long term, I suggest to put goods into pycogent as that is where e.g. the robbins estimator lives as well.
Maybe for the upcoming release, stashing it in qiime is the best solution.

Practically, we could also combine the metrics in this module with the alpha_diversity.py
I am already using AlphaDiversityCalcs(), so merging would be a no brainer.

rob-knight · 2012-12-06T17:55:51Z

All this stuff is related to alpha_diversity but having a separate coverage module that encompasses all this stuff (and is imported from alpha_diversity) might make sense. pycogent is a more logical home than qiime for all the general-purpose stuff like metrics I agree.

On Dec 6, 2012, at 10:53 AM, jensreeder <notifications@github.com mailto:notifications@github.com> wrote:

That would be conditional_uncovered_probability.py.
It might be useful to rename this script into coverage.py, as this seems to be the term that people look for.
This requires to (1-x) the probabilities from Manuel's estimators.

In the long term, I suggest to put goods into pycogent as that is where e.g. the robbins estimator lives as well.
Maybe for the upcoming release, stashing it in qiime is the best solution.

Practically, we could also combine the metrics in this module with the alpha_diversity.py
I am already using AlphaDiversityCalcs(), so merging would be a no brainer.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11096119.

gregcaporaso · 2012-12-06T17:56:53Z

We currently don't have anyone assigned to this issue. @justin212k-the-moustache-enthusiast, do you want to take this one?

justin212k · 2012-12-06T18:11:11Z

sign me up. (I don't think I can do that myself).

gregcaporaso · 2012-12-07T20:14:54Z

sign me up. (I don't think I can do that myself).

Done - you should be able to sign yourself up.

rob-knight · 2012-12-09T22:37:25Z

Anyone know the answer to this? It looked like from earlier emails that this might be in progress?

Rob

Begin forwarded message:

From: "marzia@berkeley.edumailto:marzia@berkeley.edu" <marzia@berkeley.edu mailto:marzia@berkeley.edu>
Subject: QIIME and Roche v2.8 software
Date: December 8, 2012 12:28:08 PM MST
To: Rob Knight <rob.knight@colorado.edu mailto:rob.knight@colorado.edu>

Dear Rob,

I am the postdoc working with Steven Lindow at UC Berkeley on the
Sloan-funded indoor air microbial ecology project (BIMERC). We talked in
Boulder on October, I attended the workshop you and Mitch Sogin organized
for QIIME and VAMPS (I enjoyed it very much! And thanks for posting the
videos of the presentations online, really useful).

I have a question for you about QIIME. I recently sent some sample for
sequencing and I have been told that Roche made available a new software
(Roche v2.8 software with flow Pattern B) that apparently increases
quality and quantity from amplicon sequencing runs. I heard also that it
does not currently work with the QIIME denoising tool, but I also heard
that you guys are working with Roche to fix this problem. I was asked how
I want my samples processed, with the original software or with the new
software.

I would go for the new Roche software that apparently improves the
quality/quantity of data. But I also do want to use QIIME for my analyses.
So I was wondering if you could kindly give me an update on your work with
Roche, and a suggestion on how to proceed.

Thank you and I wish you a nice weekend!

Marzia

jensreeder · 2012-12-09T23:04:46Z

454 runs with the randomized flow pattern B can not be denoised with Qiime
at this point.
I briefly looked into the code and figured that it will take me some time
to fix it.
My previous suggestions in the other thread was to ask the sequencing
center to keep the regular flow order.

Up to now, I haven't seen any official documentation of this new feature,
so I am hesitant to jump at it without more information. I will bug the
sequencing folks here at work and see if they know anything about it.

In any case, I think we have to caution people to blindly denoise FLX+ data
using the Titanum or FLX error profiles.
As I have no idea how much the profiles differ for these extremely long
reads, I can't really say anything about the effectiveness. Maybe someone
should run a mock community on FLX+ an have a look at the denoising outcome.

Jens

On Sun, Dec 9, 2012 at 2:37 PM, Rob Knight notifications@github.com wrote:

Anyone know the answer to this? It looked like from earlier emails that
this might be in progress?

Rob

Begin forwarded message:

From: "marzia@berkeley.edumailto:marzia@berkeley.edu" <
marzia@berkeley.edumailto:marzia@berkeley.edu>
Subject: QIIME and Roche v2.8 software
Date: December 8, 2012 12:28:08 PM MST
To: Rob Knight <rob.knight@colorado.edu mailto:rob.knight@colorado.edu>

Dear Rob,

I am the postdoc working with Steven Lindow at UC Berkeley on the
Sloan-funded indoor air microbial ecology project (BIMERC). We talked in
Boulder on October, I attended the workshop you and Mitch Sogin organized
for QIIME and VAMPS (I enjoyed it very much! And thanks for posting the
videos of the presentations online, really useful).

I have a question for you about QIIME. I recently sent some sample for
sequencing and I have been told that Roche made available a new software
(Roche v2.8 software with flow Pattern B) that apparently increases
quality and quantity from amplicon sequencing runs. I heard also that it
does not currently work with the QIIME denoising tool, but I also heard
that you guys are working with Roche to fix this problem. I was asked how
I want my samples processed, with the original software or with the new
software.

I would go for the new Roche software that apparently improves the
quality/quantity of data. But I also do want to use QIIME for my analyses.
So I was wondering if you could kindly give me an update on your work with
Roche, and a suggestion on how to proceed.

Thank you and I wish you a nice weekend!

Marzia

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11177190.

rob-knight · 2012-12-09T23:06:23Z

Thanks, Jens. Is it just denoising that fails, i.e. they can do the rest of the analysis? Can they use e.g. Acacia or ampliconnoise for denoising?

Rob

On Dec 9, 2012, at 4:04 PM, jensreeder <notifications@github.com mailto:notifications@github.com> wrote:

454 runs with the randomized flow pattern B can not be denoised with Qiime
at this point.
I briefly looked into the code and figured that it will take me some time
to fix it.
My previous suggestions in the other thread was to ask the sequencing
center to keep the regular flow order.

Up to now, I haven't seen any official documentation of this new feature,
so I am hesitant to jump at it without more information. I will bug the
sequencing folks here at work and see if they know anything about it.

In any case, I think we have to caution people to blindly denoise FLX+ data
using the Titanum or FLX error profiles.
As I have no idea how much the profiles differ for these extremely long
reads, I can't really say anything about the effectiveness. Maybe someone
should run a mock community on FLX+ an have a look at the denoising outcome.

Jens

On Sun, Dec 9, 2012 at 2:37 PM, Rob Knight <notifications@github.com mailto:notifications@github.com> wrote:

Anyone know the answer to this? It looked like from earlier emails that
this might be in progress?

Rob

Begin forwarded message:

From: "marzia@berkeley.edumailto:marzia@berkeley.edumailto:marzia@berkeley.edu" <
marzia@berkeley.edumailto:marzia@berkeley.edumailto:marzia@berkeley.edu>
Subject: QIIME and Roche v2.8 software
Date: December 8, 2012 12:28:08 PM MST
To: Rob Knight <rob.knight@colorado.edu mailto:rob.knight@colorado.edu mailto:rob.knight@colorado.edu>

Dear Rob,

I am the postdoc working with Steven Lindow at UC Berkeley on the
Sloan-funded indoor air microbial ecology project (BIMERC). We talked in
Boulder on October, I attended the workshop you and Mitch Sogin organized
for QIIME and VAMPS (I enjoyed it very much! And thanks for posting the
videos of the presentations online, really useful).

I have a question for you about QIIME. I recently sent some sample for
sequencing and I have been told that Roche made available a new software
(Roche v2.8 software with flow Pattern B) that apparently increases
quality and quantity from amplicon sequencing runs. I heard also that it
does not currently work with the QIIME denoising tool, but I also heard
that you guys are working with Roche to fix this problem. I was asked how
I want my samples processed, with the original software or with the new
software.

I would go for the new Roche software that apparently improves the
quality/quantity of data. But I also do want to use QIIME for my analyses.
So I was wondering if you could kindly give me an update on your work with
Roche, and a suggestion on how to proceed.

Thank you and I wish you a nice weekend!

Marzia

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11177190.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11177562.

jensreeder · 2012-12-10T00:04:50Z

It's just denoising that fails, the rest of qiime will be fine.
Not sure how ampliconnoise or Acacia behave, but a simple grep over
ampliconnoise's code base showed several hardcoded occasions of the regular
flow order TACG, so I assume that it might have some issues as well.

Jens

On Sun, Dec 9, 2012 at 3:06 PM, Rob Knight notifications@github.com wrote:

Thanks, Jens. Is it just denoising that fails, i.e. they can do the rest
of the analysis? Can they use e.g. Acacia or ampliconnoise for denoising?

Rob

On Dec 9, 2012, at 4:04 PM, jensreeder <notifications@github.com<mailto:
notifications@github.com>> wrote:

454 runs with the randomized flow pattern B can not be denoised with Qiime
at this point.
I briefly looked into the code and figured that it will take me some time
to fix it.
My previous suggestions in the other thread was to ask the sequencing
center to keep the regular flow order.

Up to now, I haven't seen any official documentation of this new feature,
so I am hesitant to jump at it without more information. I will bug the
sequencing folks here at work and see if they know anything about it.

In any case, I think we have to caution people to blindly denoise FLX+
data
using the Titanum or FLX error profiles.
As I have no idea how much the profiles differ for these extremely long
reads, I can't really say anything about the effectiveness. Maybe someone
should run a mock community on FLX+ an have a look at the denoising
outcome.

Jens

On Sun, Dec 9, 2012 at 2:37 PM, Rob Knight <notifications@github.com
mailto:notifications@github.com> wrote:

Anyone know the answer to this? It looked like from earlier emails that
this might be in progress?

Rob

Begin forwarded message:

From: "marzia@berkeley.edumailto:marzia@berkeley.edu<mailto:
marzia@berkeley.edu>" <
marzia@berkeley.edumailto:marzia@berkeley.edu<mailto:
marzia@berkeley.edu>>
Subject: QIIME and Roche v2.8 software
Date: December 8, 2012 12:28:08 PM MST
To: Rob Knight <rob.knight@colorado.edu<mailto:rob.knight@colorado.edu
mailto:rob.knight@colorado.edu>

Dear Rob,

I am the postdoc working with Steven Lindow at UC Berkeley on the
Sloan-funded indoor air microbial ecology project (BIMERC). We talked in
Boulder on October, I attended the workshop you and Mitch Sogin
organized
for QIIME and VAMPS (I enjoyed it very much! And thanks for posting the
videos of the presentations online, really useful).

I have a question for you about QIIME. I recently sent some sample for
sequencing and I have been told that Roche made available a new software
(Roche v2.8 software with flow Pattern B) that apparently increases
quality and quantity from amplicon sequencing runs. I heard also that it
does not currently work with the QIIME denoising tool, but I also heard
that you guys are working with Roche to fix this problem. I was asked
how
I want my samples processed, with the original software or with the new
software.

I would go for the new Roche software that apparently improves the
quality/quantity of data. But I also do want to use QIIME for my
analyses.
So I was wondering if you could kindly give me an update on your work
with
Roche, and a suggestion on how to proceed.

Thank you and I wish you a nice weekend!

Marzia

—
Reply to this email directly or view it on GitHub<
https://github.com/qiime/qiime/issues/255#issuecomment-11177190>.

—
Reply to this email directly or view it on GitHub<
https://github.com/qiime/qiime/issues/255#issuecomment-11177562>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11177588.

rob-knight · 2012-12-10T01:10:56Z

OK thanks. Can anyone confirm whether acacia is wrapped in qiime yet as an alternative denoising procedure?

On Dec 9, 2012, at 5:04 PM, jensreeder <notifications@github.com mailto:notifications@github.com> wrote:

It's just denoising that fails, the rest of qiime will be fine.
Not sure how ampliconnoise or Acacia behave, but a simple grep over
ampliconnoise's code base showed several hardcoded occasions of the regular
flow order TACG, so I assume that it might have some issues as well.

Jens

On Sun, Dec 9, 2012 at 3:06 PM, Rob Knight <notifications@github.com mailto:notifications@github.com> wrote:

Thanks, Jens. Is it just denoising that fails, i.e. they can do the rest
of the analysis? Can they use e.g. Acacia or ampliconnoise for denoising?

Rob

On Dec 9, 2012, at 4:04 PM, jensreeder <notifications@github.com mailto:notifications@github.com<mailto:
notifications@github.com mailto:notifications@github.com>> wrote:

454 runs with the randomized flow pattern B can not be denoised with Qiime
at this point.
I briefly looked into the code and figured that it will take me some time
to fix it.
My previous suggestions in the other thread was to ask the sequencing
center to keep the regular flow order.

Up to now, I haven't seen any official documentation of this new feature,
so I am hesitant to jump at it without more information. I will bug the
sequencing folks here at work and see if they know anything about it.

In any case, I think we have to caution people to blindly denoise FLX+
data
using the Titanum or FLX error profiles.
As I have no idea how much the profiles differ for these extremely long
reads, I can't really say anything about the effectiveness. Maybe someone
should run a mock community on FLX+ an have a look at the denoising
outcome.

Jens

On Sun, Dec 9, 2012 at 2:37 PM, Rob Knight <notifications@github.com mailto:notifications@github.com
mailto:notifications@github.com> wrote:

Anyone know the answer to this? It looked like from earlier emails that
this might be in progress?

Rob

Begin forwarded message:

From: "marzia@berkeley.edumailto:marzia@berkeley.edumailto:marzia@berkeley.edu<mailto:
marzia@berkeley.edu mailto:marzia@berkeley.edu>" <
marzia@berkeley.edumailto:marzia@berkeley.edumailto:marzia@berkeley.edu<mailto:
marzia@berkeley.edu mailto:marzia@berkeley.edu>>
Subject: QIIME and Roche v2.8 software
Date: December 8, 2012 12:28:08 PM MST
To: Rob Knight <rob.knight@colorado.edu mailto:rob.knight@colorado.edu<mailto:rob.knight@colorado.edu
mailto:rob.knight@colorado.edu>

Dear Rob,

I am the postdoc working with Steven Lindow at UC Berkeley on the
Sloan-funded indoor air microbial ecology project (BIMERC). We talked in
Boulder on October, I attended the workshop you and Mitch Sogin
organized
for QIIME and VAMPS (I enjoyed it very much! And thanks for posting the
videos of the presentations online, really useful).

I have a question for you about QIIME. I recently sent some sample for
sequencing and I have been told that Roche made available a new software
(Roche v2.8 software with flow Pattern B) that apparently increases
quality and quantity from amplicon sequencing runs. I heard also that it
does not currently work with the QIIME denoising tool, but I also heard
that you guys are working with Roche to fix this problem. I was asked
how
I want my samples processed, with the original software or with the new
software.

I would go for the new Roche software that apparently improves the
quality/quantity of data. But I also do want to use QIIME for my
analyses.
So I was wondering if you could kindly give me an update on your work
with
Roche, and a suggestion on how to proceed.

Thank you and I wish you a nice weekend!

Marzia

—
Reply to this email directly or view it on GitHub<
https://github.com/qiime/qiime/issues/255#issuecomment-11177190>.

—
Reply to this email directly or view it on GitHub<
https://github.com/qiime/qiime/issues/255#issuecomment-11177562>.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11177588.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11178241.

gregcaporaso · 2012-12-10T03:53:07Z

I do not think that it is, and a search for acacia in the full code base doesn't return any hits.

justin212k · 2012-12-11T01:39:40Z

Hey folks, note that all these emails are being posted on github under Issue #255.

justin212k · 2012-12-11T01:47:18Z

and returning to Issue #255, what does everyone think of merging the coverage stuff with alpha_diversity.py. E.g. Good's coverage isn't estimating the diversity of the community, but instead the extent to which it's been adequately sampled. But rarefaction curves with e.g. Good's seem informative, and it'd be nice to have all the workflow scripts that interact with alpha_diversity.py work with e.g. Good's coverage. I'd like to 1-x the metrics in conditional_uncovered_probability.py, delete that file, and put the new coverage estimators in alpha_diversity.py. Then add a little documentation noting how we've blurred the boundaries of what alpha_diversity.py does.

I'm tepid myself, anyone dislike this idea?

gregcaporaso · 2012-12-11T02:29:10Z

I don't think this is a bad idea, but note that we could also modify the alpha_rarefaction.py workflow to use either script if you think that will be clearer/easier to document.

justin212k · 2012-12-13T02:39:14Z

Sorry folks, I don't think I can get this in by dec 13th at 7am. I've made a few changes, but nothing near ready for a pull request.

gregcaporaso · 2012-12-13T20:36:10Z

Would an extra day help?

wdwvt1 · 2012-12-13T20:45:45Z

I may be able to help - I have finished the gini index stuff. How far are
you along?
On Dec 13, 2012 1:36 PM, "Greg Caporaso" notifications@github.com wrote:

Would an extra day help?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/255#issuecomment-11352032.

ghost assigned justin212k Dec 7, 2012

wdwvt1 mentioned this issue Dec 14, 2012

Alpha diversity changes #523

Merged

gregcaporaso closed this as completed Dec 18, 2012

jansuategui unassigned justin212k Mar 17, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Good's coverage estimate #255

Good's coverage estimate #255

jansuategui commented Oct 15, 2012

gregcaporaso commented Nov 28, 2012

antgonza commented Nov 28, 2012

gregcaporaso commented Nov 28, 2012

rob-knight commented Nov 28, 2012

justin212k commented Dec 6, 2012

antgonza commented Dec 6, 2012

rob-knight commented Dec 6, 2012

justin212k commented Dec 6, 2012

jensreeder commented Dec 6, 2012

rob-knight commented Dec 6, 2012

gregcaporaso commented Dec 6, 2012

justin212k commented Dec 6, 2012

gregcaporaso commented Dec 7, 2012

rob-knight commented Dec 9, 2012

jensreeder commented Dec 9, 2012

rob-knight commented Dec 9, 2012

jensreeder commented Dec 10, 2012

rob-knight commented Dec 10, 2012

gregcaporaso commented Dec 10, 2012

justin212k commented Dec 11, 2012

justin212k commented Dec 11, 2012

gregcaporaso commented Dec 11, 2012

justin212k commented Dec 13, 2012

gregcaporaso commented Dec 13, 2012

wdwvt1 commented Dec 13, 2012

Good's coverage estimate #255

Good's coverage estimate #255

Comments

jansuategui commented Oct 15, 2012

gregcaporaso commented Nov 28, 2012

antgonza commented Nov 28, 2012

gregcaporaso commented Nov 28, 2012

rob-knight commented Nov 28, 2012

justin212k commented Dec 6, 2012

antgonza commented Dec 6, 2012

rob-knight commented Dec 6, 2012

justin212k commented Dec 6, 2012

jensreeder commented Dec 6, 2012

rob-knight commented Dec 6, 2012

gregcaporaso commented Dec 6, 2012

justin212k commented Dec 6, 2012

gregcaporaso commented Dec 7, 2012

rob-knight commented Dec 9, 2012

jensreeder commented Dec 9, 2012

rob-knight commented Dec 9, 2012

jensreeder commented Dec 10, 2012

rob-knight commented Dec 10, 2012

gregcaporaso commented Dec 10, 2012

justin212k commented Dec 11, 2012

justin212k commented Dec 11, 2012

gregcaporaso commented Dec 11, 2012

justin212k commented Dec 13, 2012

gregcaporaso commented Dec 13, 2012

wdwvt1 commented Dec 13, 2012