Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify N50, L50, metrics in anvi-display-contigs-stats #849

Closed
brymerr921 opened this issue Jun 5, 2018 · 6 comments
Closed

Clarify N50, L50, metrics in anvi-display-contigs-stats #849

brymerr921 opened this issue Jun 5, 2018 · 6 comments

Comments

@brymerr921
Copy link
Contributor

Hi everyone,

I have a quick comment about anvi-display-contigs-stats. I noticed that the N50 and L50 values returned by anvi-display-contigs-stats were different from other programs I have used (e.g. QUAST) and realized that they are just switched. From googling around it appears that they are (unfortunately) used interchangeably. I think it might be helpful to other users to clarify:

anvi'o reports N50 (and other N values) as a number of contigs.
anvi'o reports L50 (and other L values) as a contig length.
(This makes sense to me, N=number, L=length)

For example, based on anvi'o reports, if the N50 of my contigs database is 22000 and my L50 is 5530bp, it means that 50% of my assembled data is contained in contigs larger than 5530bp. There are 22000 contigs larger than 5530bp.

For this same set of contigs, QUAST would report an N50 of 5530bp and an L50 of 22000.

@ozcan
Copy link
Contributor

ozcan commented Jun 6, 2018

Hi @brymerr921,

Thanks for the bringing this issue, yesterday we had a discussion about this and changed how we report N/L values on the plot little bit.

Now it looks like this:
screen shot 2018-06-06 at 9 40 12 am

Is it more clear and/or accurate like this? Do you have any suggestions?

Best,

@brymerr921
Copy link
Contributor Author

@ozcan, this looks great! I didn't mention this before, but the original graphical version is actually what helped me make sense of things. This is still a great improvement though. Added clarification in the help at the command line for anvi-display-contigs-stats as well as clarification in the text-only output (which is what led me to discover the ambiguity) would also be great. Thanks!

@watsonar
Copy link
Contributor

It appears that while N50 and L50 are sometimes used interchangeably, according to the official definition we have the terms swapped in anvi'o. To fix this we would just need to swap the labels N50 and L50 in anvi-display-contigs-stats, everything else would remain unchanged.

So currently it looks like this:

anvi'o reports N50 (and other N values) as a number of contigs.
anvi'o reports L50 (and other L values) as a contig length.

But it should look like this:

anvi'o reports L50 (and other L values) as a number of contigs.
anvi'o reports N50 (and other N values) as a contig length.

I am really sorry about this, because I'm almost certain it was me who originally explained these concepts backwards while anvi-display-contigs-stats was being developed. :(

@meren
Copy link
Member

meren commented Mar 25, 2019

Ah. Someone else recently complained about this and I couldn't fix it. Can you fix it in the repository?

@ozcan
Copy link
Contributor

ozcan commented Mar 25, 2019

@watsonar, here are some code pointers if you are still interested in this.

for interactive plot:
https://github.com/merenlab/anvio/blob/master/anvio/data/interactive/js/contigs-plot.js#L49-L62
https://github.com/merenlab/anvio/blob/master/anvio/data/interactive/js/contigs-plot.js#L82-L98

for text report:
https://github.com/merenlab/anvio/blob/master/anvio/interactive.py#L2180

@watsonar
Copy link
Contributor

Thank you @ozcan! I believe I have fixed the problem now.

@meren meren closed this as completed Apr 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants