Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to interpret yak inspect to evaluate the completeness of an assembly #6

Open
Ashleyfarlow opened this issue Nov 11, 2020 · 1 comment

Comments

@Ashleyfarlow
Copy link

Thank you for this tool. After running:

yak count -b37 -t4 -o sr.yak myreads.fq.gz
yak count -b37 -t4 -o scaffold.yak scaffold.fa
yak inspect sr.yak scaffold.yak > sr-scaffold.kqv.txt

I obtain these results in sr-scaffold.kqv.txt:

...
SN      5       29997294        2943642 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
SN      4       54814313        5427049 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
SN      3       119820206       12359127        0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
SN      2       577560021       57889442        0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000  0.0000
QV      20      1686753367      1686676513      4.915   -1.000
QV      19      1806056255      1805978021      4.893   -1.000
QV      18      1918241646      1918162190      4.874   -1.000
QV      17      2021819785      2021739232      4.857   -1.000
QV      16      2115831944      2115750327      4.843   -1.000
QV      15      2199727376      2199644874      4.831   -1.000
QV      14      2273569360      2273486096      4.821   -1.000
QV      13      2337885508      2337801575      4.813   -1.000
QV      12      2393381989      2393297514      4.806   -1.000
QV      11      2441048817      2440963948      4.799   -1.000
QV      10      2481791146      2481705971      4.794   -1.000
QV      9       2516588006      2516502574      4.789   -1.000
QV      8       2546760834      2546675206      4.785   -1.000
QV      7       2573349003      2573263248      4.781   -1.000
QV      6       2600045275      2599959418      4.778   -1.000
QV      5       2630042569      2629956665      4.773   -1.000
QV      4       2684856882      2684770932      4.764   -1.000

How should I interpret these results (this is a human assembly from pacbio, and the short reads are 80x illumina).

@ASLeonard
Copy link

I am also wondering how to interpret these results, as mine are similar. From the inspect.c code, it appears the QV lines are written by printf("QV\t%d\t%ld\t%ld\t%.3f\t%.3f\n", i, (long)qs.tot, (long)acc[i * YAK_N_COUNTS], qs.qv_raw, qs.qv);

It may be a different meaning, but otherwise there must be some issue giving a negative QV. Also I'm not sure how to determine the assembly completeness from these values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants