You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a place holder for a discussion on what base to put in the FASTA files currently:
SNP fail - N
NON-SNP fail - N
While this works it seems too simplistic. For example: 100 coverage with 90-10 split between REF and ALT it doesn't make sense (to me) to put an N. We certainly can't call SNP here, but it's not N either. Where there are 50/50 splits I also we shouldn't be putting an N - it should be a proper mixed base coded with extended UIPAC code. The same goes for positions with 0 REF and 2 _ALT_s. A lot of information is lost into N.
In fact, should N be used only when all 4 bases have been observed? Otherwise, it should be excluded from the analysis (the whole column will in this case).
The text was updated successfully, but these errors were encountered:
This is a place holder for a discussion on what base to put in the FASTA files currently:
While this works it seems too simplistic. For example: 100 coverage with 90-10 split between REF and ALT it doesn't make sense (to me) to put an N. We certainly can't call SNP here, but it's not N either. Where there are 50/50 splits I also we shouldn't be putting an N - it should be a proper mixed base coded with extended UIPAC code. The same goes for positions with 0 REF and 2 _ALT_s. A lot of information is lost into N.
In fact, should N be used only when all 4 bases have been observed? Otherwise, it should be excluded from the analysis (the whole column will in this case).
The text was updated successfully, but these errors were encountered: