You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Filipe,
I'm excited to use your program once more (it's been almost a year)! I work with a non-model organism and have access to two possible references to work with (both from the same individual):
A scaffold reference fasta which contains N characters to position between contigs (though with undetermined gap lengths)
A contig reference fasta which contains ungapped ATCG characters only.
I don't think anything will change with respect to some calculations like genotype probabilities, estimations of heterozygosity, Fst, etc. However, I think this will likely significantly impact LD estimates.
I was wondering if it is assumed that the input used for ngsLD uses reads that were aligned to a reference that contains only contigs (no N's), or, if it doesn't matter if the alignment itself is discontiguous.
My worry is that the resulting .bam files produced by aligning against a scaffold with N's gives a false perspective of distance estimation when the sliding window spans across those Ns.
Thanks very much
The text was updated successfully, but these errors were encountered:
ngsLD does not assume anything about the input data, apart that it will only calculate LD between sites on the same contig/scaffold. So, if you use contigs, some SNP pairs won't be performed (since linked SNPs that fall on different contigs won't be taken into account).
On the other hand, if you use scaffolds, you get more SNP pairs but the distance between them might not be the most accurate (since it seems you do not trust your distance between contigs that well).
I guess in the end it depends what you want and how much you trust the distance between contigs on your scaffolds. :)
Hi Filipe,
I'm excited to use your program once more (it's been almost a year)! I work with a non-model organism and have access to two possible references to work with (both from the same individual):
N
characters to position between contigs (though with undetermined gap lengths)ATCG
characters only.I don't think anything will change with respect to some calculations like genotype probabilities, estimations of heterozygosity, Fst, etc. However, I think this will likely significantly impact LD estimates.
I was wondering if it is assumed that the input used for
ngsLD
uses reads that were aligned to a reference that contains only contigs (noN
's), or, if it doesn't matter if the alignment itself is discontiguous.My worry is that the resulting
.bam
files produced by aligning against a scaffold withN
's gives a false perspective of distance estimation when the sliding window spans across thoseN
s.Thanks very much
The text was updated successfully, but these errors were encountered: