Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question three steps before using ngsLD #2

Closed
devonorourke opened this issue Jul 16, 2018 · 1 comment
Closed

question three steps before using ngsLD #2

devonorourke opened this issue Jul 16, 2018 · 1 comment

Comments

@devonorourke
Copy link

Hi Filipe,
I'm excited to use your program once more (it's been almost a year)! I work with a non-model organism and have access to two possible references to work with (both from the same individual):

  1. A scaffold reference fasta which contains N characters to position between contigs (though with undetermined gap lengths)
  2. A contig reference fasta which contains ungapped ATCG characters only.

I don't think anything will change with respect to some calculations like genotype probabilities, estimations of heterozygosity, Fst, etc. However, I think this will likely significantly impact LD estimates.

I was wondering if it is assumed that the input used for ngsLD uses reads that were aligned to a reference that contains only contigs (no N's), or, if it doesn't matter if the alignment itself is discontiguous.

My worry is that the resulting .bam files produced by aligning against a scaffold with N's gives a false perspective of distance estimation when the sliding window spans across those Ns.

Thanks very much

@fgvieira
Copy link
Owner

Hi Devon,

glad to hear you find ngsLD useful!

ngsLD does not assume anything about the input data, apart that it will only calculate LD between sites on the same contig/scaffold. So, if you use contigs, some SNP pairs won't be performed (since linked SNPs that fall on different contigs won't be taken into account).

On the other hand, if you use scaffolds, you get more SNP pairs but the distance between them might not be the most accurate (since it seems you do not trust your distance between contigs that well).

I guess in the end it depends what you want and how much you trust the distance between contigs on your scaffolds. :)

cheers,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants