Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ambiguous nucleotides as input for Hyphy positive selection #1540

Closed
psorigue opened this issue Nov 8, 2022 · 2 comments
Closed

Ambiguous nucleotides as input for Hyphy positive selection #1540

psorigue opened this issue Nov 8, 2022 · 2 comments

Comments

@psorigue
Copy link

psorigue commented Nov 8, 2022

Hi!

I am using some methods for Hyphy positive selection. I have an alignment with 243 spp, and 2 individuals per species. For this reason, I have two different haplotypes per species. To reduce the complexity and time, I considered pooling the two haplotypes, but this will generate some ambiguous (IUPAC-symboled) nucleotides.

My question is the following: When entering a codon with a IUPAC symbol that represents two possible nucleotides, does Hyphy consider the two nucleotide options for the analyses, or does it treat the nucleotide as "missing data"? In this latter case, I would rather only choose one of the two haplotypes for the analysis.

Thanks in advance for your help.

Best,
Pol Sorigué

@spond
Copy link
Member

spond commented Nov 8, 2022

Dear @psorigue,

HyPhy will treat nucleotide ambiguities as partially missing data (e.g. a Y maps to [0,1,0,1] probabilities for the corresponding leaf). Partially resolved codons work the same way, e.g. CRA ~ CAA or CGA.

Hope this helps,
Sergei

@psorigue
Copy link
Author

psorigue commented Nov 9, 2022 via email

@spond spond closed this as completed Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants