-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
results isuue and running error #1668
Comments
Dear @fatima-akhtar113,
Best, |
thankyou for replying here is the url attached
https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3
can fel and slac be used for gene level selection ? yeah thankyou for
suggesting busted what i can interpret from my fel and slac results thou
can they be significant? Also is meme a good option for analyzing gene
level selection?
…On Mon, Nov 27, 2023 at 6:03 PM Sergei Pond ***@***.***> wrote:
Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
1. I am afraid I can't help you unless you provide more information
about the MEME analysis. If you ran in in Datamonkey, please include the
URL for the results page.
2. No, you cannot conclude that a gene is under selection if one or
two sites are under selection. See
https://academic.oup.com/mbe/article/32/5/1365/1134918. Use BUSTED to
look for gene-level selection.
image.png (view on web)
<https://github.com/veg/hyphy/assets/1018513/1cf455e8-d1a6-40ec-9c3b-be1628aa9329>
.
Best,
Sergei
—
Reply to this email directly, view it on GitHub
<#1668 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BDABE4ZBLR4UEKXX62QE5T3YGSFSHAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRXG44TINZSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
i skipped picture now i know meme is not a good option please guide me on
rest of the two
…On Mon, Nov 27, 2023 at 5:37 PM fatima khan ***@***.***> wrote:
thankyou for replying here is the url attached
https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3
can fel and slac be used for gene level selection ? yeah thankyou for
suggesting busted what i can interpret from my fel and slac results thou
can they be significant? Also is meme a good option for analyzing gene
level selection?
On Mon, Nov 27, 2023 at 6:03 PM Sergei Pond ***@***.***>
wrote:
> Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
>
> 1. I am afraid I can't help you unless you provide more information
> about the MEME analysis. If you ran in in Datamonkey, please include the
> URL for the results page.
> 2. No, you cannot conclude that a gene is under selection if one or
> two sites are under selection. See
> https://academic.oup.com/mbe/article/32/5/1365/1134918. Use BUSTED to
> look for gene-level selection.
> image.png (view on web)
> <https://github.com/veg/hyphy/assets/1018513/1cf455e8-d1a6-40ec-9c3b-be1628aa9329>
> .
>
> Best,
> Sergei
>
> —
> Reply to this email directly, view it on GitHub
> <#1668 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BDABE4ZBLR4UEKXX62QE5T3YGSFSHAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRXG44TINZSGM>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Dear @fatima-akhtar113, Like I said previosuly, if the goal is to identify selection at the level of a gene you should use Datamonkey requires codon-aware multiple sequence alignments. If you are not familiar with how to obtain those, you may want to take a look elsewhere, e.g. https://github.com/veg/hyphy-analyses/blob/master/codon-msa/README.md and #1477 I attach an aligned version of your data (using the If you run it through BUSTED in HyPhy, like so
You will get a significant result for positive selection (p ~ 0), but a very odd looking ω distribution A dN/dS of 3000 is indicative of some pathologies with the data / model. For example here's one site which shows several multi-nucleotide substitutions If you then run BUSTED with support for multiple hits (see https://academic.oup.com/mbe/article/40/7/msad150/7217158)
a very odd result is obtained
Having more than 50% of the substitutions occur due to multiple hits is very odd. May I ask where these sequences come from? (unless they are simulated). Best, |
I took the coding sequence from blast orthologus of my protein then converted them into DNA sequence using reverse translate. What do I do to gnt results, I can align my sequences using ugene also do I have to remove gaps. Sent from my Huawei Mobile-------- Original Message --------Subject: Re: [veg/hyphy] results isuue and running error (Issue #1668)From: Sergei Pond To: veg/hyphy CC: fatima-akhtar113 ,Mention
Dear @fatima-akhtar113,
Like I said previosuly, if the goal is to identify selection at the level of a gene you should use BUSTED. However, the sequences you submitted to Datamonkey have not been properly aligned. Datamonkey will "pad" sequences of unequal lengths with ? at the end and this is what happened here (https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3/fasta)
Datamonkey requires codon-aware multiple sequence alignments. If you are not familiar with how to obtain those, you may want to take a look elsewhere, e.g. https://github.com/veg/hyphy-analyses/blob/master/codon-msa/README.md and #1477
I attach an aligned version of your data (using the codon-msa workflow I linked to above).
If you run it through BUSTED in HyPhy, like so
hyphy busted --alignment /Users/sergei/Desktop/seqs.msa --tree neighbor-joining --starting-points 5
You will get a significant result for positive selection (p ~ 0), but a very odd looking ω distribution
image.png (view on web)
A dN/dS of 2000 is indicative of some pathologies with the data / model. For example here's one site which shows several multi-nucleotide substitutions
image.png (view on web)
If you then run BUSTED with support for multiple hits (see https://academic.oup.com/mbe/article/40/7/msad150/7217158)
hyphy busted --alignment /Users/sergei/Desktop/seqs.msa.gz --starting-points 5 --tree neighbor-joining --multiple-hits Double+Triple
a very odd result is obtained
### Partition-level rates for multiple-hit substitutions
* rate at which 2 nucleotides are changed instantly within a single codon : 1.9304
* Corresponding fraction of substitutions : 45.463%
* rate at which 3 nucleotides are changed instantly within a single codon : 1.9649
* Corresponding fraction of substitutions : 5.696%
| Selection mode | dN/dS |Proportion, %| Notes |
|-----------------------------------|---------------|-------------|-----------------------------------|
| Negative selection | 0.967 | 0.000 | Not supported by data |
| Negative selection | 0.999 | 0.000 | Not supported by data |
| Diversifying selection | 244.639 | 100.000 | |
Having more than 50% of the substitutions occur due to multiple hits is very odd.
May I ask where these sequences come from? (unless they are simulated).
Best,
Sergei
seqs.msa.gz
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
On Tue, Nov 28, 2023 at 7:04 AM fatima khan ***@***.***> wrote:
I took the coding sequence from blast orthologus of my protein then
converted them into DNA sequence using reverse translate. What do I do to
gnt results, I can align my sequences using ugene also do I have to remove
gaps.
Sent from my Huawei Mobile
-------- Original Message --------
Subject: Re: [veg/hyphy] results isuue and running error (Issue #1668)
From: Sergei Pond
To: veg/hyphy
CC: fatima-akhtar113 ,Mention
Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
Like I said previosuly, if the goal is to identify selection at the level
of a gene you should use BUSTED. However, the sequences you submitted to
Datamonkey have not been properly aligned. Datamonkey will "pad" sequences
of unequal lengths with ? at the end and this is what happened here (
https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3/fasta)
Datamonkey requires codon-aware multiple sequence alignments. If you are
not familiar with how to obtain those, you may want to take a look
elsewhere, e.g.
https://github.com/veg/hyphy-analyses/blob/master/codon-msa/README.md and
#1477 <#1477>
I attach an aligned version of your data (using the codon-msa workflow I
linked to above).
If you run it through BUSTED in HyPhy, like so
hyphy busted --alignment /Users/sergei/Desktop/seqs.msa --tree neighbor-joining --starting-points 5
You will get a significant result for positive selection (p ~ 0), but a
very odd looking ω distribution
image.png (view on web)
<https://github.com/veg/hyphy/assets/1018513/d0612854-bdd5-49ff-a442-1c3d38c32cba>
A dN/dS of 2000 is indicative of some pathologies with the data / model.
For example here's one site which shows several multi-nucleotide
substitutions
image.png (view on web)
<https://github.com/veg/hyphy/assets/1018513/a09a972c-9b0d-4c28-a3ee-dacdc8a8007b>
If you then run BUSTED with support for multiple hits (see
https://academic.oup.com/mbe/article/40/7/msad150/7217158)
hyphy busted --alignment /Users/sergei/Desktop/seqs.msa.gz --starting-points 5 --tree neighbor-joining --multiple-hits Double+Triple
a very odd result is obtained
### Partition-level rates for multiple-hit substitutions
* rate at which 2 nucleotides are changed instantly within a single codon : 1.9304
* Corresponding fraction of substitutions : 45.463%
* rate at which 3 nucleotides are changed instantly within a single codon : 1.9649
* Corresponding fraction of substitutions : 5.696%
| Selection mode | dN/dS |Proportion, %| Notes |
|-----------------------------------|---------------|-------------|-----------------------------------|
| Negative selection | 0.967 | 0.000 | Not supported by data |
| Negative selection | 0.999 | 0.000 | Not supported by data |
| Diversifying selection | 244.639 | 100.000 | |
Having more than 50% of the substitutions occur due to multiple hits is
very odd.
May I ask where these sequences come from? (unless they are simulated).
Best,
Sergei
seqs.msa.gz <https://github.com/veg/hyphy/files/13481434/seqs.msa.gz>
—
Reply to this email directly, view it on GitHub
<#1668 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BDABE46A2CB2P5PARI5YAMTYGUMYBAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRYG43TQOJSHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
homosapien
MDEPPFSEAALEQALGEPCDLDAALLTDIEGEVGAGRGRANGLDAPRAGADRGAMDCTFE
DMLQLINNQDSDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPATLSSSLEAFLSGP
QAAPSPLSPPQPAPTPLKMYPSMPAFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFP
APAPPQFSSTPVLGYPSPPGGFSTGSPPGNTQQPLPGLPLASPPGVPPVSLHTQVQSVVP
QQLLTVTAAPTAAPVTTTVTSQIQQVPVLLQPHFIKADSLLLTAMKTDGATVKAAGLSPL
VSGTTVQTGPLPTLVSGGTILATVPLVVDAEKLPINRLAAGSKAPASAQSRGEKRTAHNA
IEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSGGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGS
GGSGSDSEPDSPVFEDSKAKPEQRPSLHSRGMLDRSRLALCTLVFLCLSCNPLASLLGAR
GLPSPSDTTSVYHSPGRNVLGTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEP
VTRPHSGPAVYFWRHRKQADLDLARGDFAQAAQQLWLALRALGRPLPTSHLDLACSLLWN
LIRHLLQRLWVGRWLAGRAGGLQQDCALRVDASASARDAALVYHKLHQLHTMGKHTGGHL
TATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRALHFLTRFFLSSARQACL
AQSGSVPPAMQWLCHPVGHRFFVDGDWSVLSTPWESLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPAYSFSISSSMATTTGVD
PVAKWWASLTAVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAARALL
GCAKAESGPASLTICEKASGYLQDSLATTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQP
PAPAPAAQGTSSRPQASALELRGFQRDLSSLRRLAQSFRPAMRRVFLHEATARLMAGASP
TRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEALLLASCYLPPGFLSAPGQRVG
MLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Pantroglodytes
MDEPPFSEAALEQALGEPCDLDAALLTDIEGEVGAGRGRANRLDAPRAGADHGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPATLSSSLEAFLSGPQAAPSPLSPPQPAPTPLKMY
PSVPTFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPGGFSTGSPPGS
TQQPLPGLPLASPPGVPPISLHTQVQSVVPQQLLTVTAAPTAAPVTTTVTSQIQQVPVLLQPHFIKADSL
LLTAMKTDGATVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDAEKLPINRLAAGSKAPASAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSGGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPVFEDSKAKPEQRPSLHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPSPSDTTSIYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVYFWRHRKQADLDLARGDFAQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQQDCALRVDARASARDAA
LVYHKLHQLHTMGKHTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRALHFLTRF
FLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWSVLSTPWESLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPACSFSISSSMATTTGVDPVAKWWASLT
AVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASG
YLQDSLATTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQPPAPAPAAQGTSSRPQASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEAL
LLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Panpaniscus
MDEPPFSEAALEQALGEPCDLDAALLTDIEGEVGAGRGRANRLDAPRAGADHGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPATLSSSLEAFLSGPQAAPSPLSPPQPAPTPLKMY
PSVPTFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPGGFSTGSPPGS
TQQLLPGLPLASPPGVPPISLHTQVQSVVPQQLLTVTAAPTAAPVTTTVTSQIQQVPVLLQPHFIKADSL
LLTAMKTDGATVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDAEKLPINRLAAGSKAPASAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSGGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPVFEDSKAKPEQRPSLHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPSPSDTTSIYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVYFWRHRKQADLDLARGDFDQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQQDCALRVDARASARDAA
LVYHKLHQLHTMGKHTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRALHFLTRF
FLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWSVLSTPWESLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPACSFSISSSMATTTGVDPVAKWWASLT
AVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASG
YLQDSLATTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQPPAPAPAAQGTSSRPQASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEAL
LLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Pongoabelii
MDEPPFSEAALEQALGEPCDLDLALLTDIEGEVGAGRGRANRLDAPRAGADRGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPATLSSSLEAFLSGPKAAPSPLSPPQPAPTPLKMY
PSMPAFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPGGFSTGSPPGS
TQQPLPGLPLASPPGVPPVSLHTQAQSVVPQQLLTVTAAPTAAPVTTTVTSQIQQVPVLLQPHFIKADSL
LLTAVKTDGATVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDADKLPINRLAAGSKASGSAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSGGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPAFEDSKAKPEQRPSSHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPSPSDTTSVYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVYFWRHRKQADLDLARGDFAQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQQDCALRVDACASARDAA
LVYHKLHQLHTMGKYTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRALHFLTRF
FLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWAVLSTPRESLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPSPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPACSFSISSSMATTTGVDPVAKWWASLT
AVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAAWALLGCAKAESGPASLTICEKASG
YLQDSLATTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQPPAPAPAAQGTSSRPHASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEAL
LLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Macacafascicularis
MDEPPFSEAALEQALGGPCDLDAALLTDIEGEVGAGRGRASRLDAPRAGADRGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPTTLSSSLEDFLSGPKAAPSPLSPPQPAPTPLKMY
PSVPTFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPGGFSTGSPPGS
TQQPLPGLPLASPPGVPPVSLHTQVQSVAPQRLLTVTAAPTAAPATTTVTSQIQQVPVLLQPHFIKADSL
LLTAMKTDGTTVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDADKLPINRLAAGSKAPGSAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSEGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPVFEDSKAKPEQRPSPHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPGPSDITSVYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVHFWRHRKQADLDLARGDFAQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQRDCSLRVDARASARDAA
LVYHKLHQLHTMGKYTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRTLHFLTRF
FLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWAVLSTPRETLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPACSFSISSSMATTTGIDPVAKWWASLT
AVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASG
YLQDSLATTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQPPAPAPAAQGTSSGPQASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEAL
LLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Nomascusleucogenys
MDEPPFSEAALEQALGEPCDLDAALLTDIEGARRGAGRGRANRLDAPRAGADRGAMDCTFEDMLQLINNQ
DSDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPATLSSSLEAFLSGPKAAPSPLSPPQPAPTPLKM
YPSVPAFSPGPGIKEESVPLSILQTPTPHPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPEGFSTGSPPG
STQQPLPGLPLASPPGVPPVSLHTQVQSVVPQQLLTVTAAPTAAPVTTTVTSQIQQVPVLLQPHFIKADS
LLLTAMKTDGATVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDADKLPINRLAAGSKAPGSAQ
SRGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRT
AVHKSKSLKDLVSACGSGGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDLEP
DSPVFEDSKAKPEQWPSPHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPSPSDTTSVYHSPGRNV
LGTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVYFWRHRKQADLDLARGDFA
QAAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQQDCALRVDARASARDA
ALVYHKLHQLHTMGKYTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRALHFLTR
FFLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWAVLSTPRESLYSLAGNPVDPLAQVTQLFREHL
LERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGTPACSFSISSSMATTTGVDPVAKWWASL
TAVVIHWLRRDEEAAERLCPLLEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKAS
GYLQDSLATTPTSSSIDKAVQLFLCDLLLVVRTSLWQQQQPLAPAPASQSASSRPQASALELRGFQRDLS
SLRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEA
LLLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRWLLHDCQQMLMRLGGGTTVTS
Chlorocebussabaeus
MDEPPFSKAALEQALGGPCDLDAALLTDIEGEVGAGRGRASRLDAPRAGADRGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPTTLSSSLEDFLSGPKAAPSPLSPPQPAPTPLKMY
PSVPTFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPGAFSTGSPPGS
TQQPLPGLPLASPPGVPPVSLHTQVQSVAPQRLLTVTAAPTAAPATTTVTSQIQQVPVLLQPHFIKADSL
LLTAMKTDGATVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDADKLPINRLAAGSKAPGSAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSEGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPVFEDSKAKPEQRPSPHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPGPSDITSVYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVHFWRHRKQADLDLARGDFAQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQRDCSLRVDARASARDAA
LVYHKLHQLHTMGKYTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRTLHFLTRF
FLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWAVLSTPRETLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPACSFSISSSMATTTGVDPVAKWWASLT
AVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASG
YLQDSLTTTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQPPAPAPAAQGTSSGPQASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPSGKGGAVAELEPRPTRREHAEAL
LLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Cercocebusatys
MDEPPFSEAALEQALGGPCDLDAALLTDIEGEVGARRGRASRLDAPRAGADRGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGSGAGGTDPASPDTSSPGSLSPPPTTLSSSLEDFLSGPKAAPSPLSPPQPAPTPLKMY
PSVPTFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVLGYPSPPGGFSTGSPPGS
TQQPLPGLPLASPPGVPPVSLHTQVQSVAAQRLLTVTAAPTAAPATTTVTSQIQQVPVLLQPHFIKADSL
LLTAMKTDGTTVKAAGLSPLVSGTTVQTGPLPTLVSGGTILATVPLVVDADKLPINRLAAGSKAQGSAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGSEGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPVFEDSKAKPEQRPSPHSRGMLDRSRLALCTLVFLCLSCNPLASLLGARGLPGPSDITSVYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVHFWRHRKQADLDLARGDFAQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLAGRAGGLQRDCSLRVDARASARDAA
LVYHKLHQLHTMGKYTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVAAALRVKTSLPRTLHFLTRF
FLSSARQACLAQSGSVPPAMQWLCHPVGHRFFVDGDWAVLSTPRETLYSLAGNPVDPLAQVTQLFREHLL
ERALNCVTQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPACSFSISSSMATTTGIDPVAKWWASLT
AVVIHWLRRDEEAAERLCPLVEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASG
YLQDSLATTPASSSIDKAVQLFLCDLLLVVRTSLWRQQQPPAPAPAAQGTSSGPQASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPTRTHQLLDRSLRRRAGPGGKGGAVAELEPRPTRREHAEAL
LLASCYLPPGFLSAPGQRVGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
Gorillagorillagorilla
MDEPPFSEAALEQALGEPCDLDAALLTDIEDMLQLINNQDSDFPGLFDPPYAGSGAGGTDPASPDTSSPG
SLSPPPATLSSSLEAFLSGPKAAPSPLSPPQPAPTPLKMYPSVPAFSPGPGIKEESVPLSILQTPTPQPL
PGALLPQSFPAPAPPQFSSTPVLGYPSPPGGFSTGSPPGSTQQPLPGLPLASPPGVPPVSLHTQVQSVVP
QQLLTVTAAPTAAPVTTTVTSQIQQVLLQPHFIKADSLLLTAMKTDGATVKAAGLSPLVSGTTVQTGPLP
TLVSGGTILATVPLVVDAEKLPINRLAAGSKAPASAQSRGEKRTAHNAIEKRYRSSINDKIIELKDLVVG
TEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTAVHKSKSLKDLVSACGSGGNTDMLMEGVKTEVE
DTLTPPASDAGSPFQSSPLSLGSRGSGSGGSGSDSEPDSPVFEDSKAKPEQRPSLHSRGMLDRSRLALCT
LVFLCLSCNPLASLLGARGLPSPSDTTSVYHSPGRNVLGTESRDGPGWAQWLLPPVVWLLNGLLVLVSLV
LLFVYGEPVTRPHSGPAVYFWRHRKQADLDLARGDFAQAAQQLWLALRALGRPLPTSHLDLACSLLWNLI
RHLLQRLWVGRWLAGRAGGLQQDCALRVDARASARDAALVYHKLHQLHTMGKHTGGHLTATNLALSALNL
AECAGDAVSVATLAEIYVAAALRVKTSLPRALHFLTRFFLSSARQACLAQSGSVPPAMQWLCHPVGHRFF
VDGDWSVLSTPWESLYSLAGNPVDPLAQVTQLFREHLLERALNCVTQPNPSPGSADGDKEFSDALGYLQL
LNSCSDAAGAPACSFSISSSMATTTGVDPVAKWWASLTAVVIHWLRRDEEAAERLCPLVEHLPRVLQESE
RPLPRAALHSFKAARALLGCAKAESGPASLTICEKASGYLQDSLATTPASSSIDKAVQLFLCDLLLVVRT
SLWRQQQPPAPAPAAQGTSSRPQASALELRGFQRDLSSLRRLAQSFRPAMRRVFLHEATARLMAGASPTR
THQLLDRSLRRRAGPGGKGGTVAELEPRPTRREHAEALLLASCYLPPGFLSAPGQRVGMLAEAARTLEKL
GDRRLLHDCQQMLMRLGGGTTVTSS
Rhinopithecusroxellana
MDESPFSEAALEQALGGPCDLDAALLTDIEDMLQLINNQDSDFPGLFDPPYAGSGAGGTDPASPDTSSPG
SLSPPPATLSSSLEDFLSGPKAAPSPLSPPQPAPTPLKMYPSVPTFSPGPGIKEESVPLSILQTPTPQPL
PGALLPQSFPAPAPTQFSSTPVLGYPSPPGGFSTGSPPGSTQQPLPGLPLASPPGVPPVSLHTQVQSVAP
QRLLTVTAAPTAAPATTTVTSQIQQVPVLLQPHFIKADSLLLTAMKTDGATVKAAGLSPLVSGTAVQTGP
LPTLVSGGTILATVPLVVDADKLPINRLAAGSKAPVSAQSRGEKRTAHNAIEKRYRSSINDKIIELKDLV
VGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTAVHKSKSLKDLVSACGSEGNTDVLMEGVKTE
VEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPDSPVFEDSKAKPEQRPSPHSRGMLDRSRLAL
CTLVFLCLSCNPLASLLGARGLPGPSDITSVYHSPGRNVLGTESRDGPGWAQWLLPPVVWLLNGLLVLVS
LVLLFVYGEPVTRPHSGPAVHFWRHRKQADLDLARGDFAQAAQQLWLALRALGRPLPTSHLDLACSLLWN
LIRHLLQRLWVGRWLAGRAGGLQRDCSLRVDARASARDAALVYHKLHQLHTMGKYTGGHLTATNLALSAL
NLAECAGDAVSVATLAEIYVAAALRVKTSLPRTLHFLTRFFLSSARQACLAQSGSVPPAMQWLCHPVGHR
FFVDGDWVVLSTPRETLYSLAGNPVDPLAQVTQLFREHLLERALNCVTQPNPSPGSADGDKEFSDALGYL
QLLNSCSDAAGAPACSFSISSSMATTTGVDPVAKWWASLTAVVIHWLRRDEEAAERLCPLVEHLPRVLQE
SERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASGYLQDSLATTPASSSIDKAVQLFLCDLLLVV
RTSLWRQQQPPAPAPAAQGASSGPQASALELRGFQRDLSSLRRLAQSFRPAMRRVFLHEATARLMAGASP
TRTHQLLDRSLRRRAGPGGKGGTVAELEPRPTRREHAEALLLASCYLPPGFLSAPGQRVGMLAEAARTLE
KLGDRRLLHDCQQMLMRLGGGTTVTSS
Aotusnancymaae
MDELSFSEAVLEQALSEPCDLDAALLTDIEGEVGAGRGRASRLDALWAGADRGAMDCTFEDMLQLINNQD
SDFPGLFDPPYAGGGAGGTDPASPDTSSPASLSPPPATLSSSLEGFLSGPEAAPSPLSPPQPAPAPLKMY
PPLPTFSPGPGIKEESVPLSILQTPTPQPLPGALLPQSFPAPAPPQFSSTPVMGYASPAGGFSTGSPPAS
TQQPLPGLPLASPPGVPPVSLHTQVQSAAPQQLLTVTAAPTAAPATTTVNSQIQQVPVLLQPHFIKADSL
LLTTMKTDGATVKAASLGPLVSGATVQTGPLPTLVSGGTILATVPLVVDADKLPINRLAAGSKAPGSAQS
RGEKRTAHNAIEKRYRSSINDKIIELKDLVVGTEAKLNKSAVLRKAIDYIRFLQHSNQKLKQENLSLRTA
VHKSKSLKDLVSACGGGGNTDVLMEGVKTEVEDTLTPPPSDAGSPFQSSPLSLGSRGSGSGGSGSDSEPD
SPVFEDSQAKPEQRPTAHSGGMPDRSRLALCTLVFLCLSCNPLASLLGARGLPSPSETTSIYHSPGRNVL
GTESRDGPGWAQWLLPPVVWLLNGLLVLVSLVLLFVYGEPVTRPHSGPAVHFWRHRKQADLDLARGDFAQ
AAQQLWLALRALGRPLPTSHLDLACSLLWNLIRHLLQRLWVGRWLASRAGGLQRDCALRVDARASARDAA
LVYHKLHQLHTMGKYTGGHLTATNLALSALNLAECAGDAVSVATLAEIYVVAALRVKTSLPWALHFLTRF
FLSSARQVCLAQSGSVPLAMQWLCHPVGHRFFVDGDWAVLSTPRESPYSLAGNPVDPLAQVTQLFREHLL
ERALNCVAQPNPSPGSADGDKEFSDALGYLQLLNSCSDAAGAPTCSFSISSSMATTTGVDPVAKWWASLT
AVVIHWLRRDEEAAEQLCPLVEHLPRVLQESERPLPRAALHSFKAARALLGCAKAESGPASLTICEKASG
YLQDSLATTPAGSSLDKAVQLFLCDLLLVVRTSLWRQQQLPAPAPAGQGASSGPQASALELRGFQRDLSS
LRRLAQSFRPAMRRVFLHEATARLMAGASPARTHQLLDRSLRRRAGPGGKGGVAPAELEPRPTRREHAEA
LLLASCYLPPGFLSAPGQRMGMLAEAARTLEKLGDRRLLHDCQQMLMRLGGGTTVTSS
|
i have attached protein file
…On Tue, Nov 28, 2023 at 8:30 AM fatima khan ***@***.***> wrote:
On Tue, Nov 28, 2023 at 7:04 AM fatima khan ***@***.***>
wrote:
> I took the coding sequence from blast orthologus of my protein then
> converted them into DNA sequence using reverse translate. What do I do to
> gnt results, I can align my sequences using ugene also do I have to remove
> gaps.
>
> Sent from my Huawei Mobile
>
>
> -------- Original Message --------
> Subject: Re: [veg/hyphy] results isuue and running error (Issue #1668)
> From: Sergei Pond
> To: veg/hyphy
> CC: fatima-akhtar113 ,Mention
>
> Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
>
> Like I said previosuly, if the goal is to identify selection at the level
> of a gene you should use BUSTED. However, the sequences you submitted to
> Datamonkey have not been properly aligned. Datamonkey will "pad" sequences
> of unequal lengths with ? at the end and this is what happened here (
> https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3/fasta)
>
> Datamonkey requires codon-aware multiple sequence alignments. If you are
> not familiar with how to obtain those, you may want to take a look
> elsewhere, e.g.
> https://github.com/veg/hyphy-analyses/blob/master/codon-msa/README.md
> and #1477 <#1477>
>
> I attach an aligned version of your data (using the codon-msa workflow I
> linked to above).
>
> If you run it through BUSTED in HyPhy, like so
>
> hyphy busted --alignment /Users/sergei/Desktop/seqs.msa --tree neighbor-joining --starting-points 5
>
> You will get a significant result for positive selection (p ~ 0), but a
> very odd looking ω distribution
> image.png (view on web)
> <https://github.com/veg/hyphy/assets/1018513/d0612854-bdd5-49ff-a442-1c3d38c32cba>
>
> A dN/dS of 2000 is indicative of some pathologies with the data / model.
> For example here's one site which shows several multi-nucleotide
> substitutions
> image.png (view on web)
> <https://github.com/veg/hyphy/assets/1018513/a09a972c-9b0d-4c28-a3ee-dacdc8a8007b>
>
> If you then run BUSTED with support for multiple hits (see
> https://academic.oup.com/mbe/article/40/7/msad150/7217158)
>
> hyphy busted --alignment /Users/sergei/Desktop/seqs.msa.gz --starting-points 5 --tree neighbor-joining --multiple-hits Double+Triple
>
> a very odd result is obtained
>
> ### Partition-level rates for multiple-hit substitutions
> * rate at which 2 nucleotides are changed instantly within a single codon : 1.9304
> * Corresponding fraction of substitutions : 45.463%
> * rate at which 3 nucleotides are changed instantly within a single codon : 1.9649
> * Corresponding fraction of substitutions : 5.696%
>
> | Selection mode | dN/dS |Proportion, %| Notes |
> |-----------------------------------|---------------|-------------|-----------------------------------|
> | Negative selection | 0.967 | 0.000 | Not supported by data |
> | Negative selection | 0.999 | 0.000 | Not supported by data |
> | Diversifying selection | 244.639 | 100.000 | |
>
>
> Having more than 50% of the substitutions occur due to multiple hits is
> very odd.
>
> May I ask where these sequences come from? (unless they are simulated).
>
> Best,
> Sergei
>
> seqs.msa.gz <https://github.com/veg/hyphy/files/13481434/seqs.msa.gz>
>
> —
> Reply to this email directly, view it on GitHub
> <#1668 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BDABE46A2CB2P5PARI5YAMTYGUMYBAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRYG43TQOJSHA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
>
|
i have 49 genes i used this alignment file for other gene
…On Tue, Nov 28, 2023 at 8:31 AM fatima khan ***@***.***> wrote:
i have attached protein file
On Tue, Nov 28, 2023 at 8:30 AM fatima khan ***@***.***>
wrote:
>
>
> On Tue, Nov 28, 2023 at 7:04 AM fatima khan ***@***.***>
> wrote:
>
>> I took the coding sequence from blast orthologus of my protein then
>> converted them into DNA sequence using reverse translate. What do I do to
>> gnt results, I can align my sequences using ugene also do I have to remove
>> gaps.
>>
>> Sent from my Huawei Mobile
>>
>>
>> -------- Original Message --------
>> Subject: Re: [veg/hyphy] results isuue and running error (Issue #1668)
>> From: Sergei Pond
>> To: veg/hyphy
>> CC: fatima-akhtar113 ,Mention
>>
>> Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
>>
>> Like I said previosuly, if the goal is to identify selection at the
>> level of a gene you should use BUSTED. However, the sequences you
>> submitted to Datamonkey have not been properly aligned. Datamonkey will
>> "pad" sequences of unequal lengths with ? at the end and this is what
>> happened here (
>> https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3/fasta)
>>
>> Datamonkey requires codon-aware multiple sequence alignments. If you are
>> not familiar with how to obtain those, you may want to take a look
>> elsewhere, e.g.
>> https://github.com/veg/hyphy-analyses/blob/master/codon-msa/README.md
>> and #1477 <#1477>
>>
>> I attach an aligned version of your data (using the codon-msa workflow
>> I linked to above).
>>
>> If you run it through BUSTED in HyPhy, like so
>>
>> hyphy busted --alignment /Users/sergei/Desktop/seqs.msa --tree neighbor-joining --starting-points 5
>>
>> You will get a significant result for positive selection (p ~ 0), but a
>> very odd looking ω distribution
>> image.png (view on web)
>> <https://github.com/veg/hyphy/assets/1018513/d0612854-bdd5-49ff-a442-1c3d38c32cba>
>>
>> A dN/dS of 2000 is indicative of some pathologies with the data / model.
>> For example here's one site which shows several multi-nucleotide
>> substitutions
>> image.png (view on web)
>> <https://github.com/veg/hyphy/assets/1018513/a09a972c-9b0d-4c28-a3ee-dacdc8a8007b>
>>
>> If you then run BUSTED with support for multiple hits (see
>> https://academic.oup.com/mbe/article/40/7/msad150/7217158)
>>
>> hyphy busted --alignment /Users/sergei/Desktop/seqs.msa.gz --starting-points 5 --tree neighbor-joining --multiple-hits Double+Triple
>>
>> a very odd result is obtained
>>
>> ### Partition-level rates for multiple-hit substitutions
>> * rate at which 2 nucleotides are changed instantly within a single codon : 1.9304
>> * Corresponding fraction of substitutions : 45.463%
>> * rate at which 3 nucleotides are changed instantly within a single codon : 1.9649
>> * Corresponding fraction of substitutions : 5.696%
>>
>> | Selection mode | dN/dS |Proportion, %| Notes |
>> |-----------------------------------|---------------|-------------|-----------------------------------|
>> | Negative selection | 0.967 | 0.000 | Not supported by data |
>> | Negative selection | 0.999 | 0.000 | Not supported by data |
>> | Diversifying selection | 244.639 | 100.000 | |
>>
>>
>> Having more than 50% of the substitutions occur due to multiple hits is
>> very odd.
>>
>> May I ask where these sequences come from? (unless they are simulated).
>>
>> Best,
>> Sergei
>>
>> seqs.msa.gz <https://github.com/veg/hyphy/files/13481434/seqs.msa.gz>
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <#1668 (comment)>, or
>> unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/BDABE46A2CB2P5PARI5YAMTYGUMYBAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRYG43TQOJSHA>
>> .
>> You are receiving this because you were mentioned.Message ID:
>> ***@***.***>
>>
>>
|
i got these results
https://www.datamonkey.org/busted/65656bb91fdac30a835a487a
i will attach sequence file of protein translated file in DNA and i have
already sent alignment file that i submitted
are my results right also just i have to mention that there is no selection
how can i interpret results tables in my analysis.
also thank you so much for helping i am doing it for first time i am trying
to get my head around all of this
regard,
fatima Akhtar.
On Tue, Nov 28, 2023 at 8:37 AM fatima khan ***@***.***> wrote:
i have 49 genes i used this alignment file for other gene
On Tue, Nov 28, 2023 at 8:31 AM fatima khan ***@***.***>
wrote:
> i have attached protein file
>
> On Tue, Nov 28, 2023 at 8:30 AM fatima khan ***@***.***>
> wrote:
>
>>
>>
>> On Tue, Nov 28, 2023 at 7:04 AM fatima khan ***@***.***>
>> wrote:
>>
>>> I took the coding sequence from blast orthologus of my protein then
>>> converted them into DNA sequence using reverse translate. What do I do to
>>> gnt results, I can align my sequences using ugene also do I have to remove
>>> gaps.
>>>
>>> Sent from my Huawei Mobile
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: [veg/hyphy] results isuue and running error (Issue #1668)
>>> From: Sergei Pond
>>> To: veg/hyphy
>>> CC: fatima-akhtar113 ,Mention
>>>
>>> Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
>>>
>>> Like I said previosuly, if the goal is to identify selection at the
>>> level of a gene you should use BUSTED. However, the sequences you
>>> submitted to Datamonkey have not been properly aligned. Datamonkey will
>>> "pad" sequences of unequal lengths with ? at the end and this is what
>>> happened here (
>>> https://www.datamonkey.org/meme/656497ed1fdac30a835a1cd3/fasta)
>>>
>>> Datamonkey requires codon-aware multiple sequence alignments. If you
>>> are not familiar with how to obtain those, you may want to take a look
>>> elsewhere, e.g.
>>> https://github.com/veg/hyphy-analyses/blob/master/codon-msa/README.md
>>> and #1477 <#1477>
>>>
>>> I attach an aligned version of your data (using the codon-msa workflow
>>> I linked to above).
>>>
>>> If you run it through BUSTED in HyPhy, like so
>>>
>>> hyphy busted --alignment /Users/sergei/Desktop/seqs.msa --tree neighbor-joining --starting-points 5
>>>
>>> You will get a significant result for positive selection (p ~ 0), but a
>>> very odd looking ω distribution
>>> image.png (view on web)
>>> <https://github.com/veg/hyphy/assets/1018513/d0612854-bdd5-49ff-a442-1c3d38c32cba>
>>>
>>> A dN/dS of 2000 is indicative of some pathologies with the data /
>>> model. For example here's one site which shows several multi-nucleotide
>>> substitutions
>>> image.png (view on web)
>>> <https://github.com/veg/hyphy/assets/1018513/a09a972c-9b0d-4c28-a3ee-dacdc8a8007b>
>>>
>>> If you then run BUSTED with support for multiple hits (see
>>> https://academic.oup.com/mbe/article/40/7/msad150/7217158)
>>>
>>> hyphy busted --alignment /Users/sergei/Desktop/seqs.msa.gz --starting-points 5 --tree neighbor-joining --multiple-hits Double+Triple
>>>
>>> a very odd result is obtained
>>>
>>> ### Partition-level rates for multiple-hit substitutions
>>> * rate at which 2 nucleotides are changed instantly within a single codon : 1.9304
>>> * Corresponding fraction of substitutions : 45.463%
>>> * rate at which 3 nucleotides are changed instantly within a single codon : 1.9649
>>> * Corresponding fraction of substitutions : 5.696%
>>>
>>> | Selection mode | dN/dS |Proportion, %| Notes |
>>> |-----------------------------------|---------------|-------------|-----------------------------------|
>>> | Negative selection | 0.967 | 0.000 | Not supported by data |
>>> | Negative selection | 0.999 | 0.000 | Not supported by data |
>>> | Diversifying selection | 244.639 | 100.000 | |
>>>
>>>
>>> Having more than 50% of the substitutions occur due to multiple hits is
>>> very odd.
>>>
>>> May I ask where these sequences come from? (unless they are simulated).
>>>
>>> Best,
>>> Sergei
>>>
>>> seqs.msa.gz <https://github.com/veg/hyphy/files/13481434/seqs.msa.gz>
>>>
>>> —
>>> Reply to this email directly, view it on GitHub
>>> <#1668 (comment)>, or
>>> unsubscribe
>>> <https://github.com/notifications/unsubscribe-auth/BDABE46A2CB2P5PARI5YAMTYGUMYBAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRYG43TQOJSHA>
>>> .
>>> You are receiving this because you were mentioned.Message ID:
>>> ***@***.***>
>>>
>>>
Homosapiens
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQIALAFGLAIGTLAQAL
GPVSGGHINPAITLALLVGNQISLLRAFFYVAAQLVGAIAGAGILYGVAPLNARGNLAVN
ALNNNTTQGQAMVVELILTFQLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFT
GCSMNPARSFGPAVVMNRFSPAHWVFWVGPIVGAVLAAILYFYLLFPNSLSLSERVAIIK
GTYEPDEDWEEQREERKKTMELTTR
Pantroglodytes
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAFFYVAAQLVGAIAGASILYGVAPLNARGNLAVNALNNNTTQGQAMVVELILTF
QLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPAHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLSERVAIIKGTYEPDEDWEEQREERKKTMELTTR
Macacanemestrina
MRGVRDTGPVYTAAWSRGPDGSALSAGGAGPRGCAGRRGRARAAGTPPNPCAAAALSVWSRPPPPRRRPA
PAPAPIESRPSRARRPGPARSGCWDRARRHPARPRPAASTSSAACDPTGAPRRGRRRRAPAATMKKEVCS
VAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINPAITLALL
VGNQISLLRAFFYVAAQLVGAIAGAGILYGVAPLNARGNLAVNALNNNTTQGQAMVVELILTFQLALCIF
ASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPAHWVFWVGPIVGAVLA
AILYFYLLFPNSLSLSERVDIIKGTYEPDEDWEEQREERKKTMELTAR
Pongoabelii
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAFFYVAAQLVGAIAGAGILYGVAPLNARGNLAVNALNNNTTQGQAMVVELILTF
QLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPAHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLSERVAIIKGTYEPDEDWEEQREERKKTMELTTH
Papioanubis
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAFFYVAAQLVGAIAGAGILYGVAPLNARGNLAINALNNNTTQGQAMVVELILTF
QLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPAHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLSERVDIIKGTYEPDEDWEEQREERKKTMELTAR
Mandrillusleucophaeus
MEGPQTQAWETESAAQFSRPRLTPPSRQVDKGNPAWERAPPGVHCLVQVCSVAFLKAVFAEFLATLIFVF
FGLGSALKWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINPAITLALLVGNQISLLRAFFYVAAQLV
GAIAGAGILYGVAPLNARGNLAVNALNNNTTQGQAMVVELILTFQLALCIFASTDSRRTSPVGSPALSIG
LSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPAHWVFWVGPIVGAVLAAILYFYLLFPNSLSLSERV
DIIKGTYEPDEDWEEQREERKKTMELTAR
Galeopterusvariegatus
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVVAQLVGAIAGAGILYGLAPLNARGNLAVNALNNNTTQGQAMVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMKRFSPAHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLSERVAVFKGTYEPEEDWEEQREERKKTMELTAR
Mustelaputoriusfuro
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPSILQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVAAQLVGAIAGAGILYGLAPLNARGNLAINALNNNTTQGQAMVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSSAHWVFWVGP
IVGAILAAILYFYLLFPNSLSVSERVAVIKGTYEPEEDWEEQREERKKTMELTAR
Carlito syrichta
MKKEVCSVAFVKAVFAEFLATLVFVFFGLGSALRWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRALFYVVAQLVGAIAGAGILYGLAPLNARGNLAVNALNNNTTPGQAMAVELILTF
QLALCVFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPAHWVFWVGP
IVGAVLAAILYFYLLFPHSLSLSERVAIIKGTYEPDEDWEEQREERKKTMELTAR
Ailuropodamelanoleuca
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPSILQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAAFYVVAQLVGAIAGAGILYGLAPLNARGNLAINALNNNTTQGQAMVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSSAHWVFWVGP
IVGAILAAVLYFYLLFPNSLSLSERVAVIKGTYEPEEDWEEQREERKKTMELTAR
Saimiriboliviensisboliviensis
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQISLAFGLAIGTLVQALGPVSGGHINP
AVTLALLVGNQISLLRALFYVVAQLVGAIAGAGILYGLAPLNARGNLAVNALNNNTTPGQATAVELILTF
QLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPVHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLSERVAIFKGTYEPDEDWEEQREERKKTMELTAR
Cebusimitator
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQISLAFGLAIGTLVQALGPVSGGHINP
AVTLALLVGNQISLLRALFYVVAQLVGAIAGAGILYGLAPLNARGNLAVNAVNKNTTPGQAMAVELILTF
QLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSRAHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLGERVAIFKGTYEPDEDWEEQREERKKTMELTAR
Aotusnancymaae
MKKEVCSVAFLKAVFAEFLATLIFIFFGLGSALKWPSALPTILQISLAFGLAIGTLVQALGPVSGGHINP
AVTLALLVGNQISLLRALFYVVAQLVGAIAGAGILYGLAPLNARGNLAVNGINSNTTPGQAMAVELILTF
QLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSSAHWVFWVGP
IVGAVLAAILYFYLLFPNSLSLGERVAIFKGTYEPDEDWEEQREERKKTMELTAR
Rattusnorvegicus
MKKEVCSLAFFKAVFAEFLATLIFVFFGLGSALKWPSALPTILQISIAFGLAIGTLAQALGPVSGGHINP
AITLALLIGNQISLLRAVFYVAAQLVGAIAGAGILYWLAPLNARGNLAVNALNNNTTPGKAMVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPSHWVFWVGP
IVGAMLAAILYFYLLFPSSLSLHDRVAVVKGTYEPEEDWEDHREERKKTIELTAH
Musmusculus
MKKEVCSVAFFKAVFAEFLATLIFVFFGLGSALKWPSALPTILQISIAFGLAIGILAQALGPVSGGHINP
AITLALLIGNQISLLRAIFYVAAQLVGAIAGAGILYWLAPGNARGNLAVNALSNNTTPGKAVVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPSHWVFWVGP
IVGAVLAAILYFYLLFPSSLSLHDRVAVVKGTYEPEEDWEDHREERKKTIELTAH
Oryctolaguscuniculus
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPSILQIALAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVAAQLVGAIAGAGILYGLAPLNARGNLAVNALNNNTTPGQAVVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMKRFSPSHWVFWVGP
IVGAILAAILYFYLLFPTSLSLSERVAVVKGSYEPEEDWEEHREKTLELTSR
Myotislucifugus
MKKEVCSVAFVKAVFTEFLATLIFVFFGLGSALQWPSALPSILQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVVAQLVGAIAGAGILYGLAPLNARGSLAVNALNNNTTPGQAMVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMKRFSSAHWVFWVGP
IVGAALAAILYFYLLFPNSLSLSERVAVVKGTYEPEEDWEEQREERKKTMELTAH
Susscrofa
MKKEVCSLAFLKAVFAEFLATLIFVFFGLASALKWPSALPTILQIALAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVVAQLVGAIAGAGILYGLAPGNARGNLAVNSLNNNTTPGQAVVVEMILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMNRFSPSHWVFWVGP
IVGAAVAAILYFYLLFPNSLSLSERVAVVKGTYESEEDWEEQREERKKTMELTAH
Heterocephalusglaber
MKKEMCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPSILQISMAFGLAIGTLAQALGPVSGGHINP
AVTLALLVGNQISLLRAVFYVAAQLVGAIAGAGILYGVAPTNARGNLAVNALNNNTTPGQAVVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGFSVALGHLVGIYFTGCSMNPARSFGPAVVMKRFSSSHWVFWVGP
IVGAMLAAILYFYLLFPHSLSLSERMAIIKGTYEPEDDWEDQREERKKTIELTAH
Caviaporcellus
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVIAQLVGAIAGAGILYGVAPTNARGNLAVNALNSNITTGQAVVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMKRFSSTHWVFWVGP
IVGAVLAAILYFYVLFPHSLSISDRVAIVKGTYEPEEDWEEQHEERKKTIELTAR
Manisjavanica
MKKEVCSVAFLKAVFAEFLATLIFVFLGLGSALKWPSALPSVLQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVVAQLVGAIAGAGILYGLAPVNVRGNLAVNSLNNNTTPGQAMAVELILTF
QLALCIFSSTDSRRTSPMGSPALSIGLSVTLGHFVGIYFTGCSMNPARSFGPAVVMKWFSPAHWVFWVGP
IVGAALAAILYFYLLFPNSLSLSERVAVIKGTYEPEEDWEEQREERKKTMELTAH
Chinchillalanigera
MLRPAAAQPVYTAGWVTWPGQGGRRAGVGVGARPGARGARAAAAGSAPCAPCGPPSAGAAHCPPARAPRP
GARPVYSAQCQLAGRPARAEPGARPAPQPACASAPTAAARRRRAPEATMKKEVCSVAFLKAVFAEFLATL
IFVFFGLGSALKWPSALPTILQISLAFGLAIGTLVQALGPVSGGHINPAITLALLVGNQISLLRAVFYVI
AQLVGAIAGAGILYGVAPTNARGNLAVNALNNNTTAGQAVVVELILTFQLALCIFASTDTRRSSPVGAPA
LSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMKRFSSSHWVFWVGPIVGAVLASILYFYLLFPHSLSL
SERVAIVKGTYEPEDDWEEQREERKKTIELTAH
Bostaurus
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPSVLQISLAFGLAIGTMAQALGPVSGGHMNP
AITLALLVGNQISLLRAVFYVVAQLVGAIAGAAILYGLAPYNARGNLAVNALNNNTTAGQAVVAEMILTF
QLALCVFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPSVIMNRFSSAHWVFWVGP
IVGAAVAAIIYFYLLFPHSLSLSDRAAILKGTYEPDEDWEESQEERKKTMELTAH
Feliscatus
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPSILQISLAFGLAIGTLAQALGPVSGGHINP
AITLALLVGNQISLLRAVFYVVAQLVGAIAGAGILYGLAPINARGNLAINALNNNTTQGQAMVVELILTF
QLALCVFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFTGCSMNPARSFGPAVVMKRFSPAHWVFWVGP
IVGAILAAILYFYLLFPNSLSLSERVAVVKGTYEPEEDWEEQREERKKTMELTAR
Equuscaballus
MKKEVCSVAFFKAVFAEFLATLIFVFFGLGSALQWPSALPSILQISMAFGLAIGTLAQALGPVSGGHINP
AITLALFVGNQISLLRALFYVVAQLVGAIAGAAILYGLAPRNARGNLAINSLNSNTTPGQAMVVELILTF
QLALCIFSSTDSRRTSPVGSPALSIGLSVTLGHLLGIHFTGCSMNPARSFGPAVIMKRFSSAHWVFWVGP
IVGAALAAILYFYLLFPNSLSLSERVAIVKGTYEPEEDWEEQREERKKTMELTAH
gorilla
MKKEVCSVAFLKAVFAEFLATLIFVFFGLGSALKWPSALPTILQIALAFGLAIGTLAQAL
GPVSGGHINPAITLALLVGNQISLLRAFFYVAAQLVGAIAGAGILYGVAPLNARGNLAVN
ALNNNTTQGQAMVVELILTFQLALCIFASTDSRRTSPVGSPALSIGLSVTLGHLVGIYFT
GCSMNPARSFGPAVVMNRFSPAHWVFWVGPIVGAVLAAILYFYLLFPNSLSLSERVAIIK
GTYEPDEDWEEQREERKKTMELTTR
pteropusvampyrus
MKKEVCSVAFIKAVFTEFLATLIFVFFGLGSALQWPSALPSILQISLAFGLAIGTLAQAL
GPVSGGHINPAITLALLVGNQISLLRATFYVVAQLLGAIAGAGILYGLAPTNARGNLAVN
ALNNNTTPGQAVVVELILTFQLALCVFSSTDSRRTSPVGSPALSIGLSVTLGHLVGIYFT
GCSMNPARSFGPAVVMKRFSPAHWVFWVGPIVGAALAAILYFYLLFPNSLSLSERVAVVK
GTYEPEEDWEEQREERKKTMELTAR
loxodontafricana
MWELRSIAFSRAVFSEFLATLLFVFFGLGSALNWPQALPSVLQIAMAFGLAIGTLVQTLG
HISGAHINPAVTVACLVGCHVSFLRATFYLAAQLLGAVAGAALLHELTPPDIRGDLAVNA
LSNNTTVGQAVTVELFLTLQLVLCIFASTDDRRGDNLGTPALSIGFSVALGHLLGIHYTG
CSMNPARSLAPAIITGKFDDHWVFWIGPLVGGILGSLLYNYVLFPHSKSLSERLAVLKGL
EPDTDWEEREVRRRQSVELHSPQSLQRAARP
human
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattgcgctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcgtttttttatgtggcggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcgtggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccacccagggccaggcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttgcgagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
ccggcgcattgggtgttttgggtgggcccgattgtgggcgcggtgctggcggcgattctg
tatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtggcgattattaaa
ggcacctatgaaccggatgaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccacccgc
chimp
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattgcgctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcgtttttttatgtggcggcgcagctggtgggcgcgattgcg
ggcgcgagcattctgtatggcgtggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccacccagggccaggcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttgcgagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
ccggcgcattgggtgttt
macaque
atgcgcggcgtgcgcgataccggcccggtgtataccgcggcgtggagccgcggcccggat
ggcagcgcgctgagcgcgggcggcgcgggcccgcgcggctgcgcgggccgccgcggccgc
gcgcgcgcggcgggcaccccgccgaacccgtgcgcggcggcggcgctgagcgtgtggagc
cgcccgccgccgccgcgccgccgcccggcgccggcgccggcgccgattgaaagccgcccg
agccgcgcgcgccgcccgggcccggcgcgcagcggctgctgggatcgcgcgcgccgccat
ccggcgcgcccgcgcccggcggcgagcaccagcagcgcggcgtgcgatccgaccggcgcg
ccgcgccgcggccgccgccgccgcgcgccggcggcgaccatgaaaaaagaagtgtgcagc
gtggcgtttctgaaagcggtgtttgcggaatttctggcgaccctgatttttgtgtttttt
ggcctgggcagcgcgctgaaatggccgagcgcgctgccgaccattctgcagattgcgctg
gcgtttggcctggcgattggcaccctggcgcaggcgctgggcccggtgagcggcggccat
attaacccggcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcg
tttttttatgtggcggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggc
gtggcgccgctgaacgcgcgcggcaacctggcggtgaacgcgctgaacaacaacaccacc
cagggccaggcgatggtggtggaactgattctgacctttcagctggcgctgtgcattttt
gcgagcaccgatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctg
agcgtgaccctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcg
cgcagctttggcccggcggtggtgatgaaccgctttagcccggcgcattgggtgttttgg
gtgggcccgattgtgggcgcggtgctggcggcgattctgtatttttatctgctgtttccg
aacagcctgagcctgagcgaacgcgtggatattattaaaggcacctatgaaccggatgaa
gattgggaagaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcgc
Pongoabelii
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattgcgctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcgtttttttatgtggcggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcgtggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccacccagggccaggcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttgcgagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
ccggcgcattgggtgttttgggtgggcccgattgtgggcgcggtgctggcggcgattctg
tatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtggcgattattaaa
ggcacctatgaaccggatgaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccacccat
Papioanubis
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcgtttttttat
gtggcggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcgtggcgccg
ctgaacgcgcgcggcaacctggcgattaacgcgctgaacaacaacaccacccagggccag
gcgatggtggtggaactgattctgacctttcagctggcgctgtgcatttttgcgagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaccgctttagcccggcgcattgggtgttttgggtgggcccg
attgtgggcgcggtgctggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcctgagcgaacgcgtggatattattaaaggcacctatgaaccggatgaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcgc
Mandrillusleucophaeus
atggaaggcccgcagacccaggcgtgggaaaccgaaagcgcggcgcagtttagccgcccg
cgcctgaccccgccgagccgccaggtggataaaggcaacccggcgtgggaacgcgcgccg
ccgggcgtgcattgcctggtgcaggtgtgcagcgtggcgtttctgaaagcggtgtttgcg
gaatttctggcgaccctgatttttgtgttttttggcctgggcagcgcgctgaaatggccg
agcgcgctgccgaccattctgcagattgcgctggcgtttggcctggcgattggcaccctg
gcgcaggcgctgggcccggtgagcggcggccatattaacccggcgattaccctggcgctg
ctggtgggcaaccagattagcctgctgcgcgcgtttttttatgtggcggcgcagctggtg
ggcgcgattgcgggcgcgggcattctgtatggcgtggcgccgctgaacgcgcgcggcaac
ctggcggtgaacgcgctgaacaacaacaccacccagggccaggcgatggtggtggaactg
attctgacctttcagctggcgctgtgcatttttgcgagcaccgatagccgccgcaccagc
ccggtgggcagcccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggc
atttattttaccggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatg
aaccgctttagcccggcgcattgggtgttttgggtgggcccgattgtgggcgcggtgctg
gcggcgattctgtatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtg
gatattattaaaggcacctatgaaccggatgaagattgggaagaacagcgcgaagaacgc
aaaaaaaccatggaactgaccgcgcgc
Galeopterusvariegatus
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattagcctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcggtgttttatgtggtggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcctggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccacccagggccaggcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaacgctttagc
ccggcgcattgggtgttttgggtgggcccgattgtgggcgcggtgctggcggcgattctg
tatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtggcggtgtttaaa
ggcacctatgaaccggaagaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccgcgcgc
Mustelaputoriusfuro
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttat
gtggcggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcctggcgccg
ctgaacgcgcgcggcaacctggcgattaacgcgctgaacaacaacaccacccagggccag
gcgatggtggtggaactgattctgacctttcagctggcgctgtgcatttttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaccgctttagcagcgcgcattgggtgttttgggtgggcccg
attgtgggcgcgattctggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcgtgagcgaacgcgtggcggtgattaaaggcacctatgaaccggaagaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcgc
Carlito syrichta
atgaaaaaagaagtgtgcagcgtggcgtttgtgaaagcggtgtttgcggaatttctggcg
accctggtgtttgtgttttttggcctgggcagcgcgctgcgctggccgagcgcgctgccg
accattctgcagattgcgctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcgctgttttatgtggtggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcctggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccaccccgggccaggcgatggcggtggaactgattctgaccttt
cagctggcgctgtgcgtgtttgcgagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
ccggcgcattgggtgttttgggtgggcccgattgtgggcgcggtgctggcggcgattctg
tatttttatctgctgtttccgcatagcctgagcctgagcgaacgcgtggcgattattaaa
ggcacctatgaaccggatgaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccgcgcgc
Ailuropodamelanoleuca
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
agcattctgcagattagcctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcggcgttttatgtggtggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcctggcgccgctgaacgcgcgcggcaacctggcgattaac
gcgctgaacaacaacaccacccagggccaggcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
agcgcgcattgggtgttttgggtgggcccgattgtgggcgcgattctggcggcggtgctg
tatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtggcggtgattaaa
ggcacctatgaaccggaagaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccgcgcgc
Saimiriboliviensisboliviensis
gcggtgaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcgctgttttat
gtggtggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcctggcgccg
ctgaacgcgcgcggcaacctggcggtgaacgcgctgaacaacaacaccaccccgggccag
gcgaccgcggtggaactgattctgacctttcagctggcgctgtgcatttttgcgagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaccgctttagcccggtgcattgggtgttttgggtgggcccg
attgtgggcgcggtgctggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcctgagcgaacgcgtggcgatttttaaaggcacctatgaaccggatgaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcgc
Cebusimitator
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattagcctggcgtttggcctggcgattggcaccctggtgcaggcgctg
ggcccggtgagcggcggccatattaacccggcggtgaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcgctgttttatgtggtggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcctggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcggtgaacaaaaacaccaccccgggccaggcgatggcggtggaactgattctgaccttt
cagctggcgctgtgcatttttgcgagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
cgcgcgcattgggtgttttgggtgggcccgattgtgggcgcggtgctggcggcgattctg
tatttttatctgctgtttccgaacagcctgagcctgggcgaacgcgtggcgatttttaaa
ggcacctatgaaccggatgaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccgcgcgc
Aotusnancymaae
gcggtgaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcgctgttttat
gtggtggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcctggcgccg
ctgaacgcgcgcggcaacctggcggtgaacggcattaacagcaacaccaccccgggccag
gcgatggcggtggaactgattctgacctttcagctggcgctgtgcatttttgcgagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaccgctttagcagcgcgcattgggtgttttgggtgggcccg
attgtgggcgcggtgctggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcctgggcgaacgcgtggcgatttttaaaggcacctatgaaccggatgaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcgc
Rattusnorvegicus
atgaaaaaagaagtgtgcagcctggcgttttttaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattagcattgcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctgattggcaac
cagattagcctgctgcgcgcggtgttttatgtggcggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtattggctggcgccgctgaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccaccccgggcaaagcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
ccgagccattgggtgttttgggtgggcccgattgtgggcgcgatgctggcggcgattctg
tatttttatctgctgtttccgagcagcctgagcctgcatgatcgcgtggcggtggtgaaa
ggcacctatgaaccggaagaagattgggaagatcatcgcgaagaacgcaaaaaaaccatt
gaactgaccgcgcat
Musmusculus
atgaaaaaagaagtgtgcagcgtggcgttttttaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
accattctgcagattagcattgcgtttggcctggcgattggcattctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctgattggcaac
cagattagcctgctgcgcgcgattttttatgtggcggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtattggctggcgccgggcaacgcgcgcggcaacctggcggtgaac
gcgctgagcaacaacaccaccccgggcaaagcggtggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaccgctttagc
ccgagccattgggtgttttgggtgggcccgattgtgggcgcggtgctggcggcgattctg
tatttttatctgctgtttccgagcagcctgagcctgcatgatcgcgtggcggtggtgaaa
ggcacctatgaaccggaagaagattgggaagatcatcgcgaagaacgcaaaaaaaccatt
gaactgaccgcgcat
Oryctolaguscuniculus
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttat
gtggcggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcctggcgccg
ctgaacgcgcgcggcaacctggcggtgaacgcgctgaacaacaacaccaccccgggccag
gcggtggtggtggaactgattctgacctttcagctggcgctgtgcatttttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaacgctttagcccgagccattgggtgttttgggtgggcccg
attgtgggcgcgattctggcggcgattctgtatttttatctgctgtttccgaccagcctg
agcctgagcgaacgcgtggcggtggtgaaaggcagctatgaaccggaagaagattgggaa
gaacatcgcgaaaaaaccctggaactgaccagccgc
Myotislucifugus
atgaaaaaagaagtgtgcagcgtggcgtttgtgaaagcggtgtttaccgaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgcagtggccgagcgcgctgccg
agcattctgcagattagcctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcggtgttttatgtggtggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcctggcgccgctgaacgcgcgcggcagcctggcggtgaac
gcgctgaacaacaacaccaccccgggccaggcgatggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaacgctttagc
agcgcgcattgggtgttttgggtgggcccgattgtgggcgcggcgctggcggcgattctg
tatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtggcggtggtgaaa
ggcacctatgaaccggaagaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccgcgcatattgtgggcgcgattctggcggcgattctgtatttttatctgctg
tttccgaccagcctgagcctgagcgaacgcgtggcggtggtgaaaggcagctat
Susscrofa
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttat
gtggtggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcctggcgccg
ggcaacgcgcgcggcaacctggcggtgaacagcctgaacaacaacaccaccccgggccag
gcggtggtggtggaaatgattctgacctttcagctggcgctgtgcatttttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaccgctttagcccgagccattgggtgttttgggtgggcccg
attgtgggcgcggcggtggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcctgagcgaacgcgtggcggtggtgaaaggcacctatgaaagcgaagaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcat
Heterocephalusglaber
atgaaaaaagaaatgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgttttttggcctgggcagcgcgctgaaatggccgagcgcgctgccg
agcattctgcagattagcatggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcggtgaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcggtgttttatgtggcggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcgtggcgccgaccaacgcgcgcggcaacctggcggtgaac
gcgctgaacaacaacaccaccccgggccaggcggtggtggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccggtgggcagc
ccggcgctgagcattggctttagcgtggcgctgggccatctggtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaacgctttagc
agcagccattgggtgttttgggtgggcccgattgtgggcgcgatgctggcggcgattctg
tatttttatctgctgtttccgcatagcctgagcctgagcgaacgcatggcgattattaaa
ggcacctatgaaccggaagatgattgggaagatcagcgcgaagaacgcaaaaaaaccatt
gaactgaccgcgcat
Caviaporcellus
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttat
gtgattgcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcgtggcgccg
accaacgcgcgcggcaacctggcggtgaacgcgctgaacagcaacattaccaccggccag
gcggtggtggtggaactgattctgacctttcagctggcgctgtgcatttttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaacgctttagcagcacccattgggtgttttgggtgggcccg
attgtgggcgcggtgctggcggcgattctgtatttttatgtgctgtttccgcatagcctg
agcattagcgatcgcgtggcgattgtgaaaggcacctatgaaccggaagaagattgggaa
gaacagcatgaagaacgcaaaaaaaccattgaactgaccgcgcgc
Manisjavanica
atgaaaaaagaagtgtgcagcgtggcgtttctgaaagcggtgtttgcggaatttctggcg
accctgatttttgtgtttctgggcctgggcagcgcgctgaaatggccgagcgcgctgccg
agcgtgctgcagattagcctggcgtttggcctggcgattggcaccctggcgcaggcgctg
ggcccggtgagcggcggccatattaacccggcgattaccctggcgctgctggtgggcaac
cagattagcctgctgcgcgcggtgttttatgtggtggcgcagctggtgggcgcgattgcg
ggcgcgggcattctgtatggcctggcgccggtgaacgtgcgcggcaacctggcggtgaac
agcctgaacaacaacaccaccccgggccaggcgatggcggtggaactgattctgaccttt
cagctggcgctgtgcatttttagcagcaccgatagccgccgcaccagcccgatgggcagc
ccggcgctgagcattggcctgagcgtgaccctgggccattttgtgggcatttattttacc
ggctgcagcatgaacccggcgcgcagctttggcccggcggtggtgatgaaatggtttagc
ccggcgcattgggtgttttgggtgggcccgattgtgggcgcggcgctggcggcgattctg
tatttttatctgctgtttccgaacagcctgagcctgagcgaacgcgtggcggtgattaaa
ggcacctatgaaccggaagaagattgggaagaacagcgcgaagaacgcaaaaaaaccatg
gaactgaccgcgcat
Chinchillalanigera
ggcgcgcgcccggtgtatagcgcgcagtgccagctggcgggccgcccggcgcgcgcggaa
ccgggcgcgcgcccggcgccgcagccggcgtgcgcgagcgcgccgaccgcggcggcgcgc
cgccgccgcgcgccggaagcgaccatgaaaaaagaagtgtgcagcgtggcgtttctgaaa
gcggtgtttgcggaatttctggcgaccctgatttttgtgttttttggcctgggcagcgcg
ctgaaatggccgagcgcgctgccgaccattctgcagattagcctggcgtttggcctggcg
attggcaccctggtgcaggcgctgggcccggtgagcggcggccatattaacccggcgatt
accctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttatgtgatt
gcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcgtggcgccgaccaac
gcgcgcggcaacctggcggtgaacgcgctgaacaacaacaccaccgcgggccaggcggtg
gtggtggaactgattctgacctttcagctggcgctgtgcatttttgcgagcaccgatacc
cgccgcagcagcccggtgggcgcgccggcgctgagcattggcctgagcgtgaccctgggc
catctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagctttggcccg
gcggtggtgatgaaacgctttagcagcagccattgggtgttttgggtgggcccgattgtg
ggcgcggtgctggcgagcattctgtatttttatctgctgtttccgcatagcctgagcctg
agcgaacgcgtggcgattgtgaaaggcacctatgaaccggaagatgattgggaagaacag
cgcgaagaacgcaaaaaaaccattgaactgaccgcgcat
Bostaurus
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttat
gtggtggcgcagctggtgggcgcgattgcgggcgcggcgattctgtatggcctggcgccg
tataacgcgcgcggcaacctggcggtgaacgcgctgaacaacaacaccaccgcgggccag
gcggtggtggcggaaatgattctgacctttcagctggcgctgtgcgtgtttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccgagcgtgattatgaaccgctttagcagcgcgcattgggtgttttgggtgggcccg
attgtgggcgcggcggtggcggcgattatttatttttatctgctgtttccgcatagcctg
agcctgagcgatcgcgcggcgattctgaaaggcacctatgaaccggatgaagattgggaa
gaaagccaggaagaacgcaaaaaaaccatggaactgaccgcgcat
Feliscatus
gcgattaccctggcgctgctggtgggcaaccagattagcctgctgcgcgcggtgttttat
gtggtggcgcagctggtgggcgcgattgcgggcgcgggcattctgtatggcctggcgccg
attaacgcgcgcggcaacctggcgattaacgcgctgaacaacaacaccacccagggccag
gcgatggtggtggaactgattctgacctttcagctggcgctgtgcgtgtttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctggtgggcatttattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtggtgatgaaacgctttagcccggcgcattgggtgttttgggtgggcccg
attgtgggcgcgattctggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcctgagcgaacgcgtggcggtggtgaaaggcacctatgaaccggaagaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcgc
Equuscaballus
gcgattaccctggcgctgtttgtgggcaaccagattagcctgctgcgcgcgctgttttat
gtggtggcgcagctggtgggcgcgattgcgggcgcggcgattctgtatggcctggcgccg
cgcaacgcgcgcggcaacctggcgattaacagcctgaacagcaacaccaccccgggccag
gcgatggtggtggaactgattctgacctttcagctggcgctgtgcatttttagcagcacc
gatagccgccgcaccagcccggtgggcagcccggcgctgagcattggcctgagcgtgacc
ctgggccatctgctgggcattcattttaccggctgcagcatgaacccggcgcgcagcttt
ggcccggcggtgattatgaaacgctttagcagcgcgcattgggtgttttgggtgggcccg
attgtgggcgcggcgctggcggcgattctgtatttttatctgctgtttccgaacagcctg
agcctgagcgaacgcgtggcgattgtgaaaggcacctatgaaccggaagaagattgggaa
gaacagcgcgaagaacgcaaaaaaaccatggaactgaccgcgcat
|
Dear @fatima-akhtar113, I am not sure what you mean by "reverse-translate". There is no 1-1 way to reverse translate a protein sequence because of redundant codons. For example, you can use any of the 6 available codons for Serine. You need to find the underlying CDS sequences for each corresponding species. If you take your protein sequence for However, if you take the nucleotide sequence you provided and run This nucleotide sequence does not exist in nature. You should instead pull out the corresponding CDS for each sequence, for example https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi?REQUEST=CCDS&DATA=CCDS8793.1 will give you the human sequences. Best, |
I will try to do it how you told me... I also used Perl to translate my
protein transcript sequence of gene into DNA sequence.. I blasted the
protein human coding sequence and picked orthologues of my specie of
interest then converted them into DNA.. I will try to understand your way
thankyou
…On Tue, Nov 28, 2023, 7:50 PM Sergei Pond ***@***.***> wrote:
Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
I am not sure what you mean by "reverse-translate". There is no 1-1 way to
reverse translate a protein sequence because of redundant codons. For
example, you can use any of the 6 available codons for Serine. You need to
find the underlying CDS sequences for each corresponding species.
If you take your protein sequence for homo and use blastp on it, the
following result
image.png (view on web)
<https://github.com/veg/hyphy/assets/1018513/2b256e72-d053-4f4c-b195-cf2e26e8613b>
However, if you take the nucleotide sequence you provided and run blastn
on it, you get complete nonsense.
image.png (view on web)
<https://github.com/veg/hyphy/assets/1018513/f91cab25-f7ef-4371-822c-613fd91b3ff7>
This nucleotide sequence does not exist in nature.
You should instead pull out the corresponding CDS for each sequence, for
example
https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi?REQUEST=CCDS&DATA=CCDS8793.1
will give you the human sequences.
Best,
Sergei
—
Reply to this email directly, view it on GitHub
<#1668 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BDABE42YNDLVKZNAPKKLAWDYGX24JAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZQGAYDANBQHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
heyy can i use my protein sequence to blast against nucleotide database and
extract nucleotide sequences of my transcript in orthologue form
…On Tue, Nov 28, 2023 at 7:56 PM fatima khan ***@***.***> wrote:
I will try to do it how you told me... I also used Perl to translate my
protein transcript sequence of gene into DNA sequence.. I blasted the
protein human coding sequence and picked orthologues of my specie of
interest then converted them into DNA.. I will try to understand your way
thankyou
On Tue, Nov 28, 2023, 7:50 PM Sergei Pond ***@***.***>
wrote:
> Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
>
> I am not sure what you mean by "reverse-translate". There is no 1-1 way
> to reverse translate a protein sequence because of redundant codons. For
> example, you can use any of the 6 available codons for Serine. You need to
> find the underlying CDS sequences for each corresponding species.
>
> If you take your protein sequence for homo and use blastp on it, the
> following result
> image.png (view on web)
> <https://github.com/veg/hyphy/assets/1018513/2b256e72-d053-4f4c-b195-cf2e26e8613b>
>
> However, if you take the nucleotide sequence you provided and run blastn
> on it, you get complete nonsense.
> image.png (view on web)
> <https://github.com/veg/hyphy/assets/1018513/f91cab25-f7ef-4371-822c-613fd91b3ff7>
>
> This nucleotide sequence does not exist in nature.
>
> You should instead pull out the corresponding CDS for each sequence, for
> example
> https://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi?REQUEST=CCDS&DATA=CCDS8793.1
> will give you the human sequences.
>
> Best,
> Sergei
>
> —
> Reply to this email directly, view it on GitHub
> <#1668 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BDABE42YNDLVKZNAPKKLAWDYGX24JAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZQGAYDANBQHE>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Dear @fatima-akhtar113, How you collect your data is really up to you, and depends on the problem at hand. Best, |
hey i have a few queries what does this p value is busted suggest can you
tell? p=8.692e-12
also when i run absrel p value is 0 which doesnot make sense i took cds
file from ensemble i will attach
…On Wed, Nov 29, 2023 at 6:39 PM Sergei Pond ***@***.***> wrote:
Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
How you collect your data is really up to you, and depends on the problem
at hand.
But based on what you describe, this seems sensbile. The database will
have underlying CDS sequences for your proteins.
Best,
Sergei
—
Reply to this email directly, view it on GitHub
<#1668 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BDABE455AU7LDIAGG257CQTYG43HRAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZRHEYTKMJQGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
the methodology i followed for this was i collected cds file did alignment
and striped all stop codon from hyphy and then run it in datamonkey
…On Thu, Jan 4, 2024 at 12:31 PM fatima khan ***@***.***> wrote:
hey i have a few queries what does this p value is busted suggest can you
tell? p=8.692e-12
also when i run absrel p value is 0 which doesnot make sense i took cds
file from ensemble i will attach
On Wed, Nov 29, 2023 at 6:39 PM Sergei Pond ***@***.***>
wrote:
> Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
>
> How you collect your data is really up to you, and depends on the problem
> at hand.
> But based on what you describe, this seems sensbile. The database will
> have underlying CDS sequences for your proteins.
>
> Best,
> Sergei
>
> —
> Reply to this email directly, view it on GitHub
> <#1668 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/BDABE455AU7LDIAGG257CQTYG43HRAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMZRHEYTKMJQGI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Dear @fatima-akhtar113, I am afraid I don't fully understand what you are asking. If you are including attachments, you should do it via a web-browser (not e-mail), because otherwise the attachments will be stripped out. Best, |
https://www.datamonkey.org/absrel/65978ad3ba6f2072cc42906e are myresults
accurate ? is p value 0.00 correct
…On Thu, Jan 4, 2024 at 6:31 PM Sergei Pond ***@***.***> wrote:
Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
I am afraid I don't fully understand what you are asking. If you are
including attachments, you should do it via a web-browser (not e-mail),
because otherwise the attachments will be stripped out.
Best,
Sergei
—
Reply to this email directly, view it on GitHub
<#1668 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BDABE4YBIMGDOTYXZKM6ZLDYM2VLRAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZXGEYDAMBSHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Dear @fatima-akhtar113, Yes, based on your alignment, However, I would encourage you to check the alignment for robustness. Some of the "hotspots" for positive selection signal, e.g. codons around position 1150 seem to correspond to a gappy region which may have been misaligned Best, |
Stale issue message |
Intron(s) used as neutral proxy with HKY85 model.
Non-branch specific test.
NULL=Negative selection and Neutral evolution.
ALTERNATE= Class 1: Negative, Class 2: neutral evolution and Class 3:
positive selection.
*** Null model ***
inverse kappa: 0.3760629370048015
*** Alternate model ***
f0: 0.3091412697694165
f1: 0.01606826205589621
f2: 0.6747904681746874
f3: 0.03334068966119959
zeta0: 0.01182198580767017
zeta1: 1
zeta2: 0.0253951718952625
Lk null = -11802.92742646848 Lk alt = -10934.97392787508
LRT = 1735.906997186801
LRT p-value = 0.0000000000
calculate NEB
calculate BEB
discretization= 10 please explain this
…On Mon, Nov 27, 2023 at 6:03 PM Sergei Pond ***@***.***> wrote:
Dear @fatima-akhtar113 <https://github.com/fatima-akhtar113>,
1. I am afraid I can't help you unless you provide more information
about the MEME analysis. If you ran in in Datamonkey, please include the
URL for the results page.
2. No, you cannot conclude that a gene is under selection if one or
two sites are under selection. See
https://academic.oup.com/mbe/article/32/5/1365/1134918. Use BUSTED to
look for gene-level selection.
image.png (view on web)
<https://github.com/veg/hyphy/assets/1018513/1cf455e8-d1a6-40ec-9c3b-be1628aa9329>
.
Best,
Sergei
—
Reply to this email directly, view it on GitHub
<#1668 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BDABE4ZBLR4UEKXX62QE5T3YGSFSHAVCNFSM6AAAAAA73OHYSSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRXG44TINZSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
i cannot run my orthologue file in meme there are 11 sequences it is running fine in fel and slac
also can we say gene is under positive selection if there is selection on one codon or two how we interpret datamonkey results
The text was updated successfully, but these errors were encountered: