Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

alphafold feature generation #18

Closed
luwei0917 opened this issue Jan 15, 2020 · 4 comments
Closed

alphafold feature generation #18

luwei0917 opened this issue Jan 15, 2020 · 4 comments

Comments

@luwei0917
Copy link

Hello,
I'm interested in the details about feature generation described in the method section of the paper. Is there a plan to open source the script for generating the features from MSA data? or is it already exist, just that I didn't see it.

Thanks!

@Augustin-Zidek
Copy link
Collaborator

Sorry, there is currently no plan to open source the feature generation code. (It's tightly coupled to our internal infrastructure as well as external tools which we cannot open source.)

However, I updated the README with descriptions of all the features. I also added code snippets that show how to calculate most of the features that are needed. I hope this helps.

@kad-ecoli
Copy link

kad-ecoli commented Feb 5, 2020

Why the pseudolikelihood estimation parameters have 484 (=22*22) parameters instead of 441 (=21*21) parameters? Almost all potts model implementation only has 441 parameters for each residue pair. Is it possible to release AlphaFold's pseudolikelihood estimation pipeline?

@huhlim
Copy link

huhlim commented Feb 5, 2020

I guess they used 20 amino acid types + X + gap.

@huhlim
Copy link

huhlim commented Feb 5, 2020

Thank you for sharing some code snippets for the input feature generation.
I found that "deletion_probability" is somehow weird. According to A3M format (https://github.com/soedinglab/hh-suite/wiki#the-same-alignment-in-a3m),
deletion cannot be mapped to the query sequence. For example,
a MSA in FASTA format,

>Query
ACDEFG--HIK
>Templ_1
ACD--GYWHI-
>Templ_2
ACDEFGFWH-K

can be converted into A3M format (by using HHsuite reformat.pl) as

>Query
ACDEFGHIK
>Templ_1
ACD--GywHI-
>Templ_2
ACDEFGfwH-K

Since those deletions in Templ_1 (yw) and Template_2 (fw) are aligned(?) to the gaps in the Query sequence, the deletions would not be counted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants