Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature representations for new Proteins in DiG #184

Open
sai-advaith opened this issue Apr 23, 2024 · 1 comment
Open

Feature representations for new Proteins in DiG #184

sai-advaith opened this issue Apr 23, 2024 · 1 comment

Comments

@sai-advaith
Copy link

sai-advaith commented Apr 23, 2024

Hi,

This is regarding protein generation in DiG.

I wanted to know how you obtained the features present in the protein pickle files. As per Appendix B.1 of the paper, the single and pair representations are simply outputs of a pre-trained Evoformer model from AlphaFold given the corresponding protein's Fasta sequence and MSAs.

I set up OpenFold on our systems and saved the representations from Evoformer in a pickle file for the corresponding protein. I used the single and pair keys in the output dictionary in this link. Also, to get the MSAs for the fasta sequence I queried the ColabFold server.

Unfortunately, the representations I received from OpenFold's Evoformer and the representations in the dataset's pickle file were quite different.

Can you please let me know the exact method you used to obtain the single and pair representations for the respective protein fasta sequence?

@zhengsx
Copy link
Collaborator

zhengsx commented May 27, 2024

Please use AlphaFold's representations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants