Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reasoning behind use of uniref90.a3m MSA files for pairing #3

Open
Maryam-Haghani opened this issue Aug 3, 2023 · 1 comment
Open

Comments

@Maryam-Haghani
Copy link

Hi,

Your paper claims:

"We use of the default AlphaFold-Multimer MSA search setting with JackHMMER to search the UniProt database for MSA pairing."

However, after examining the provided code and example, it appears that the code exclusively deals with 'uniref90.a3m' MSA files. This means that the code relies on MSAs generated by the Uniref90 database in the 'a3m' format for pairing, which diverges from AlphaFold's default approach, where the MSA file result is based on JackHMMER and uniprot in the form of 'uniprot.sto'.

Additionally, the code employs TaxID for species grouping. Nevertheless, TaxID from the Uniref90 database encompasses different taxonomy ranks beyond just species, which may introduce challenges in the pairing process.

I kindly request clarification regarding these disparities and the reasoning behind.
Thank you for your attention to these concerns, and I eagerly await your response.

Sincerely,
Maryam

@allanchen95
Copy link
Owner

Hi,
Hope the following comments will answer your concerns;

  1. Yes, we use the MSA UniProt databases for MSA pairing in the paper.
  2. Our code provides the species process pipeline both for uniprot or uniref format data. You can enable the uniprot process by opening the codes from https://github.com/allanchen95/ESMPair/blob/main/msa_pair/data/species_processing.py#L57 to line 61.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants