Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using DeeplyTough as an embedder #16

Open
vinbl opened this issue Feb 18, 2022 · 1 comment
Open

Using DeeplyTough as an embedder #16

vinbl opened this issue Feb 18, 2022 · 1 comment

Comments

@vinbl
Copy link

vinbl commented Feb 18, 2022

Hello Josh,

I am thinking of the possibility of using DeeplyTough as an embedder for protein pockets, so that each pocket is mapped to a vector of descriptors. Could you provide some guidance on how these could be obtained?

Also, is it possible to process a custom pdb as the input containing only the pocket residues, instead of relying on the automated pocket detection?

Thank you very much

@JoshuaMeyers
Copy link
Collaborator

JoshuaMeyers commented Feb 21, 2022

Hello @vinbl, thanks for raising an issue! Yep, DeeplyTough could be exactly what you're looking for. Apologies for the delay.

Obtaining the descriptors for each pocket is relatively straightforward, there are many strategies but I'll suggest the one with the fewest code changes.

  1. A pre-requisite is that you can run 'custom_evaluation.py' on the README. This involves running deeplytough pairwise pocket matching for a set of (pdb, pocket) pairs defined in pairs.csv.
  2. Within custom_evaluation.py the calculation of the entries dictionary here involves calculating descriptors for each pocket so if you modify the code to save this dict somewhere you should be good to go.
  3. The simplest way to setup pairs.csv would be to just duplicate your pocket entries (it will essentially be calculating the distance between pocket 1 and pocket 1 which should be 0. This allows you to loop over just the pockets you care about without needing to modify the current interface.
  • The descriptor for each pocket has a dimensionality of 128.
  • "is it possible to process a custom pdb as the input containing only the pocket residues, instead of relying on the automated pocket detection?" no and yes.
    -- Automated pocket detection is definitely not required, you can specify your own pockets.
    -- But you can't provide a specific set of residues as the protein file, this is because deeplytough takes a full protein pdb and a pocket pdb to define a 24 angstrom cube around the centroid of the pocket residues, which it crops from the original protein coordinates. If you provide it with a partial protein to begin with, this deviates from the training conditions and it might not perform as expected.
    -- what you should do is save a custom pdb containing only the pocket residues, this defines your pocket file. And specify (originalPDB, pocketFile) in pairs.csv as described above.

p.s. I would suggest starting from our image on Dockerhub, or building the docker image yourself since this repo has a few stale dependencies now which can be a bit fiddly to install (docker pull joshuameyers/deeplytough)

Hope this helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants