Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sequence at certain positions in Binder protocol #107

Open
aminsagar opened this issue Dec 3, 2022 · 8 comments
Open

Fix sequence at certain positions in Binder protocol #107

aminsagar opened this issue Dec 3, 2022 · 8 comments

Comments

@aminsagar
Copy link

Hello.
Thanks for this amazing work.
I am trying to redesign a peptide binder while keeping the sequence at some positions.
For example, I would like to keep the prolines at positions 2,9 and 16 as present in the input peptide.
I tried the following.

pep_model = mk_afdesign(protocol="binder")
pep_model.prep_inputs(pdb_filename="./data/Complex.pdb", chain="A", binder_chain="B", hotspot = "37,38,67,68,69", fix_pos="2,9,16", fix_seq=True) 

However, the designed peptides don't retain prolines at these positions.
Am I doing something wrong here?
I would be really grateful for any suggestions.
Thanks.
Amin.

@shanilpanara
Copy link

I believe that fix_pos is only currently supported in "fixbb" and "partial" modes (as per the README.md)

@amin-sagar
Copy link

I see. I can try to implement it if @sokrypton or other developers can give me some pointers.
Maybe I need to change something here??

def _mutate(self, seq, plddt=None, logits=None, mutation_rate=1):

to disallow mutations for some positions.
I would be really grateful for any suggestions.

@sokrypton
Copy link
Owner

sokrypton commented Jan 11, 2023 via email

@amin-sagar
Copy link

amin-sagar commented Jan 11, 2023

Thanks @sokrypton . I tried the following script but the generated peptides don't have prolines at the specified positions.

import numpy as np
import re
from IPython.display import HTML

from colabdesign import mk_afdesign_model, clear_mem
from colabdesign.af.alphafold.common import protein, residue_constants
from tqdm import tqdm

af_model = mk_afdesign_model(protocol="binder",data_dir="/home/amin/softwares/Protein-Design")
af_model.prep_inputs(pdb_filename="../data/protein-pep.pdb", chain="A", binder_chain="B")
fixpos = [1,8,15]

print (af_model._inputs["bias"])

for i in tqdm(range (0,5)):
    print (i)
    af_model.restart()
    af_model._inputs["bias"][fixpos,aa_order["P"]] = 10000000
    print (af_model._inputs["bias"])
    af_model.design_pssm_semigreedy(120,32)
    af_model.save_pdb("Design_bind17_fix_pos_seq2_sm_"+str(i)+".pdb")

The bias matrix looks like this which seems to be correct.

[       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
  10000000.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
  10000000.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
  10000000.        0.        0.        0.        0.        0.]
 [       0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.        0.
         0.        0.        0.        0.        0.        0.]]

Could you please see what I am doing wrong?
Thanks,
Amin.

@sokrypton
Copy link
Owner

sokrypton commented Jan 16, 2023

Thanks for the report, the issue has been fixed in v1.1.1. But if you want to update your copy see:
aa3ced6

@amin-sagar
Copy link

Thanks @sokrypton
I updated to v1.1.1.
This doesn't seem to completely solve the issue.
The generated peptides still don't retain the amino acids at the biased positions.
I printed out mut_seq from design_semigreedy and I see that each mutation cycle changes the amino acids. Maybe the mutate function is not considering the bias.
As a test, if I mutate the fixed residues back after passing through the mutate function, it works.

for t in range(int(num_tries)):
        mut_seq = self._mutate(seq, plddt, logits=(seq_logits + self._inputs["bias"]))
        for fixaa in [1,8,15]:
            mut_seq[0,fixaa] = 4
        print (mut_seq)

I am trying to figure out what's happening but maybe it's instantly clear to you.
Thanks again.

@sokrypton
Copy link
Owner

Should be fixed now!
I tracked the bug down to predict() function, turns out when I was making a copy of the dictionary before/after prediction, the copy wasn't being made.

Please try again!

Suggested pipeline:

from colabdesign.af.alphafold.common import residue_constants
bias = np.zeros((af_model._binder_len,20))
# example: force first position to be proline
bias[0,residue_constants.restype_order["P"]] = 1e8

af_model.restart()
af_model.set_seq(bias=bias)
af_model.design_pssm_semigreedy()

@amin-sagar
Copy link

Thanks @sokrypton
It works perfectly now. The residues are retained at the defined positions.
I think I am experiencing the issue described in #85
I will post the results on that issue.
Thanks again.
Amin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants