Hello RFdiffusion devs! I'm trying to do something pretty specific in RFdiffusion, and I am hoping for some guidance. I want to do two-sided partial diffusion on a binder-target complex, where the target contains a gap in the sequence (missing residues). I've constructed a simple minimal example using a modified version of PDB ID 9SPS (attached) where I set the 'binder' as chain A, 'target' as chain B, and I deleted residues 80-100 from chain B.
$INPUT_PDB="9sps_binderA_targetBgaps.pdb"
9sps_binderA_targetBgaps.pdb
One-sided partial diffusion works fine and preserves the sequence gap. Of course, it helps here that the contig includes the chain label, so we can just specify chain B twice!
run_inference.py \
"inference.input_pdb=$INPUT_PDB" \
'inference.num_designs=3' \
'diffuser.partial_T=15' \
'contigmap.contigs=["124-124/0 B1-79/0 B101-195/0"]' \
"inference.schedule_directory_path=$OUTPUT_DIR" \
"inference.output_prefix=$OUTPUT_DIR/onesided"
onesided_0.pdb
I tried two different versions of the syntax for two-sided partial diffusion:
# version one - pretend the target is continuous
run_inference.py \
"inference.input_pdb=$INPUT_PDB" \
'inference.num_designs=3' \
'diffuser.partial_T=15' \
'contigmap.contigs=["124-124/0 174-174/0"]' \
'contigmap.provide_seq=["124-297"]' \
"inference.schedule_directory_path=$OUTPUT_DIR" \
"inference.output_prefix=$OUTPUT_DIR/continuous"
# version two - include the sequence gap in the target
run_inference.py \
"inference.input_pdb=$INPUT_PDB" \
'inference.num_designs=3' \
'diffuser.partial_T=15' \
'contigmap.contigs=["124-124/0 79-79/0 95-95/0"]' \
'contigmap.provide_seq=["124-297"]' \
"inference.schedule_directory_path=$OUTPUT_DIR" \
"inference.output_prefix=$OUTPUT_DIR/breaks"
continuous_0.pdb
breaks_0.pdb
Unsurprisingly, version one ("continuous") stitches together the gap in the target sequence, creating an unhelpful franken-target. Version two ("breaks") is more promising, except that it interprets the discontinuity as a chain break too, and so the target gets split into chain B before the break and chain C after the break.
Breaking up the provide_seq flag into two parts, e.g. 'contigmap.provide_seq=["124-202,203-297"]' \ produced the same output as version 2.
Other than the chain labels, version 2 gives me exactly what I'm looking for, so I could just do some post-processing to combine chains B and C back into chain B and call it a day. But, it would certainly be preferable to have this handled by RFdiffusion itself! Perhaps the syntax could support something like 'contigmap.provide_seq=["124-202/0 203-297"]' \.
To summarize, is there any way to do two-sided partial diffusion, while preserving the gaps in the input sequence and preserving the chain labels of the input sequence? Thank you!
Hello RFdiffusion devs! I'm trying to do something pretty specific in RFdiffusion, and I am hoping for some guidance. I want to do two-sided partial diffusion on a binder-target complex, where the target contains a gap in the sequence (missing residues). I've constructed a simple minimal example using a modified version of PDB ID 9SPS (attached) where I set the 'binder' as chain A, 'target' as chain B, and I deleted residues 80-100 from chain B.
9sps_binderA_targetBgaps.pdb
One-sided partial diffusion works fine and preserves the sequence gap. Of course, it helps here that the contig includes the chain label, so we can just specify chain B twice!
onesided_0.pdb
I tried two different versions of the syntax for two-sided partial diffusion:
continuous_0.pdb
breaks_0.pdb
Unsurprisingly, version one ("continuous") stitches together the gap in the target sequence, creating an unhelpful franken-target. Version two ("breaks") is more promising, except that it interprets the discontinuity as a chain break too, and so the target gets split into chain B before the break and chain C after the break.
Breaking up the provide_seq flag into two parts, e.g.
'contigmap.provide_seq=["124-202,203-297"]' \produced the same output as version 2.Other than the chain labels, version 2 gives me exactly what I'm looking for, so I could just do some post-processing to combine chains B and C back into chain B and call it a day. But, it would certainly be preferable to have this handled by RFdiffusion itself! Perhaps the syntax could support something like
'contigmap.provide_seq=["124-202/0 203-297"]' \.To summarize, is there any way to do two-sided partial diffusion, while preserving the gaps in the input sequence and preserving the chain labels of the input sequence? Thank you!