# Hallucinating Scaffold 101

Authors: Angelica Lam

Last Updated: Aug. 24, 2021

### Introduction

The Baker Lab has developed a deep learning method to scaffold functional sites (including active sites) of proteins. The method works by feeding random protein sequences to trRosetta, which will use a structure prediction algorithm to guide sequence design. The Baker Lab demonstrates that a de novo protein generated by their method successfully binds to the substrate of its native enzyme. Although the de novo protein successfully binds to its desired substrate, it is unclear whether the protein retains the catalytic activity of its native enzyme.

Our team wants to determine if we can use the Baker Lab’s method to hallucinate scaffolds for enzymatic active sites that support catalytic activity. Moreover, we want to determine how much of our native target enzyme needs to be in the de novo protein in order for the de novo protein to exhibit the catalytic activity of the native enzyme.

### Proposed Plan

1. First, we need to pick an enzyme. We restrict ourselves to relatively simple enzymes that have a single active site and whose activity is easy to assay. After choosing an enzyme, we need to identify its catalytic residues (amino acids) and active site. Then we need to select “sets” of residues to scaffold. These sets vary by how much of the native enzyme we are including. For instance, we may have sets of residues that are 2 Å from the substrate, 5 Å, and so on.

2. Next, we need to use the Baker Lab’s deep learning method to scaffold our sets of residues. Their method will output the amino acid sequences of multiple de novo proteins that should have our inputted sets of residues in the correct geometry.

3. Then we need to computationally validate the de novo protein sequences to find the best candidates for testing in the wet lab. We use Rosetta forward folding both with and without the substrate to independently confirm that the sequences can fold into the desired de novo protein. We also use Rosetta docking functions, in particular their ligand docking function, to confirm that the de novo proteins can bind to the desired substrate.

4. Finally, we test the de novo proteins in the wet lab.

### Where are we now?

We have selected TEM-1 beta-lactamase and identified its catalytic residues. TEM-1 beta-lactamase has a single active site and is easy to assay because it provides E. coli with resistance to beta-lactam antibiotics (by hydrolyzing the molecule's beta-lactam ring).

Think ahead: We want our de novo proteins to have the same catalytic function as TEM-1 beta-lactamase. How might we test that these proteins function as expected in the wet lab?

Answer:

Activity 1: Look at the [enzyme mechanism for beta-lactamase](https://www.ebi.ac.uk/thornton-srv/m-csa/entry/2/). Name at least 5 catalytic residues.

Answer:

See the end of this notebook for answers.

### Next Steps

* Find PDB (protein data bank) entries for crystallizations of TEM1 beta-lactamase that also have the beta-lactam substrate (e.g., ampicillin, penicillin).
* Write a script to identify the "sets" of residues mentioned in Step 1.

### Technical Skills

In future steps, we will likely use trRosetta, Rosetta forward (ab initio) folding, and Rosetta docking. Here, we provide some simple exercises to give you a feel of what the software is like. We are also completely new to using Rosetta, so these exercises are not comprehensive. Feel free to do your own exploring!

### Rosetta forward (ab initio) folding

### trRosetta

### Rosetta docking

### Answers

Answers may vary.

Think ahead: Use Golden Gate or Gibson assembly to build a plasmid with the de novo protein sequence, transform the plasmid into bacteria, and then test whether the bacteria become resistant to the antibiotic.

Activity 1: Ser70, Lys73, Ser130, Glu166, Ala237 and Lys234