# Simulation of Recombinant Protein Expression
***

# Introduction

This is an initial version of the Biotechnology Laboratory Simulator. The goal of this implementation is to maximize the production rate of any recombinant protein of your choice. This introduction is followed by the tasks to be performed in the virtual lab to find out how to maximize the expression rate of the target protein. The results are then evaluated and assessed.
After the necessary preparations, the laboratory workflow includes strain characterization cultivations, experiments for promoter sequence selection and finally the experiment to measure the achieved expression rate.

# Laboratory Tasks

In this section all aspects of the laboratory are handled. As in every laboratory you have only a limited amount of resources.

**1. Preparations for the experiments**
 
*1.1 Set-up your laboratory*    
*1.2 Choose host organism*
 
**2. Experiments**
 
*2.1 Strain characterization cultivations*     
*2.2 Promoter sequence selection*    
*2.3 Measurement of the expression rate*


'''better transition still needs to be created, maybe some pictures of the organisms or a laboratory'''

## 1. Preparations for the experiments

### 1.1 Set-up your laboratory

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# The cloning output is a mutant. Which is the host organism and the promoter sequence.
%matplotlib inline
from BioLabSimFun import Mutant

### 1.2 Choose host organism
Choosing your biotech-host is extremely simple: just type the name of your favorit bug into the 'Mutant'-command. Your company gives you two organisms, namely *E. coli* (abbr. Ecol) and *P. putida* (abbr. Pput). Use the abbreviation for the selection.

In [None]:
myhost = Mutant('Ecol')
myhost.show_BiotechSetting()
# list(vars(myhost).keys())
print(vars(myhost))

## 2. Experiments

### 2.1 Strain characterization cultivations
Your organization has a strain similar to what you wanted, but slightly different. No one knows what the optimal cultivation conditions are. The only information you got from your organization is that the two organisms are mesophilic bacteria. You can find out the range of temperatures suitable for growth on the following website (excerpt from the book "Biotechnology"):    
[Schmid, Rolf D., and Claudia Schmidt-Dannert. Biotechnology: An illustrated primer. John Wiley & Sons, 2016.](https://application.wiley-vch.de/books/sample/3527335153_c01.pdf)  
Find out the optimal growth temperature by cultivating your strain at different temperatures and calculating the growth rates based on the measured biomass concentrations. Pay attention to the maximum biomass concentration that can be reached.

In [None]:
# temperatures have to be defined
a = np.array([20, 30, 32, 35, 38, 38])
#experiment:
myhost.Make_TempGrowthExp(a)

myhost.show_BiotechSetting()
#print('OptTemp: {}'.format(myhost._Mutant__OptTemp))

### 2.2 Promoter sequence selection
You need to identify the optimal promoter sequence for expression of your gene of interest. Read the following article to become an expert on sigma70 driven prokaryotic gene expression: [https://doi.org/10.3390/biom5031245](https://doi.org/10.3390/biom5031245).
Think of some promoters and test them, but be aware that each testing costs resources.

The total length of the promoters must be 40 nt. Apart from that the genetic distance to the reference sequence of the expression tests should not be larger than 0.4. You can first check that as follows:

Then write down your promoter sequences in an Excel sheet.
To test the sequences, you have to clone each of them, introduce the resulting construct into the organism and then perform an expression test.

#### 2.2.1 Cloning
You are given publication from which you can identify positions for integration.
Pseudomonas GC content: [https://doi.org/10.1111/1462-2920.14130](https://doi.org/10.1111/1462-2920.14130)   
First create the primers matching your promoter sequences and wirte them down in your Excel sheet.  
The deviation from the optimum primer length should not be greater than 20 % and the length should not be greater than 30 nt for cloning to work.    
Then calculate the melting temperature for each primer and write it into your Excel sheet. On the following website you will find formulas for calculating the melting temperature. A sodium concentration of 50 mM is assumed. The deviation from the optimal melting temperature should not be greater than 20 %.   
[proceedings of ICSIIT 2010](https://core.ac.uk/download/pdf/35391868.pdf#page=190)      
Finally perform a cloning with each pair of primers followed directly by an expression test.

'''direct execution one after the other is necessary, since promoter
is transferred to the mutant class if cloning is successful.
Also mention that Tm or the total primer length may need to be adjusted
if cloning fails.'''

In [None]:
# Construct Primer and calculate melting temperature
Clone_ID1 = 'Test1'
Promoter1 =  'GCCCATTGACGCTGCCGTAGCGCTCCTATACCCTTGCACG'
TestPrimer = 'CGGGTAACTGCGACGGCATCGCGAG' #GAT'
Tm = 25

Clone_ID2 = 'Test2'
Promoter2 = 'A'*40

Clone_ID3 = 'Test3'
Promoter3 = 'CCGCATTGACGCTGCCGTAGCGCTCCTATACCCTTGCACG'
TestPrimer3 = 'GGCGTAACTGCGACGGCATCGCG' #AG' #GAT'
Tm = 29

Clone_ID4 = 'BestEcol'
OptPromoterEcol = 'GCCCATTGACAAGGCTCTCGCGGCCAGGTATAATTGCACG'
TestPrimer4 = 'CGGGTAACTGTTCCGAGAGCGC' #CG' #G

Clone_ID5 = 'Test5'
Promoter5 =   'GCCCATTGAGCTGTTAGCCTAAACTAGCTAAATTTGCACG'
TestPrimer5 = 'CGGGTAACTCGACAATCGGATTTG'


# experiment, output excecuted?; Length is calculated automatically,
# need not be entered
myhost.Make_Cloning(Clone_ID5, Promoter5, TestPrimer5, Tm)
myhost.show_Library()
myhost.show_BiotechSetting()

#### 2.2.2 Measurement of the promoter strength

In [None]:
myhost.Make_MeasurePromoterStrength('Test5')
myhost.show_Library()
myhost.show_BiotechSetting()
# print('Promoter strength: ', myhost.Promoter_Strength)

### 2.3 Measurement of the expression rate
Now perform the production experiment with the best promoter sequence and use the determined optimal growth temperature (integer only) and the maximum possible biomass (integer only) that you can determine from your data. In this way you will achieve the maximum expression rate and thus the maximum yield of your product.   
With the help of the following command you can see the minimum expression rate you should achieve with your experiment.

In [None]:
myhost.show_TargetExpressionRate()

In [None]:
myhost.Make_ProductionExperiment('BestEcol', 28, 100)
myhost.show_Library()
myhost.show_BiotechSetting()

# Evaluation and visualization of the results

Finally, summarize your results in a graph by plotting the promoter strengths of your promoter sequences against the respective GC contents. The promoter strength is directly proportional to the expression rate. Therefore, highlight in the plot the promoter strength of your best promoter sequence that you used in the production experiment. With the command "plot_ReferencePromoterStrength()" you can additionally display the promoter strength of a very well suited sequence in your plot to compare your results.

In [None]:
# after a more detailed test during the documentation there are still places to be hidden (???)
x = np.array([0.45, 0.63])
y = np.array([1.17, 1.13])

plt.figure(figsize = (4.5,3), dpi = 120)
plt.plot(x,y, linestyle = '--', marker = 'x', color = 'grey')
# highlight best sequence
plt.plot(0.45, 1.17, marker = 'X', color = 'black', markersize = 8)

myhost.plot_ReferencePromoterStrength()

plt.xlabel('GC content [-]')
plt.ylabel('promoter strength [-]')
plt.show()