# Creation of in vitro insertion parts for pfa chain in C.Cinerea

Now that promoters have been collected and generated and ran through a quantative flouresence analysis we can create the full list of inserts and necesarry primers for the top 5 primers and terminators. 

This insert will contain: <br>
- PABA marker gene <br>
- PUFA synthesis pathway <br>
    - pfa1,2,3 and a pptase <br>
- Hygromicin marker gene. 

*this is where we insert a cool image of the creation*

## Loading Libraries

## Genetic elements

### HR Domain

The insertion site into C.Cinerea will be the spoII site based on the report: <br> *Non-conventional fungi are efficient heterologous hosts for natural product production.*

### PABA marker

The PABA marker is to restore auxotrophy by introducing an amino acid change from glutamic acid to arginine via codon shift.

### PUFA Synthesis pathway

The PUFA synthesis pathway is from *Aetherobacter fasciculatus* (SBSr002) and contains pfa1, pfa2, pfa3 and a pptase for DHA synthesis

### Hygromicin marker

The hygromicin marker is an antibiotic marker which is needed due to the size of the insert. This sequence file includes B-TUB promoter and terminator which has proven effective in the report: *"[Non-conventional fungi are efficient heterologous hosts for natural product production](https://findit.dtu.dk/en/catalog/68c2c7e59104491074663843)"*

## Primer creation using TEEMI

### Imports

In [None]:
from teemi.design.combinatorial_design import DesignAssembly
from src.smart_functions import read_fasta_to_dseqrecords
from IPython.display import display

import os
os.chdir("..")

### Fetching everything

In [None]:
top_promoters = r'data/promoter_terminator_library/promoters.fasta'
top_terminators = r'data/promoter_terminator_library/terminators.fasta'

hr_fa= r'data/insert_sequences/HR.fasta'
m_paba_fa = r'data/insert_sequences/PABA.fasta'
m_hygro_fa = r'data/insert_sequences/Hygromycin.fasta' #hygromicin including promoter and terminator.
cds_fa = r'data/insert_sequences/pufa_optimized.fasta' #pfa123 and pptase

promoters, promoter_names = read_fasta_to_dseqrecords(top_promoters)
terminators, terminator_names = read_fasta_to_dseqrecords(top_terminators)

cds_records, cds_names = read_fasta_to_dseqrecords(cds_fa)
m_paba, m_paba_names = read_fasta_to_dseqrecords(m_paba_fa)
m_hygro, m_hygro_names = read_fasta_to_dseqrecords(m_hygro_fa)
hr_records, HR_names = read_fasta_to_dseqrecords(hr_fa)

#### A quick count of sequences

In [3]:
print(f"Promoters: {len(promoters)}, CDS: {len(cds_records)}, Terminators: {len(terminators)}")

Promoters: 2, CDS: 4, Terminators: 2


### Putting it all together in a list for DesignAssembly

In [4]:
list_of_seqs  = [[hr_records[0]], [m_paba[0]], 
                 promoters, [cds_records[0]], terminators, 
                 promoters, [cds_records[1]], terminators, 
                 promoters, [cds_records[2]], terminators,
                 promoters, [cds_records[3]], terminators,
                 [m_hygro[0]], [m_paba[1]], [hr_records[1]]]

### DesignAssembly

In [5]:
TARGET_TM = 65
LIMIT = 28
OVERLAP = 35


design = DesignAssembly(list_of_seqs, list_of_pads=[], positions_of_pads=[], target_tm=TARGET_TM, limit=LIMIT, overlap=OVERLAP)
variants_df = design.show_variants_lib_df()          
primers_df  = design.primer_list_to_dataframe()      
pcrs_df = design.pcr_list_to_dataframe()


out_dir = os.getcwd()
variants_csv = os.path.join(out_dir, "data/constructs/full_construct_variants_library.csv")
primers_csv  = os.path.join(out_dir, "data/constructs/full_construct_primers_list.csv")
pcrs_csv= os.path.join(out_dir, "data/constructs/full_construct_pcr_plan.csv")

variants_df.to_csv(variants_csv, index=False)
primers_df.to_csv(primers_csv, index=False)
pcrs_df.to_csv(pcrs_csv, index=False)

print(f"Variants: {len(variants_df)} saved -> {variants_csv}")
print(f"Primers:  {len(primers_df)} saved -> {primers_csv}")
print(f"PCRs:     {len(pcrs_df)} saved -> {pcrs_csv}")
display(variants_df.head())
display(primers_df.head())
display(pcrs_df.head())

Variants: 256 saved -> c:\Users\Bruger\Desktop\GIT-Projects\Synthetic biology\27460_synthetic_promoters\data/constructs/full_construct_variants_library.csv
Primers:  56 saved -> c:\Users\Bruger\Desktop\GIT-Projects\Synthetic biology\27460_synthetic_promoters\data/constructs/full_construct_primers_list.csv
PCRs:     51 saved -> c:\Users\Bruger\Desktop\GIT-Projects\Synthetic biology\27460_synthetic_promoters\data/constructs/full_construct_pcr_plan.csv


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,Systematic_name,Variant
0,HR_DW,PABA-UP,PKG1,pfa1,PKG1_T,PKG1,pfa2,PKG1_T,PKG1,pfa3,PKG1_T,PKG1,PPtase_BBa_K5300011,PKG1_T,Hygromycin,PABA-DW,HR_UP,"(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...",0
1,HR_DW,PABA-UP,PKG1,pfa1,PKG1_T,PKG1,pfa2,PKG1_T,PKG1,pfa3,PKG1_T,PKG1,PPtase_BBa_K5300011,ADH1_T,Hygromycin,PABA-DW,HR_UP,"(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, ...",1
2,HR_DW,PABA-UP,PKG1,pfa1,PKG1_T,PKG1,pfa2,PKG1_T,PKG1,pfa3,PKG1_T,ADH1,PPtase_BBa_K5300011,PKG1_T,Hygromycin,PABA-DW,HR_UP,"(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, ...",2
3,HR_DW,PABA-UP,PKG1,pfa1,PKG1_T,PKG1,pfa2,PKG1_T,PKG1,pfa3,PKG1_T,ADH1,PPtase_BBa_K5300011,ADH1_T,Hygromycin,PABA-DW,HR_UP,"(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, ...",3
4,HR_DW,PABA-UP,PKG1,pfa1,PKG1_T,PKG1,pfa2,PKG1_T,PKG1,pfa3,ADH1_T,PKG1,PPtase_BBa_K5300011,PKG1_T,Hygromycin,PABA-DW,HR_UP,"(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, ...",4


Unnamed: 0,id,anneals to,sequence,annealing temperature,length,price(DKK),description,footprint,len_footprint
0,P001,HR_DW,"(C, A, C, G, G, G, G, A, A, G, G, C, G, A, T, ...",79.29,28,50.4,Anneals to HR_DW,"(C, A, C, G, G, G, G, A, A, G, G, C, G, A, T, ...",28
1,P002,HR_DW,"(G, G, A, A, A, G, A, T, G, C, C, A, G, A, A, ...",79.38,68,122.4,"Anneals to HR_DW, overlaps to 1135bp_PCR_prod","(A, G, G, C, T, C, T, T, G, G, A, C, C, A, A, ...",50
2,P003,PABA-UP,"(A, T, G, T, T, G, G, T, C, C, A, A, G, A, G, ...",68.88,46,82.8,"Anneals to PABA-UP, overlaps to HR_DW","(T, T, C, T, T, C, T, G, G, C, A, T, C, T, T, ...",28
3,P004,PABA-UP,"(A, T, C, A, C, G, A, C, C, A, G, A, T, A, A, ...",69.46,46,82.8,"Anneals to PABA-UP, overlaps to 1036bp_PCR_prod","(C, C, T, C, T, C, T, T, A, C, T, C, C, C, G, ...",28
4,P005,PKG1,"(G, G, A, C, G, G, G, A, G, T, A, A, G, A, G, ...",71.94,46,82.8,"Anneals to PKG1, overlaps to PABA-UP","(G, T, G, T, T, A, T, C, T, G, G, T, C, G, T, ...",28


Unnamed: 0,pcr_number,template,forward_primer,reverse_primer,f_tm,r_tm
0,PCR1,HR_DW,P001,P002,79.29,79.38
1,PCR2,PABA-UP,P003,P004,68.88,69.46
2,PCR3,PKG1,P005,P006,71.94,71.95
3,PCR4,pfa1,P007,P008,81.13,81.41
4,PCR5,PKG1_T,P009,P010,64.79,66.24
