# Creation of in vitro insertion parts for pfa chain in C.Cinerea

Now that promoters have been collected and generated and ran through a quantative flouresence analysis we can create the full list of inserts and necesarry primers for the top 5 primers and terminators. 

This insert will contain: <br>
- PABA marker gene <br>
- PUFA synthesis pathway <br>
    - pfa1,2,3 and a pptase <br>
- Hygromicin marker gene. 

*this is where we insert a cool image of the creation*

## Loading Libraries

In [1]:
from teemi.design.combinatorial_design import DesignAssembly
import os
from IPython.display import display

os.chdir("..") #move to root to allow relative import
from src.smart_functions import read_fasta_to_dseqrecords

## Genetic elements

### Homologous recombination PABA marker

The PABA marker contains a homologous arm to restore PABA auxotrophy by introducing a amino acid change from glutamic acid to arginine. Correct homologous recombination can then be achieved and validated with diagnostic PCR.

### PUFA Synthesis pathway

The PUFA synthesis pathway is from *Aetherobacter fasciculatus* (SBSr002) and contains pfa1, pfa2, pfa3 and a pptase for DHA synthesis

## Primer creation using TEEMI

### Fetching everything

In [2]:
top_promoters = r'data/promoter_terminator_library/subset_promoter.fasta'
top_terminators = r'data/promoter_terminator_library/subset_terminator.fasta'

m_paba_fa = r'data/insert_sequences/PABA.fasta'
cds_fa = r'data/insert_sequences/pufa_optimized.fasta' #pfa123 and pptase

promoters, promoter_names = read_fasta_to_dseqrecords(top_promoters)
terminators, terminator_names = read_fasta_to_dseqrecords(top_terminators)

cds_records, cds_names = read_fasta_to_dseqrecords(cds_fa)
m_paba, m_paba_names = read_fasta_to_dseqrecords(m_paba_fa)

#### A quick count of sequences

In [3]:
print(f"Promoters: {len(promoters)}, CDS: {len(cds_records)}, Terminators: {len(terminators)}")

Promoters: 4, CDS: 4, Terminators: 3


### Putting it all together in a list for DesignAssembly

In [4]:
list_of_seqs  = [[m_paba[0]], 
                 promoters, [cds_records[0]], terminators, 
                 promoters, [cds_records[1]], terminators, 
                 promoters, [cds_records[2]], terminators,
                 promoters, [cds_records[3]], terminators,
                 [m_paba[1]], 
                 ]

### DesignAssembly

In [5]:
TARGET_TM = 65
LIMIT = 28
OVERLAP = 35


design = DesignAssembly(list_of_seqs, list_of_pads=[], positions_of_pads=[], target_tm=TARGET_TM, limit=LIMIT, overlap=OVERLAP)
variants_df = design.show_variants_lib_df()          
primers_df  = design.primer_list_to_dataframe()      
pcrs_df = design.pcr_list_to_dataframe()


out_dir = os.getcwd()
variants_csv = os.path.join(out_dir, "data/constructs/full_construct_variants_library.csv")
primers_csv  = os.path.join(out_dir, "data/constructs/full_construct_primers_list.csv")
pcrs_csv= os.path.join(out_dir, "data/constructs/full_construct_pcr_plan.csv")

variants_df.to_csv(variants_csv, index=False)
primers_df.to_csv(primers_csv, index=False)
pcrs_df.to_csv(pcrs_csv, index=False)

print(f"Variants: {len(variants_df)}")
print(f"Primers:  {len(primers_df)}")
print(f"PCRs:     {len(pcrs_df)}")
display(variants_df.head())
display(primers_df.head())
display(pcrs_df.head())

KeyboardInterrupt: 