# PCR Primer design
So far we have used Snapgene to design our primers. However in order to explore open-source alternatives this notebook will use `Biopython` in conjunction with Primer3
## Installation
First we need to install the python wrapper `primer3-py`

In [1]:
!pip install primer3-py



## Import
We can now import it along with other libraries we need

In [2]:
from Bio import SeqIO
import primer3
import os
os.chdir("..")

## Designing our first primer

In [3]:
dna_seq = str("AGCGCACATCTCTAATCTTTGATTTACGTATAATCGTTACACAACCATCCCCCTAATTCCTCTACAGAGGCGCGTCCTGCTGCCCCTTCCCATGGCACATGCTTCCAACTTACGATTTCTCACGTCCCACTCGTCCCAGGTTTCGCCTTTTCGGATTTCGTCCGTGTTGTGTTTTGTCTTCTTCCGTTGTCTTCCTCCGTTGTCTTCGATTTGATTCCCCACACCCGACTGCCTATCATTGTTCTTGCTTCGGATTCCTGGGCCACCTCGTTGAACTGGTGCTACTCAATAATTTATCGTAATTTACCATACGTTATCGATACATACTCCCCCCCATTCCGTTGTATTCCTTCGCTGTAGCTTTTGAGACCACCACCACCACCCTAGCCTCTGTTGTATAGTTTTCGACTTCGTCCGCCCGCCGTTGTCCTTGTGAACCATTTTTGTTGCTATCGCACACATACATACCAAGTACACTGCTTCTTGCTATCCTCCCTACTAGCTTTCTACATGCATTGCTACTATACAGCCCTCTACAGCACTTGAAATGAAGTAATTTGAGCCCGGAGCATCCACGAAACCTACGCGAGGAAGAGGTTCTTTAGGCGCTGGTCTGCATTTTCGCCCATACTCAGTCCCAAATTCCAGGGAACGAGGCCAGTAATACTACTGTATTGATTGTCGTTTTACAGCCACTGTCACTGTCATTCAATGTCAACGAACGGCGTCCAGACTTGTCAAGCATCGTGCCAACGATTTGTTTCCGCGTTCACAGCATCCGCTCCAACATCTCCCTATTTTCGACGTACAAATCAAACCCTAGTTGACAGCAAACCTAACGGTAGTGTTGTTGCGAGGTGGAGGTCCCTCAAGGTGGAGTTACCTTGAGGTCCCGTATTTGTGATGGTGGCGTTCTCAAGGTCCCTTGAGGTGGCCTTGATGTGTTTTCCCATCCCTGTAACGGAATCTCCGGGACGGTGATGTGGCAGCGAGTGCGAGC")

primers = primer3.bindings.design_primers({'SEQUENCE_TEMPLATE': dna_seq,}, {'PRIMER_OPT_SIZE': 20,'PRIMER_MIN_SIZE': 18, 'PRIMER_MAX_SIZE': 25, 'PRIMER_OPT_TM': 60.0, 'PRIMER_MIN_TM': 57.0, 'PRIMER_MAX_TM': 63.0, 'PRIMER_PAIR_MAX_DIFF_TM': 3.0, 'PRIMER_PRODUCT_SIZE_RANGE': [[900, 1100]],'SEQUENCE_INCLUDED_REGION': [0, len(dna_seq)],'SEQUENCE_PRIMER_PAIR_OK_REGION_LIST': [[0, 25, len(dna_seq)-25, 25]]})

# Print the first primer pair
print("Forward primer:", primers['PRIMER_LEFT_0_SEQUENCE'])
print("Reverse primer:", primers['PRIMER_RIGHT_0_SEQUENCE'])
print("Product size:", primers['PRIMER_PAIR_0_PRODUCT_SIZE'])

Forward primer: AGCGCACATCTCTAATCTTTGA
Reverse primer: ACTCGCTGCCACATCACC
Product size: 994


As we can see the primer is slightly shorter than the sequence due to primer3's optimization even when specifying strict positions with the last argument. In this case we choose to ignore it as it will likely not be a problem however if an important motif is left out the promoter will not work very well. To solve this one could screen the input sequence for restriction sites and choose a non-cutter restriction sequence to append to the primer and include a wider piece of the genome to ensure the entire upstream region is extracted.
## Designing primers for genomic extraction of upstream sequences

In [4]:
from src.utils import design_primers

pth = "data/primer_design/promoters_for_extraction.fasta"

design_primers(pth)

Record: PKG1_promoter
up primer: CGTGATTTCCTCTGCCTCGT
down primer: ACGGAATGGTTATCGCCCTT
product size: 935
primr melting temp:  60.108614558536146 59.45530340378093

Record: ADH1_promoter
up primer: CAGGGTCAAGCAGAGCAGAA
down primer: GGCACAGGGAGGTGTAAGTC
product size: 947
primr melting temp:  59.96411926156776 60.0362939165052

Record: TDH3_promoter
up primer: CGTTCCCACTTTGGACGTGT
down primer: GAGAGAAAAGCGCAGTTGGC
product size: 944
primr melting temp:  60.812412072024244 60.11000999450579

Record: ACT_promoter
up primer: TCGAGAAGAGAGGTAGGCGG
down primer: TGATAGAGCTGTAGGGCGGG
product size: 950
primr melting temp:  60.464013016904346 60.827924519505984

Record: TEF1_promoter
up primer: GGGCGAGTGTCCATTCATGA
down primer: CATCTCGCACCAGTGGATGA
product size: 926
primr melting temp:  60.10770297561146 59.82355063247462

Record: DED1_promoter
up primer: CCACAGATGCAAACGCAACA
down primer: ATCGCTGTGGATATCGGTGG
product size: 933
primr melting temp:  59.96949614280919 59.681559465123314

Record: gpdA_

We'll also generate primers for extraction of the downstream sequences used as terminators:

In [5]:
pth = "data/primer_design/terminators_for_extraction.fasta"

design_primers(pth)

Record: PKG1_terminator
up primer: TGATGATTATTAGTGAGAGCGTGGA
down primer: GCCCGATACCTCCAAGGAAA
product size: 951
primr melting temp:  59.69819491103988 59.45530340378093

Record: ADH1_terminator
up primer: TGTGACAGATGACGGACACA
down primer: AAGGCTTGGTTTACGACGGT
product size: 931
primr melting temp:  58.96329264587649 59.89273163212596

Record: TDH3_terminator
up primer: TCGTTACACAACCATCCCCC
down primer: ACACATCAAGGCCACCTCAA
product size: 912
primr melting temp:  59.67439756087623 59.52192805017313

Record: ACT_terminator
up primer: ACGACCTCCTTACGACCCTT
down primer: TCAACCAAGCCACAAGTCGT
product size: 992
primr melting temp:  60.25151896976945 60.107410468332944

Record: TEF1_terminator
up primer: TCACCACCTCGTTCTCGTTT
down primer: TGACTAGGCTGCCTTTGACC
product size: 942
primr melting temp:  59.251549596098016 59.67505708032144

Record: DED1_terminator
up primer: GCCCCTTCTCTTTTCGACGA
down primer: ATGTGATGCCAACATGCTGC
product size: 938
primr melting temp:  60.0376045227884 59.825155079357785

### A very specific function for generating a table in our report

In [6]:
from src.utils import design_primers_latex_table
#promoters
design_primers_latex_table("data/primer_design/promoters_for_extraction.fasta")

\textit{PKG1_promoter} & Up & 60 & 5'-CGTGATTTCCTCTGCCTCGT-3' \\
& Down & 59 & 5'-ACGGAATGGTTATCGCCCTT-3' \\
\textit{ADH1_promoter} & Up & 60 & 5'-CAGGGTCAAGCAGAGCAGAA-3' \\
& Down & 60 & 5'-GGCACAGGGAGGTGTAAGTC-3' \\
\textit{TDH3_promoter} & Up & 61 & 5'-CGTTCCCACTTTGGACGTGT-3' \\
& Down & 60 & 5'-GAGAGAAAAGCGCAGTTGGC-3' \\
\textit{ACT_promoter} & Up & 60 & 5'-TCGAGAAGAGAGGTAGGCGG-3' \\
& Down & 61 & 5'-TGATAGAGCTGTAGGGCGGG-3' \\
\textit{TEF1_promoter} & Up & 60 & 5'-GGGCGAGTGTCCATTCATGA-3' \\
& Down & 60 & 5'-CATCTCGCACCAGTGGATGA-3' \\
\textit{DED1_promoter} & Up & 60 & 5'-CCACAGATGCAAACGCAACA-3' \\
& Down & 60 & 5'-ATCGCTGTGGATATCGGTGG-3' \\
\textit{gpdA_promoter} & Up & 61 & 5'-CGTTCCCACTTTGGACGTGT-3' \\
& Down & 60 & 5'-GAGAGAAAAGCGCAGTTGGC-3' \\
\textit{pkiA_promoter} & Up & 60 & 5'-GAGGCAATGCTGGGTTTTCC-3' \\
& Down & 60 & 5'-GTGTCCCTTTAAGTGGCGGA-3' \\
\textit{mdhA_promoter} & Up & 60 & 5'-CCAGTACCGCGATCCTTTGT-3' \\
& Down & 60 & 5'-GAAGGTGGTGGTTGTGGAGA-3' \\


In [7]:
#terminators
pth = "data/primer_design/terminators_for_extraction.fasta"
design_primers_latex_table(pth)

\textit{PKG1_terminator} & Up & 60 & 5'-TGATGATTATTAGTGAGAGCGTGGA-3' \\
& Down & 59 & 5'-GCCCGATACCTCCAAGGAAA-3' \\
\textit{ADH1_terminator} & Up & 59 & 5'-TGTGACAGATGACGGACACA-3' \\
& Down & 60 & 5'-AAGGCTTGGTTTACGACGGT-3' \\
\textit{TDH3_terminator} & Up & 60 & 5'-TCGTTACACAACCATCCCCC-3' \\
& Down & 60 & 5'-ACACATCAAGGCCACCTCAA-3' \\
\textit{ACT_terminator} & Up & 60 & 5'-ACGACCTCCTTACGACCCTT-3' \\
& Down & 60 & 5'-TCAACCAAGCCACAAGTCGT-3' \\
\textit{TEF1_terminator} & Up & 59 & 5'-TCACCACCTCGTTCTCGTTT-3' \\
& Down & 60 & 5'-TGACTAGGCTGCCTTTGACC-3' \\
\textit{DED1_terminator} & Up & 60 & 5'-GCCCCTTCTCTTTTCGACGA-3' \\
& Down & 60 & 5'-ATGTGATGCCAACATGCTGC-3' \\
\textit{gpdA_terminator} & Up & 61 & 5'-CGTTCCCACTTTGGACGTGT-3' \\
& Down & 60 & 5'-GAGAGAAAAGCGCAGTTGGC-3' \\
\textit{pkiA_terminator} & Up & 60 & 5'-GCTTTCGCCATTCTACTCGC-3' \\
& Down & 60 & 5'-CCCTTGCCTGTCTATCGACC-3' \\
\textit{mdhA_terminator} & Up & 60 & 5'-GGCGGGTGGTTAGATGGTAG-3' \\
& Down & 60 & 5'-CCGATTTACCTCTCCCAGCG-3' 

## Generating primers for PCR validation of construct
To validate the inserts with PCR, we need primers annealing to known sequences on each promoter and terminator. This way, we can easily verify whether the assembly was successfull by combining the products with restriction enzymes and running on gel. We can then assess whether the observed lengths match our expectations. 
### Full construct validation
Each promoter and terminator needs its own primer. We design the primers so they anneal at different lengths from the end of promoter. This allows us to verify which promoter or terminator was inserted by checking the length of the resulting bands. Furthermore we need both forward and reverse primers at each CDS and a forward primer on the upstream PABA and a reverse primer on the downstream PABA. With this assembly we are able to verify each step in the final assembly. Please see the figure below:

![alt text](primers.png)

Note that we will also need reverse primers on promoters and forward primers on terminators to verify assembly of the PABA-ends. Also note that on the figure the CDS primers are shown near the middle. In reality they are placed near the end to keep the resulting products small to make the difference distinguisable.
### Primers for promoters
Each promoter needs a forward promoter annealing at varying positions. The reverse promoter does not need to vary in position as the promoter can be identified via the product between promoter and CDS.

In [8]:
from src.utils import create_end_primers

# generate the primers we need for promoters  
primers = create_end_primers("data\promoter_terminator_library\promoter_library.fasta", out_fasta="data/primer_design/promoters_end_primers.fasta", primer_len=30, first_offset=20, step=30, reverse_complement=False)
total_records = len(primers)
print(f"N primers: {total_records}")

N primers: 19


We also need reverse primers for the promoters. For these we can simply take the reverse compliment to the start of each promoter:

In [9]:
read_file = "data\promoter_terminator_library\promoter_library.fasta"
to_file = "data/primer_design/promoters_reverse_primers.fasta"

primers = create_end_primers(read_file, out_fasta=to_file, primer_len=30, first_offset=950, step=30, reverse_complement=True)
print(f"N primers: {len(primers)}")

N primers: 19


### Terminator primers
The above procedure is repeated for the terminators

In [10]:
read_file = "data/promoter_terminator_library/terminator_library.fasta"
to_file = "data/primer_design/terminators_reverse_primers.fasta"

primers = create_end_primers(read_file, out_fasta=to_file, primer_len=30, first_offset=930, step=-30, reverse_complement=True)
print(f"N reverse primers: {len(primers)}")

#forward primers for terminators
read_file = "data/promoter_terminator_library/terminator_library.fasta"
to_file = "data/primer_design/terminators_forward_primers.fasta"

primers = create_end_primers(read_file, out_fasta=to_file, primer_len=30, first_offset=20, step=30, reverse_complement=False)
print(f"N forward primers: {len(primers)}")

N reverse primers: 9
N forward primers: 9


### PABA primers
We need the PABA primers:

In [11]:
read_file = "data/insert_sequences/PABA.fasta"

#reading to seq object
records = list(SeqIO.parse(read_file, "fasta"))
paba_rec_1 = records[0]
paba_up = paba_rec_1.seq
paba_rec = records[1]
paba_down = paba_rec.seq

# we need a forward primer for the upstream PABA:
paba_up_forward_primer = paba_up[-50:-20]  # last 30 bases of the upstream sequence
print("PABA upstream forward primer:", paba_up_forward_primer)

# and a reverse primer for the downstream PABA:
paba_down_reverse_primer = paba_down[20:50].reverse_complement()
print("PABA downstream reverse primer:", paba_down_reverse_primer)

PABA upstream forward primer: TGATAGGTCGTAGCTGGCACACAGAATGAG
PABA downstream reverse primer: GAAGCGACATTTGGGATCAGGAGAGTGGCC


## dTomato validation
Finally we need dTomato (and negative control CDS) forward and reverse primers for validation of the dTomato variants:

In [12]:
read_file = "data/insert_sequences/dTomato_non_optimized.fasta"

#reading to seq objects
records = list(SeqIO.parse(read_file, "fasta"))
dTomato_rec = records[0]
dTomato = dTomato_rec.seq
non_fluo_rec = records[1]
non_fluo = non_fluo_rec.seq

# forward and reverse for dTomato
dTomato_forward_primer = dTomato[20:50]
dTomato_reverse_primer = dTomato[-50:-20].reverse_complement()
print("dTomato forward primer:", dTomato_forward_primer)
print("dTomato reverse primer:", dTomato_reverse_primer)
# forward and reverse for non-fluorescent control
non_fluo_forward_primer = non_fluo[20:50]
non_fluo_reverse_primer = non_fluo[-50:-20].reverse_complement()

print("Control forward primer:", non_fluo_forward_primer)
print("Control reverse primer:", non_fluo_reverse_primer)

dTomato forward primer: GGTCATCAAAGAGTTCATGCGCTTCAAGGT
dTomato reverse primer: CGTACAGGAACAGGTGGTGGCGGCCCTCGG
Control forward primer: GCACAGCTCCGACGAGTTCGGAATAGAGAC
Control reverse primer: GGGTCTCCACGTTTCCGGTCAACGGATGCA
