# Assay Design Pipeline

**General requirements:**
- Amplicons 200 bp in length
- Primers and probes between 18-32 bp, optimal size 24 bp in length
- Annealing temp: 56 C

**Reaction constants:**
- Expected input DNA concentration: 100 nM
- Monovalent Cations: 50 mM
- Divalent Cations: 4.5 mM
- dNTP: 0.8 uM

**Quality control:**
- Avoid user-specified recognition sites of restriction enzymes ‘CviAII’, ‘FatI’, ‘Hpy188III’, ‘NlaIII’ ‘CviQI’, ‘RsaI’
- Filter the primers and probes have G/C clamps
- Filter any oligos with more than 4 consecutive repeated bases
- Remove any primers and probes that will create heterodimers in the range of 55-70 C

**Result ranking:** 
- Results should be ranked by primers and probes of the optimal size, followed by the Tm of the primers and probes - closest to the user specified annealing temperature.

**What to return:**
- Forward primer sequence and Tm
- Reverse primer sequence and Tm
- Probe sequence and Tm
- Tm at which these primers and probes will have homodimers
- Amplicon sequence

# Stage I - Prep env and run Primer3 by passing user-defined parameters.
Environment set up  
--- Ubuntu 18.0.4 LTS 
--- Bioconda + Python 3.8.3 
--- Primer3-py 0.6.1

In [1]:
# Load necessary libraries

import primer3
import re
import pandas as pd
import textwrap
import sys

In [2]:
# Open the FASTA file and read the first 500

import re

class InputOutput:
    def __init__(self, file_loc_name = '', data = ''):
        self.file_loc_name = file_loc_name
        self.data = data
        
    def save_as_file(self, file_loc_name = '', data = ''):
        with open (file_loc_name, 'w') as f:
            f.write(data)
            print('Saved!')
            
    def read_seq(self, file_loc_name = '', data = ''):
        with open(file_loc_name, 'r') as f:
            ls = ''
            lines = f.readlines()[1:]
            for line in lines:
                ls  = ls + line.rstrip()
        re.sub('[^a-zA-Z]+', '', ls)
        source_seq = ls[:500]
        print(f'Source sequence is: \n' + source_seq)
        return source_seq
    
new_io = InputOutput()

try:
    read_sequence = new_io.read_seq('/home/qiime2/Downloads/sequence.fasta')
except IOError as e:
    print('An IOError has occurred. Please locate your sequence file correctly.')

Source sequence is: 
ATGAGCAGCGGCGCCAACATCACCTATGCCAGCCGCAAGCGGCGCAAGCCGGTGCAGAAAACAGTAAAGCCCGTCCCTGCTGAAGGAATTAAGTCAAACCCTTCTAAACGACACAGAGACCGGCTGAACACAGAGTTAGACCGCCTGGCTAGCCTGCTGCCCTTCCCACAAGATGTTATTAATAAGCTGGACAAACTCTCCGTTCTAAGGCTCAGCGTCAGCTACCTGAGGGCCAAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAAGTAGAGGCCAGGACCAGTGTAGAGCACAAGTCAGAGACTGGCAGGACTTGCAAGAAGGAGAGTTCTTGTTACAGGCGCTGAATGGCTTTGTTCTGGTTGTCACGGCAGATGCCTTGGTCTTCTATGCGTCTTCCACTATCCAAGATTACCTGGGCTTTCAGCAATCTGATGTCATACATCAGAGCGTGTATGAGCTTATCCATACAGAAGACCGAGCTGA


In [3]:
''' Use Primer3-py, set parameters and run the assay.
Customer specified Ta set as 60C. Since there seems to have no Ta calculation within Primer3-py, 
I use Tm = 65C as a beginning estimate for Ta = 60C (practically, Ta is ~5C lower than Tm).
Later in the result trimming and sorting stage, I will use Ta=0.3xTm_primer+0.7xTm_amplicon-14.9 
to calculate Ta. If other Ta calculation equations are preferred, we can adjust accordingly.
I set MAX and MIN TM parameters for larger range.
'''

import primer3

class SetParamRunPrimer3:
    
    def set_primer_product_size_range(self, lst = [[]]):
        return lst
    
    def set_primer_opt_size(self, num = 20):
        return num
    
    def set_primer_min_size(self, num = 10):
        return num
    
    def set_primer_max_size(self, num = 50):
        return num
    
    def set_primer_internal_opt_size(self, num = 20):
        return num
    
    def set_primer_internal_min_size(self, num = 10):
        return num
    
    def set_primer_internal_max_size(self, num = 50):
        return num
    
    def set_primer_dna_conc(self, conc = 100.0):
        return conc
    
    def set_primer_salt_momovalent(self, conc = 0.0):
        return conc
    
    def set_primer_salt_divalent(self, conc = 0.0):
        return conc
    
    def set_primer_dntp_conc(self, conc = 0.0):
        return conc
    
    def set_primer_dntp_conc(self, conc = 0.0):
        return conc
    
    def set_primer_opt_tm(self, flt = 55.0):
        return flt
    
    def set_primer_internal_opt_tm(self, flt = 55.0):
        return flt
           
new_setrun = SetParamRunPrimer3()

# User specified amplicon size = 200bp
user_specified_amplicon_length = 200

primer_product_size_range = new_setrun.set_primer_product_size_range([[200,200]])
primer_opt_size = new_setrun.set_primer_opt_size(24)
primer_min_size = new_setrun.set_primer_min_size(18)
primer_max_size = new_setrun.set_primer_max_size(32)
primer_internal_opt_size = new_setrun.set_primer_internal_opt_size(24)
primer_internal_min_size = new_setrun.set_primer_internal_min_size(18)
primer_internal_max_size = new_setrun.set_primer_internal_max_size(32)
primer_dna_conc = new_setrun.set_primer_dna_conc(100)
primer_salt_momovalent = new_setrun.set_primer_salt_momovalent(50)
primer_salt_divalent = new_setrun.set_primer_salt_divalent(4.5)
primer_dntp_conc = new_setrun.set_primer_dntp_conc(0.8)

# User-specified Ta = 56C
user_specified_Ta = 56.0

# Optimal Tm is temporarily set as user_specified_Ta +5C, which is 61C
primer_opt_tm = new_setrun.set_primer_opt_tm(user_specified_Ta + 5)
primer_internal_opt_tm = new_setrun.set_primer_internal_opt_tm(user_specified_Ta + 5)

primer_prelim = (primer3.bindings.designPrimers(
    {
        'SEQUENCE_ID': 'ASSAY000001',
        'SEQUENCE_TEMPLATE': read_sequence,
        'SEQUENCE_INCLUDED_REGION': [0,len(read_sequence)]
    },
    {
        'PRIMER_TASK': 'pick_pcr_primers_and_hyb_probe',
        'PRIMER_PICK_LEFT_PRIMER': 1,
        'PRIMER_PICK_INTERNAL_OLIGO': 1,
        'PRIMER_PICK_RIGHT_PRIMER': 1,
        'PRIMER_NUM_RETURN': 30,
        'PRIMER_OPT_SIZE': primer_opt_size,
        'PRIMER_MIN_SIZE': primer_min_size,
        'PRIMER_MAX_SIZE': primer_max_size,
        'PRIMER_INTERNAL_OPT_SIZE': primer_internal_opt_size,
        'PRIMER_INTERNAL_MIN_SIZE': primer_internal_min_size,
        'PRIMER_INTERNAL_MAX_SIZE': primer_internal_max_size,
        'PRIMER_OPT_TM': primer_opt_tm,
        'PRIMER_MAX_TM': 72.0,
        'PRIMER_MIN_TM': 50.0,
        'PRIMER_INTERNAL_OPT_TM': primer_internal_opt_tm,
        'PRIMER_INTERNAL_MAX_TM': 72.0,
        'PRIMER_INTERNAL_MIN_TM': 50.0,
        'PRIMER_PRODUCT_SIZE_RANGE': primer_product_size_range,
        'PRIMER_SALT_MONOVALENT': primer_salt_momovalent,
        'PRIMER_SALT_DIVALENT': primer_salt_divalent,
        'PRIMER_DNTP_CONC': primer_dntp_conc,
        'PRIMER_DNA_CONC': primer_dna_conc,
    }))
print(primer_prelim)

{'PRIMER_LEFT_EXPLAIN': 'considered 4515, GC content failed 3, low tm 35, high tm 1403, high hairpin stability 1, ok 3073', 'PRIMER_RIGHT_EXPLAIN': 'considered 4515, low tm 10, high tm 722, ok 3783', 'PRIMER_INTERNAL_EXPLAIN': 'considered 7140, GC content failed 3, low tm 896, high tm 78, high hairpin stability 176, ok 5987', 'PRIMER_PAIR_EXPLAIN': 'considered 22717, unacceptable product size 22681, ok 36', 'PRIMER_LEFT_NUM_RETURNED': 30, 'PRIMER_RIGHT_NUM_RETURNED': 30, 'PRIMER_INTERNAL_NUM_RETURNED': 30, 'PRIMER_PAIR_NUM_RETURNED': 30, 'PRIMER_PAIR_0_PENALTY': 2.2928180392926265, 'PRIMER_LEFT_0_PENALTY': 0.02046254673200565, 'PRIMER_RIGHT_0_PENALTY': 2.272355492560621, 'PRIMER_INTERNAL_0_PENALTY': 0.13477322369681133, 'PRIMER_LEFT_0_SEQUENCE': 'AAGAGCTTCTTTGATGTTGCATTA', 'PRIMER_RIGHT_0_SEQUENCE': 'CCCAGGTAATCTTGGATAGTGGAA', 'PRIMER_INTERNAL_0_SEQUENCE': 'AGGCCAGGACCAGTGTAGAGCACA', 'PRIMER_LEFT_0': (234, 24), 'PRIMER_RIGHT_0': (433, 24), 'PRIMER_INTERNAL_0': (284, 24), 'PRIMER_LEFT_0

# Stage II -  Pre-process results.
--- Read from the original result and extract out info based on primer pair number.
--- Further convert it into Pandas dataframe.

In [4]:
# Put the result into a more structured and manipulable dictionary.

import pandas as pd

class PreProcess():
    def __init__(self, dict):
        self.dict = dict
        
    def pre_process(self, dict):
        initial_dict = {}
        for id in range(dict['PRIMER_PAIR_NUM_RETURNED']):
            primer_id = str(id)
            for key in dict:
                if primer_id in key and not re.search(r'\d', re.sub(r'\D', '', key.replace(primer_id, '', 1))):
                    param = re.sub(r'_([\d]+)', '', key)
                    try:
                        initial_dict[param]
                    except:
                        initial_dict[param] = []
                    finally:
                        initial_dict[param].append(dict[key])
    
        # Convert to Pandas dataframe to facilitate following steps.
        primer = pd.DataFrame.from_dict(initial_dict,orient="index").T
        return primer
    
new_preprocess = PreProcess(primer_prelim)
df = new_preprocess.pre_process(primer_prelim)

In [5]:
df

Unnamed: 0,PRIMER_PAIR_PENALTY,PRIMER_LEFT_PENALTY,PRIMER_RIGHT_PENALTY,PRIMER_INTERNAL_PENALTY,PRIMER_LEFT_SEQUENCE,PRIMER_RIGHT_SEQUENCE,PRIMER_INTERNAL_SEQUENCE,PRIMER_LEFT,PRIMER_RIGHT,PRIMER_INTERNAL,...,PRIMER_RIGHT_SELF_END_TH,PRIMER_INTERNAL_SELF_END_TH,PRIMER_LEFT_HAIRPIN_TH,PRIMER_RIGHT_HAIRPIN_TH,PRIMER_INTERNAL_HAIRPIN_TH,PRIMER_LEFT_END_STABILITY,PRIMER_RIGHT_END_STABILITY,PRIMER_PAIR_COMPL_ANY_TH,PRIMER_PAIR_COMPL_END_TH,PRIMER_PAIR_PRODUCT_SIZE
0,2.29282,0.0204625,2.27236,0.134773,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(234, 24)","(433, 24)","(284, 24)",...,0,0,34.93,44.6293,46.6334,1.9,3.53,0.0,0.0,200
1,2.33472,1.33274,1.00198,0.134773,CAAGAGCTTCTTTGATGTTGCATT,CCAGGTAATCTTGGATAGTGGAAG,AGGCCAGGACCAGTGTAGAGCACA,"(233, 24)","(432, 24)","(284, 24)",...,0,0,36.777,44.6293,46.6334,3.56,3.46,0.0,0.0,200
2,2.34484,0.0204625,2.32438,0.134773,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGG,AGGCCAGGACCAGTGTAGAGCACA,"(234, 24)","(433, 22)","(284, 24)",...,0,0,34.93,44.6293,46.6334,1.9,4.0,0.0,0.0,200
3,2.37977,1.33274,1.04703,0.134773,CAAGAGCTTCTTTGATGTTGCATT,CCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(233, 24)","(432, 23)","(284, 24)",...,0,0,36.777,44.6293,46.6334,3.56,3.53,0.0,0.0,200
4,2.75892,0.0204625,2.73846,0.134773,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGGA,AGGCCAGGACCAGTGTAGAGCACA,"(234, 24)","(433, 23)","(284, 24)",...,0,0,34.93,44.6293,46.6334,1.9,4.02,0.0,0.0,200
5,2.7781,1.77612,1.00198,0.134773,CAAGAGCTTCTTTGATGTTGCAT,CCAGGTAATCTTGGATAGTGGAAG,AGGCCAGGACCAGTGTAGAGCACA,"(233, 23)","(432, 24)","(284, 24)",...,0,0,36.777,44.6293,46.6334,3.96,3.46,0.0,0.0,200
6,2.82315,1.77612,1.04703,0.134773,CAAGAGCTTCTTTGATGTTGCAT,CCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(233, 23)","(432, 23)","(284, 24)",...,0,0,36.777,44.6293,46.6334,3.96,3.53,0.0,0.0,200
7,2.93188,0.0204625,2.91141,0.134773,AGAGCTTCTTTGATGTTGCATTAA,GCCCAGGTAATCTTGGATAGTG,AGGCCAGGACCAGTGTAGAGCACA,"(235, 24)","(434, 22)","(284, 24)",...,0,0,0.0,44.6293,46.6334,1.4,2.74,0.0,0.0,200
8,3.06664,2.91395,0.152691,0.134773,CCAAGAGCTTCTTTGATGTTGCAT,CAGGTAATCTTGGATAGTGGAAGA,AGGCCAGGACCAGTGTAGAGCACA,"(232, 24)","(431, 24)","(284, 24)",...,0,0,36.777,0.0,46.6334,3.96,2.87,6.33944,1.79767,200
9,3.08282,1.07941,2.00341,0.134773,AGCTTCTTTGATGTTGCATTAAAA,AAGCCCAGGTAATCTTGGATAG,AGGCCAGGACCAGTGTAGAGCACA,"(237, 24)","(436, 22)","(284, 24)",...,0,0,0.0,44.6293,46.6334,1.52,2.08,0.0,0.0,200


# Stage III - Trim results by filtering (quality control).

In [6]:
''' Apply 4 methods to remove unwanted oligos by applying regular expression rules and Primer3 APIs.
* filter_RE: to discard oligos containing restriction enzyme recognition sites by using regular expression rules.
The recognition sites for those user-defined enzymes are CATG|GTAC|TCNNGA. 
Could write a module to crawl the NEB site for corresponding enzymes and there recognition site.
* filter_GC-clamps: to filter out oligos with >=4 G/Cs (out of 5) @3' end by using regular expression rules 
(adjustable, personally I think <=3 G/Cs improves specific binding while too many gives rise to too strong binding).
* filter_multi_cons: to remove oligos having >3 consecutive repeated bases by using regular expression rules.
* filter_hetero_dimer: to throw away heterodimer-forming oligos with Tm 55-70C.
'''

import re

class QualityControl:

    def filter_RE(self, seq):
        return re.search(r'CATG|GTAC|TC\w{2}GA', seq)

    def filter_GC_clamps(self, seq):
        return re.search(r'[GC]{4}', seq[-5:])    

    def filter_multi_cons(self, seq):
        return re.search(r'(.)\1\1\1', seq)

    def filter_hetero_dimer(self, seq1, seq2):
        return 55 < primer3.calcHeterodimer(seq1, seq2, mv_conc=primer_salt_momovalent, 
                                                       dv_conc=primer_salt_divalent, dntp_conc=primer_dntp_conc, 
                                                       dna_conc=primer_dna_conc, temp_c=37, max_loop=30).tm < 70

    def filtering(self, dataframe):
        
        # Drop the columns with non-critical info to reduce data size and expedite processing.
        dataframe.drop(dataframe.columns[[0,1,2,3,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29]], axis = 1, inplace = True)

        # Mark the rows containing filtered targets.
        marked_idx = []
        for idx in range(df.shape[0]):
            primer_left, primer_right, probe = df.at[idx, 'PRIMER_LEFT_SEQUENCE'], df.at[idx, 'PRIMER_RIGHT_SEQUENCE'], df.at[idx, 'PRIMER_INTERNAL_SEQUENCE']

            if self.filter_hetero_dimer(primer_left, primer_right) or self.filter_hetero_dimer(probe, primer_right):
                marked_idx.append(idx)
                break  

            for seq in (primer_left, primer_right, probe): 
                if self.filter_RE(seq) or self.filter_GC_clamps(seq) or self.filter_multi_cons(seq):
                    marked_idx.append(idx)
                    break   

        # Delete those rows containing filtered targets.           
        for row in marked_idx:
            df.drop(index = row, inplace=True)
            
        # Reset index.
        df.reset_index(drop=True, inplace=True)    
        return df
    
new_QC = QualityControl()
df = new_QC.filtering(df)
df

Unnamed: 0,PRIMER_LEFT_SEQUENCE,PRIMER_RIGHT_SEQUENCE,PRIMER_INTERNAL_SEQUENCE,PRIMER_LEFT,PRIMER_RIGHT,PRIMER_INTERNAL
0,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(234, 24)","(433, 24)","(284, 24)"
1,CAAGAGCTTCTTTGATGTTGCATT,CCAGGTAATCTTGGATAGTGGAAG,AGGCCAGGACCAGTGTAGAGCACA,"(233, 24)","(432, 24)","(284, 24)"
2,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGG,AGGCCAGGACCAGTGTAGAGCACA,"(234, 24)","(433, 22)","(284, 24)"
3,CAAGAGCTTCTTTGATGTTGCATT,CCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(233, 24)","(432, 23)","(284, 24)"
4,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGGA,AGGCCAGGACCAGTGTAGAGCACA,"(234, 24)","(433, 23)","(284, 24)"
5,CAAGAGCTTCTTTGATGTTGCAT,CCAGGTAATCTTGGATAGTGGAAG,AGGCCAGGACCAGTGTAGAGCACA,"(233, 23)","(432, 24)","(284, 24)"
6,CAAGAGCTTCTTTGATGTTGCAT,CCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(233, 23)","(432, 23)","(284, 24)"
7,AGAGCTTCTTTGATGTTGCATTAA,GCCCAGGTAATCTTGGATAGTG,AGGCCAGGACCAGTGTAGAGCACA,"(235, 24)","(434, 22)","(284, 24)"
8,CCAAGAGCTTCTTTGATGTTGCAT,CAGGTAATCTTGGATAGTGGAAGA,AGGCCAGGACCAGTGTAGAGCACA,"(232, 24)","(431, 24)","(284, 24)"
9,CCAAGAGCTTCTTTGATGTTGC,CAGGTAATCTTGGATAGTGGAAGA,AGGCCAGGACCAGTGTAGAGCACA,"(232, 22)","(431, 24)","(284, 24)"


# Stage IV - (generate new columns and) Rank results.

In [7]:
'''Generate the following extra columns for sorting and rank:
1. Tm for each oligo.
2. HomodimerTm for each oligo.
3. Temprature differences between predicted Ta and user-designated Ta (56C) for each oligo
(Ta=0.3xTm_primer+0.7xTm_amplicon-14.9).
4. 200bp Amplicon sequence for each predicted PCR product.
5. Oligo length difference between optimal size (24bp) and predicted size for each oligo
(will be used for sorting of the results).
'''

class GenNewCol:

    def __init__(self):
        self.lst_Tm_amplicon = []
        
    # Generate the following lists first so that they can readily become columns of the target dataframe.
    
    def gen_lst_amplicon(self, dataframe, idx_col):
        lst_amplicon = []
        for row in range(dataframe.shape[0]):
            lst_amplicon.append(read_sequence[dataframe.iat[row,idx_col][0]:dataframe.iat[row,idx_col][0]+user_specified_amplicon_length])
            self.lst_Tm_amplicon.append(primer3.calcTm(lst_amplicon[row], mv_conc=primer_salt_momovalent, 
                                                       dv_conc=primer_salt_divalent, dntp_conc=primer_dntp_conc, 
                                                       dna_conc=primer_dna_conc, max_nn_length=60, tm_method='santalucia', salt_corrections_method='santalucia'))
        return lst_amplicon

    def gen_lst_oligo_len_diff(self, dataframe, idx_col):
        lst_oligo_len_diff = []
        for row in range(dataframe.shape[0]):
            lst_oligo_len_diff.append(abs(dataframe.iat[row,idx_col][1]- primer_opt_size)) 
        return lst_oligo_len_diff
    
    def gen_lst_others(self, dataframe, idx_col):
        lst_Tm_oligo, lst_Tm_Homo_oligo, lst_Ta, lst_Ta_diff= [], [], [], []
        for row in range(dataframe.shape[0]):
            lst_Tm_oligo.append(primer3.calcTm(dataframe.iat[row,idx_col], mv_conc=primer_salt_momovalent, 
                                                       dv_conc=primer_salt_divalent, dntp_conc=primer_dntp_conc, 
                                                       dna_conc=primer_dna_conc, max_nn_length=60, tm_method='santalucia', salt_corrections_method='santalucia'))
            lst_Tm_Homo_oligo.append(primer3.calcHomodimer(dataframe.iat[row,idx_col], mv_conc=primer_salt_momovalent, 
                                                       dv_conc=primer_salt_divalent, dntp_conc=primer_dntp_conc, 
                                                       dna_conc=primer_dna_conc, temp_c=37, max_loop=30).tm)
            lst_Ta.append(0.3*lst_Tm_oligo[row] + 0.7*self.lst_Tm_amplicon[row] - 14.9)
            lst_Ta_diff.append(abs(lst_Ta[row] - user_specified_Ta))
        return [lst_Tm_oligo, lst_Tm_Homo_oligo, lst_Ta_diff]
    

    # This is the method to generate new columns.
    def gen_new_col(self, dataframe, col_name, lst):
        dataframe[col_name] = pd.Series(lst)
        return dataframe
    
    # Swap the values of the tuples (oligo_start, oligo_len) for each oligo so that they oligo length can be easily seen.
    def tuple_swap(self, dataframe, idx_col):
        for row in range(dataframe.shape[0]):
            lst_tuple_swap = []
            lst_tuple_swap.append(dataframe.iat[row,idx_col][1])
            lst_tuple_swap.append(dataframe.iat[row,idx_col][0])
            dataframe.iat[row,idx_col] = tuple(lst_tuple_swap)
        return dataframe
    
    # Sort and re-index. Sorting is based on: the differences between the predicted values of oligo length/Ta 
    # and the expected values (25bp and Ta=60, respectively) 
    def rank(self, dataframe):
        df = dataframe.sort_values(by=['PRIMER_LEFT_LENGTH_DIFF','PRIMER_RIGHT_LENGTH_DIFF','PROBE_LENGTH_DIFF','PRIMER_LEFT_Ta_DIFF','PRIMER_RIGHT_Ta_DIFF','PROBE_Ta_DIFF'], ascending=[True, True, True, True, True, True], ignore_index=True).reset_index(drop=True)
        return df
    
gnc = GenNewCol()

# Generate column 'AMPLICON'
gnc.gen_new_col(df, 'AMPLICON', gnc.gen_lst_amplicon(df, 3))

# Generate the following columns   
gnc.gen_new_col(df, 'PRIMER_LEFT_LENGTH_DIFF', gnc.gen_lst_oligo_len_diff(df, 3))
gnc.gen_new_col(df, 'PRIMER_RIGHT_LENGTH_DIFF', gnc.gen_lst_oligo_len_diff(df, 4))
gnc.gen_new_col(df, 'PROBE_LENGTH_DIFF', gnc.gen_lst_oligo_len_diff(df, 5))

lst_lst_LEFT = gnc.gen_lst_others(df, 0)
gnc.gen_new_col(df, 'PRIMER_LEFT_Tm', lst_lst_LEFT[0])
gnc.gen_new_col(df, 'PRIMER_LEFT_Tm_Homo', lst_lst_LEFT[1])
gnc.gen_new_col(df, 'PRIMER_LEFT_Ta_DIFF', lst_lst_LEFT[2])

lst_lst_RIGHT = gnc.gen_lst_others(df, 1)
gnc.gen_new_col(df, 'PRIMER_RIGHT_Tm', lst_lst_RIGHT[0])
gnc.gen_new_col(df, 'PRIMER_RIGHT_Tm_Homo', lst_lst_RIGHT[1])
gnc.gen_new_col(df, 'PRIMER_RIGHT_Ta_DIFF', lst_lst_RIGHT[2])

lst_lst_PROBE = gnc.gen_lst_others(df, 2)
gnc.gen_new_col(df, 'PROBE_Tm', lst_lst_PROBE[0])
gnc.gen_new_col(df, 'PROBE_Tm_Homo', lst_lst_PROBE[1])
gnc.gen_new_col(df, 'PROBE_Ta_DIFF', lst_lst_PROBE[2])
    
# Swap values of (oligo_start, oligo_len) tuples so that sizes of oligos are more visible.
gnc.tuple_swap(df, 3)
gnc.tuple_swap(df, 4)
gnc.tuple_swap(df, 5)

result = gnc.rank(df)
result

Unnamed: 0,PRIMER_LEFT_SEQUENCE,PRIMER_RIGHT_SEQUENCE,PRIMER_INTERNAL_SEQUENCE,PRIMER_LEFT,PRIMER_RIGHT,PRIMER_INTERNAL,AMPLICON,PRIMER_LEFT_LENGTH_DIFF,PRIMER_RIGHT_LENGTH_DIFF,PROBE_LENGTH_DIFF,PRIMER_LEFT_Tm,PRIMER_LEFT_Tm_Homo,PRIMER_LEFT_Ta_DIFF,PRIMER_RIGHT_Tm,PRIMER_RIGHT_Tm_Homo,PRIMER_RIGHT_Ta_DIFF,PROBE_Tm,PROBE_Tm_Homo,PROBE_Ta_DIFF
0,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(24, 234)","(24, 433)","(24, 284)",AAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAA...,0,0,0,61.020463,-27.722902,10.153455,63.272355,1.178132,10.829023,71.038005,8.457218,13.158718
1,CAAGAGCTTCTTTGATGTTGCATT,CCAGGTAATCTTGGATAGTGGAAG,AGGCCAGGACCAGTGTAGAGCACA,"(24, 233)","(24, 432)","(24, 284)",CAAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGA...,0,0,0,62.332739,-27.722902,10.547138,62.001977,-33.79308,10.44791,71.038005,8.457218,13.158718
2,CCAAGAGCTTCTTTGATGTTGCAT,CAGGTAATCTTGGATAGTGGAAGA,AGGCCAGGACCAGTGTAGAGCACA,"(24, 232)","(24, 431)","(24, 284)",CCAAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAG...,0,0,0,63.913946,10.780594,11.021501,61.152691,-38.878269,10.193124,71.038005,8.457218,13.158718
3,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGGA,AGGCCAGGACCAGTGTAGAGCACA,"(24, 234)","(23, 433)","(24, 284)",AAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAA...,0,1,0,61.020463,-27.722902,10.153455,62.738458,1.178132,10.668854,71.038005,8.457218,13.158718
4,CAAGAGCTTCTTTGATGTTGCATT,CCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(24, 233)","(23, 432)","(24, 284)",CAAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGA...,0,1,0,62.332739,-27.722902,10.547138,61.047034,-33.79308,10.161427,71.038005,8.457218,13.158718
5,GAGCTTCTTTGATGTTGCATTAAA,AGCCCAGGTAATCTTGGATAGT,AGGCCAGGACCAGTGTAGAGCACA,"(24, 236)","(22, 435)","(24, 284)",GAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAAGT...,0,2,0,60.285313,-27.722902,10.076411,62.103388,1.178132,10.621833,71.038005,8.457218,13.302218
6,AAGAGCTTCTTTGATGTTGCATTA,CCCAGGTAATCTTGGATAGTGG,AGGCCAGGACCAGTGTAGAGCACA,"(24, 234)","(22, 433)","(24, 284)",AAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAA...,0,2,0,61.020463,-27.722902,10.153455,61.324379,1.178132,10.24463,71.038005,8.457218,13.158718
7,AGAGCTTCTTTGATGTTGCATTAA,GCCCAGGTAATCTTGGATAGTG,AGGCCAGGACCAGTGTAGAGCACA,"(24, 235)","(22, 434)","(24, 284)",AGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAAG...,0,2,0,61.020463,-27.722902,10.296955,61.911414,1.178132,10.564241,71.038005,8.457218,13.302218
8,AGAGCTTCTTTGATGTTGCATTAA,GCCCAGGTAATCTTGGATAGT,AGGCCAGGACCAGTGTAGAGCACA,"(24, 235)","(21, 434)","(24, 284)",AGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAAG...,0,3,0,61.020463,-27.722902,10.296955,60.586664,1.178132,10.166816,71.038005,8.457218,13.302218
9,AAGAGCTTCTTTGATGTTGCATT,CCCAGGTAATCTTGGATAGTGGAA,AGGCCAGGACCAGTGTAGAGCACA,"(23, 234)","(24, 433)","(24, 284)",AAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAA...,1,0,0,61.161851,-27.722902,10.195872,63.272355,1.178132,10.829023,71.038005,8.457218,13.158718


# Stage V - Report results.
--- Output assay result.

In [8]:
''' Print required contents including oligo sequences, corresponding Tm and Tm_homodimers, and amplicon sequences.
Alternatively, could use sys.stdout to log the report.
'''

# import sys

# class Logger:
#     def __init__(self, filename='default.log'):
#         self.terminal = sys.stdout
#         self.log = open(filename, 'a')

#     def write(self, msg):
#         self.terminal.write(msg)
#         self.log.write(msg)

#     def flush(self):
#         pass

# sys.stdout = Logger('Assay design report.txt')

import textwrap

class Report:
    def report(self, dataframe):
        row_num = dataframe.shape[0]
        print(f'We found ' + str(row_num) + ' groups of oligos. \n')
        for row in range(row_num):
            print(f'# ' + str(row+1))
            print(f'Forward - ' + dataframe.iat[row, 0] + 
                  '  Tm=' + str(round(dataframe.iat[row, 10], 1)) + 
                  '  Tm_homodimer=' + str(round(dataframe.iat[row, 11], 1)))
            print(f'Reverse - ' + dataframe.iat[row, 1] + 
                  '  Tm=' + str(round(dataframe.iat[row, 13], 1)) + 
                  '  Tm_homodimer=' + str(round(dataframe.iat[row, 14], 1)))
            print(f'Probe   - ' + dataframe.iat[row, 2] + 
                  '  Tm=' + str(round(dataframe.iat[row, 16], 1)) + 
                  '  Tm_homodimer=' + str(round(dataframe.iat[row, 17], 1)))
            print(f'Amplicon sequence (100bp) - \n' + textwrap.fill(dataframe.iat[row, 6], width=50) + '\n')
            
result_report = Report()
result_report.report(result)

We found 22 groups of oligos. 

# 1
Forward - AAGAGCTTCTTTGATGTTGCATTA  Tm=61.0  Tm_homodimer=-27.7
Reverse - CCCAGGTAATCTTGGATAGTGGAA  Tm=63.3  Tm_homodimer=1.2
Probe   - AGGCCAGGACCAGTGTAGAGCACA  Tm=71.0  Tm_homodimer=8.5
Amplicon sequence (100bp) - 
AAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAAGTAG
AGGCCAGGACCAGTGTAGAGCACAAGTCAGAGACTGGCAGGACTTGCAAG
AAGGAGAGTTCTTGTTACAGGCGCTGAATGGCTTTGTTCTGGTTGTCACG
GCAGATGCCTTGGTCTTCTATGCGTCTTCCACTATCCAAGATTACCTGGG

# 2
Forward - CAAGAGCTTCTTTGATGTTGCATT  Tm=62.3  Tm_homodimer=-27.7
Reverse - CCAGGTAATCTTGGATAGTGGAAG  Tm=62.0  Tm_homodimer=-33.8
Probe   - AGGCCAGGACCAGTGTAGAGCACA  Tm=71.0  Tm_homodimer=8.5
Amplicon sequence (100bp) - 
CAAGAGCTTCTTTGATGTTGCATTAAAATCCACCCCGGCTGACAGAAGTA
GAGGCCAGGACCAGTGTAGAGCACAAGTCAGAGACTGGCAGGACTTGCAA
GAAGGAGAGTTCTTGTTACAGGCGCTGAATGGCTTTGTTCTGGTTGTCAC
GGCAGATGCCTTGGTCTTCTATGCGTCTTCCACTATCCAAGATTACCTGG

# 3
Forward - CCAAGAGCTTCTTTGATGTTGCAT  Tm=63.9  Tm_homodimer=10.8
Reverse - CAGGTAATCTTGGATAGTGGAAGA  Tm=61.2  Tm