As of 2019 Illumina will be discontinuing all ordering of TrueSeq Custom Amplicon (TSCA) reagents and materials. As of 3.2018 they have also discontinued access to their probe design algorithm. So, here I will be recreating the TSCA probe design from scratch to obviate any need to include Illumina in the process.

In [25]:
# make editing code easier
%reload_ext autoreload
%autoreload 2

The primer3 docs are [here](https://libnano.github.io/primer3-py/quickstart.html#installation).

In [2]:
import primer3
from Bio import Seq

primer3 can calculate Tm if passed a particular probe sequence.

In [5]:
primer3.calcTm('GTAAAACGACGGCCAGT')

49.16808228911765

It can also calculate the heterodimer thermodynamics of two sequences.

In [8]:
primer3.calcHeterodimer('CAACGTGGAATGTGCCCTGGTAGCAGAA', 'TGTTATGGTCCAGGAATGTGACATGGGTTG')
x=primer3.bindings.calcHomodimer('GTAAAACGACGGCCAGT')
y=primer3.bindings.calcHairpin('GTAAAACGACGGCCAGT')
z=primer3.bindings.calcEndStability('GTAAAACGACGGCCAGT','CTGTAACTCTGTGAAAATCAGTGTTTAAAATGTGTGACAAAAAGCAATAAAATCATGTTGATCGGCATACAAGAGATCAACGTGGAATGTGCCCTGGTAGCAGAAACAGGGTGGAGGAAAGTTGGAATTCACAAACATGTTTATAGATCTCTGGTTTTCTAAGTCCAGTTAGAAGATATTCAACCCATGTCACATTCCTGGACCATAACATTGCTCTGATGTTGATCTAGAAGCTGCCATCTATTGTACAGTTGAATCCGTCTATGGTAACTAGGCTAATCAATCAAGGAGGAAAATCAAGACAGGGAGCTTGTGAGAGTGGATGTGGTTTCTGGTCACAAGGCTTCCAGG')

If primer3 is passed a sequence, it will attempt to return four  internal, left, and right primers.

In [19]:
x='GCTTGCATGCCTGCAGGTCGACTCTAGAGGATCCCCCTACATTTTAGCATCAGTGAGTACAGCATGCTTACTGGAAGAGAGGGTCATGCAACAGATTAGGAGGTAAGTTTGCAAAGGCAGGCTAAGGAGGAGACGCACTGAATGCCATGGTAAGAACTCTGGACATAAAAATATTGGAAGTTGTTGAGCAAGTNAAAAAAATGTTTGGAAGTGTTACTTTAGCAATGGCAAGAATGATAGTATGGAATAGATTGGCAGAATGAAGGCAAAATGATTAGACATATTGCATTAAGGTAAAAAATGATAACTGAAGAATTATGTGCCACACTTATTAATAAGAAAGAATATGTGAACCTTGCAGATGTTTCCCTCTAGTAG'
primers = primer3.bindings.designPrimers(
    {
        'SEQUENCE_ID': 'MH1000',
        'SEQUENCE_TEMPLATE': x,
        'SEQUENCE_INCLUDED_REGION': [36,342]
    },
    {
        'PRIMER_OPT_SIZE': 27, 
        'PRIMER_PICK_INTERNAL_OLIGO': 1,
        'PRIMER_INTERNAL_MAX_SELF_END': 8,
        'PRIMER_MIN_SIZE': 22, 
        'PRIMER_MAX_SIZE': 30, 
        'PRIMER_OPT_TM': 69.9,
        'PRIMER_MIN_TM': 66.7,
        'PRIMER_MAX_TM': 73.2,
        'PRIMER_MIN_GC': 40.4,
        'PRIMER_MAX_GC': 59.6,
        'PRIMER_MAX_POLY_X': 100,
        'PRIMER_INTERNAL_MAX_POLY_X': 100,
        'PRIMER_SALT_MONOVALENT': 50.0,
        'PRIMER_DNA_CONC': 50.0,
        'PRIMER_MAX_NS_ACCEPTED': 0,
        'PRIMER_MAX_SELF_ANY': 12, 
        'PRIMER_MAX_SELF_END': 8,
        'PRIMER_PAIR_MAX_COMPL_ANY': 12, 
        'PRIMER_PAIR_MAX_COMPL_END': 8,
        'PRIMER_NUM_RETURN': 20
    })


The probes that I have designed for the original hematopoietic FERMI panel are [here](https://docs.google.com/spreadsheets/d/1VtoJPKQnmHPC3fwBRYzlXbNGHJrcByrO7C9IgmI79WY/edit#gid=0).

I am designing the probe designer to attempt to build probes that fall within the GC content and melting temperatures of the probes designed for the hematopoietic panel, the information for which is below.

Up GC: 45.3-59.6  
Up Tm: 67.7-72.9  
Down GC: 40.4-59.2  
Down Tm: 66.7-73.2  

The scheme for the two probe designs is as follows:  
Up  
[Binding] [UMI] [Targetarm]  
Down  
[Targetarm] [UMI] [Binding]  

UMI: NNNNNN  
Up Binding: CAACGATCGTCGAAATTCGC  
Down Binding: AGATCGGAAGAGCGTCGTGTA  
Targetarm len: 22-30bp


All of the necessary sites can be entered below and used to design probes.

I now realize that the previous way of going about this will be cumbersome at best, now I am going to manually design a probe designer, and only use primer3 for the calculations about Tm, dimerization, etc, while actually choosing the oligos myself.  

For the dimerization calculations, results are given according to the formula: ΔG = ΔH – TΔS, and a ΔG >-9kcal is generally acceptable for heterodimerization and >-5kcal is generally accepted for homodimerization.

I think the best strategy will be to create oligos that are 75bp away on either side of a mutation and then step one base at a time further away and maybe one base at a time closer as well, until I have a decently sized list of oligos that will be sequentially eliminated based on a number of desireable parameters.  

Then from this list I will eliminate all homodimerization below -5kcal.  

Then eliminate all hairpins below >-2kcal.  

Then eliminate all probes with GC content lower than 40.4 and higher than 59.6.  

Then eliminate all probes with poor GC clamps.  

Then eliminate all probes with melting temps lower and higher than 66.7 and 73.2.  

Heterodimerization and end stability should both then be optimized, the less negative the endstability ΔG and less negative the heterodimerization tm the better the primers. The best way I can think of simultaneously optimizing these paramters is to associate all remaining oligos with a value for heterdimerization and one for end stability. Then rank the two lists, and the first group that is observed once in each list will be the top oligo group.



In [1]:
import primer3
from Bio import Seq
ref = '../../../ReferenceGenomes/hg19.fa'
regions = {'ref':ref, 'seqs':[
    {'name':'TIIIN', 'chrom':'chr15', 'loc':92527082}
]}

This next part of the code will then get a sequence that surrounds the regions of interest.

In [2]:
from probeDesign import getSeq
seq = getSeq(regions)

Now the whole sequence will be used to design all possible primers that are between 22 and 30 bp long. Thermodynamics will be associated with each primer in a DataFrame, that will then be sorted as needed.

In [3]:
from probeDesign import possibleOligos
df = possibleOligos(seq)

Now, upstream oligos will be paired with the downstream oligos to which they have the least chance of forming a heteroduplex. I will have to test how many of these pairs need to be generated for a full panel to be created because heteroduplex capacity must be considered across all probes within a panel. For the moment I will find 1 pair per possible oligo.  

The first thing to do is eliminate the bottom 75% of oligos by 3' end stability. This will only retain those oligos with the highest ΔG values which will therefore be at a reduced risk of mispriming.

In [None]:
from probeDesign import sampleStability
