Skip to content

jbloomlab/CodonTilingPrimers

Repository files navigation

Tiling primers for codon mutagenesis

This repository contains information and a Python script for designing codon-mutant libraries of genes using a protocol from the Bloom lab.

Codon mutagenesis

Codon mutagenesis can be used to create mutant libraries of a gene with all possible codon mutations. Such libraries are useful for experiments such as deep mutational scanning or mutational antigenic profiling.

The script described below can design primers for a codon mutagenesis protocol used in the Bloom lab. This protocol randomly mutates each codon to all other possible codons.

The protocol was originally described in Bloom (2014), and the methods section of that paper goes over the protocol in quite a bit of detail. Additionally, here are Jesse's lab notes for the codon mutagenesis experiments described in that paper. Here is a schematic from Hugh Haddox that graphically illustrates the process from his paper on making libraries of HIV Env.

The protocol was subsequently improved by Adam Dingens by making the primers have roughly equal melting temperatures rather than equal lengths. We think that Adam's modification to use more equal melting temperatures improves the protocol by making the rate of mutation more uniform across the gene. These improvements are described in Dingens et al (2017) , and the script included in this repository (and described below) uses this improvement.

A few important considerations regarding our codon mutagenesis protocol:

  • The number of PCR cycles and quantifying the template concentrations is important. Don't deviate from the protocol on those unless you have a good reason.
  • It is important to use linear PCR product as your initial template rather than plasmid. This is because PCR is more efficient from linear templates, and the cycle numbers are optimized for that.
  • The total number of mutations that you will get out depends on how many rounds of the whole process that you perform. In the original Bloom (2014) paper, we used three overall rounds of the process. This gave an average of almost three codon mutations per gene. That number is probably too high for most applications, so in more recent work we have used two (or sometimes even just one) round of the overall process to get an average of closer to 1 to 1.5 codon mutations per gene. (Note that recently Shirleen in the lab has found that using just one round of the whole process but increasing the number of cycles in the fragment PCR from 7 to 10 also works well in terms of giving 1 to 1.5 codon mutations, although these libraries are not yet verified by deep sequencing.)
  • We recommend that you Sanger sequence some clones to make sure things look reasonable before proceeding. An example of the types of features that you want to look at are in S1 Fig of Haddox et al (2016). A script that can generate these figures from Sanger sequencing data is at https://github.com/jbloomlab/SangerMutantLibraryAnalysis

Finally, note that the gene variants generated by our protocol will have a roughly Poisson distribution of the number of mutations per variant. This means that some variants will have single mutations, while some will have zero or multiple mutations. If the mean number of mutations is one, then about 37% will have no mutations, 37% will have a single mutation, and the rest will have more than one mutations (mostly two mutations).

Example uses of this protocol

Here are some of the genes for which we have used this protocol:

Alternate protocols

There are alternative codon mutagenesis protocols that are intended to give just single codon mutations. Here are some of these alternative protocols:

Our lab has not tested any of these other protocols, so we cannot personally offer advice on how well any of them work.

Running the script to design primers

The create_primers.py Python script can be used to create NNN primers that tile the codons of a gene in both the forward and reverse direction. You can use this primers to make codon mutagenesis libraries. You might want to order these primers in the form of an IDT 96-well plate.

The script takes command line arguments; for a listing of how to provide the arguments, type the following to get the help message:

python create_primers.py -h

For instance, the file EN72-HA_primers.txt included in the repository can be generated with:

python create_primers.py EN72-HA.txt EN72 1 EN72-HA_primers.txt

Note that the parent gene file (EN72-HA.txt in this case) should have upper-case letters for the coding sequence to mutate and lower case letters for the flanking region.

There are a variety of optional parameters specifing primer length and melting temperature constraints; the default values for these optional parameters are displayed when you run the program with the -h option to get the help message.

For instance, if you want to make NNS rather than NNN codon mutations, use the option --ambiguous_codon NNS. Currently, NNN, NNS, NNK, NNG, and NNC are the supported ambiguous codons.

Additionally, the script has two --output options: plates and opools. The plates output is the original output format that lists oligos in sets of 96 for ordering as plates from IDT. The opools output removes the plate separation and adds Pool name and Ambiguous codon columns. The Pool name column is necessary for ordering oPools from IDT and the Ambiguous codon column is to keep track of what ambiguous codon was used as it is necessary to make primers separately for NNG and NNC to create an NNS oligo pool.

You can also adjust the optional parameters described in the help message, such as:

python create_primers.py EN72-HA.txt EN72 2 EN72-HA_primers.txt --minprimertm 65 --maxprimertm 66

The script works as follows:

  1. For each codon, it first makes an ORIGINAL primer of the length specified by --startprimerlength
  2. If the original primer has a melting temperature (Tm) greater than the value specified by --maxprimertm, then nucleotides are trimmed off one by one (first from the 5' end, then the 3' end, then the 5' end again, etc) until the melting temperature is less than --maxprimertm or the length is reduced to --minlength.
  3. If the original primer has a Tm greater than --minprimertm, then nucleotides are added one-by-one (first to the 3' end, then the 5' end, then the 3' end again, etc) until the melting temperature is greater than --minprimertm or the length reaches --maxlength.
  4. Note that because the primers are constrained to be between --minprimerlength and --maxprimerlength, the Tm may not always fall between --minprimertm and --maxprimertm. This can also happen if a primer initially exceeds --maxprimertm but the first trimming that drops it below this value also drops it below --minprimertm, or vice-versa if the primer is being extended to increase its melting temperature.

The Tm_NN command of the MeltingTemp module of Biopython is used to calculate Tm of primers. This calculation is based on nearest neighbor thermodynamics; nucleotides labeled N are given average values in the Tm calculation.

The result of running this script is the file specified by outfile. It lists the primers. All of the forward primers are have names which are the prefix specified by primerprefix, then -for-mut, then the codon number starting with firstcodon. The reverse primers are named similarly, but with the for replaced by rev. The forward primers are grouped in sets of 96 (for ordering in 96-well plates), as are the reverse primers. The file EN72-HA_primers.txt shows an example output file.

Ordering mutagenesis primers from IDT

Primers can be ordered in 96-well plates from IDT.

First, manually generate an Excel document from the primer list text file output of create_primers.py in the appropriate format for submission:

  • Open this file with excel, using the comma as the delimiter.

  • Separate the plates, giving each plate its own spreadsheet.

  • Add a third column specifying the well position of each primer, going from A1-H12, with columns in the following order:

    WellPosition, Name, Sequence.

Then, upload this document to the IDT website.

Set the IDT 96-well plate parameters:

  • scale: 25 nmol DNA oligo
  • purification: standard desalting
  • plate type: deep-well plate
  • ship option: wet
  • buffer: IDTE pH 7.5
  • normalization type: full yield
  • concentration: 100 uM (volume option goes away)