-
Notifications
You must be signed in to change notification settings - Fork 0
GetPrimers command line manual
Linux 64-bit operation system, GCC, g++ compiler, gzip and perl (>=5.8) environment are pre-required. Run the following commands to install GetPrimers and third-party softwares,
git clone https://github.com/codeatcg/GetPrimers.git
cd GetPrimers
sh install.sh
Third-party softwares
For short primer strategy GetPrimer used WindowMasker to mask repeat regions in the genome and then extracted gene sequence based on GFF file. Primer3 was used to get raw primers. After preliminary quality control the primers were mapped to the genome by blastn. Based on the alignment information the product sizes were predicted. Subsequently the primers were filtered further. Then all combinations of upstream and downstream primers were evaluated and graded. Finally, the pointed number of gene targeting primer sets were outputted.
For long primer strategy GetPrimer also used WindowMasker to mask repeat regions in the genome and then extracted gene sequence based on GFF file. The default primer size was 65 bp with a 45 bp homology to yeast genome and 20 bp homology to plasmid. The primer size can be pointed by customers by modifying the config file. Then the primer section that was homologous to yeast genome was aligned to the source genome by blastn and the hit counts were saved.
- pipeline of short primer strategy
Short primer strategy
- Knockout
- C' tagging
- N' tagging
Long primer stragety
- Knockout
- C' tagging
- N' tagging
Verification primers
- Knockout
- C' tagging
- N' tagging
For short primer strategy the candidate gene targeting primers were graded based on a serial of criteria. When ‘--blastn’ option is set in silico PCR amplification will be run and the number of probable products will be as one of the criteria to grade the primer sets. Here, a targeting primer set is composed of primer G1, G3, G4 and G2. A verification primer set is composed of V1, V3 or V4, V2 for knockout and V1, V2 for C’ tagging and N’ tagging.
- in silico PCR
- rating criteria for gene targeting primers
- rating criteria for verification primers
--ctag design primers for adding tags at the C terminal
--ntag design primers for adding tags at the N terminal
--knockout design primers for gene knockout
--force force to output G3 and G4 that don't meet the criteria to design primers
--refine refine primer design
--sscodon primers for gene knockout contain start codon and stop codon
--mode [string] method for primer design, by default: short (long | short)
--coordinate [string] coordinate of target gene, any location in the gene. format: chr:coordinate
--coordList [file] list of coordinates of target genes. one line per gene
--geneName [string] name of target gene, name can be ID, GeneName or DbxrefID
--geneList [file] list of names of target genes. one line per gene
--all design primers for all genes
--gff [file] GFF file
--genome [file] reference genome file
--mask mask repetitive elements of genome
--marker [file] selection marker or insertion sequence
--rec redesign common sequence of primers, which is homologous to the insertion cassette (section of P1 and P2, V5, V6)
--outDir [dir] output directory
--config [file] config file (in most cases there's no need to modify the file)
--nump [int] number of primers output at most, by default 5
--bundle [int] in case of designing primers for many genes bundle a certain number of genes together, by default 1000
--blast run blast
--thread [int] number of threads
--pcon [float] primer concentration, by default 50 nM (unit nM)
--salt [float] concentration of monovalent cation, by default 50 mM (unit mM)
--dsalt [float] concentration of divalent cation, by default 1.5 mM (unit mM)
--dntp [float] concentration of dNTP, by default 0.6 mM (unit mM)
--thermo thermodynamic models is used for oligo-oligo interactions and hairpins
Flat or gzip compressed file is supported. When using files downloaded from Ensembl or NCBI please make sure that Genome file and GFF file are from the same database.
Marker file contains a DNA sequence from plasmid, which is the insertion cassette. Several insertion cassettes are available in ‘plasmid’ directory.
For most situations, there is no need to modify the config file ‘config.txt’ in script directory. The program searches the config file ‘config.txt’ in script directory automatically. You can also define a different file path by using ‘--config’. Please read the comment line in the file ‘config.txt’ carefully if you want to modify the default values.
When genome file and GFF file downloaded from NCBI are used Gene Names, Annotation ID and Gene ID are all supported. For example:
-
Annotation ID List:
YAL067C
YAL064C-A -
Gene Name List:
SEO1
TDA8 -
Gene ID List:
851230
851234
When genome file and GFF file downloaded from Ensembl are used Gene Names and Annotation ID are supported but Gene ID is not.
The format is (Chromosome name):(Coordinate). Coordinate can be any position in a gene. For example:
-
NCBI
NC_001133.9:7236
NC_001133.9:13364 -
Ensembl
I:7236
I:13364
Four sub-directories will be created automatically in the output directory. Intermediate files are saved in directory 'primer_blast',‘primer_para’and‘primer_raw'. The final results are saved in directory 'primer_result'.
wget -c http://ftp.ensemblgenomes.org/pub/fungi/release-53/fasta/saccharomyces_cerevisiae/dna/Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz
wget -c http://ftp.ensemblgenomes.org/pub/fungi/release-53/gff3/saccharomyces_cerevisiae/Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz
Getprimers can design gene targeting primers automatically for all genes in the genome or a list of genes or one gene. Gene name and gene location are both supported. Please use the absolute file paths running the tests. The default value of option '--mode' is 'short'. If you want to use long primer strategy please add option '--mode long' in the following command line.
knockout
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_knockout.fa --outDir all_knockout --mask --all --thread 10 --knockout --nump 10 --blast --force --refine --thermo
C’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_ctag.fa --outDir all_ctag --mask --all --thread 10 --ctag --nump 10 --blast --force --refine --thermo
N’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_GFP_ntag.fa --outDir all_ntag --mask --all --thread 10 --ntag --nump 10 --blast --force --refine --thermo
- Gene coordinate list
knockout
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_knockout.fa --outDir coord_list_knockout --mask –coordList coord1.list --thread 10 --knockout --nump 10 --blast --force --refine --thermo
C’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_ctag.fa --outDir coord_list_ctag --mask –coordList coord1.list --thread 10 --ctag --nump 10 --blast --force --refine --thermo
N’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_GFP_ntag.fa --outDir coord_list_ntag --mask –coordList coord1.list --thread 10 --ntag --nump 10 --blast --force --refine --thermo
- Gene Name list
knockout
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_knockout.fa --outDir name_list_knockout --mask --geneList geneName1.list --thread 10 --knockout --nump 10 --blast --force --refine --thermo
C’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_ctag.fa --outDir name_list_ctag --mask --geneList geneName1.list --thread 10 --ctag --nump 10 --blast --force --refine --thermo
N’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_GFP_ntag.fa --outDir name_list_ntag --mask --geneList geneName1.list --thread 10 --ntag --nump 10 --blast --force --refine --thermo
- Gene coordinate
knockout
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_knockout.fa --outDir coord_one_knockout --mask –coordinate I:7236 --thread 10 --knockout --nump 10 --blast --force --refine --thermo
C’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_ctag.fa --outDir coord_one_ctag --mask –coordinate I:7236 --thread 10 --ctag --nump 10 --blast --force --refine --thermo
N’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_GFP_ntag.fa --outDir coord_one_ntag --mask –coordinate I:7236 --thread 10 --ntag --nump 10 --blast --force --refine --thermo
- Gene name
knockout
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_knockout.fa --outDir name_one_knockout --mask --geneName SEO1 --thread 10 --knockout --nump 10 --blast --force --refine --thermo
C’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_ctag.fa --outDir name_one_ctag --mask --geneName SEO1 --thread 10 --ctag --nump 10 --blast --force --refine --thermo
N’ tagging
perl mainFun.pl --genome Saccharomyces_cerevisiae.R64-1-1.dna_sm.toplevel.fa.gz --gff Saccharomyces_cerevisiae.R64-1-1.53.gff3.gz --marker pFA6_GFP_ntag.fa --outDir name_one_ntag --mask --geneName SEO1 --thread 10 --ntag --nump 10 --blast --force --refine --thermo