Skip to content

For Users: CLI

billzt edited this page Aug 25, 2022 · 10 revisions

The CLI version deploys a command called primertool.

Running modes

Full mode

$ primertool full -h

In this mode, primertool accepts target regions or target sequences (called query) defined by users, designs primers from these regions, and searches user defined databases (called templates) to defined primer specificities.

Design mode

$ primertool design -h

In this mode, primertool just designs primers and doesn't conduct specificity check. This is usually used in NGS target sequencing panels where primer specificity is not so necessary as in legacy PCR.

Checking mode

$ primertool check -h

In this mode, primertool accepts a set of user defined primer sequences (called query) and check their specificity against certain databases (called templates). This is usually used if users already have some primer sequences from literatures or colleagues.

Input (Required, important):

Two Mandatory Parameters

  query                 query file. (STDIN is acceptable)
  templates             template file in FASTA format. Allowing multiple files
                        (separated by comma), where the first one is used to
                        design primers and/or order the primer specificity

query file formats

(1) Full mode or design mode There are two choices:

A): If you have parts of template sequences, you can directly input in FASTA format:

>site1
TGTGATATTAAGTAAAGGAACATTAAACAATCTCGACACCAGATTGAATATCGATACAGA
TACCCCAACTGCCGCCAATTCAACCGACCCTTCACCACAAAAAAACTAATATTTATCAGC
CAATA[GTTACCTGTGTG]ATTAATAGATAAAGCTACAAAAGCAAGCTTGGTATGATAGT
TAATAATAAAAAAAGAAAAAACAAGTATCCAAATGGCCAACAAAGGCTGTATCAACAAGT
>site2
ACCAGATTGAATATCGATACAGATACCCCAACTGCCGCCAATTCAACCGACCCTTCACCA
CAAAAAAACTAATATTTATCA[GC]CAATAGTTACCTGTGTGATTAATAGATAAAGCTAC
AAGCAAGCTTGGTATGATAGTATTAATAATAAAAAAAGAAAAACAAGTATCCAAATGGCC

Note there is a pair of square brackets indicating target in each sequences. It means primers should be put around the target. This is the default mode.

If you don't provide square brackets, primers would be searched in any region within each sequences. In this case you should set option --type to SEQUENCE_INCLUDED_REGION.

B): If you have genomic coordinates for each sites, you can input coordinates like:

seq1 200 10
seq1 400 10

It means that two sites (one site per line) are needed to design primers. The first site is in seq1 and starts in position 200 and the region length is 10 (means seq1:200-209). The second site is in seq1 and starts in position 400 and the region length is 10 (means seq1:400-409).

If you want to make the secode amplicon length longer than the first one, you can write like this:

seq1 200 10 100 150
seq1 400 10 200 250

It means that the first site should has PCR amplicons in length 100-150bp, and the second PCR amplicons in length 200-250bp.

(2) Check mode Here is an example

P1 CTTCTGCAATGCCAAGTCCAG   GTGGTGAAGGGTCGGTTGAA
P2 ACCAAACCCCAGAGTCAATTAA  TCTATCTATTGCACTGCCTGTTG

It means that two primer pairs are needed to check.

Overall Setttings

  --primer-num-retain PRIMER_NUM_RETAIN
                        The maximum number of primers to retain in each site
                        in the final report. (default: 10)
  --check-multiplex     Checking dimers between primers in different sites,
                        which is useful in multiplex PCR. (default: False)
  --Tm-diff TM_DIFF     The mininum difference of melting temperature (℃)
                        suggested to produce off-target amplicon or primer
                        dimers. Suggest >10 (default: 20)
  -p CPU, --cpu CPU     Used CPU number. (default: 2)
  -o OUT, --out OUT     Output primers in JSON format. default: {query}.json
                        (default: None)
  -t TSV, --tsv TSV     Output primers in TSV format. default: {query}.tsv
                        (default: None)

Mode specific settings

Full mode and Design mode

  --type {SEQUENCE_TARGET,SEQUENCE_INCLUDED_REGION,FORCE_END}
                        designing primer types (default: SEQUENCE_TARGET)
  --pick-oligo          Pick internal Oligos (Probes) for qRT-PCR (default:
                        False)
  --product-size-min PRODUCT_SIZE_MIN
                        Lower limit of the product amplicon size range (bp).
                        (default: 70)
  --product-size-max PRODUCT_SIZE_MAX
                        Upper limit of the product amplicon size range (bp).
                        (default: 1000)
  --Tm-opt TM_OPT       Optmized melting temperature for primers (℃).
                        (default: 60)
  --primer-num-return PRIMER_NUM_RETURN
                        The maximum number of primers to return in Primer3
                        designing results. (default: 30)
  --junction            Primer pair must be separated by at least one intron
                        on the corresponding genomic DNA; or primers must span
                        an exon-exon junction. Junction data in JSON format
                        should be prepared by the command-line primertool-
                        junctions (default: False)

As to the details of the --type option, please refer to the legacy PrimerServer wiki. Generally speaking:

  • SEQUENCE_TARGET: the default value, used in target sequencing (SSR, SNPs, and so on)
  • SEQUENCE_INCLUDED_REGION: used in qPCR quantification
  • FORCE_END: used in SNP genotyping.

Full mode and Check mode

  -3, --use-3-end       If turned on, primer pairs having at least one
                        mismatch at the 3 end position with templates would
                        not be considered to produce off-target amplicon, even
                        if their melting temperatures are high. Turn on this
                        would find more candidate primers, but might also have
                        more false positives (default: False)
  --checking-size-min CHECKING_SIZE_MIN
                        Lower limit of the checking amplicon size range (bp).
                        (default: 50)
  --checking-size-max CHECKING_SIZE_MAX
                        Upper limit of the checking amplicon size range (bp).
                        (default: 2000)
  --amplicon-num-max AMPLICON_NUM_MAX
                        The maximum number of amplicons for checking.
                        (default: 10)
  -a, --report-amplicon-seqs
                        Get amplicon seqs (might be slow) (default: False)
  --isoform             Allow primers targeting on alternative isoforms and
                        still regard them as specific ones. Isoform data in
                        JSON format should be prepared by the command-line
                        primertool-isoforms (default: False)

Output

TSV format

Can be pasted into Office software like Excel to view details. Each primer is listed in a line.

Please be noted that you can't view primer's position in templates unless it is marked as specific, since there's no such space to write multiple positions in a single line.

JSON format

Contain all information. Can be used to visualize in website or extract information for downstream analysis.

Suggestions

I want as strict as possible

Increase --checking-size-max

Increase --Tm-diff

I want as many primers as possible while accepting some false positives.

Decrease --checking-size-max

Decrease --Tm-diff

Turn on --use-3-end

Increase --primer-num-return