RM-seq: Resistance Mutation SEQuencing

Analysis bioinformatic pipeline for high-throughput assessment of resistance mutations. RM-seq is an amplicon-based, deep-sequencing technique using single molecule barcoding. We have adapted this method to identify and characterise antibiotic resistance mutation.

The complete description of the RM-seq workflow is available here

Is this the right tool for me?

To be able to us this pipeline you need to have sequenced amplicon library with molecular barcodes.
It only supports paired-end FASTQ reads (including .gz compressed fastq files).
It needs paired reads that are overlapping.
It needs a reference fasta sequences of the sequenced gene (DNA sequence).
It's written in Python3 and Perl.

Installation

Install RM-seq pipeline

pip3 install rmseq

Dependencies

RM-seq has the following package dependencies:

EMBOSS >= 6.6 for clustalo, cons, getorf, diffseq
clustal-omega >= 1.2.1
bwa >= 0.7.15
samtools >= 1.3
bedtools >= 2.26.0
pear >= 0.9.10
cd-hit >= 4.7
trimmomatic >= 0.36
seqtk >= 1.3-r106 (only if you subsample reads)
python modules: plumbum, Biopython

If you are using the OSX Brew or LinuxBrew packaging system:

brew tap homebrew/science
brew tap tseemann/bioinformatics-linux
brew install parallel; parallel --citation # please write will cite
brew install bedtools
brew install EMBOSS
brew install clustal-omega
brew install bwa
brew install samtools
brew install pear
brew install cd-hit
brew install trimmomatic
brew install seqtk
pip3 install plumbum
pip3 install biopython

Quick start

Do

rmseq

Help

usage: rmseq [-h]  ...

Run RM-seq pipeline.

optional arguments:
  -h, --help  show this help message and exit

Commands:

    run       Run the pipeline.
    version   Print version.
    check     Check pipeline dependencies
    test      Run the test data set.

To check dependencies are installed

rmseq check

To run the test dataset

rmseq test

To run analysis pipeline, follow the steps in

rmseq run -h
usage: rmseq run [options]

Run the pipeline

positional arguments:
   R1                    Path to read pair 1
   R2                    Path to read pair 2
   refnuc                Reference sequence that will be used for premapping
                    filtering and mutation annotation (fasta).
   outdir                Output directory.

optional arguments:
   -h, --help            show this help message and exit
   -d, --debug_on        Switch on debug mode.
   -f, --force           Force overwite of existing.
   -b BARLEN, --barlen BARLEN
                         Length of barcode (default 16)
   -m MINFREQ, --minfreq MINFREQ
                         Minimum barcode frequency to keep (default 5)
   -q BASEQUAL, --basequal BASEQUAL
                         Minimum base quality threshold used for trimming the
                         end of reads (trimmomatic TRAILING argument) (default
                         30)
   -c CPUS, --cpus CPUS  Number of CPUs to use (default 72)
   -t TRANSLATION, --translation TRANSLATION
                         Manually set the reading frame for translation (use 1,
                         2 or 3 - use getorf by default)
   -r MINSIZE, --minsize MINSIZE
                         Minimum ORF size in bp used when annotating variants
                         (default 200)
   -w WSIZE, --wsize WSIZE
                         Word-size option to pass to diffseq for comparison
                         with reference sequence (default 5)
   -s SUBSAMPLE, --subsample SUBSAMPLE
                         Only examine this many reads.
   -k, --keepfiles       Keep the intermediate files (default remove)
   -n, --noaln           Skip reads alignment when generating consensus (to use
                         for indel quantification only) (default align)

To check the version

rmseq version

Outputs

RM-seq produces a tap-separated output file called amplicons.effect where each raw correspond to a consensus amplicon (a genetic variant in the sequenced population):

Column	Example	Description
barcode	GACACAACTGAGATTA	sequence of the barcode
sample	Rifampicin1	output folder name
prot_mutation	H481N	annotation of the amino acid change (Histidine residue 481 substituted by Asparagine)
prot_start	481	start coordinate of the mutation
prot_end	481	end coordinate of the mutation
nuc_mutation	C1443G	annotation of the nucleotide change
nuc_start	1443	start coordinate of the nucleotide change
nuc_end	1443	end coordinate of the nucleotide change
prot	VRPPDKNNRFVGLYCTLV...	protein sequence of the consensus sequence
dna	GGTTAGACCACCCGATAA...	dna sequence of the consensus sequence
reference_barcode	CTGACACGTCCTGAAG	barcode of the identical consesnsus amplicon used for annotation

The other files produced by RM-seq are:

File name	Description
amplicons.barcodes	Table with the count of each barcode sequence
amplicons.fna	Multifasta file containing all the consensus nucleotide sequence (header of sequence is the barcode)
amplicons.faa	Multifasta file containing all the consensus protein sequence (header of sequence is the barcode)
amplicons.fna.cdhit	Multifasta file containing all the unique consensus nucleotide sequence (header of sequence is the barcode)
amplicons.faa.cdhit	Multifasta file containing all the unique consensus amino acid sequence (header of sequence is the barcode)

Issues

Please report problems to the Issues Page.

Authors

Romain Guerillot (github: rguerillot) | Torsten Seemann (github: tseemann) | Mark B Schultz (github: schultzm)

Citation

If you use this tool please cite: Guérillot R et al. Comprehensive antibiotic-linked mutation assessment by Resistance Mutation Sequencing (RM-seq). 2018. doi:10.1101/257915.

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
RMseq		RMseq
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README		README
README.md		README.md
consensus_reads_nb_vs_reads.md		consensus_reads_nb_vs_reads.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RM-seq: Resistance Mutation SEQuencing

Is this the right tool for me?

Installation

Install RM-seq pipeline

Dependencies

Quick start

Help

To check dependencies are installed

To run the test dataset

To run analysis pipeline, follow the steps in

To check the version

Outputs

Issues

Authors

Citation

About

Releases

Packages

Contributors 2

Languages

rguerillot/RM-seq

Folders and files

Latest commit

History

Repository files navigation

RM-seq: Resistance Mutation SEQuencing

Is this the right tool for me?

Installation

Install RM-seq pipeline

Dependencies

Quick start

Help

To check dependencies are installed

To run the test dataset

To run analysis pipeline, follow the steps in

To check the version

Outputs

Issues

Authors

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages