vsearch

Commands for installing vsearch

Cloning the repo. You will need Git, autoconf and automake to clone the repository and install VSEARCH. On a Debian-based Linux system, the three packages can be installed using the commands:

sudo apt-get install git autotools-dev

To clone the repository and install VSEARCH use the following commands:

$ git clone https://github.com/torognes/vsearch.git
$ cd vsearch
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install

Binary distribution. If cloning/compiling fails, you may directly download the pre-compiled VSEARCH binary for your system. If you are on a Linux system:

wget https://github.com/torognes/vsearch/releases/download/v2.3.0/vsearch-2.3.0-linux-x86_64.tar.gz
tar xzf vsearch-2.3.0-linux-x86_64.tar.gz

Or, if you are on a MAC system:

wget https://github.com/torognes/vsearch/releases/download/v2.3.0/vsearch-2.3.0-osx-x86_64.tar.gz
tar xzf vsearch-2.3.0-osx-x86_64.tar.gz

You will now have the binary distribution in a folder called vsearch-2.3.0-linux-x86_64 in which you will find three subfolders; bin, man and doc. We recommend making a copy or a symbolic link to the vsearch binary bin/vsearch in a folder included in your $PATH, and a copy or a symbolic link to the vsearch man page man/vsearch.1 in a folder included in your $MANPATH. The PDF version of the manual is available in doc/vsearch_manual.pdf.

Usage

Overview. VSEARCH includes commands to perform de novo clustering using a greedy and heuristic centroid-based algorithm with an adjustable sequence similarity threshold specified with the --id option (e.g., --id 0.97). The input sequences are either processed in the user supplied order (--cluster_smallmem) or pre-sorted based on length (--cluster_fast) or abundance (--cluster_size).

Method. Each input sequence is used as a query against an initially empty database of centroid sequences. The query sequence is clustered with the first centroid sequence found with similarity equal to or above the threshold (--id). If no matches are found, the query sequence becomes the centroid of a new cluster and is added to the database. If --maxaccepts is higher than 1 (default: 1), several centroids with sufficient sequence similarity may be found and considered. By default, the query is clustered with the centroid presenting the highest sequence similarity (distance-based greedy clustering), or, if the --sizeorder option is used, the centroid with the highest abundance (abundance-based greedy clustering).

Examples

vsearch --cluster_fast BR_cob_57ind_no_outgr.fasta --id 0.97 --centroids centroids-cf.fa --msaout msaout-cf.txt

vsearch --cluster_smallmem BR_cob_57ind_no_outgr.fasta --usersort --id 0.97 --centroids centroids-sm.fa --msaout msaout-sm.txt

vsearch --cluster_size BR_cob_57ind_no_outgr.fasta --id 0.97 --centroids centroids-sz.fa --msaout msaout-sz.txt

Note: When using ``--cluster_smallmem, option --usersort` indicates that sequences are not pre-sorted by length.

Exercise files

Input files

Filename	Description
CHANGE FILES
Anolis.fas	Input sequences

Produced output files

Filename	Description
[centroids-cf.fa](place link)	Centroids for `--cluster_fast`
[centroids-sm.fa](place link)	Centroids for `--cluster_smallmem`
[centroids-sz.fa](place link)	Centroids for `--cluster_size`
[msaout-cf.fa](place link)	Clusters for `--cluster_fast`
[msaout-sm.fa](place link)	Clusters for `--cluster_smallmem`
[msaout-sz.fa](place link)	Clusters for `--cluster_size`

More information on VSEARCH

Check the VSEARCH wiki page on clustering.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vsearch

Commands for installing vsearch

Usage

Examples

Exercise files

Input files

Produced output files

More information on VSEARCH

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Home

Main Tutorial Task

ABGD

Vsearch

Crop

GMYC

(m)PTP

tr2

Clone this wiki locally