-
Notifications
You must be signed in to change notification settings - Fork 23
align_seq
align_seq creates an alignment of all sequences in the stream.
align_seq currently uses Muscle as alignment engine, and Muscle must be installed in order for align_seq to work.
For more about Muscle:
... | align_seq [options]
[-? | --help] # Print full usage description.
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
Consider the following file test.fna
containing these FASTA entries:
>test1
CTAGCTTCGACT
>test2
GAATCGACT
>test3
ACGAAACTAGCATC
>test4
AGCATCGACT
>test5
TAACAGGCACT
In order to align these sequences read the file with read_fasta and pipe the stream to align_seq:
read_fasta -i test.fna | align_seq
SEQ: ---TAACAGGCACT
SEQ_LEN: 14
SEQ_NAME: test5
---
SEQ: -----GAATCGACT
SEQ_LEN: 14
SEQ_NAME: test2
---
SEQ: --CTAGCTTCGACT
SEQ_LEN: 14
SEQ_NAME: test1
---
SEQ: ACGAAACTAGCATC
SEQ_LEN: 14
SEQ_NAME: test3
---
SEQ: ----AGCATCGACT
SEQ_LEN: 14
SEQ_NAME: test4
---
The resulting alignment can then be written in FASTA format using write_fasta:
read_fasta -i test.fna | align_seq | write_fasta -x
>test5
---TAACAGGCACT
>test2
-----GAATCGACT
>test1
--CTAGCTTCGACT
>test3
ACGAAACTAGCATC
>test4
----AGCATCGACT
Or you can write the alignment in pretty text format using write_align:
read_fasta -i test.fna | align_seq | write_align -x
.
test5 ---TAACAGGCACT
test2 -----GAATCGACT
test1 --CTAGCTTCGACT
test3 ACGAAACTAGCATC
test4 ----AGCATCGACT
Consensus: 50% ----A-C----ACT
If there is only two aligned sequence in the stream, write_align will output a pairwise alignment in pretty text:
read_fasta -i test.fna -n 2 | align_seq | write_align -x
.
test1 CTAGCTTCGACT
| ||||||
test2 ---GAATCGACT
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2007
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
align_seq is part of the Biopieces framework.