-
Notifications
You must be signed in to change notification settings - Fork 23
read_fasta
read_fasta read in sequence entries from FASTA files. Each sequence entry consists of a sequence name prefixed by a '>' followed by the sequence name on a line of its own, followed by one or my lines of sequence until the next entry or the end of the file. The resulting biopiece record consists of the following record type:
SEQ_NAME: test
SEQ_LEN: 10
SEQ: ATCGATCGAC
---
Input files may be compressed with gzip og bzip2.
For more about the FASTA format:
http://en.wikipedia.org/wiki/Fasta_format
read_fasta [options] -i <FASTA file(s)>
[-? | --help] # Print full usage description.
[-i <files!> | --data_in=<files!>] # Comma separated list of files or glob expression to read.
[-n <uint> | --num=<uint>] # Limit number of records to read.
[-I <file> | --stream_in=<file!>] # Read input stream from file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output stream to file - Default=STDOUT
[-v | --verbose] # Verbose output.
To read all FASTA entries from a file:
read_fasta -i test.fna
To read in only 10 records from a FASTA file:
read_fasta -n 10 -i test.fna
To read all FASTA entries from multiple files:
read_fasta -i test1.fna,test2.fna
To read FASTA entries from multiple files using a glob expression:
read_fasta -i '*.fna'
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2007
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
read_fasta is part of the Biopieces framework.