-
Notifications
You must be signed in to change notification settings - Fork 23
read_psl
Martin Asser Hansen edited this page Oct 2, 2015
·
8 revisions
read_psl read PSL data from file. The PSL format consists of up to 21 columns:
- MATCHES - Number of non-repeat matches.
- MISMATCHES - Number of mismatches.
- REPMATCHES - Number of repeat matches.
- NCOUNT - Number of Ns.
- QNUMINSERT - Number of inserts in query.
- QBASEINSERT - Number of bases inserted in query.
- SNUMINSERT - Number of inserts in subject.
- SBASEINSERT - Number of bases inserted in subject.
- STRAND - Strand.
- Q_ID - Query ID.
- Q_LEN - Query length.
- Q_BEG - Query begin.
- Q_END - Query end.
- S_ID - Subject ID.
- S_LEN - Subject length.
- S_BEG - Subject begin.
- S_END - Subject end.
- BLOCKCOUNT - Block count.
- BLOCKSIZES - Block sizes.
- Q_BEGS - Query sequence blocks begins.
- S_BEGS - Subject sequence blocks begins.
read_psl adds an additional two keys:
- SCORE - Score calculated as in web BLAT results.
- SPAN - The span of the hit.
- REC_TYPE - Record type.
For more about the PSL format:
http://genome.ucsc.edu/FAQ/FAQformat#format2
read_psl [options] -i <PSL file(s)>
[-? | --help] # Print full usage description.
[-i <files!> | --data_in=<files!>] # Comma separated list of files or glob expression to read.
[-n <uint> | --num=<uint>] # Limit number of records to read.
[-I <file!> | --stream_in=<file!>] # Read input stream from file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
To read all PSL entries from a file:
read_psl -i test.psl
To read in only 10 records from a PSL file:
read_psl -n 10 -i test.psl
To read all PSL entries from multiple files:
read_psl -i test1.psl,test2.psl
To read PSL entries from multiple files using a glob expression:
read_psl -i '*.psl'
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2007
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
read_psl is part of the Biopieces framework.