Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: embl_index

Description

EMBL files can be indexes with embl_index that for each EMBL entry output Biopiece records with the entry ID, file OFFSET position and LEN or the EMBL entry as well as the full file path:

FILE: /Users/maasha/DATA/EMBL/rel_ann_pro_01_r100.dat
ID: CH482373
OFFSET: 0
LEN: 6647452
---

This can be used with write_mysql to create a database for fast lookups of EMBL entries.

Note that EMBL files cannot be gzipped because gzipped files does not allow random access.

Usage

embl_index [options] -i <EMBL file(s)>

Options

[-?          | --help]               #  Print full usage description.
[-i <files!> | --data_in=<files!>]   #  Comma separated list of files or glob expression to read.
[-I <file>   | --stream_in=<file!>]  #  Read input stream from file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output stream to file  -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

embl_index -i rel_ann_pro_01_r100.dat | head_records -n 1

FILE: /Users/maasha/DATA/EMBL/rel_ann_pro_01_r100.dat
ID: CH482373
OFFSET: 0
LEN: 6647452
---

To create an EMBL MySQL database do:

embl_index -i rel_ann_pro_01_r100.dat | write_mysql -d EMBL -r Release_100 -x

See also

read_embl

write_mysql

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

July 2009

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

embl_index is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally