Browse files

Adding example use-case to Bio.SearchIO.index_db docstring.

No functional changes, skip testing with TravisCI [ci skip]
  • Loading branch information...
1 parent 7c5fa9f commit 69c8c7d501ddc890a09634c54e97699695ccb59b @peterjc peterjc committed Dec 3, 2012
Showing with 7 additions and 1 deletion.
  1. +7 −1 Bio/SearchIO/
@@ -505,7 +505,7 @@ def index_db(index_filename, filenames=None, format=None,
The `index_db` function is similar to `index` in that it indexes the start
position of all queries from search output files. The main difference is
- instead of storing these indices in-memory, they are written into a flat
+ instead of storing these indices in-memory, they are written to disk as an
SQLite database file. This allows the indices to persist between Python
sessions. This enables access to any queries in the file without any
indexing overhead, provided it has been indexed at least once.
@@ -529,6 +529,12 @@ def index_db(index_filename, filenames=None, format=None,
>>> db_idx['33212']
QueryResult(id='33212', 44 hits)
+ One common example where this is helpful is if you had a large set of
+ query sequences (say ten thousand) which you split into ten query files
+ of one thousand sequences each in order to run as ten separate BLAST jobs
+ on a cluster. You could use `index_db` to index the ten BLAST output
+ files together for seamless access to all the results as one dictionary.
Note that ':memory:' rather than an index filename tells SQLite to hold
the index database in memory. This is useful for quick tests, but using
the Bio.SearchIO.index(...) function instead would use less memory.

0 comments on commit 69c8c7d

Please sign in to comment.