Skip to content
Zebulun Arendsee edited this page Feb 3, 2022 · 5 revisions

Welcome to the octofludb wiki!

Big picture

octofludb is designed to automatically retrieve and clean all data needed for routine IVA work and then to make it readily accessible.

octofludb is based on a triplestore graph database that is queriable via SPARQL. I chose this database rather than a relational database (e.g., SQL) for several reasons:

  • New tables of data linking strains or segments to new metadata may be easily added without modifying a schema.

  • Complex nested data, such as the data from GenBank, can be trivially loaded.

  • Data can be modularly parsed into triples.

  • SPARQL queries are simpler than SQL queries, no table joins are required.

  • SPARQL databases support the inference of new data based on simple ontologies.

Usage and help

octofludb is a command line Python utility. Its functionality is broken into many subcommands. Overal usage help and descriptions of each subcommand may be accessed with octofludb -h. Further usage information may be accessed for each subcommand by appending the subcommand name and -h, for example, octofludb query -h.

Extended examples can be found on the Examples page.

Installation