Ontobio is designed to work with either local files or with remote information accessed via Services.
Access is generally mediated using a factory object. The client requests an ontology via a handle to the factory, and the factory will return with the relevant implementation instantiated.
ontobio.ontol_factory
You can load an ontology from disk (or a URL) that conforms to the obographs JSON standard.
Command line example:
ogr.py -r path/to/my/file.json
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("/path/to/my/file.json")
Requirement: OWLTools
Command line example:
ogr.py -r path/to/my/file.owl
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("/path/to/my/file.owl")
SKOS is an RDF data model for representing thesauri and terminologies.
See the SKOS primer for more details.
Command line example:
ogr.py -r path/to/my/skosfile.ttl
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("skos:/path/to/my/skosfile.ttl")
The default SPARQL service used is the OntoBee one, which provides access to all OBO library ontologies
Command line example:
ogr.py -r cl
Note that the official OBO library prefix must be used, e.g. cl
, go
, hp
. See http://obofoundry.org/
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("cl")
Command line example:
ogr.py -r scigraph:ontology
Code example, using an OntologyFactory
from ontobio.ontol_factory import OntologyFactory
ont = OntologyFactory().create("scigraph:ontology")
ontobio.assocmodel
The ontobio.AssociationSet
class provides a lightweight way of storing sets of associations.
ontobio.assoc_factory
Code example: parse all associations from a GAF, and filter according to provider:
p = GafParser()
assocs = p.parse(open(POMBASE,"r"))
pombase_assocs = [a for a in assocs if a['provided_by'] == 'UniProt']
Code example, creating AssociationSet objects, using an AssociationSetFactory
afactory = AssociationSetFactory()
aset = afactory.create_from_file(file=args.assocfile,ontology=ont)
ontobio.golr.golr_query
GOlr is the name given to the Solr instance used by the Gene Ontology and Planteome projects. This has been generalized for use with the Monarch Initiative project.
GOlr provides fast access and faceted search on top of Associations (see the concepts
section for more on the concept of associations). Ontobio provides both a transparent facade over GOlr, and also direct access to advanced queries.
By default an eager loading strategy is used: given a set of query criteria (minimally, subject and object categories plus a taxon, but optionally including evidence etc), all asserted pairwise associations are loaded into an association set. E.g.
aset = afactory.create(ontology=ont,
subject_category='gene',
object_category='function',
taxon=MOUSE)
Additionally, this is cached so future calls will not invoke the service overhead.
For performing advanced analytic queries over the complete GOlr database, see the GolrAssociationQuery
class. TODO provide examples.
ontobio.sparql.wikidata
TODO
When using remote services to access ontology or association set objects, caching is automatically used to avoid repeated access. Currently an eager strategy is used, in which large blocks are fetched in advance, though in future lazy strategies are optionally employed.
- Remote access to SciGraph/Neo4J
- Remote access to Chado databases
- Remote access to Knowledge Beacons