Skip to content

BioInterchange/Ontologies

Repository files navigation

BioInterchange Ontologies

Bioinformatics related ontologies. Especially for generating RDF content using BioInterchange.

Genomic Feature and Variation Ontology (GFVO)

An ontology for describing genomic features and variants; in particular the contents of GFF3, GTF, GVF and VCF files.

Build Instructions

Build instructions are intended for collaborators and enthusiasts who would like contributing to the BioInterchange ontologies. The ontologies can be edited using Protege, but a few post-processing steps are necessary to remove additional information that Protege inserts on its own.

Post-Protege-Cleanup

Saving an ontology with Protege will introduce explicit class definitions and individuals for external URIs. These have to be removed, so that only BioInterchange URIs are described by the ontologies. A script has been provided that takes care of this, and additionally, increments the patch level version number of the ontologies.

For example, the following commands can be used to create a new cleaned version of the GFVO ontology:

<gfvo.xml scripts/cleanse.rb > gfvo.tmp
mv gfvo.tmp gfvo.xml

Generating GFVO for BioPortal

Due to technical limitations of BioPortal, GFVO in BioPortal cannot import other ontologies or contain SIO class- or property-equivalences. If ontologies are imported and equivalences kept, then BioPortal reports from summary statistics and the class browser shows thousands of classes that are not part of GFVO itself.

Removal of OWL imports and class- and property-equivalences:

grep -v '<owl:imports ' gfvo.xml | grep -v '<owl:equivalentProperty ' | grep -v '<owl:equivalentClass ' > gfvo_bioportal.xml

Generating Statistics

Summary statistics about classes and properties can be output in human-readable and HTML via:

./scripts/stats.rb < gfvo.xml

Generating new GO Abbreviation Collection Link-Outs

A regular expression of valid URIs as defined in the Gene Ontology Abbreviation Collection can be automatically generated using the following command:

./scripts/go_xref2xsd_pattern.rb

On Mac OS X, the generated regular expression can be copied into the clipboard for subsequent pasting using the pbcopy command:

./scripts/go_xref2xsd_pattern.rb | pbcopy

Deprecated Ontologies

The following ontologies were prototypes that eventually merged into the Genomic Feature and Variation Ontology (GFVO).

Generic Feature Format Version 3 Ontology (GFF3O)

An ontology for describing GFF3 file contents.

Genome Variation Format Version 1 Ontology (GVF1O)

An ontology for describing GVF file contents.