Beginner's guide to RDF: 7. Ontologies and OWL
7.1. What is an ontology (again)?
In section 4.2. it was noted that there is no strict definition of an ontology. However, in general an ontology describes the formal constraints of the terms in a vocabulary and expresses relationships among terms in the vocabulary using some kind of ontology representation language. The transition from simple vocabulary to complex ontology is a continuum without a distinct boundary. At one extreme are vocabularies such as Darwin Core where, although defined in RDF, properties have few or no expressed relationships to other properties or classes, and there are no defined relationships among the classes. At the other extreme are complex ontologies which exhibit many of the features described in the following section.
7.2. Some features that may apply to ontologies
7.2.1. Hierarchical relationships
Ontologies can express how one term is related hierarchically to another, e.g. a class is a subclass of another, a class is broader than another, object A is
partOf object B, etc.
7.2.2. Characteristics of properties
Ontologies can define the characteristics of properties using terms that are well known to machines. For example, a property can be the inverse of another, a property can be transitive, symmetric, or functional, etc. Annotation properties (e.g.
rdfs:label) can be applied as properties to describe the characteristics of other properties. (1)
7.2.3. Properties with assumed types of values
Properties can be defined to have values that are identifiable objects, or defined to have values that are literals.
7.2.4. Ontologies can facilitate inferences
The relationships and restrictions expressed in an ontology can allow facts to be inferred that are not stated explicitly. Those relationships and restrictions can also allow it to be determined that there are logical inconsistencies in data described using terms from the ontology.
7.3. Examples of ontologies
7.3.1. Friend of a Friend (FOAF) vocabulary
The FOAF vocabulary (
foaf: = http://xmlns.com/foaf/0.1/) is a very simple ontology used to describe people and their relationships to information. (2) Some terms in the vocabulary, such as
foaf:fundedBy have a meaning that is understood through a human-readable text description, but whose definition implies nothing about either the subject or the object of the term. It also has no relationships to other terms in the ontology. Other terms have range, domain, subclass, and subproperty assignments as described in the DCMI abstract model and RDFS (section 4.3 and section 4.4). For example, because of its range declaration, using the term
foaf:publications implies that the object of the statement is a
foaf:Document. Instances of the class
foaf:Person are also instances of
foaf:Agent because the
foaf:Person class is a subclass of the
foaf:Agent class. There are also properties that are declared to be the inverse of other properties. For example,
foaf:made are inverse properties. So if A
foaf:made B, then it can be inferred that B has
foaf:maker A. Because of these declarations of the characteristics of FOAF properties, it is possible to infer additional facts that are not explicitly expressed.
7.3.2. The Plant Ontology (PO)
The Plant Ontology is a controlled vocabulary (
po: = http://owlfiles.plantontology.org/PO_) that describes plant morphology and developmental stages. (3) It is follows the principles of the The Open Biological and Biomedical Ontologies (OBO) Foundry. (4) The PO is currently used to describe patterns of gene expression and the phenotypes of genetic variants.
Relationships among terms in the PO are described by a set of Relationship Types which includes
is_a part_of develops_from has_part participates_in adjacent_to derives_from
For example, the term
po:0020043 which represents
compound leaf, has the following relationships detailed at http://www.plantontology.org/amigo/go.cgi?view=details&query=PO:0020043
po:0020043 is_a po:0009025 (vascular leaf) po:0020055 (leaf rachis) part_of po:0020043 po:0020049 (leaflet) part_of po:0020043 po:0020046 (palmate leaf) is_a po:0020043 po:0020045 (pinnate leaf) is_a po:0020043
Although the PO uses few properties to relate its terms, it has many terms. So its complexity is due to size rather than complexity of relationships.
7.3.3. TaxonConcept Ontology (txn)
This is a large and complex ontology that imports terms from other vocabularies and ontologies as well as coining new terms. It is beyond the scope of this guide to describe the TaxonConcept ontology, but it can be explored at http://lod.taxonconcept.org/ontology/doc/index.html or by examining the raw RDF at http://lod.taxonconcept.org/ontology/txn.owl .
7.4. What is Web Ontology Language (OWL)?
Web Ontology Language (OWL) is a declarative language for expressing ontologies. (5) This means that it is used to describe a state of affairs in a domain of interest in a logical way. (6) It is a knowledge representation language - it does not "do" anything in contrast to a computer language which can cause actions by providing instructions for the functioning of a computer. However, applications called "reasoners" can infer information about the state of affairs by assessing statements made using the language of an OWL ontology.
An OWL ontology can be considered an abstract model about knowledge in some domain, and is sometimes expressed in other modeling languages such as UML (7), a modeling language familiar to many programmers which shares with OWL notions of classes and relations between them. However, OWL was designed so that ontologies could be expressed as RDF graphs, with a default exchange serialization of RDF/XML. (8) As an RDF serialization, OWL is a more expressive extension of RDF than of its precursor RDFS (section 4.4) or generic RDF (part 3). However, this increased expressivity comes at the expense of increased complexity.
7.4.1. Varieties of OWL
Meanings can be assigned to ontologies through OWL in two ways: OWL DL and OWL Full. (9) The details of the distinction between these two semantics is beyond the scope of this guide. However, at the risk of oversimplification, one can say that OWL Full is less restrictive than OWL DL. In OWL Full, the same URI can refer to both a class and an instance of a class or as both a class and a property. This increased expressivity comes at a price because the restrictions imposed by OWL DL ensure that a reasoner can at least in principle always come up with "an answer" while OWL Full is by its nature undecidable. So constraining an ontology to OWL DL makes life easier for implementers.
There is an additional version of OWL known as OWL Lite. (10) It consists of a subset of the terms available in OWL DL and OWL Full. It is less expressive but easier to implement and does provide useful terms that are not available in RDFS for describing relationships.
7.4.2. OWL language profiles
There are three profiles (11) which restrict the expressiveness of OWL in order to achieve efficiency under different circumstances. For example, OWL EL is intended to be useful in circumstances where there are many properties or classes, while OWL QL is intended to be useful in applications with large amounts of instance data where query answering is the primary reasoning task.
7.5. How does one create an OWL ontology?
Because of the complexity of OWL, OWL-based ontologies are nearly always constructed using a software tool called an ontology editor. The most widespread is Protégé (12) which is free and open source, although other editors are available. (13) The Protégé OWL Tutorial (14) is a straightforward guide (with examples) for using Protégé.
7.6. OWL basics
The namespace abbreviation for OWL terms is
owl: = http://www.w3.org/2002/07/owl#
7.6.1. OWL and RDFS
Since OWL is an extension of RDFS, it contains some of the concepts introduced to RDF by RDFS. OWL uses classes and properties, and includes terms from RDFS and RDF including
rdfs:domain which were discussed in section 4.3 and section 4.4.
7.6.2. owl:Class and individuals
The concept of a class in OWL is similar to its meaning in RDFS. In fact,
owl:Class rdfs:subClassOf rdfs:Class
(15) In OWL, there is a built-in most general class named
owl:Thing. All other OWL classes are automatically subclasses of
owl:Thing. In OWL, instances of classes are called individuals.
OWL allows any two classes to be declared disjoint, i.e. it is not allowable for an individual to be an instance of both classes. Unless disjointness is declared explicitly, it is allowable for an individual to be simultaneously a member of any two (or more) classes.
7.6.3. Properties in OWL
In OWL, properties state relationships involving individuals, just as in RDFS properties state relationships involving instances of classes. OWL defines two types of properties:
owl:ObjectProperty which relates individuals to other individuals and
owl:DatatypeProperty which relates individuals to data values (strings, numbers, etc.). Both of these kinds of properties are
Outside of OWL, if a particular term is defined to be
rdf:type rdf:Property, then it is not necessarily clear whether the object of a triple which has that property as a predicate should be a literal or a URI. For example, if one wanted to describe a name in RDF using
dwc:nameAccordingTo, a Darwin Core property, should the object be a literal as in:
urn:lsid:ubio.org:namebank:2472422 dwc:nameAccordingTo "Claramunt Derryberry et al. 2010"
or should it be a URI?
urn:lsid:ubio.org:namebank:2472422 dwc:nameAccordingTo http://dx.doi.org/10.1525/auk.2009.09022
(16) There is no clear answer to this question. (17) However, in OWL the distinction is clear. In the following example the appropriate type of the object resource is unambiguous. (
tc: = http://rs.tdwg.org/ontology/voc/TaxonConcept#)
urn:lsid:ubio.org:namebank:2472422 tc:accordingToString "Claramunt Derryberry et al. 2010"
urn:lsid:ubio.org:namebank:2472422 tc:accordingTo http://dx.doi.org/10.1525/auk.2009.09022
tc:accordingToString is defined as an
tc:accordingTo is defined as an
7.6.4. Statements of equivalence in OWL
OWL provides a means of indicating that resources are equivalent.
Making the statement
X owl:equivalentClass Y essentially means that two named classes are synonymous, i.e. that all instances of class X are instances of class Y and vice versa. For example, in the TDWG Taxon Concept Ontology (18), the
tc:Taxon class and the
tc:TaxonConcept class are declared to be equivalent.
In OWL, if two properties are declared to be equivalent, they relate an individual to the same set of other individuals. (19) For example, Dublin Core has declared that
dcterms:creator owl:equivalentProperty foaf:maker
This means that
kimage:ac1490 dcterms:creator agents:kirchoff#coblea
kimage:ac1490 foaf:maker agents:kirchoff#coblea
The definition of
foaf:maker (20), which is written in OWL, declares that
foaf:maker is an
owl:ObjectProperty which means that the object of a triple using that property should be a URI. The definition of
dcterms:creator, which primarily uses RDFS, specifies a non-literal range (
dcterms:Agent), so there is no inconsistency. However, there has historically been confusion over whether
creator in Dublin Core should refer to a person or a person's name. The FOAF guidelines suggest that
dc:creator (as opposed to
dcterms:creator) should be used for textual names and
foaf:maker should be used to refer to the creators as identified by URIs. (21)
The property owl:sameAs is used to state that two individuals (i.e. class instances) are the same. The practical effect of this is to say that if we declare
uri1 owl:sameAs uri2, and a statement is made in the form of a triple containing
uri1, one can infer that the triple formed by substituting
uri2 represents a logically correct assertion (assuming that the original assertion is itself correct). Informally, this means that
uri2 identify the same resource. For example, if
http://biocol.org/urn:lsid:biocol.org:col:35115 owl:sameAs urn:lsid:biocol.org:col:35115
(22) and the triple
http://bioimages.vanderbilt.edu/ind-baskauf/11657.rdf foaf:maker http://biocol.org/urn:lsid:biocol.org:col:35115
is stated, then it can be reasoned that
http://bioimages.vanderbilt.edu/ind-baskauf/11657.rdf foaf:maker urn:lsid:biocol.org:col:35115
owl:sameAs is a very powerful and a very dangerous thing. If Institution A describes item
url1 using many triples stored in a dataset and Institution B describes item
url2 using many other triples stored in the same dataset, then an assertion by Person C that
uri1 owl:sameAs uri2
effectively merges all parts of the graph which relate to both
uri2. This may be a good thing, but if
uri2 aren't actually precisely the same thing, the result might be silly or possibly logically inconsistent statements.(25)
7.6.5. OWL property characteristics
OWL allows the creator of an ontology to specify special characteristics of properties that can be used by machines (reasoners) to infer triples that are not explicitly stated. Several of these characteristics are described below.
propertyA owl:inverseOf propertyB then if
x propertyA y
y propertyB x
Well-known pairs of inverse properties include:
If a property is transitive, then if
x property y
y property z
x property z
For example, in the Plant Ontology,
is_a is a transitive property. If
po:0020045 (pinnate leaf) is_a po:0020043 (compound leaf)
po:0020043 (compound leaf) is_a po:0009025 (vascular leaf)
then it can be inferred that
po:0020045 (pinnate leaf) is_a po:0009025 (vascular leaf)
A property that is declared to be an
owl:FunctionalProperty can have only one unique value as an object. An example is
foaf:primaryTopic which relates a document to the thing which its main topic. Because of the
owl:FunctionalProperty declaration, a document can have only one
foaf:primaryTopic. In a manner similar the OWL terms of equivalence, using
owl:FunctionalProperty can cause a reasoner to infer that individuals identified by different URIs are the same. For example, if two triples using
foaf:primaryTopic to describe the same subject resource have different object URIs, a reasoner would infer that the resources identified by those object URIs are the same. For example, if I state
http://bioimages.vanderbilt.edu/baskauf/10998.rdf foaf:primaryTopic http://bioimages.vanderbilt.edu/baskauf/10998
(a true statement declaring an RDF-formatted document to have an image as its
foaf:primaryTopic), and if the following statement were also made (perhaps by mistake):
http://bioimages.vanderbilt.edu/baskauf/10998.rdf foaf:primaryTopic http://bioimages.vanderbilt.edu/ind-baskauf/10997
(that the primary topic of the document is the trees which are depicted in the image), then a reasoner would conclude
http://bioimages.vanderbilt.edu/baskauf/10998 owl:sameAs http://bioimages.vanderbilt.edu/ind-baskauf/10997
(i.e. the image is the same thing as the tree) and subsequently could conclude that all properties of the image also apply to the tree, e.g. the tree was a StillImage created by Steven J. Baskauf and that the image was a natualized IndividualOrganism. For this reason functional properties (and inverse functional properties) should be used with caution.
126.96.36.199. Other property characteristics
7.7. What is reasoning?
Software applications which are designed to examine sets of OWL statements and draw inferences from them are called reasoners. The function of reasoners can be described as follows:
"When humans think, they draw consequences from their knowledge. An important feature of OWL is that it captures this aspect of human intelligence for the forms of knowledge that it can represent. But what does it mean, generally speaking, that a statement is a consequence of other statements? Essentially it means that this statement is true whenever the other statements are. In OWL terms: we say, a set of statements
A entails a statement
a if in any state of affairs wherein all statements from
A are true, also
a is true. Moreover, a set of statements may be consistent (that is, there is a possible state of affairs in which all the statements in the set are jointly true) or inconsistent (there is no such state of affairs). The formal semantics of OWL specifies, in essence, for which possible “states of affairs” a particular set of OWL statements is true. There are OWL tools – reasoners – that can automatically compute consequences." (23)
One basic function that a reasoner can perform is to check an OWL ontology for consistency, i.e. whether it is possible for a class to have any instances. It can also infer an ontology class hierarchy which may go beyond the hierarchy asserted explicitly by OWL statements. (24)
Some reasoners which are available are listed at (13).
7.7.1. Some publications related to reasoning
A method of reasoning, which uses rules to infer triples from an OBO ontology followed by SPARQL queries, is described in Blondé et al. 2011 (doi: 10.1093/bioinformatics/btr164).
Hogan et al. describe a "system for performing rule-based forward-chaining reasoning which we call SAOR: Scalable Authoritative OWL Reasoner" at http://www.deri.ie/fileadmin/documents/DERI-TR-2009-04-21.pdf. Aidan Hogan, Andreas Harth and Axel Polleres. Scalable Authoritative OWL Reasoning for the Web. International Journal on Semantic Web and Information Systems, 5(2), pages 49-90, April-June 2009.
The principles of Referent Tracking, which can be used to represent and track "particulars" (instances which are based in objective reality), is discussed in Cuesters and Smith. 2007. Referent Tracking is designed to support unique identifiers and reduce ambiguity with the ultimate goal of facilitating reasoning.
Calder et al. doi:10.1016/j.ecoinf.2009.08.007 describe a validation tool which uses machine reasoning to draw inferences about anomalous sensor data.
Rector et al. OWL Pizzas: Practical Experience of Teaching OWL-DL: Common Errors & Common Patterns http://www.co-ode.org/resources/papers/ekaw2004.pdf
1 Annotations in Web Ontology Language
In some forms of OWL (e.g. OWL DL) there are restrictions on the use of annotation properties because if they are used incorrectly they would prevent a machine reasoner from completing its task. See http://www.w3.org/TR/owl-ref/#Annotations
2 FOAF Vocabulary Specification
3 Plant Ontology
4 The Open Biological and Biomedical Ontologies
For unambiguous formal definitions of the primitive relationships used in OBO ontologies, see Smith et al. 2005. Relations in biomedical ontologies.
Genome Biology 6:R46 doi:10.1186/gb-2005-6-5-r46
5 OWL 2 Web Ontology Language Primer
OWL is a W3C Recommendation with a number of normative documents. This particular document was designed for novices.
6 What is OWL 2?
7 OMG Unified Modeling Language (UML)
8 OWL 2 Overview
9 OWL 2 DL and OWL 2 Full
10 OWL Lite Feature Synopsis
11 OWL 2 Web Ontology Language Profiles
12 Protégé ontology editor
13 OWL tools
A commonly used reasoner is Pellet:
14 A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools
15 Axiomatic Triples for the Classes of the OWL 2 RDF-Based Vocabulary
16 Expressing DOIs as HTTP URIs
17 Should Darwin Core terms have literals or URIs as objects?
Darwin Core includes "ID" versions of several terms. For example
dwc:nameAccordingTo shows strings as its examples, while its corresponding ID term
dwc:nameAccordingToID is defined to be an identifier. However, since Darwin Core does not have a RDF Guide indicating how its terms should be used and since the normative definition of terms in RDF describe the property terms as
rdf:type rdf:Property, there are no clear guidelines for how the ID terms might be used in RDF triples. The example given in the XML Guide (http://rs.tdwg.org/dwc/terms/guides/xml/index.htm#classes) uses
dwc:locationID as both an identifier for the subject resource and as an ID reference for the object of a property. The situation is further complicated by the fact that all of the ID terms are declared to be subproperties of
dcterms:identifier which has range
rdfs:Literal. This implies that the ID terms also have range
rdfs:Literal, which might not be the intent of Darwin Core. See this for more information.
In some cases (such as
dwc:recordedBy), the definition clearly states that the object of the property should be text. But in many other cases, usage is not clear.
18 TDWG Taxon Concept Ontology
19 Equivalent properties in RDFS
In RDFS, the same thing can be accomplished by declaring two properties to be subproperties of each other, e.g.
A rdfs:subPropertyOf B B rdfs:subPropertyOf A
20 Definition of the FOAF vocabulary in RDF
21 Definition of foaf:maker
For additional discussion of
The DCMI notes on the specifications for Dublin Core metadata in RDF specifies that literal value strings for a value should be expressed using the
rdf:value property for resources that should be represented as URI references.
This strategy is clarified at
which describes the best practices for this situation in Dublin Core. Use of
rdf:value is described at
22 Life Science Identifiers (LSID) Applicability Statement
The LSID Applicability Statement (pdf viewable in browser) states in Recommendation 30 that descriptions of objects identified by LSIDs must contain an OWL statement of equivalence (e.g.
owl:sameAs) that relates the LSID to its HTTP proxied form.
23 Description of reasoning in basic notions on modeling knowledge
24 Inconsistent Classes
Refer to section 4.9.2. of A Practical Guide To Building OWL Ontologies Using Protégé 4 and CO-ODE Tools at
25 Reference on owl:sameAs
Harry Halpin, Patrick J. Hayes, James P. McCusker, Deborah L. McGuinness, and Henry S. Thompson. 2010. When owl:sameAs isn’t the Same: An Analysis of Identity in Linked Data. International Semantic Web Conference (ISWC).
Thanks to Paul Murray and Bob Morris for helpful suggestions on this page.
Questions? Comments? Contact Steve Baskauf