Skip to content
Switch branches/tags
This branch is 1 commit ahead, 7 commits behind oral-health-and-disease-ontologies/ohd-ontology:master.

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


We need a standardized approach that enables efficient access to information in Electronic Dental Records, and integration across different dental care providers and EHR systems. Our approach is to structure data from dental patient records using a realist approach. We interleave the construction of our Oral Health and Disease Ontology (OHD) with the re-encoding of the EDR data using the OHD, which more directly represents what happens during dental visits. The OHD includes terms relevant to the diagnosis and treatment of dental maladies, and is publicly available. Notably, we did not start from scratch. The OHD incorporates terms from a growing network of interoperable ontologies built using principles of the OBO Foundry. On this page we describe initial efforts to represent dental patient data contained in an EDR and to build the supporting OHD. This project includes a snapshot of the in-development ontology and sample queries that retrieve relevant data. Deidentified selections from patient records represented using the OHD are not yet publicly available. The papers below discuss the benefits and challenges of our approach as concerns meaningful use of EDRs aggregated across practices, practice software, and with other sources of health information such as the EHR.

OHD is in early development and subject to change without notice. The ontology can be accessed at or browsed at

For more information contact,

Publications and Presentations

Working with Data in R

We using R to do statistics on our data. While we're arranging for the release of the dataset we're working with, we do have running code which you have a look at in the repository. We're collecting it in /trunk/src/analysis. Below is a plot demonstrating proof of concept.

We're just staring to realize the goal of working directly in R doing statistics on our data. While we're still working to arrange for the release of the dataset we're working with, we do have running code which you have a look at in the repository. We're collecting it in /src/analysis. Below is the first pretty plot demonstrating proof of concept.

The code to do this is in simple-statistics.r generated by a call to age_to_first_treatment_statistics(). The SPARQL query (minus the prefixes which are defined in environment.r) is below, and uses SPARQL 1.1's aggregate functions to good use, in this case pulling back a row per patient with their birth date and the date of their earliest treatment.

SELECT ?patient (sample(?birth_date) as ?bdate) (min(?treatdatei) as ?treatdate)
   { ?patient rdf:type dental_patient: . 
     ?patient participates_in: ?procedure. 
     ?procedure rdf:type dental_procedure: .
     ?procedure occurrence_date: ?treatdatei .
     ?patient birth_date: ?birth_date
   } group by ?patient

Note the line ?procedure rdf:type dental_procedure: . Dental procedure is a parent term for a hierarchy of dental procedures currently consisting of 50 terms with a maximum depth of 6. That line illustrates an advantage of SPARQL in that querying to retrieve all children of a term in a hierarchy is a natural operation. The same operation is not as straightforward to implement in a relational database.

We are grateful to OntoText for their generous grant of research license for use of OWLIM, which powers our SPARQL endpoint.